DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-6 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Schmid et al. (WO 2018/102717), and further in view of Qi et al. ("Human-centric Indoor Scene Synthesis Using Stochastic Grammar", June 2018, IEEE/CVF Conference on Computer Vision and Pattern Recognition, pg. 5899-5908).
As per claim 1, Schmid et al., hereinafter Schmid, discloses a method comprising: 
sampling a scene graph; applying first data representative of the scene graph to a first machine learning model trained to predict updated scene graphs having synthetic attribute distributions modeled after real-world attribute distributions ([0032] where “In some implementations, once the optical flow is determined, the optical flow can be used to detect and track motion of moving objects depicted in the first image and the second images. Based on the detected and tracked motion, the motion of these moving objects can be modeled and their motion in future images can be predicted based on the model”); 
computing, using the first machine learning model and based at least in part on the first data, second data representative of a transformed scene graph, the transformed scene graph including updated attributes for at least one object represented by initial attributes within the scene graph ([49] where “To generate the optical flow 234, the subsystem 250 first generates, from the depth map 220, an initial three-dimensional (3D) point cloud 222 corresponding to the pixels in the scene depicted in the first image. The subsystem 250 can generate the initial 3D point cloud 222 using estimated or known camera intrinsics. The subsystem 250 then transforms, using the segmentation masks 218 and the object motion output 228, the initial 3D point cloud 222 to generate an initial transformed 3D point cloud 230. Subsequently, the subsystem 250 transforms, using the camera motion output 226, the initial transformed 3D point cloud 230 to generate a final transformed 3D point cloud 232. The system then determines the optical flow 234 by projecting the final transformed 3D point cloud 232 to a two-dimensional representation of the scene in the second image”);
 rendering, based at least in part on the second data, image data representative of an image ([85] where “The scene structure decoder subnetwork includes one or more deconvolutional neural network layers. The system depth-to-space upsamples the second encoded representation using the one or more deconvolutional neural network layers to generate a second decoded representation, i.e., an up-sampled feature map that has the same resolution as the first and second image’);
generating, based at least in part on the second data, ground truth data representative of corresponding ground truth ([104] where “if ground-truth optical flow, object masks, or object motions are available in the training input, the system can be trained to minimize, for example an L1 regression loss between predicted {U(x,y), V(x,y)} and ground-truth UGT(x,y), VGT(x,y)} flow vectors); and 
training a second machine learning model using the image data and the ground truth data ([104] where “if ground-truth optical flow, object masks, or object motions are available in the training input, the system can be trained to minimize, for example an L1 regression loss between predicted {U(x,y), V(x,y)} and ground-truth UGT(x,y), VGT(x,y)} flow vectors).  
It is noted Schmid does not explicitly teach 
a scene graph generated using a scene grammar. However, this is known in the art as taught by Qi et al., hereinafter Qi. Qi discloses a method of generating a scene graph using scene gramma (where the stochastic grammar is considered a scene grammar) (Abstract).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Qi into Schmid because Schmid discloses a method of generating scene graph and Qi discloses a scene gramma could be utilized for the purpose of improving the quality of the scene graph.
As per claim 2, Schmid and Qi demonstrated all the elements as disclosed in claim 1, and Qi further discloses wherein the scene grammar is a probabilistic grammar (where the stochastic grammar is a probabilistic grammar).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Qi into Schmid because Schmid discloses a method of generating a scene graph and Qi discloses a scene grammar could be utilized for the purpose of improving the quality of the scene graph.
As per claim 3, Schmid and Qi demonstrated all the elements as disclosed in claim 1, and Schmid further discloses wherein the initial attributes and the corresponding updated attributes include at least one of a location, a pose, a dimension, a color, a class, or an identification value ([0087] where the transforming a point cloud which is generated from the depth map which is considered an attribute).  
As per claim 4, Schmid and Qi demonstrated all the elements as disclosed in claim 1, and Schmid further discloses wherein the first machine learning model includes a graph convolutional network ([84]).  
As per claim 5, Schmid and Qi demonstrated all the elements as disclosed in claim 1, and Schmid further discloses wherein the first machine learning model is further trained to predict the transformed scene graphs having the synthetic attribute distributions that are tailored to a task of the second machine learning model ([32] where the future images are considered second machine learning model).  
As per claim 6, Schmid and Qi demonstrated all the elements as disclosed in claim 5, and Schmid further discloses wherein the first machine learning model is trained using a first loss function for predicting the transformed scene graphs having the synthetic attribute distributions modeled after the real-world attribute distributions, and using a second loss function different from the first loss function for predicting the transformed scene graphs having the synthetic attribute distributions, the synthetic attribute distributions being tailored to the task of the second machine learning model. ([104] where “if ground-truth optical flow, object masks, or object motions are available in the training input, the system can be trained to minimize, for example an L1 regression loss between predicted {U(x,y), V(x,y)} and ground-truth UGT(x,y), VGT(x,y)} flow vectors. As for the first loss function and the second loss function, since Schmid discloses predicting future scene graph using regression loss, it would be obvious to one of ordinary skill in the art to expand the loss function to multiple loss functions for the purpose of improving the quality of the scene graph.) 
As per claim 9, Schmid and Qi demonstrated all the elements as disclosed in claim 1.
As for the second machine learning model is tested on a real-world validation dataset prior to deployment, since testing against a real-world dataset before deployment is notoriously well known in the art (Official Notice), it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed  invention to incorporate the method for the purpose of ensuring better results.

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Schmid et al. (WO 2018/102717) and Qi et al. ("Human-centric Indoor Scene Synthesis Using Stochastic Grammar", June 2018, IEEE/CVF Conference on Computer Vision and Pattern Recognition, pg. 5899-5908), and further in view of Chatty (EP 3185113).
As per claim 7, Schmid and Qi demonstrated all the elements as disclosed in claim 1.
It is noted Schmid and Qi do not explicitly teach wherein a first subset of the initial attributes are mutable and a second subset of the initial attributes are fixed, further wherein only the first subset account for the updated attributes within the transformed scene graph. However, this is known in the art as taught by Chatty. Chatty discloses a method of generating objects in a scene in which a subset of attributes could be selected for transformation (Abstract).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Chatty into Schmid and Qi because Schmid and Qi disclose a method of generating a scene graph and Chatty discloses a scene grammar could be utilized for the purpose of better defining the scene graph.
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Schmid et al. (WO 2018/102717) and Qi et al. ("Human-centric Indoor Scene Synthesis Using Stochastic Grammar", June 2018, IEEE/CVF Conference on Computer Vision and Pattern Recognition, pg. 5899-5908), and further in view of Sharma et al. (US 10,474,917).
As per claim 8, Schmid and Qi demonstrated all the elements as disclosed in claim 1.
It is noted Schmid and Qi do not explicitly teach wherein, the ground truth data is automatically generated using the image data to correspond to a task of the second machine learning model. However, this is known in the art as taught by Sharma et al., hereinafter Sharma. Sharma discloses an image processing method in which the ground truth data is generated from training the second machine learning model ([0026]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Sharma into Schmid and Qi because Schmid and Qi disclose a method of generating a scene graph and Sharma discloses the ground truth data could be generated for the purpose of shorten path to the final segmentation.

Allowable Subject Matter
Claims 10-23 are allowed.
The following is a statement of reasons for the indication of allowable subject matter:  
The closest prior art by Schmid et al. (WO 2018/102717) and Qi et al. ("Human-centric Indoor Scene Synthesis Using Stochastic Grammar", June 2018, IEEE/CVF Conference on Computer Vision and Pattern Recognition, pg. 5899-5908) do not explicitly teach 
NVIDIA Matter No.: 18-TR-0382US03determining a discrepancy by comparing synthetic attribute distributions corresponding to the synthetic images to real-world attribute distributions corresponding to real-world images; 
based at least in part on the discrepancy, generating fourth data representative of network update information; and training the machine learning model using the fourth data.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN R YANG whose telephone number is (571)272-7666. The examiner can normally be reached 9:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on (571) 272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/RYAN R YANG/Primary Examiner, Art Unit 2616                                                                                                                                                                                                        May 21, 2022