Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claim Interpretations
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

Use of the word “means” (or “step for”) in a claim with functional language creates a rebuttable presumption that the claim element is to be treated in accordance with 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph).  The presumption that 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph) is invoked is rebutted when the function is recited with sufficient structure, material, or acts within the claim itself to entirely perform the recited function.  
Absence of the word “means” (or “step for”) in a claim creates a rebuttable presumption that the claim element is not to be treated in accordance with 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph).  The presumption that 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph) is not invoked is rebutted when the claim element recites function but fails to recite sufficiently definite structure, material or acts to perform that function. 


Since the claim limitation(s) invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, claims 2, 17 and 21 have been interpreted to cover the corresponding structure described in the specification that achieves the claimed function, and equivalents thereof.  

A review of the specification shows that the following appears to be the corresponding structure described in the specification for the 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph limitation: for example, FIG. 4 and paragraph [0048] of "Inverse renderer 402 is trained to generate an implicit representation 408 (also referred to as a scene representation) of an object from an input image 400 of the object (e.g., a two-dimensional input image) from a particular view. Forward renderer 406 generates an output 412 based on the implicit representation 408 from the inverse renderer 402"; paragraph [0050] of “A shear rotation module can be particularly helpful for facilitating machine learning models based on rotational equivariance”.


If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

	
	
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-5, 10 and 17 are rejected under 35 U.S.C. 102 (a)(1) as being anticipated by YANG et al  (U.S. Patent Application Publication 2018/0234671 A1).

Regarding claim 1, YANG discloses a method comprising: 
providing an input image depicting a view of an object to a machine learning model (Paragraph [0044], system 150 includes a plurality of machine-learning networks. In the embodiment shown in FIG. 1B, system 150 includes at least geometric flow network (GFN) 160, image completion network (ICN) 170, and loss network 180; paragraph [0048], GFN 160 receiving a single 2D source image 164. Source image 164 includes a view of the 3D object (an automobile) from the source viewpoint VS; paragraph [0052], GFM “geometric flow model” 162 of GFN 160 is trained to perform such a transformation without the 3D data normally employed to perform such a rotational transformation, e.g. by explicitly rotationally transforming (or moving) pixel values via the flow field) that has been trained based on a constraint of equivariance under rotations (Paragraph [0049], GFN 160 may receive one or more viewpoint parameters that indicate the target viewpoint VT={θT, ϕT}. For instance, the received viewpoint parameters may include one or more of VT={θT, ϕT} and/or rotational transformation R=VT−Vs={Δθ, Δϕ}={θT−θS, ϕT−ϕS}. In at least one embodiment, the one or more viewpoint parameters includes one or more parameters indicating the source viewpoint, VS={θS, ϕS}) between a training object (Paragraph [0048], source image 164 includes a view of the 3D object (an automobile) from the source viewpoint VS) and a model-generated representation of the training object (Paragraph [0058], target image 190 is from target viewpoint VT); and 
generating, using the machine learning model and based on the provided input image (Paragraphs [0048]-[0049], FN 160 receiving a single 2D source image 164 and  rotational transformation R), at least one of an output image that depicts the object from a rotated view that is different from the view of the object in the input image, or a three-dimensional representation of the object (Paragraph [0050], based on the 2D data encoding source image 164 and the one or more viewpoint parameters, GFN 160 generates an intermediate image 174. Intermediate image 174 includes an intermediate view of the object from the target viewpoint. The intermediate view of the object is a rotated view of the object, where the rotation is based on the rotational transformation; paragraphs [0057]-[0058], the intermediate image 174 is provided to ICN 170 for image completion … ICM 172 generates target image 190 based on intermediate image 174 and the prediction of the disoccluded portion 198 of the object. Similar to intermediate image 174, target image 190 is from target viewpoint VT. Target image 190 includes a target view of the object that includes the common portion of the object (as rotated from the source viewpoint VS via GFN 160)).

	Regarding claim 2, YANG discloses everything claimed as applied above (see claim 1), and YANG further disclose wherein the machine learning model includes: 
an inverse renderer (FIG. 1B; paragraph [0052], GFM “geometric flow model” 162 of GFN 160 is trained to perform such a transformation without the 3D data normally employed to perform such a rotational transformation, e.g. by explicitly rotationally transforming (or moving) pixel values via the flow field); and 
a forward renderer (Paragraph [0058], ICM “image completion model” 172 generates target image 190 based on intermediate image 174 and the prediction of the disoccluded portion 198 of the object. Similar to intermediate image 174, target image 190 is from target viewpoint VT. Target image 190 
includes a target view of the object that includes the common portion of the object (as rotated from the source viewpoint VS via GFN 160)).     

	Regarding claim 3, YANG discloses everything claimed as applied above (see claim 2), and YANG further disclose wherein generating the at least one of the output image that depicts the object from the rotated view that is different from the view of the object in the input image or the three-dimensional representation of the object (See claim 1) comprises generating the at least one of the output image that depicts the object from the rotated view that is different from the view of the object in the input image or the three-dimensional representation of the object with the forward renderer (FIG. 1B; paragraph [0050],  GFN 160 generates an intermediate image 174. Intermediate image 174 includes an intermediate view of the object from the target viewpoint. The intermediate view of the object is a rotated view of the object, where the rotation is based on the rotational transformation; paragraph [0057], the intermediate image 174 is provided to ICN 170 for image completion; paragraph [0058], ICM “image completion model” 172 generates target image 190 based on intermediate image 174 and the prediction of the disoccluded portion 198 of the object. Similar to intermediate image 174, target image 190 is from target viewpoint VT. Target image 190 includes a target view of the object that includes the common portion of the object (as rotated from the source viewpoint VS via GFN 160)).

	Regarding claim 4, YANG discloses everything claimed as applied above (see claim 3), and YANG discloses further comprising generating an implicit representation of the object with the inverse renderer based on the input image (FIG. 1B; paragraph [0050],  GFN 160 generates an intermediate image 174. Intermediate image 
174 includes an intermediate view of the object from the target viewpoint. The intermediate view of the object is a rotated view of the object, where the rotation is based on the rotational transformation).  

	Regarding claim 5, YANG discloses everything claimed as applied above (see claim 4), and YANG further disclose wherein the forward renderer generates the at least one of the output image that depicts the object from the rotated view that is different from the view of the object in the input image or the three-dimensional representation of the object (FIG. 1B; paragraphs [0057]-[0058], essentially, ICM 172 is trained to hallucinate (or predict) the disoccluded region of intermediate image 174. Essentially, the ICM 172 is trained to generate a prediction of the disoccluded portion of the object and/or a prediction for the disoccluded region of the intermediate image 174. The ICM 172 updates the incomplete region of intermediate image 174 with the prediction … ICM 172 generates target image 190 based on intermediate image 174 and the prediction of the disoccluded portion 198 of the object. Similar to intermediate image 174, target image 190 is from target viewpoint VT. Target image 190 includes a target view of the object 
that includes the common portion of the object (as rotated from the source viewpoint VS via GFN 160),  as well as a prediction (or hallucination) of the disoccluded portion of the object in the target image 190) based on the implicit representation generated by the inverse renderer (Paragraph [0050], GFN 160 
generates an intermediate image 174. Intermediate image 174 includes an intermediate view of the object from the target viewpoint. The intermediate view of the object is a rotated view of the object, where the rotation is based on the rotational transformation; paragraph [0057], the intermediate image 174 is provided to ICN 170 for image completion). 

	Regarding claim 10, YANG discloses everything claimed as applied above (see claim 4), and YANG further disclose wherein generating the implicit representation of the object with the inverse renderer (FIG. 1B; paragraph [0050], based on the 2D data encoding source image 164 and the one or more viewpoint parameters, GFN 160 generates an intermediate image 174. Intermediate image 174 includes an intermediate view of the object from the target viewpoint. The intermediate view of the object is a rotated view of the object, where the rotation is based on the rotational transformation) based on the input image comprises generating the implicit representation in a single forward pass of the inverse renderer (Paragraph [0048], GFN 160 receiving a single 2D source image 164. Source image 164 includes a view of the 3D object (an automobile) from the source viewpoint VS; paragraphs [0057]-[0058], the intermediate image 174 is provided to ICN 170 for image completion … ICM 172 generates target image 190 based on intermediate image 174 and the prediction of the disoccluded portion 198 of the object. Similar to intermediate image 174, target image 190 is from target viewpoint VT. Target image 190 includes a target view of the object that includes the common portion of the object (as rotated from the source viewpoint VS via GFN 160)).

	Regarding claim 17, YANG discloses a system (FIG. 1B; paragraph [0040], an image generation system (IGS) 150) comprising: 
a processor (Paragraph [0040], an image generation computing device (IGCD) 158; paragraph [0143], FIG. 9 shows computing device 900, one or more processors 914); - 39 -Attorney Docket No.: 122202-7187 (P49108US1) 
a memory device (Paragraph [0143], computing device 900 includes a bus 910 that directly or indirectly couples the following devices: memory 912) containing instructions (Paragraph [0147], memory 912 includes computer storage media in the form of volatile and/or nonvolatile memory. Memory 912 may be non-transitory memory. As depicted, memory 912 includes instructions 924), which when executed by the processor (Paragraph [0147], instructions 924, when executed by processor(s) 914 are configured to cause the computing device to perform any of the operations described herein, in reference to the above discussed figures) cause the processor to: 
provide an input image depicting a view of an object to a machine learning model (Paragraph [0044], system 150 includes a plurality of machine-learning networks. In the embodiment shown in FIG. 1B, system 150 includes at least geometric flow network (GFN) 160, image completion network (ICN) 170, and loss network 180; paragraph [0048], GFN 160 receiving a single 2D source image 164. Source image 164 includes a view of the 3D object (an automobile) from the source viewpoint VS; paragraph [0052], GFM “geometric flow model” 162 of GFN 160 is trained to perform such a transformation without the 3D data normally employed to perform such a rotational transformation, e.g. by explicitly rotationally transforming (or moving) pixel values via the flow field) that has been trained based on a constraint of equivariance under rotations (Paragraph [0049], GFN 160 may receive one or more viewpoint parameters that indicate the target viewpoint VT={θT, ϕT}. For instance, the received viewpoint parameters may include one or more of VT={θT, ϕT} and/or rotational transformation R=VT−Vs={Δθ, Δϕ}={θT−θS, ϕT−ϕS}. In at least one embodiment, the one or more viewpoint parameters includes one or more parameters indicating the source viewpoint, VS={θS, ϕS}) between a training object (Paragraph [0048], source image 164 includes a view of the 3D object (an automobile) from the source viewpoint VS) and a model-generated representation of the training object (Paragraph [0058], target image 190 is from target viewpoint VT); and 
generate, using the machine learning model and based on the provided input image (Paragraphs [0048]-[0049], FN 160 receiving a single 2D source image 164 and  rotational transformation R), at least one of an output image that depicts the object from a rotated view that is different from the view of the object in the input image, or a three-dimensional representation of the object (Paragraph [0050], based on the 2D data encoding source image 164 and the one or more viewpoint parameters, GFN 160 generates an intermediate image 174. Intermediate image 174 includes an intermediate view of the object from the target viewpoint. The intermediate view of the object is a rotated view of the object, where the rotation is based on the rotational transformation; paragraphs [0057]-[0058], the intermediate image 174 is provided to ICN 170 for image completion … ICM 172 generates target image 190 based on intermediate image 174 and the prediction of the disoccluded portion 198 of the object. Similar to intermediate image 174, target image 190 is from target viewpoint VT. Target image 190 includes a target view of the object that includes the common portion of the object (as rotated from the source viewpoint VS via GFN 160)).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 6, 11-12, 20 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over YANG et al  (U.S. Patent Application Publication 2018/0234671 A1) in view of DUPONT et al ("EquivariantNeuralRendering,"Arxiv.org,CornellUniversityLibrary,
December 2020,14 pages, listed in IDS submitted by Applicant on 09/24/2021).

Regarding claim 6, YANG discloses everything claimed as applied above (see claim 5).
 	However, YANG does not specifically disclose wherein generating the at least one of the output image that depicts the object from the rotated view that is different from the view of the object in the input image or the three-dimensional representation of the object based on the implicit representation comprises rotating the implicit representation of the object.
	In the similar field of endeavor, DUPONT discloses (Abstract, we propose a framework for learning neural scene representations directly from images, without 3D supervision. Our key insight is that 3D structure can be imposed by ensuring that the learned representation transforms like a real 3D scene …) wherein generating the at least one of the output image that depicts the object from the rotated view that is different from the view of the object in the input image or the three-dimensional representation of the object (FIG. 4, page 3, last paragraph of the right hand side, we first map the images through the inverse renderer to obtain their scene representations z1 = f(x1) and z2 = f(x2). We then rotate each encoded representation by its relative transformation RθZ, such that ẑ1 = RθZ z1 and ẑ2 = (RθZ) −1z2) based on the implicit representation comprises rotating the implicit representation of the object (Page 4, FIG. 4, we encode two images x1, x2 of the same scene into their respective scene representations z1, z2).
	YANG and DUPONT are analogous art because both pertain to utilize the training method for performing the rotation of the object in the input image. It would have been obvious to a person of ordinary skill in the art before the effective filing date 

	Regarding claim 11, YANG discloses everything claimed as applied above (see claim 1).
 	However, YANG does not specifically disclose further comprising training the machine learning model based on the constraint of equivariance under rotations between the training object and the model-generated representation of the training object by: 
providing a first input training image depicting a first view of the training object to the machine learning model; 
providing a second input training image depicting a second view of the training object to the machine learning model; 
generating a first implicit representation of the training object based on the first input training image; 
generating a second implicit representation of the training object based on the second input training image; - 38 -Attorney Docket No.: 122202-7187 (P49108US1) 
rotating the first implicit representation of the training object; 
rotating the second implicit representation of the training object; 

generating a second output training image based on the rotated second implicit representation of the object; 
comparing the first input training image to the second output training image; and 
comparing the second input training image to the first output training image.
In the similar field of endeavor, DUPONT discloses (Abstract, we propose a framework for learning neural scene representations directly from images, without 3D supervision. Our key insight is that 3D structure can be imposed by ensuring that the learned representation transforms like a real 3D scene …) further comprising training the machine learning model based on the constraint of equivariance under rotations between the training object and the model-generated representation of the training object (Pages 3-4; “4.Model”, FIG. 4. Model training) by: 
providing a first input training image depicting a first view of the training object to the machine learning model (Page 4, FIG. 4, image x1 corresponding to scene depicting a first view of the training object); 
providing a second input training image depicting a second view of the training object to the machine learning model (Page 4, FIG. 4, image x2 corresponding to scene depicting a second view of the training object); 
generating a first implicit representation of the training object based on the first input training image (Page 3, the last paragraph of the right hand side, scene representation z1 = f(x1)); 
Page 3, the last paragraph of the right hand side, scene representation z2 = f(x2)); - 38 -Attorney Docket No.: 122202-7187 (P49108US1) 
rotating the first implicit representation of the training object (FIG. 4, page 3, last paragraph of the right hand side, we then rotate representation z1 by its relative transformation RθZ, such that ẑ1 = RθZ z1); 
rotating the second implicit representation of the training object (FIG. 4, page 3, last paragraph of the right hand side, we then rotate representation z2 by its relative transformation RθZ, such that ẑ2 = (RθZ) −1z2); 
generating a first output training image based on the rotated first implicit representation of the object (Page 4, FIG. 4, output g(ẑ1)); 
generating a second output training image based on the rotated second implicit representation of the object (Page 4, FIG. 4, output g(ẑ2)); 
comparing the first input training image to the second output training image (Page 4, left hand side, ||x1 − g(ẑ2)||); and 
comparing the second input training image to the first output training image (Page 4, left hand side, ||x2 − g(ẑ1)||).
YANG and DUPONT are analogous art because both pertain to utilize the training method for performing the rotation of the object in the input image. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the image generation system taught by YANG incorporate the teachings of DUPONT, and applying the equivariant neural rendering taught by DUPONT to provide the rotated transition from the intermediate image to the target 

	Regarding claim 12, the combination of YANG in view of DUPONT discloses everything claimed as applied above (see claim 11).
 	However, YANG does not specifically disclose wherein the training further comprises minimizing a loss function based on the comparing of the first input training image to the second output training image and the comparing of the second input training image to the first output training image.          
In the similar field of endeavor, DUPONT discloses wherein the training further comprises minimizing a loss function (Page 4, left hand side, we can then ensure our model obeys these transformations by minimizing Lrender =||x2 − g(ẑ1)|| + ||x1 − g(ẑ2)||) based on the comparing of the first input training image to the second output training image (Page 4, left hand side, ||x1 − g(ẑ2)||) and the comparing of the second input training image to the first output training image  (Page 4, left hand side, ||x2 − g(ẑ1)||).
YANG and DUPONT are analogous art because both pertain to utilize the training method for performing the rotation of the object in the input image. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the image generation system taught by YANG incorporate the teachings of DUPONT, and applying the equivariant neural rendering taught by DUPONT to provide a loss function for minimizing the loss in order to satisfying the 
         
Regarding claim 20, YANG discloses a non-transitory machine-readable medium comprising code that, when executed by a processor (Paragraph [0143], FIG. 9 shows computing device 900, memory 912, one or more processors 914; paragraph [0147], memory 912 includes computer storage media in the form of volatile and/or nonvolatile memory. Memory 912 may be non-transitory memory. As depicted, memory 912 includes instructions 924. Instructions 924, when executed by processor(s) 914 are configured to cause the computing device to perform any of the operations described herein, in reference to the above discussed figures), causes the processor to: 
provide an input image depicting a view of an object to a machine learning model (Paragraph [0044], system 150 includes a plurality of machine-learning networks. In the embodiment shown in FIG. 1B, system 150 includes at least geometric flow network (GFN) 160, image completion network (ICN) 170, and loss network 180; paragraph [0048], GFN 160 receiving a single 2D source image 164. Source image 164 includes a view of the 3D object (an automobile) from the source viewpoint VS; paragraph [0052], GFM “geometric flow model” 162 of GFN 160 is trained to perform such a transformation without the 3D data normally employed to perform such a rotational transformation, e.g. by explicitly rotationally transforming (or moving) pixel values via the flow field) that has been trained based Paragraph [0049], GFN 160 may receive one or more viewpoint parameters that indicate the target viewpoint VT={θT, ϕT}. For instance, the received viewpoint parameters may include one or more of VT={θT, ϕT} and/or rotational transformation R=VT−Vs={Δθ, Δϕ}={θT−θS, ϕT−ϕS}. In at least one embodiment, the one or more viewpoint parameters includes one or more parameters indicating the source viewpoint, VS={θS, ϕS}); and 
generate, using the machine learning model and based on the provided input image (Paragraphs [0048]-[0049], FN 160 receiving a single 2D source image 164 and  rotational transformation R), at least one of an output image that depicts the object from a rotated view that is different from the view of the object in the input image, or a three-dimensional representation of the object (Paragraph [0050], based on the 2D data encoding source image 164 and the one or more viewpoint parameters, GFN 160 generates an intermediate image 174. Intermediate image 174 includes an intermediate view of the object from the target viewpoint. The intermediate view of the object is a rotated view of the object, where the rotation is based on the rotational transformation; paragraphs [0057]-[0058], the intermediate image 174 is provided to ICN 170 for image completion … ICM 172 generates target image 190 based on intermediate image 174 and the prediction of the disoccluded portion 198 of the object. Similar to intermediate image 174, target image 190 is from target viewpoint VT. Target image 190 includes a target view of the object that includes the common portion of the object (as rotated from the source viewpoint VS via GFN 160)).

	In the similar field of endeavor, DUPONT discloses (Abstract, we propose a framework for learning neural scene representations directly from images, without 3D supervision. Our key insight is that 3D structure can be imposed by ensuring that the learned representation transforms like a real 3D scene …) a machine learning model (Page 3, “4. Mode”; page 4, FIG, 4. Model training) that has been trained based on at least two training images depicting different views of a training object (Page 4, FIG, 4. We encode two images x1, x2 of the same scene into their respective scene representations z1, z2. Since they are representations of the same scene viewed from different points, we can rotate each one into the other. The rotated scene representations ẑ1, ẑ2 should then be decoded to match the swapped image pairs x2, x1).
	YANG and DUPONT are analogous art because both pertain to utilize the training method for performing the rotation of the object in the input image. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the image generation system taught by YANG incorporate the teachings of DUPONT, and applying the equivariant neural rendering taught by DUPONT to provide the model training based on at least two training images depicting different views of a training object. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify YANG 

	Regarding claim 22, the combination of YANG in view of DUPONT discloses everything claimed as applied above (see claim 20).
 	However, YANG does not specifically disclose wherein the machine learning model has been trained based on the at least two training images without three-dimensional supervision.
	In the similar field of endeavor, DUPONT discloses wherein the machine learning model has been trained based on the at least two training images (Page 4, FIG, 4. We encode two images x1, x2 of the same scene into their respective scene representations z1, z2. Since they are representations of the same scene viewed from different points, we can rotate each one into the other. The rotated scene representations ẑ1, ẑ2 should then be decoded to match the swapped image pairs x2, x1) without three-dimensional supervision (Page 1, right hand side, our model is trained with no 3D supervision and only requires images and their relative poses to learn equivariant scene representations).
	YANG and DUPONT are analogous art because both pertain to utilize the training method for performing the rotation of the object in the input image. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the image generation system taught by YANG incorporate the teachings of DUPONT, and applying the equivariant neural rendering taught by DUPONT to provide the model training based on at least two training images depicting .

Claims 7-9 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over YANG et al  (U.S. Patent Application Publication 2018/0234671 A1) in view of DUPONT et al ("Equivariant Neural Rendering,"Arxiv.org, Cornell University Library,
December 2020,14 pages, listed in IDS submitted by Applicant on 09/24/2021) in view of SCHRODER et al ("Fast rotation of volume data on parallel architectures,” IEEE
Conference on Visualization, October 1991, pp50-57, listed in IDS submitted by Applicant on 09/24/2021).

	Regarding claim 7, the combination of YANG in view of DUPONT discloses everything claimed as applied above (see claim 6), the combination of YANG in view of DUPONT wherein rotating the implicit representation of the object comprises performing a rotation of the implicit representation of the object (See claim 6).  
	However, the combination of YANG in view of DUPONT does not specifically disclose a rotation is a shear rotation.
	In additional, SCHRODER discloses “Fast rotation of volume data on parallel architectures” a rotation is a shear rotation (Page 52 of SCHRODER describes “shear decomposition of rotation” and right hand side of page 52 describes “Another way to factor an orientation into seven shear matrices is based on the observation that any orientation can be expressed as a rotation of the form Ra’RbRa …”). 
Thus, SCHRODER discloses a rotation is a shear rotation for performing the fast rotation of volume data. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the combination of YANG in view of DUPONT by the teachings of SCHRODER, and applying the shear rotation in the training model in order to perform a shear rotation of the implicit representation of the object.

	Regarding claim 8, the combination of YANG in view of DUPONT in view of SCHRODER discloses everything claimed as applied above (see claim 7).
 	However, YANG does not specifically disclose wherein the three-dimensional representation comprises is an explicit three-dimensional representation including at least one of a voxel grid, a mesh or a point cloud.
	In the similar field of endeavor, DUPONT discloses wherein the three-dimensional representation comprises is an explicit three-dimensional representation including at least one of a voxel grid, a mesh or a point cloud (Page 3, left hand side, “indeed, meshes, voxels, point clouds (and so on) paired with their appropriate rendering function all satisfy this equation”; right hand side, “While our formulation applies to general transformations and scene representations, we focus on the case where the scene representations are deep voxels and the family of transformations is 3D rotations”).


	Regarding claim 9, the combination of YANG in view of DUPONT in view of SCHRODER discloses everything claimed as applied above (see claim 7).
 	However, YANG does not specifically disclose wherein the implicit representation of the object comprises a tensor or a latent space of an autoencoder. 
	In the similar field of endeavor, DUPONT discloses wherein the implicit representation of the object comprises a tensor or a latent space of an autoencoder (Page 3, left hand side, “Implicit representations, in contrast, are abstract and need not be human interpretable. For example, z could be the latent space of an autoencoder and g a neural network”). 
	YANG and DUPONT are analogous art because both pertain to utilize the training method for performing the rotation of the object in the input image. It would have been obvious to a person of ordinary skill in the art before the effective filing date 

Regarding claim 21, the combination of YANG in view of DUPONT discloses everything claimed as applied above (see claim 20), and YANG further disclose wherein the machine learning model includes an inverse renderer(FIG. 1B; paragraph [0052], GFM “geometric flow model” 162 of GFN 160 is trained to perform such a transformation without the 3D data normally employed to perform such a rotational transformation, e.g. by explicitly rotationally transforming (or moving) pixel values via the flow field) , a forward renderer (Paragraph [0058], ICM “image completion model” 172 generates target image 190 based on intermediate image 174 and the prediction of the disoccluded portion 198 of the object. Similar to intermediate image 174, target image 190 is from target viewpoint VT. Target image 190 includes a target view of the object that includes the common portion of the object (as rotated from the source viewpoint VS via GFN 160)).
	However, YANG does not specifically disclose a shear rotation module.
In additional, SCHRODER discloses “Fast rotation of volume data on parallel architectures” a shear rotation module (Page 52 of SCHRODER describes “shear decomposition of rotation” and right hand side of page 52 describes “Another way to factor an orientation into seven shear matrices is based on the observation that any orientation can be expressed as a rotation of the form Ra’RbRa …”). 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the image generation system taught by YANG by the teachings of SCHRODER, and implement a shear rotation module in the machine learning model in order to perform a shear rotation of the implicit representation of the object.

Claims 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over YANG et al  (U.S. Patent Application Publication 2018/0234671 A1) in view of WORRALL et al (“Interpretable Transformations with Encoder-Decoder Networks,”  Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 5726-5735, listed in IDS submitted by Applicant on 06/03/2021).

	Regarding claim 15, YANG discloses everything claimed as applied above (see claim 1).
 	However, YANG does not specifically disclose further comprising training the machine learning model based on at least two input training images without three-dimensional supervision of the training. 
	In the similar field of endeavor, WORRALL discloses further comprising training the machine learning model (Page 5739, right hand side, “3. Method. In this section we design a neural network to learn an interpretable transformation equivariant feature-space. Our method can cope with continuous transformations on intervals, for example, uniform scalings and stretches, and continuous transformations on circles, such as, geometric rotation and relighting, but not discrete transformations, like vertical flips”) based on at least two input training images without three-dimensional supervision of the training (Page 5739, right hand side, “3.1. Problem Setup” describes a training set corresponding to many image x and setups the relative transformations without three-dimensional supervision). 
	YANG and WORRALL are analogous art because both pertain to utilize the training method for performing the rotation of the object in the input image. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the image generation system taught by YANG incorporate the teachings of WORRALL, and applying the neural network to learn an interpretable transformation equivariant feature-space taught by WORRALL to provide at least two input training images without three-dimensional supervision for training the machine learning model. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify YANG according to the relied-upon teachings of WORRALL to obtain the invention as specified in claim.

	Regarding claim 16, the combination of YANG in view of WORRALL discloses everything claimed as applied above (see claim 15).
 	However, YANG does not specifically disclose further comprising testing the trained machine learning model without providing pose information to the trained machine learning model.
Page 5743, right hand side, “Real faces For fun, we feed images of real faces into our system, to recognize basic pose, shape, appearance, and lighting. We take internet images, cropping out background and hair”; page 5744, FIG. 9, “We pass images of real faces through our system re-orienting 50° from the initial pose, while fixing all other transformation parameters. Despite being trained on artificial data, the system is able to extract basic pose, shape, appearance and illumination. The system struggles to match shape properly, since these are far from the training set”).
	YANG and WORRALL are analogous art because both pertain to utilize the training method for performing the rotation of the object in the input image. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the image generation system taught by YANG incorporate the teachings of WORRALL, and applying the neural network to learn an interpretable transformation equivariant feature-space taught by WORRALL to pass images without providing pose information to the training model for  at least two input training images without three-dimensional supervision for testing the trained machine learning model to extract basic pose of object. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify YANG according to the relied-upon teachings of WORRALL to obtain the invention as specified in claim.

Claims 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over YANG et al  (U.S. Patent Application Publication 2018/0234671 A1) in view of SCHRODER et al ("Fast rotation of volume data on parallel architectures,” IEEE Conference on Visualization, October 1991, pp50-57, listed in IDS submitted by Applicant on 09/24/2021).

Regarding claim 18, YANG discloses everything claimed as applied above (see claim 17), and YANG further disclose wherein the machine learning model includes an inverse renderer (FIG. 1B; paragraph [0052], GFM “geometric flow model” 162 of GFN 160 is trained to perform such a transformation without the 3D data normally employed to perform such a rotational transformation, e.g. by explicitly rotationally transforming (or moving) pixel values via the flow field), a forward renderer (Paragraph [0058], ICM “image completion model” 172 generates target image 190 based on intermediate image 174 and the prediction of the disoccluded portion 198 of the object. Similar to intermediate image 174, target image 190 is from target viewpoint VT. Target image 190 includes a target view of the object that includes the common portion of the object (as rotated from the source viewpoint VS via GFN 160)).
However, YANG does not specifically disclose a shear rotation module.
In additional, SCHRODER discloses “Fast rotation of volume data on parallel architectures” a shear rotation module (Page 52 of SCHRODER describes “shear decomposition of rotation” and right hand side of page 52 describes “Another way to factor an orientation into seven shear matrices is based on the observation that any orientation can be expressed as a rotation of the form Ra’RbRa …”). 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the image generation system taught by YANG by the teachings of SCHRODER, and implement a shear rotation module in the machine learning model in order to perform a shear rotation of the implicit representation of the object.

	Regarding claim 19, the combination of YANG in view of SCHRODER discloses everything claimed as applied above (see claim 18)
However, YANG does not specifically disclose wherein a model architecture of the machine learning model, including the shear rotation module, is fully differentiable.
	In additional, SCHRODER discloses wherein a model architecture of the machine learning model, including the shear rotation module, is fully differentiable (right hand side of page 52 describes “Another way to factor an orientation into seven shear matrices is based on the observation that any orientation can be expressed as a rotation of the form Ra’RbRa …”).
	It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the image generation system taught by YANG by the teachings of SCHRODER, and implement a shear rotation matrices in the machine learning model in order to perform a shear rotation of the implicit representation of the object.

Allowable Subject Matter
Claims 13-14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

	Dependent claim 13 depends from dependent claim 12 and recites the additional limitations of “comparing the first implicit representation to the rotated second implicit representation; and comparing the second implicit representation to the rotated first implicit representation” for performing the additional comparison. However, the search results (Examiner listed additional prior arts with the similar invention in PTO 892) fail to show the obviousness of the claims as a whole. None of the prior art cited alone or in combination provides the motivation to teach the above limitations recited in claim 13.  Dependent claim 14 depends from dependent claim 13 and has the same reasons.

	Conclusion
	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Xilin Guo whose telephone number is (571)272-5786. The examiner can normally be reached Monday - Friday 9:00 AM-5:30 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/XILIN GUO/Primary Examiner, Art Unit 2616