Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 19-22 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Claim 19 recites "the one or more extraction models" in line 14, which lacks proper antecedent basis.  Since one is not able to identify to which extraction model this statement refers, the scope of the claim is unclear.  Claims 20-22 depend on claim 19 but fail to cure the cited deficiency.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-5, 7, 13-16, 23, 24, and 26 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Tran et al. ("On Learning 3D Face Morphable Model from In-the-Wild Images"; hereinafter "Tran").
Regarding claim 1, Tran discloses A processor-implemented method ("computer graphics," pg. 157, sec. 1, para. 1), the method comprising: determining albedo data in a canonical space and depth data in the canonical space based on input image data including an object, using one or more neural network-based extraction models ("we use two deep networks to decode the shape, albedo parameters into the 3D facial shape and albedo respectively," pg. 160, sec. 3.2.1, para. 2; "the reference UV space … the reference shape," pg. 161, col. 1, para. 5; the Fig. 4 caption describes how the shape data includes a z spatial dimension, i.e. depth data); generating deformed albedo data and deformed depth data by applying a target shape deformation value respectively to the albedo data and the depth data ("modifying one or more elements in the albedo or shape representation … manipulate the semantic attribute, such as growing beard, smiling," pg. 169, col. 1, para. 2); generating resultant shaded data by performing shading based on the deformed depth data and a target illumination value (resultant shaded data Cuv on pg. 161, Eq. (9) is based on shape data N and illumination data L-bHb; note that the shape data N would be the deformed shape data described above in order to render the modified expressions described on pg. 169, col. 1, para. 2); generating intermediate image data based on the resultant shaded data and the deformed albedo data (intermediate image/texture data Tuv on pg. 161, Eq. (9) is based on the resultant shaded data Cuv and the albedo data Auv; note that the albedo data Auv would be the deformed albedo data described above in order to render the modified expressions described on pg. 169, col. 1, para. 2); and generating reconstructed image data from the intermediate image data and the deformed depth data based on a target pose value (pg. 161, Eq. (10) shows calculation of the reconstructed image data Î(m,n) from the intermediate image/texture data Tuv and the [deformed] shape data Suv based on a target projection/pose m).
Regarding claim 2, Tran discloses determining the albedo data in the canonical space from the input image data using a neural network-based albedo extraction model; and determining the depth data in the canonical space from the input image data using a neural network-based depth extraction model ("we use two deep networks to decode the shape, albedo parameters into the 3D facial shape and albedo respectively," pg. 160, sec. 3.2.1, para. 2; "the reference UV space … the reference shape," pg. 161, col. 1, para. 5; the Fig. 4 caption describes how the shape data includes a z spatial dimension, i.e. depth data).
Regarding claim 3, Tran discloses wherein the albedo data in the canonical space corresponds to albedo data when the object is deformed into a canonical shape which is a reference, and the depth data in the canonical space corresponds to depth data when the object is deformed into the canonical shape ("the reference UV space … the reference shape used has the mouth open," pg. 161, col. 1, para. 5).
Regarding claim 4, Tran discloses performing a backward warping operation on each of the albedo data and the depth data based on the target shape deformation value ("provides semantic parameters allowing access to different components including 3D shape, albedo, lighting and projection matrix," para. 164, sec. 4.1.3, para. 1; "Decomposing face image into individual components … edit the face by manipulating any component," pg. 168, sec. 4.4.3, para. 1; see pg. 169, sec. "Attribute Manipulation").
Regarding claim 5, Tran discloses extracting a surface normal element of the object from the deformed depth data; and generating the resultant shaded data by performing the shading based on the extracted surface normal element and the target illumination value ("the surface normal map," pg. 161, sec. 3.2.3, para. 1; see Eq. (9)).
Regarding claim 7, Tran discloses generating the reconstructed image data by deforming a pose of the object in each of the intermediate image data and the deformed depth data based on the target pose value, and combining the intermediate image data in which the pose of the object is deformed and depth data in which the pose of the object is deformed ("the 3D shape/mesh S is projected to the image plane via Eq. (4)," pg. 161, sec. 3.2.3, para. 1; see Eq. (4) and Eq. (10)).
Regarding claim 13, it is rejected using the same citations and rationales set forth in the rejection of claim 1, with the additional limitation of one or more processors ("computer graphics," pg. 157, sec. 1, para. 1).
Regarding claims 14, 15, and 16, they are rejected using the same citations and rationales set forth in the rejections of claims 2, 5, and 7 respectively.
Regarding claim 23, Tran discloses A processor-implemented method ("computer graphics," pg. 157, sec. 1, para. 1), the method comprising: decomposing an input image into an albedo component and a depth component using a trained neural network-based extraction model ("we use two deep networks to decode the shape, albedo parameters into the 3D facial shape and albedo respectively," pg. 160, sec. 3.2.1, para. 2; the Fig. 4 caption describes how the shape data includes a z spatial dimension, i.e. depth data); deforming the albedo component and the depth component based on a target shape deformation value corresponding to a local geometric change of an object of the input image ("modifying one or more elements in the albedo or shape representation … manipulate the semantic attribute, such as growing beard, smiling," pg. 169, col. 1, para. 2); shading the deformed depth component based on a target illumination value (resultant shaded data Cuv on pg. 161, Eq. (9) is based on shape data N and illumination data L-bHb; note that the shape data N would be the deformed shape data described above in order to render the modified expressions described on pg. 169, col. 1, para. 2); generating an intermediate image by combining the deformed albedo component and the shaded deformed depth component (intermediate image/texture data Tuv on pg. 161, Eq. (9) is based on the resultant shaded data Cuv and the albedo data Auv; note that the albedo data Auv would be the deformed albedo data described above in order to render the modified expressions described on pg. 169, col. 1, para. 2); and adjusting a pose of the intermediate image based on the deformed depth component and a target pose value (pg. 161, Eq. (10) shows calculation of the reconstructed image data Î(m,n) from the intermediate image/texture data Tuv and the [deformed] shape data Suv based on a target projection/pose m).
Regarding claim 24, Tran discloses performing a vector dot product operation between the deformed albedo component and the shaded deformed depth component (see pg. 161, Eq. (9)).
Regarding claim 26, Tran discloses wherein the target shape deformation value corresponds to a facial expression of the object ("modifying one or more elements in the albedo or shape representation … manipulate the semantic attribute, such as growing beard, smiling," pg. 169, col. 1, para. 2).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 6, 12, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Tran.
Regarding claim 6, Tran discloses performing the shading using spherical harmonics ("spherical harmonics," pg. 161, sec. 3.2.3, para. 1; see Eq. (9)).
Tran does not describe second-order spherical harmonics.
The Examiner takes Official Notice that both the concepts and the advantages of using second-order spherical harmonics were well known and expected before the effective filing date of the claimed invention, and it would have been obvious to use second-order spherical harmonics in Tran in order to increase accuracy and lower engineering costs by employing known mathematical conventions.
Regarding claim 12, Tran does not specifically recite A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, configure the processor to perform the method of claim 1.
The Examiner takes Official Notice that both the concepts and the advantages of using A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, configure the processor to perform the method were well known and expected before the effective filing date of the claimed invention, and it would have been obvious to use such a computer-readable medium in order to allow software to be stored in a non-volatile manner.
Regarding claim 18, Tran does not specifically recite An electronic apparatus comprising the apparatus of claim 13 and a display.
The Examiner takes Official Notice that both the concepts and the advantages of using An electronic apparatus comprising … a display were well known and expected before the effective filing date of the claimed invention, and it would have been obvious to use such an apparatus in order to allow a user to visualize the results of the reconstructed face images.

Claims 8-11, 17, 19, 20, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Tran in view of Volkov et al. (US 2022/0392133; hereinafter "Volkov").
Regarding claim 8, Tran discloses wherein … the target illumination value … are values extracted from another input image data that is different from the input image data ("replacing the lighting of a target face image using lighting from a source face," pg. 169, col. 1, para. 1), and an object in the other input image data is the same as the object in the input image data (any source and target face images can be used, including face images having the same or different faces).
Tran does not disclose wherein the target shape deformation value, and the target pose value are values extracted from another input image data that is different from the input image data.
In the same art of facial reconstruction and facial editing, Volkov teaches the target shape deformation value, and the target pose value are values extracted from another input image data that is different from the input image data ("generating an output frame of an output video that includes a modified image of the target face and the target head adopting the pose of the head and the facial expression of the source actor," abstract), and an object in the other input image data is the same as the object in the input image data (any source and target face images can be used, including face images having the same or different faces).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Volkov to Tran.  The motivation would have been "for realistic head turns and face animation synthesis" (Volkov, para. 21).
Regarding claim 9, Tran discloses wherein … the target illumination value … are values extracted from another image data ("replacing the lighting of a target face image using lighting from a source face," pg. 169, col. 1, para. 1) including an object that is different from the object in the input image data (any source and target face images can be used, including face images having the same or different faces).
Tran does not disclose wherein the target shape deformation value, and the target pose value are values extracted from another image data.
In the same art of facial reconstruction and facial editing, Volkov teaches the target shape deformation value, and the target pose value are values extracted from another image data ("generating an output frame of an output video that includes a modified image of the target face and the target head adopting the pose of the head and the facial expression of the source actor," abstract) including an object that is different from the object in the input image data (any source and target face images can be used, including face images having the same or different faces).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Volkov to Tran.  The motivation would have been "for realistic head turns and face animation synthesis" (Volkov, para. 21).
Regarding claim 10, Tran discloses wherein … the target illumination value … are values extracted from another input image data ("replacing the lighting of a target face image using lighting from a source face," pg. 169, col. 1, para. 1) using a neural network-based extraction model (e.g. neural network model "E" of Fig. 2 is used for lighting extraction) other than the one or more neural network-based extraction models (e.g. neural network models "Ds" or "DA" are used for extraction of shape and albedo data as recited in claim 1).
Tran does not disclose wherein the target shape deformation value, and the target pose value are values extracted from another image data.
In the same art of facial reconstruction and facial editing, Volkov teaches the target shape deformation value, and the target pose value are values extracted from another image data ("generating an output frame of an output video that includes a modified image of the target face and the target head adopting the pose of the head and the facial expression of the source actor," abstract).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Volkov to Tran.  The motivation would have been "for realistic head turns and face animation synthesis" (Volkov, para. 21).
Regarding claim 11, Tran discloses wherein … the target illumination value … are values extracted from another input image data ("replacing the lighting of a target face image using lighting from a source face," pg. 169, col. 1, para. 1), and the one or more extraction models are trained by updating parameters of the one or more extraction models based on the reconstructed image data and the other image data ("The entire network is end-to-end trained to reconstruct the input images, with the loss function," pg. 162, sec. 3.2.5, para. 1; see also pg. 163, sec. "Intermediate Semi-Supervised Training").
Tran does not disclose wherein the target shape deformation value, and the target pose value are values extracted from another image data.
In the same art of facial reconstruction and facial editing, Volkov teaches the target shape deformation value, and the target pose value are values extracted from another image data ("generating an output frame of an output video that includes a modified image of the target face and the target head adopting the pose of the head and the facial expression of the source actor," abstract).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Volkov to Tran.  The motivation would have been "for realistic head turns and face animation synthesis" (Volkov, para. 21).
Regarding claim 17, it is rejected using the same citations and rationales set forth in the rejection of claim 10.
Regarding claim 19, Tran discloses A processor-implemented method ("computer graphics," pg. 157, sec. 1, para. 1), the method comprising: determining albedo data in a canonical space and depth data in the canonical space ("we use two deep networks to decode the shape, albedo parameters into the 3D facial shape and albedo respectively," pg. 160, sec. 3.2.1, para. 2; "the reference UV space … the reference shape," pg. 161, col. 1, para. 5; the Fig. 4 caption describes how the shape data includes a z spatial dimension, i.e. depth data) based on first training image data using a neural network-based first extraction model ("given a set of K 2D face images … learn an encoder," pg. 160, col. 2, para. 2); extracting an illumination value from second training image data ("estimating the lighting parameters Lsource of the source image," pg. 169, col. 1, para. 1); generating deformed albedo data and deformed depth data by applying [a] shape deformation value respectively to the albedo data and the depth data ("modifying one or more elements in the albedo or shape representation … manipulate the semantic attribute, such as growing beard, smiling," pg. 169, col. 1, para. 2); generating resultant shaded data by performing shading based on the deformed depth data and [an] illumination value (resultant shaded data Cuv on pg. 161, Eq. (9) is based on shape data N and illumination data L-bHb; note that the shape data N would be the deformed shape data described above in order to render the modified expressions described on pg. 169, col. 1, para. 2); generating intermediate image data based on the resultant shaded data and the deformed albedo data (intermediate image/texture data Tuv on pg. 161, Eq. (9) is based on the resultant shaded data Cuv and the albedo data Auv; note that the albedo data Auv would be the deformed albedo data described above in order to render the modified expressions described on pg. 169, col. 1, para. 2); generating reconstructed image data from the intermediate image data and the deformed depth data based on [a] pose value (pg. 161, Eq. (10) shows calculation of the reconstructed image data Î(m,n) from the intermediate image/texture data Tuv and the [deformed] shape data Suv based on a target projection/pose m); and training the one or more extraction models by updating parameters of the one or more extraction models based on the reconstructed image data and the second training image data ("We jointly learn the model and the model fitting algorithm via weak supervision, by leveraging a large collection of 2D images," pg. 158, col. 2, bullet #3; "The entire network is end-to-end trained to reconstruct the input images, with the loss function," pg. 162, sec. 3.2.5, para. 1).
Tran does not disclose extracting a shape deformation value and a pose value from second data.
In the same art of facial reconstruction and facial editing, Volkov teaches extracting a shape deformation value and a pose value from second data ("generating an output frame of an output video that includes a modified image of the target face and the target head adopting the pose of the head and the facial expression of the source actor," abstract).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Volkov to Tran.  The motivation would have been "for realistic head turns and face animation synthesis" (Volkov, para. 21).
Regarding claim 20, the combination of Tran and Volkov renders obvious iteratively correcting the parameters of the first extraction model such that a difference between the reconstructed image data and the second training image data is reduced ("we introduce intermediate loss functions to guide the training in the early iterations," Tran, pg. 163, col. 2, para. 3).
Regarding claim 22, the combination of Tran and Volkov renders obvious using the trained one or more extraction models to generate reconstructed image data from input image data including an object (e.g. Tran, pg. 169, col. 1, paras. 1-2).

Claim 25 is rejected under 35 U.S.C. 103 as being unpatentable over Tran in view of Thies et al. ("Real-time Expression Transfer for Facial Reenactment"; hereinafter "Thies").
Regarding claim 25, Tran discloses determining a surface normal of the deformed depth component; and applying an illumination element to the surface normal (see pg. 161, Eq. (9)).
Tran does not specifically recite determining a surface normal through pixel-wise regression of local neighboring pixels of the deformed depth component.
In the same art of facial reconstruction, Thies teaches determining a surface normal through pixel-wise regression of local neighboring pixels of the deformed depth component ("the measured input color sequence CI and depth sequence XI … are aligned in image space and can be indexed by the same pixel coordinates; i.e., the color and back-projected 3D position in an integer pixel location p … a normal field NI … is obtained as the cross product of the partial derivatives of XI with respect to the continuous image coordinates," pg. 6, col. 1, para. 1).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Thies to Tran.  The motivation would have been "to reconstruct high-quality facial performance" (Thies, pg. 2, col. 1, para. 1).

Allowable Subject Matter
Claim 21 is objected to as being dependent upon a rejected base claim, but would be allowable if the rejections under 35 U.S.C. 112 are overcome and if it is rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  The combination of Tran and Volkov renders obvious extracting a shape deformation value, an illumination value, and a pose value from second image data to apply to depth and albedo values of first image data.  However, claim 21 requires a specific architecture having a first neural network trained to extract depth and albedo data and a second neural network trained to extract shape deformation, illumination, and pose data, and the claim further requires a specific training technique involving the two separate networks that are not taught or rendered obvious by the known prior art.  Note that such an architecture and training technique are allowable in the context of the interconnected limitations of the parent claim, but not necessarily in a broader context.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ryan McCulley whose telephone number is (571)270-3754. The examiner can normally be reached Monday through Friday, 8:00am - 4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on (571) 272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/RYAN MCCULLEY/Primary Examiner, Art Unit 2611