DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 7, 10-11, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al (U.S PG-PUB NO. 20160202756 A1) in view of Dai (CN 105760809 A).
-Regarding claim 1, Wu discloses an artificial intelligence (AI) apparatus comprising (Abstract; FIGS. 1-18): a two-dimensional (2D) image sensor configured to capture a 2D image of a head of a person (FIG. 1, sensor 102; FIG. 3, sensor 304; FIG. 10, steps 1004, 1010; [0023], “gaze tracking … (2D) image sensors”; [0028]; [0029], “a visible light image sensor”; FIG. 2 head pose 206; [0049], “2D image”; [0072]; [0074]); a three-dimensional (3D) image sensor configured to capture 3D image information of the head of the person (FIG. 1, sensor 102; FIG. 3, sensor 304, [0035]; FIG. 10, steps 1006, 1010; [0028]; [0029], “depth image sensor”;  FIG. 2 head pose 206, [0030]; [0037]-[0038], “3D head pose”; [0049], “any suitable manner, “3D image”; [0074]), wherein the 3D image information corresponds to at least one of 3D ([0033], “head coordinates”, “head center”; FIG. 3; [0036],”depth image”; [0037], “estimated from depth data”; [0038], “3D head pose”), or 3D rotation information ([0033], “head rotation”; [0036], [0038]); and a processor configured to (FIG. 18, 1802): associate the captured 2D image with the captured 3D image information corresponding to the captured 2D image ([0036], “face alignment … provide 2D coordinates … further converted to 3D coordinates if depth image data is also available”; [0039]; [0072], “determining 3D positions of the facial features from the 2D positions”; FIG. 7) ; select 3D head pose information for determining a rotation direction of the head from the captured 3D image information (FIGS. 2-3; [0030], “pose 206”, “head rotation matrix”, “gaze direction”; [0033]; [0036]; [0038], “3D head pose”, “rotation matrix”; [0039], “depth imaging may be utilized”); select a 2D image associated with the selected 3D head pose information ([0033], “3D head pose … may be estimated from a 2D visible spectrum image; [0036]; [0039]; FIG.7); determine 3D relative coordinates as a reference for correcting the selected 3D head pose information based on 2D coordinates with respect to a predetermined landmark point of the selected 2D image ([0026], “3D location … face model”; [0028], “calibrated”; [0033]; [0036], “converted to 3D coordinates if depth image data is also available … calibrated”;  [0039], “2D face landmark points … to estimate the corresponding 3D positions. … 3D positions may be iteratively estimated”; FIG. 5B-5C; FIG. 8), wherein the 3D relative coordinates correspond to coordinates for determining a correlation between the captured 3D image information and a corrected 3D head pose information ([0026], “determine three-dimensional (3D) locations of facial landmarks captured from image data”, “may be obtained via a stereo camera, and/or via 3D generic face models”; [0037], “corresponding 3D coordinates may be estimated from depth data”; [0038]-[0039]; FIG. 5B-5C; FIG. 8); correct the captured 3D image information based on the determined 3D relative coordinates ([0037], “For tracking head pose, a person-specific 3D face model may be calibrated for each person”; FIG. 5B-5C; FIG. 8; [0038]-[0039]; FIG. 10, [0050], “adjustment”; [0072]-[0074]); and determine the corrected captured 3D image information based on a particular predetermined landmark point of each 2D image ([0026]-[0028]; [0036], “facial landmark … landmark points … estimated from 2D RGB”; [0037]-[0038]; [0039], “estimate the person's head pose “, “iteratively minimizing the error between the predicted projection of a known 3D model and 2D landmarks tracked”; FIGS. 5A-5C; FIG. 8; FIG. 10; [0072]-[0074]).
Wu does teach determining 3D relative coordinates based on depth image. Wu is silent to teach determine 3D relative coordinates as a reference for the selected 3D head pose information.
In the same field of endeavor, Dai teaches determine 3D relative coordinates as a reference for the selected 3D head pose information (Dai: Abstract, “head model”; FIGS. 1-5; Page 4, 2nd paragraph, “head pose may be considered … as a reference, and then fitting with the target cloud”; Page 4, 5th paragraph, “acquisition of a point cloud … as a reference”; Page 4, 13th paragraph, “determine a three-dimensional coordinate of a feature point … three-dimensional image”). 
Dai further teaches comparing for estimating a head pose using a two-dimensional image (Dai: Page 5, 9th paragraph, “estimating a head pose using a two-dimensional image”; Page 4, 14th paragraph, “two-dimensional coordinates … combined … three-dimensional coordinates … calculated”, “aligned”). Dai teaches wherein the 3D relative coordinates correspond to coordinates for determining a correlation between the captured 3D image information and a corrected 3D head pose information (Dai: Page 6, 4th paragraph, “three-dimensional coordinates … three dimensional image”; FIGS. 1-5) and correcting the captured 3D image information based on the determined 3D relative coordinates (Dai: Abstract; FIGS. 1-5).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Wu with the teaching of Dai by using three-dimensional relative coordinates as a reference in order to improve the performance of head pose estimation.
-Regarding claim 2, the combination further discloses wherein the processor is further configured (FIG. 18, step 1802) to detect a landmark point of the head from the captured 2D image (FIG. 10, 1008, [0049]), wherein the landmark point corresponds to a feature point for recognizing a head position and a head direction ([0039], “2D face landmark points … may be used to estimate the corresponding 3D positions”, “to estimate the person's head pose”; [0046], FIG.8, step 818; FIG. 10, steps 1010-1012, 1030;  [0049]-[0050]; [0032], “head coordinate … including eyeball center …”; [0033]; [0036]).
-Regarding claim 7, the combination further discloses wherein the 3D relative coordinates are determined based at least in part by selecting a plurality of 2D images associated with the selected 3D head pose information and on 2D coordinate information with respect to a specific predetermined landmark point of each of the [0026]; [0036], “converted to 3D coordinates if depth image data is also available”;  [0039], “2D face landmark points … to estimate the corresponding 3D positions. … 3D positions may be iteratively estimated”; FIG. 5B-5C; FIG. 8).
-Regarding claim 10, Wu discloses a method of estimating a head pose, the method comprising (Abstract; FIGS. 1-18): capturing, by a two-dimensional (2D) image sensor, a 2D image of a head of a person (FIG. 1, sensor 102; FIG. 3, sensor 304; FIG. 10, steps 1004, 1010; [0023], “gaze tracking … (2D) image sensors”; [0028]; [0029], “a visible light image sensor”; FIG. 2 head pose 206; [0049], “2D image”; [0072]; [0074]); capturing, by a three-dimensional (3D) image sensor, a 3D image information of the head (FIG. 1, sensor 102; FIG. 3, sensor 304, [0035]; FIG. 10, steps 1006, 1010; [0028]; [0029], “depth image sensor”;  FIG. 2 head pose 206, [0030]; [0037]-[0038], “3D head pose”; [0049], “any suitable manner”, “3D image”; [0074]); associating the captured 2D image with the captured 3D image information corresponding to the captured 2D image ([0036], “face alignment … provide 2D coordinates … further converted to 3D coordinates if depth image data is also available”; [0039]; [0072], “determining 3D positions of the facial features from the 2D positions”;  FIG. 7); selecting 3D head pose information for determining a rotation direction of the head from the captured 3D image information ([0030], “pose 206”; [0033], “gaze direction”; [0036]; [0037], “estimated from depth data”; [0038], “3D head pose”, “rotation matrix”; [0039], “depth imaging may be utilized”); selecting a 2D image associated with the selected 3D head pose information ([0033], “3D head pose … may be estimated from a 2D visible spectrum image; [0036]; [0039]; FIG. 7); determining 3D relative coordinates as a reference for correcting the selected 3D head pose information based on 2D coordinates with respect to a predetermined landmark point of the selected 2D image ([0026], “3D location … face model”; [0028], “calibrated”; [0033]; [0036], “converted to 3D coordinates if depth image data is also available … calibrated”;  [0039], “2D face landmark points … to estimate the corresponding 3D positions. … 3D positions may be iteratively estimated”; FIG. 5B-5C; FIG. 8); and correcting the captured 3D image information based on the determined 3D relative coordinates ([0037], “For tracking head pose, a person-specific 3D face model may be calibrated for each person”; FIG. 5B-5C; FIG. 8; [0038]-[0039]; FIG. 10, [0050], “adjustment”; [0072]-[0074]); and determine the corrected captured 3D image information based on a particular predetermined landmark point of each 2D image ([0026]-[0028]; [0036], “facial landmark … landmark points … estimated from 2D RGB”; [0037]-[0038]; [0039], “estimate the person's head pose “, “iteratively minimizing the error between the predicted projection of a known 3D model and 2D landmarks tracked”; FIGS. 5A-5C; FIG. 8; FIG. 10; [0072]-[0074]).
Wu does teach determining 3D relative coordinates based on depth image. Wu is silent to teach determine 3D relative coordinates as a reference for the selected 3D head pose information.
In the same field of endeavor, Dai teaches determine 3D relative coordinates as a reference for the selected 3D head pose information (Dai: Abstract; FIG. 1; Page 4, 2nd paragraph, “head pose may be considered … as a reference, and then fitting with the target cloud”; Page 4, 5th paragraph, “acquisition of a point cloud … as a reference”; Page 4, 13th paragraph, “determine a three-dimensional coordinate of a feature point … three-dimensional image”). 
Dai further teaches comparing for estimating a head pose using a two-dimensional image (Dai: Page 5, 9th paragraph, “estimating a head pose using a two-dimensional image”; Page 4, 14th paragraph, “two-dimensional coordinates … combined … three-dimensional coordinates … calculated”, “aligned”). Dai teaches wherein the 3D relative coordinates correspond to coordinates for determining a correlation between the captured 3D image information and a corrected 3D head pose information (Dai: Page 6, 4th paragraph, “three-dimensional coordinates … three dimensional image”; FIGS. 1-5) and correcting the captured 3D image information based on the determined 3D relative coordinates (Dai: Abstract; FIGS. 1-5).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Wu with the teaching of Dai by using three-dimensional relative coordinates as a reference in order to improve the performance of head pose estimation.
-Regarding claim 11, the combination further discloses comprising detecting a landmark point of the head from the captured 2D image (FIG. 10, 1008, [0049]), wherein the landmark point corresponds to a feature point for recognizing a head position and a head direction ([0039], “2D face landmark points … may be used to estimate the corresponding 3D positions”, “to estimate the person's head pose”; [0046], FIG.8, step 818; FIG. 10, steps 1010-1012, 1030;  [0049]-[0050]; [0032], “head coordinate … including eyeball center …”; [0033]; [0036]).
[0026]; [0036], “converted to 3D coordinates if depth image data is also available”;  [0039], “2D face landmark points … to estimate the corresponding 3D positions. … 3D positions may be iteratively estimated”; FIG. 5B-5C; FIG. 8).
Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al (U.S PG-PUB NO. 20160202756 A1) in view of Dai (CN 105760809 A), and further in view of Lodato et al (U.S PATENT NO. 10417810 B2).
-Regarding claims 5 and 14, Wu in view of Dai discloses the methods of claim 1 and claim 10 respectively. 
Wu in view of Dai is silent to teach wherein the processor is further configured to: synchronize a 2D timestamp of the 2D image sensor with a 3D timestamp of the 3D image sensor by storing a timestamp when a photography start signal is input as metadata of the 2D image, and associating the 3D image information having a same timestamp as a stored timestamp included in metadata of the 2D image with the 2D image.
In the same field of endeavor, Lodato teaches to synchronize a 2D timestamp of the 2D image sensor with a 3D timestamp of the 3D image sensor by storing a timestamp when a photography start signal is input as metadata of the 2D image (Lodato: Col. 6, lines 10-22, “depth data … synchronized with the 2D color data … with a common instance in time (… timestamp …) … with corresponding metadata”; Fig. 1), and associating the 3D image information having a same timestamp as a stored timestamp included in metadata of the 2D image with the 2D image (Lodato: Abstract, “receiving metadata”; Fig. 1, Metadata 108; FIG. 19; Col. 2, lines 60-62; Col. 4, lines 6-8, lines 51-55; Col. 6, lines 10-22).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Wu in view of Dai with the teaching of Lodato by synchronizing a 2D timestamp of the 2D image sensor with a 3D timestamp of the 3D image sensor by storing a timestamp when a photography start signal is input as metadata of the 2D image, and associating the 3D image information having a same timestamp as a stored timestamp included in metadata of the 2D image with the 2D image in order to provide a form of media content for paired 2D and 3D images to allow for comparison and interoperability and to help head pose estimation.
Claims 6 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al (U.S PG-PUB NO. 20160202756 A1) in view of Dai (CN 105760809 A), and further in view of Chen et al (CN 107341827 A).
-Regarding claims 6 and 15, Wu in view of Dai discloses the methods of claim 1 and claim 10 respectively. 
Wu in view of Dai is silent to teach wherein the 3D image information includes head rotation information including at least a roll value, a pitch value, or a yaw value of the head, and wherein the 3D head pose information for determining a rotation direction of the head from the captured 3D image information is selected based at least in part on 
In the same field of endeavor, Chen teaches wherein the 3D image information includes head rotation information including at least a roll value, a pitch value, or a yaw value of the head (Chen: Abstract, “Euler angle”; Figures 2a-2c; Page 4, 4th paragraph, “Euler angle of the head posture”, “3D image”; Page 7, last paragraph, “pitch”), and wherein the 3D head pose information for determining a rotation direction of the head from the captured 3D image information is selected based at least in part on comparing at least one of the roll value, the pitch value, or the yaw value of the captured 3D image information with a head direction reference value (Chen: Figure 1a; Figures 1c-1d. Figures 2a-2c; Figure 3a; Page 5, 4th paragraph, “determine position … rotating rigid body … person’s head”; Page 5, 6th paragraph, “rotated … according to Euler angle”; Page 7, 13th paragraph, “rotating of the head”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Wu in view of Dai with the teaching of Chen by obtaining 3D image information includes head rotation information including at least a roll value, a pitch value, or a yaw value of the head in order to easy compare two-dimensional dynamic effect and improve the fusion result and the quality of head pose estimation.
Claims 8-9 and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al (U.S PG-PUB NO. 20160202756 A1) in view of Dai (CN 105760809 A), and further in view of Gausebeck et al (U.S PG-PUB NO. 20190026958 A1).
Regarding claims 8 and 17, Wu in view of Dai discloses the methods of claim 1 and claim 10 respectively.
Wu in view of Dai discloses to generate learning data by labeling each 2D image with the corrected captured 3D image information with respect to the predetermined landmark point of the selected 2D image (Wu: [0024], “machine-learning”, “labeled ground truth”; [0026]-[0028]; [0036], “facial landmark”; [0037]-[0038]; [0039], “estimate the person's head pose “, “iteratively minimizing the error between the predicted projection of a known 3D model and 2D landmarks tracked”; FIGS. 5A-5C; FIG. 8; FIG. 10; [0072]), and to train a 3D head pose reasoning model for inferring a particular 3D head pose information (Wu: [0038], “person's head pose may be measured relative to the reference model”; [0054], “training”) from a predefined 2D image using the generated learning data, wherein the 3D head pose reasoning model corresponds to a neural network model (Wu: Abstract; FIGS. 2, 5A-5C, 8, 10; [0026], “Face-model-based approaches may determine three-dimensional (3D) locations of facial landmarks captured from image data”; [0039], “estimate the person's head pose, … by iteratively minimizing the error between the predicted projection of a known 3D model and 2D landmarks tracked”).
Wu in view of Dai is silent to teach labeling each 2D image and wherein the 3D head pose reasoning model corresponds to a neural network model.
In the same field of endeavor, Gausebeck teaches labeling each 2D image and wherein the 3D head pose reasoning model corresponds to a neural network model (Gausebeck: Abstract; FIGS. 1, 5, 8-12; [0030]-[0031], “neural network”; [0038], “labels … 2D image”; FIG. 9 component 928).

-Regarding claims 9 and 18, Wu in view of Dai, and further in view of Gausebeck discloses the methods of claim 8 and claim 17 respectively.
The modification further discloses wherein the processor is further configured to (Wu: FIG. 18, 1802): input a 2D head image of the head of a user to the trained 3D head pose information reasoning model for determining the particular 3D head pose information of the inputted 2D head image (Wu: FIG. 10, step 1004-1012; [0026]-[0028]; [0036], “facial landmark”; [0037]-[0038]; [0039], “estimate the person's head pose “, “iteratively minimizing the error between the predicted projection of a known 3D model and 2D landmarks tracked”; FIGS. 5A-5C; FIG. 8), and determine a head position and a head direction of the user based on the determined particular 3D head pose information of the inputted 2D head image (Wu: [([0033], “3D head pose … may be estimated from a 2D visible spectrum image; [0036]; [0039], “2D face landmark points … to estimate the corresponding 3D positions. … 3D positions may be iteratively estimated”; FIGS. 5B-5C; FIG. 8; FIG. 10; [0072]).
Allowable Subject Matter
Claims 3-4 and 12-13 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XIAO LIU whose telephone number is (571)272-4539.  The examiner can normally be reached on Monday-Thursday and Alternate Fridays 8:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nay Maung can be reached on (571) 272-7882.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/XIAO LIU/Examiner, Art Unit 2664                                                                                                                                                                                                        

/PING Y HSIEH/Primary Examiner, Art Unit 2664