DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-5, 7, 10, 12-19, 26-30 is/are rejected under 35 U.S.C. 103 as being unpatentable over Khamis et al., US2021/0366146 A1, and further in view of Akbas et al., US10,853,970 B1.
Regarding claim 1, Khamis teaches An apparatus (Fig. 1A; pose estimation system 100), comprising: a detector to identify a first subject in a first image captured by a first image capture device based on a first set of two-dimensional kinematic keypoints in the first image, the first set of two-dimensional kinematic keypoints corresponding to one or more joints of the first subject, the first image capture device associated with a first view of the first subject (par. 0030; The pose estimation system 100 receives an image 102 (e.g., a single image) and predicts a pose 138 of a human depicted in the image 102, where the kinematic structure 128 is leveraged (at a relatively coarse level) to propagate convolutional feature updates between keypoints or body parts, and The keypoints 140 relate to different body parts (or joints) of the human body such as ankles, hips, head, shoulders, elbows, etc.); a multi-view associator to verify the first subject using the first image and a second image captured by a second image capture device, the second image capture device associated with a second view of the first subject, the second view different than the first view (par. 0094; The camera 794 may capture image data having a human. The pose estimation system 700 may predict (i.e., by verifying) the pose of the human in the image data and send the predicted pose to the AR server. The AR server may use those poses (i.e., first and second views of the human) for online or offline processing of the captured images (i.e., first image and second image) to add visual effects or improve the human tracking.); and a keypoint generator to generate three-dimensional keypoints for the first subject using the first set of two-dimensional kinematic keypoints and a second set of keypoints in the second image (Fig. 1B, par. 0047; With the sets of features 122 at a relatively coarse level, the kinematic feature updater 124 uses a series of convolutional blocks 126 to update the sets of features 122 to generate updated sets of features 130 based on the kinematic structure 128.).  
Khamis fails to teach the following recited limitation.  However, Akbas teaches a biomechanics analyzer to determine a performance metric for the first subject using the three-dimensional keypoints (col. 7 lines 1-18; structural errors in pose might be more important than the localization error measured by the traditional evaluation metrics such as MPJPE (mean per joint position error) and PCK (percentage of correct keypoints), and the Pose Structure Score (PSS) is a new metric used to determine the correctness of pose estimation models.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Khamis teachings with Akbas teachings in order to predict three dimensional human poses from a single image (Akbas, col. 3 lines 4-5).

Regarding claims 2, 16, 17 and 27, Khamis and Akbas teach all the limitations in claims 1, 15 and 26.  Khamis further teaches a bounding box generator to generate a first bounding box for the first subject in the first image and a second bounding box for a second subject in the second image, the multi-view associator to identify the first subject in the first image using the first bounding box and in the second image using the second bounding box (par. 0039; the heatmap predictor 132 generates one or more heatmaps 136, where a heatmap 136 predicts the probability of a keypoint 140 occurring at each pixel in the image 102. In some examples, the heatmap 136 generated by the heatmap predictor 132 includes all the keypoints 140 of the pose 138.).

Regarding claims 3, 18, and 28, Khamis and Akbas teach all the limitations in claims 2, 17 and 27.  Khamis further teaches a tracker to assign a first subject identifier to the first bounding box and a second subject identifier to the second bounding box, the multi-view associator to associate the first subject identifier and the second subject identifier with the first subject (par. 0093; The client AR application 780 includes a motion tracker 782 configured to permit the computing device 790 to detect and track its position relative to the physical space, an environment detector 784 configured to permit the computing device 790 to detect the size and location of different types of surfaces (e.g., horizontal, vertical, angled).).

Regarding claim 4, Khamis and Akbas teach all the limitations in claim 1.  Khamis further teaches an image augmenter to increase a resolution of at least one of the first image or the second image (par. 0057; an upsampler 260 that uses one or more convolutional blocks 262 to increase the resolution 216 (at a relatively low level) of the updated sets of features 230 to a resolution 264.).

Regarding claims 5, 19, and 29, Khamis and Akbas teach all the limitations in claims 3, 18 and 28.  Khamis further teaches wherein the multi-view associator is to execute a neural network model to associate the first subject identifier and the second subject identifier with the first subject (par. 0054; the pose estimation system 100 may include a convolutional neural network trainer 152 that trains the pose estimation system 100 to obtain CNN parameters 162, which includes the kinematic structure 128. For example, the convolutional neural network trainer 152 may apply training data 154 to the pose estimation system 100, which predicts the heatmaps 136.).

Regarding claim 7, Khamis and Akbas teach all the limitations in claim 1.  Khamis further teaches wherein the first image and the second image each include a second subject, the detector to identify the second subject based on a third set of two- dimensional kinematic keypoints in the first image and a fourth set of two-dimensional kinematic keypoints in the second image (par. 0039; The pose estimation system 100 estimates the pose 138 of a subject (e.g., human) in the image 102 by estimating the location of the keypoints 140.).

Regarding claim 10, Khamis and Akbas teach all the limitations in claim 1.  Khamis further teaches wherein the detector is to execute a two-dimensional pose estimation algorithm to identify the first set of two-dimensional kinematic keypoints (par. 0069; The image 402 may be denoted by I, where I denotes an n×n image. A human pose is represented by K 2D keypoints, e.g. head, left ankle, etc.).

Regarding claim 12, Khamis and Akbas teach all the limitations in claim 1.  Khamis further teaches wherein the performance metric includes one or more of velocity, acceleration, shoulder sway, center of mass, or stride frequency of the first subject (par. 0033; The kinematic structure 128 encodes structural and geometric information about the structure of the human body, which have been learned by the pose estimation system 100 during a training phase.).

Regarding claim 13, Khamis and Akbas teach all the limitations in claim 1.  Akbas further teaches wherein the biomechanics analyzer is to assign a first weight to one or more of the three-dimensional keypoints to determine a first performance metric and assign a second weight to the one or more of the three-dimensional keypoints to determine a second performance metric, the second performance metric different than the first performance metric (col. 8 lines 44-47; During training, because the lower branch 40 is kept frozen, only weights in the upper branch 10 are learned. Weights are not determined for the lower branch 40. The upper branch 10 is the network that is being trained.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Khamis teachings with Akbas teachings in order to predict three dimensional human poses from a single image (Akbas, col. 3 lines 4-5).

Regarding claim 14, Khamis teaches A system (Fig. 1A; pose estimation system 100) comprising: a first image capture device to generate first image data, the first image data including a first view of a subject (par. 0095; The user may use the camera 794 on the computing device 790 to capture a scene from the physical space (e.g. moving the camera around to capture a specific area), and the client AR application 780 is configured to detect the set of visual feature points and track the movement of the set of visual feature points over time.); a second image capture device to generate second image data, the second image data including a second view of the subject (The user may use the camera 794 on the computing device 790 to capture a scene from the physical space (e.g. moving the camera around to capture a specific area), and the client AR application 780 is configured to detect the set of visual feature points and track the movement of the set of visual feature points over time.); and a processor to:-4-U.S. Application No.: Not Yet AssignedAttorney Docket No.: AC4378-US Preliminary Amendmentpredict first positions of two-dimensional keypoints of the subject based on the first image data (par. 0035; the pose estimation system 100 uses one or more convolutional blocks 134 to predict the location of the keypoints 140 based on the updated sets of features 130. In some examples, the pose 138 is a 2D pose.); assign a first identifier to the subject in the first image data based on the first positions of the two-dimensional keypoints (par. 0030; The pose 138 is identified by a set of keypoints 140 that are estimated by the pose estimation system 100. The keypoints 140 relate to different body parts (or joints) of the human body such as ankles, hips, head, shoulders, elbows, etc.); predict second positions of two-dimensional keypoints of the subject based on the second image data (par. 0035; the pose estimation system 100 uses one or more convolutional blocks 134 to predict the location of the keypoints 140 based on the updated sets of features 130. In some examples, the pose 138 is a 2D pose.); assign a second identifier to the subject in the second image data based on the second positions of two-dimensional keypoints (par. 0030; The pose 138 is identified by a set of keypoints 140 that are estimated by the pose estimation system 100. The keypoints 140 relate to different body parts (or joints) of the human body such as ankles, hips, head, shoulders, elbows, etc.); identify the subject as a first subject in the first image data and the second image based on the first identifier and the second identifier (par. 0039; the heatmap 136 identifies the locations of the first keypoint 140-1 through N keypoint 140-N. In some examples, the heatmap predictor 132 generates a separate heatmap 136 for each keypoint 140 of a pose 138. For example, a first heatmap depicts the location of a first keypoint 140-1, a second heatmap depicts the location of a second keypoint 140-2, a third heatmap depicts the location of a third keypoint 140-3, and so forth.); predict three-dimensional keypoints for the first subject based on the first positions of the two-dimensional keypoints and the second positions of the two-dimensional keypoints in the second image (par. 0094; The camera 794 may capture image data having a human. The pose estimation system 700 may predict the pose of the human in the image data and send the predicted pose to the AR server. The AR server may use those poses for online or offline processing of the captured images to add visual effects or improve the human tracking.). 
Khamis fails to teach the following recited limitation.  However, Akbas teaches determine a performance metric for the subject using the three-dimensional keypoints (col. 7 lines 1-18; structural errors in pose might be more important than the localization error measured by the traditional evaluation metrics such as MPJPE (mean per joint position error) and PCK (percentage of correct keypoints), and the Pose Structure Score (PSS) is a new metric used to determine the correctness of pose estimation models.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Khamis teachings with Akbas teachings in order to predict three dimensional human poses from a single image (Akbas, col. 3 lines 4-5).

Regarding claim 26, Khamis teaches At least one non-transitory computer readable medium comprising instructions that, when executed, cause at least one processor (par. 0091; a non-transitory computer-readable medium 694 storing executable instructions that when executed by the processors 692.) to at least: identify a first subject in a first image captured by a first image capture device based on a first set of two-dimensional kinematic keypoints in the first image, the first set of two-dimensional kinematic keypoints corresponding to one or more joints of the first subject, the first image capture device associated with a first view of the first subject (par. 0030; The pose estimation system 100 receives an image 102 (e.g., a single image) and predicts a pose 138 of a human depicted in the image 102, where the kinematic structure 128 is leveraged (at a relatively coarse level) to propagate convolutional feature updates between keypoints or body parts, and The keypoints 140 relate to different body parts (or joints) of the human body such as ankles, hips, head, shoulders, elbows, etc.); verify the first subject using the first image and a second image captured by a second image capture device, the second image capture device associated with a second view of the first subject, the second view different than the first view (par. 0094; The camera 794 may capture image data having a human. The pose estimation system 700 may predict (i.e., by verifying) the pose of the human in the image data and send the predicted pose to the AR server. The AR server may use those poses (i.e., first and second views of the human) for online or offline processing of the captured images (i.e., first image and second image) to add visual effects or improve the human tracking.); and generate three-dimensional keypoints for the first subject using the first set of two- dimensional kinematic keypoints and a second set of keypoints in the second image (Fig. 1B, par. 0047; With the sets of features 122 at a relatively coarse level, the kinematic feature updater 124 uses a series of convolutional blocks 126 to update the sets of features 122 to generate updated sets of features 130 based on the kinematic structure 128.). 
Khamis fails to teach the following recited limitation.  However, Akbas teaches determine a performance metric for the first subject using the three-dimensional keypoints (col. 7 lines 1-18; structural errors in pose might be more important than the localization error measured by the traditional evaluation metrics such as MPJPE (mean per joint position error) and PCK (percentage of correct keypoints), and the Pose Structure Score (PSS) is a new metric used to determine the correctness of pose estimation models.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Khamis teachings with Akbas teachings in order to predict three dimensional human poses from a single image (Akbas, col. 3 lines 4-5).

Regarding claim 30, Khamis and Akbas teach all the limitations in claim 26.  Khamis further teaches wherein the instructions, when executed, cause the at least one processor to increase a resolution of at least one of the first image or the second image (par. 0057; The pose estimation system 200 includes an upsampler 260 that uses one or more convolutional blocks 262 to increase the resolution 216 (at a relatively low level) of the updated sets of features 230 to a resolution 264.).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AYODEJI O AYOTUNDE whose telephone number is (571)270-7983. The examiner can normally be reached Monday - Friday, 7:00am-3:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Yuwen Pan can be reached on 571-272-7855. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AYODEJI O AYOTUNDE/Primary Examiner, Art Unit 2649