Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED OFFICE ACTION

Status of Claims

Claims 1-20 are pending in this Office Action.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.

3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b) (2) (C) for any potential 35 U.S.C. 102(a) (2) prior art against the later invention.

1.	Claims 1,2,3,4,5,17,18,19 and 20  are rejected under 35 U.S.C 103 as being patentable over Han et al. ( USPAT 9311713)  in view of Helge Rhodin( NPL DOC:  "Learning Monocular 3D Human Pose Estimation from Multi-view Images,"June 2018, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Pages. 8437-8444).

As per Claim 1,  Han et al. teaches A method for training a three-dimensional (3D) pose estimation model using unlabeled images ( Col 4- “…The model 130 of the articulated object may include three-dimensionally (3D)-modeled information of poses performable by the articulated object based on a structure of joints included in the articulated object….”) , comprising: receiving as input a plurality of unlabeled images of a particular object each captured from a different viewpoint ( Col. 6, lines 9- 25 – “…Labeled data and unlabeled data may be jointly modeled. A change in a viewpoint with respect to a pose of an articulated object may be processed using an adaptive hierarchical classification scheme. Both the labeled data and the unlabeled data may be used…”)  ; 

However, within analogous art, Helge Rhodin teaches  training a model, using the plurality of unlabeled images ( Page 8438, Col. 2 – “ 3. Approach - we train our network using a novel loss function that adds view-consistency terms to a standard supervised loss evaluated on a small amount of labeled data and a regularization term that penalizes drift from initial pose predictions. This formulation enables us to use unlabeled multi-view footage by estimating jointly the body pose in a person-centered coordinate system…”) , for use by a process that estimates a three-dimensional (3D) pose for a given 2-dimensional (2D) image( 2D pose data  for pre-training of 3D estimation taught within Page 8440-Col. 2-“…The first three levels are initialized through transfer learning by pre-training on a 2D pose estimation task, as proposed by [16], and then kept constant. For the weak supervision experiments, the
network is pre-trained on L….” AND Page 8438-Col. 2 – “…Our goal is to leverage multi-view images, for which the true 3D human pose is unknown, to train a deep network to predict 3D pose from a single image. To this end, we train our network using a novel loss function that adds view-consistency terms to a standard supervised loss evaluated on a small amount of labeled data and a regularization term that penalizes drift from initial pose predictions…”).
	One of ordinary skill in the art would have been motivated to combine the teaching of Helge Rhodin within the modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. because the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin provides a system and method for implementing 3D human pose estimation utilizing deep net architecture for processing image datasets. 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin within the modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al.  for implementation of a system and method  for  3D human pose estimation utilizing deep net architecture for processing image datasets. 

As per Claim 2,  Combination of Han et al. and Helge Rhodin teach claim 1, 
Han et al. does not explicitly teach wherein the plurality of unlabeled images include images without 3D pose annotations.
Within analogous art,  Helge Rhodin teaches wherein the plurality of unlabeled images include images without 3D pose annotations ( Page 8438- Col. 2 – Figure 2 (b)  AND Page 8438-Col. 2 – 3. Approach- “…3D human pose is unknown, to train a deep network to predict 3D pose from a single image. …This formulation enables us to use unlabeled multi-view footage by estimating jointly the body pose in a person-centered coordinate system and the rotation of that coordinate system with respect to the cameras….”) .  

As per Claim 3,  Combination of Han et al. and Helge Rhodin teach claim 1,
Han et al. does not explicitly teach wherein the plurality of unlabeled images are captured without calibrating camera position.
Within analogous art,  Helge Rhodin teaches wherein the plurality of unlabeled images are captured without calibrating camera position ( Page 8438 , Col. 2- “…This formulation enables us to use unlabeled multi-view footage by estimating jointly the body pose in a person-centered coordinate system and the rotation of that coordinate system with respect to the cameras….” AND Page 8440- Col. 1- “…unlike other recent multi-view approaches [24, 19], we do not require a full camera calibration. We only need the intrinsics….”) .  

As per Claim 4, Combination of Han et al. and Helge Rhodin teach claim 1,
Han et al. does not explicitly teach wherein the particular object is a human.
Within analogous art,  Helge Rhodin teaches wherein the particular object is a human (Abstract  “…Accurate 3D human pose …” And Figure 2) .  

As per Claim 5, Combination of Han et al. and Helge Rhodin teach claim 1,
Han et al. does not explicitly teach wherein the 3D pose is defined by 3D locations of joints with respect to a camera.
Within analogous art,  Helge Rhodin teaches wherein the 3D pose is defined by 3D locations of joints with respect to a camera ( Page 8439- Col. 1- “…let fθ denote the mapping, with parameters _,encoded by a CNN taking a monocular image I ∈ Rw×h×3as input and producing a 3D human pose p = fθ(I) ∈ R3×NJ as output, where NJ is the number of human jointsin our model and the kth column of p denotes the position of joint k relative to the pelvis….”) .  

As per Claim 17, Han et al. teaches A non-transitory computer-readable medium storing computer instructions ( Col. 25 -lines 15-20)  that, when executed by one or more processors ( Col. 24- lines 47-55) , cause the one or more processors to perform a method comprising: 
receiving as input a plurality of unlabeled images of a particular object each captured from a different viewpoint( Col. 6, lines 9- 25 – “…Labeled data and unlabeled data may be jointly modeled. A change in a viewpoint with respect to a pose of an articulated object may be processed using an adaptive hierarchical classification scheme. Both the labeled data and the unlabeled data may be used…”);
Han et al. does not explicitly teach  training a model, using the plurality of unlabeled images, for use by a process that estimates a three-dimensional (3D) pose for a given 2-dimensional (2D) image.  
However, within analogous art, Helge Rhodin teaches  training a model, using the plurality of unlabeled images ( Page 8438, Col. 2 – “ 3. Approach - we train our network using a novel loss function that adds view-consistency terms to a standard supervised loss evaluated on a small amount of labeled data and a regularization term that penalizes drift from initial pose predictions. This formulation enables us to use unlabeled multi-view footage by estimating jointly the body pose in a person-centered coordinate system…”) , for use by a process that estimates a three-dimensional (3D) pose for a given 2-dimensional (2D) image( 2D pose data  for pre-training of 3D estimation taught within Page 8440-Col. 2-“…The first three levels are initialized through transfer learning by pre-training on a 2D pose estimation task, as proposed by [16], and then kept constant. For the weak supervision experiments, the
network is pre-trained on L….” AND Page 8438-Col. 2 – “…Our goal is to leverage multi-view images, for which the true 3D human pose is unknown, to train a deep network to predict 3D pose from a single image. To this end, we train our network using a novel loss function that adds view-consistency terms to a standard supervised loss evaluated on a small amount of labeled data and a regularization term that penalizes drift from initial pose predictions…”).
	One of ordinary skill in the art would have been motivated to combine the teaching of Helge Rhodin within the modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. because the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin provides a system and method for implementing 3D human pose estimation utilizing deep net architecture for processing image datasets. 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin within the modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al.  for implementation of a system and method  for  3D human pose estimation utilizing deep net architecture for processing image datasets. 

As per Claim 18, Han et al. teaches A system ( Col. 25 – lines 15-18) , comprising: 
a memory storing computer  ( Col. 25, lines 20-27) instructions; and a processor that executes the computer instructions to perform a method comprising: receiving as input a plurality of unlabeled images of a particular object each captured from a different viewpoint ( Col. 6, lines 9- 25 – “…Labeled data and unlabeled data may be jointly modeled. A change in a viewpoint with respect to a pose of an articulated object may be processed using an adaptive hierarchical classification scheme. Both the labeled data and the unlabeled data may be used…”); 
Han et al. does not explicitly teach  training a model, using the plurality of unlabeled images, for use by a process that estimates a three-dimensional (3D) pose for a given 2-dimensional (2D) image. 
However, within analogous art, Helge Rhodin teaches  training a model, using the plurality of unlabeled images ( Page 8438, Col. 2 – “ 3. Approach - we train our network using a novel loss function that adds view-consistency terms to a standard supervised loss evaluated on a small amount of labeled data and a regularization term that penalizes drift from initial pose predictions. This formulation enables us to use unlabeled multi-view footage by estimating jointly the body pose in a person-centered coordinate system…”) , for use by a process that estimates a three-dimensional (3D) pose for a given 2-dimensional (2D) image( 2D pose data  for pre-training of 3D estimation taught within Page 8440-Col. 2-“…The first three levels are initialized through transfer learning by pre-training on a 2D pose estimation task, as proposed by [16], and then kept constant. For the weak supervision experiments, the
network is pre-trained on L….” AND Page 8438-Col. 2 – “…Our goal is to leverage multi-view images, for which the true 3D human pose is unknown, to train a deep network to predict 3D pose from a single image. To this end, we train our network using a novel loss function that adds view-consistency terms to a standard supervised loss evaluated on a small amount of labeled data and a regularization term that penalizes drift from initial pose predictions…”).
	One of ordinary skill in the art would have been motivated to combine the teaching of Helge Rhodin within the modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. because the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin provides a system and method for implementing 3D human pose estimation utilizing deep net architecture for processing image datasets. 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin within the modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al.  for implementation of a system and method  for  3D human pose estimation utilizing deep net architecture for processing image datasets. 

As per Claim 19, Combination of Han et al. and Helge Rhodin teach claim 18,
Han et al. does not explicitly teach wherein the plurality of unlabeled images are captured without calibrating camera position.  
Within analogous art,  Helge Rhodin teaches wherein the plurality of unlabeled images are captured without calibrating camera position ( Page 8438 , Col. 2- “…This formulation enables us to use unlabeled multi-view footage by estimating jointly the body pose in a person-centered coordinate system and the rotation of that coordinate system with respect to the cameras….” AND Page 8440- Col. 1- “…unlike other recent multi-view approaches [24, 19], we do not require a full camera calibration. We only need the intrinsics….”) .  

As per Claim 20, Combination of Han et al. and Helge Rhodin teach claim 18,
Han et al. does not explicitly teach wherein the plurality of unlabeled images include images without 3D pose annotations.
Within analogous art,  Helge Rhodin teaches wherein the plurality of unlabeled images include images without 3D pose annotations ( Page 8438- Col. 2 – Figure 2 (b)  AND Page 8438-Col. 2 – 3. Approach- “…3D human pose is unknown, to train a deep network to predict 3D pose from a single image. …This formulation enables us to use unlabeled multi-view footage by estimating jointly the body pose in a person-centered coordinate system and the rotation of that coordinate system with respect to the cameras….”) .  


2.	Claims 6 and 7  are rejected under 35 U.S.C 103 as being patentable over Han et al. ( USPAT 9311713)  in view of Helge Rhodin,( NPL DOC:  "Learning Monocular 3D Human Pose Estimation from Multi-view Images,"June 2018, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Pages. 8437-8444) in further view of Jiajun Wu(NPL Doc:" MarrNet: 3D Shape Reconstruction via 2.5D Sketches," December 9th 2017, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, Pages 1-8.).


As per Claim 6, Combination of Han et al. and Helge Rhodin teach claim 1,
Combination of Han et al. and Helge Rhodin does not explicitly teach wherein the model is used by a first layer of the process that predicts 2.5 dimension (2.5D) pose for the given 2D image.
Within analogous art, Jiajun Wu teaches  wherein the model is used by a first layer of the process that predicts 2.5 dimension (2.5D) pose for the given 2D image ( 2.5D Sketches and estimator within the processing of 2D image taught within Page 3- Figure 2(a)  and  3. Approach) .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Jiajun Wu within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin because the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu provides a system and method for implementing  efficient 3D shape reconstruction from images . 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu  within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin for implementation of a system and method  for  efficient 3D shape reconstruction from images .

As per Claim 7, Combination of Han et al. and Helge Rhodin  and Jiajun Wu  teach claim 6,
Combination of Han et al. and Helge Rhodin does not explicitly teach wherein a second layer of the process implements 3D reconstruction of the 2.5D pose to estimate the 3D pose for the given 2D image.
Within analogous art, Jiajun Wu teaches wherein a second layer of the process implements 3D reconstruction of the 2.5D pose to estimate the 3D pose for the given 2D image ( Page 3-Figure 2 (b) AND Page 3 – Single -Image 3D Reconstruction and Page 4-3.2 – 3D Shape Estimation)  .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Jiajun Wu within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin because the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu provides a system and method for implementing  efficient 3D shape reconstruction from images . 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu  within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin for implementation of a system and method  for  efficient 3D shape reconstruction from images .

3.	Claim 8  is rejected under 35 U.S.C 103 as being patentable over Han et al. ( USPAT 9311713)  in view of Helge Rhodin,( NPL DOC:  "Learning Monocular 3D Human Pose Estimation from Multi-view Images,"June 2018, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Pages. 8437-8444) in further view of Jiajun Wu(NPL Doc:" MarrNet: 3D Shape Reconstruction via 2.5D Sketches," December 9th 2017, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, Pages 1-8.) and Wei Yang( NPL Doc: "3D Human Pose Estimation in the Wild by Adversarial Learning,” June 2018, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Pages 5255-5262).

As per Claim 8,  Combination of Han et al. and Helge Rhodin  and Jiajun Wu  teach claim 6,
Combination of Han et al. and Helge Rhodin  and Jiajun Wu does not explicitly teach wherein the model: generates 2D heatmaps for each unlabeled image of the plurality of unlabeled images  , and - 29 -generates latent depth-maps for each unlabeled image of the plurality of unlabeled images.
Within analogous art,Wei Yang teaches wherein the model: generates 2D heatmaps for each unlabeled image of the plurality of unlabeled images ( Page 5258-Col. 1 – Figure 2 shows 2D Heatmaps and Depth maps, Page 5258-Col. 2 lines 6-11)  , and - 29 -generates latent depth-maps for each unlabeled image of the plurality of unlabeled images ( Page 5258- Col. 2- “…the depth information into this representation, we created P depth maps, which have the same resolution as the 2D heatmaps for body joints. Each map is a matrix denoting the depth of a  body joint at the corresponding location. The heatmaps and depth maps…”) .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Wei Yang within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin and  the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu  because the 
3D Human Pose Estimation in the Wild by Adversarial Learning mentioned by Wei Yang provides a system and method for implementing  deep convolutional neural network for the human pose estimation and 3D location of body parts. 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the 3D Human Pose Estimation in the Wild by Adversarial Learning mentioned by Wei Yang within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin and the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu for implementation of a system and method  for deep convolutional neural network for the human pose estimation and 3D location of body parts.


4.	Claims 10,11 and 12 are rejected under 35 U.S.C 103 as being patentable over Han et al. ( USPAT 9311713)  in view of Helge Rhodian( NPL DOC:  "Learning Monocular 3D Human Pose Estimation from Multi-view Images,"June 2018, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Pages. 8437-8444) in further view of Jiajun Wu(NPL Doc:" MarrNet: 3D Shape Reconstruction via 2.5D Sketches," December 9th 2017, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, Pages 1-8.) and Bruce Xiaohan Nie( NPL Doc: "Monocular 3D Human Pose Estimation by Predicting Depth on Joints," October 2017, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 3447-3454).

As per Claim 10, Combination of Han et al. and Helge Rhodin  and Jiajun Wu and   Wei Yang teach claim 8,
Combination of Han et al. and Helge Rhodin  and Jiajun Wu and   Wei Yang does not explicitly teach wherein the first layer: normalizes the 2D heatmaps to generate normalized 2D heatmaps.
Within analogous art, Bruce Xiaohan Nie teaches  wherein the first layer: normalizes the 2D heatmaps to generate normalized 2D heatmaps( the processing of  normalize 2D pose within layer 1 taught within Page 3450 – Col. 2 – “…The input feature of the skeleton-LSTM at joint j is xsj = MP(Sˆj) where Sˆj is the normalized 2D pose by subtracting each joint location by the current joint location [xj , yj ]. The structure of the multi-layer perceptron is visualized in Fig. 3….”) .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Bruce Xiaohan Nie within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin and  the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu and  the 
3D Human Pose Estimation in the Wild by Adversarial Learning mentioned by Wei Yang  because the Monocular 3D Human Pose Estimation by Predicting Depth on Joints mentioned by Bruce Xiaohan Nie provides a system and method for implementing  the localization of human joint within 3D human pose estimation utilizing neural network models. 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Monocular 3D Human Pose Estimation by Predicting Depth on Joints mentioned by Bruce Xiaohan Nie within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin and the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu and the  3D Human Pose Estimation in the Wild by Adversarial Learning mentioned by Wei Yang for implementation of a system and method  for the localization of human joint within 3D human pose estimation utilizing neural network models.

As per Claim 11, Combination of Han et al. and Helge Rhodin  and Jiajun Wu and   Wei Yang and   Bruce Xiaohan Nie  teach claim 10,
Combination of Han et al. and Helge Rhodin  and Jiajun Wu and   Wei Yang does not explicitly teach wherein the first layer: converts the normalized 2D heatmaps to 2D pose coordinates.
Within analogous art, Wei Yang teaches wherein the first layer: converts the normalized 2D heatmaps to 2D pose coordinates (2D heatmaps for converting ( estimating ) to 2D pose taught within  Pages 5258- Col. 1& 2 - “Additionally, we also investigate using heatmaps as another information source, which is effective for 2D adversarial pose estimation…”).
	One of ordinary skill in the art would have been motivated to combine the teaching of Wei Yang within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin and  the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu  and the Monocular 3D Human Pose Estimation by Predicting Depth on Joints mentioned by Bruce Xiaohan Nie because the 3D Human Pose Estimation in the Wild by Adversarial Learning mentioned by Wei Yang provides a system and method for implementing  deep convolutional neural network for the human pose estimation and 3D location of body parts. 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the 3D Human Pose Estimation in the Wild by Adversarial Learning mentioned by Wei Yang within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin and the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu and the Monocular 3D Human Pose Estimation by Predicting Depth on Joints mentioned by Bruce Xiaohan Nie for implementation of a system and method  for deep convolutional neural network for the human pose estimation and 3D location of body parts.

As per Claim 12, Combination of Han et al. and Helge Rhodin  and Jiajun Wu and   Bruce Xiaohan Nie and  Wei Yang teach claim 11,
Combination of Han et al. and Helge Rhodin  and Jiajun Wu and   Bruce Xiaohan Nie  does not explicitly teach wherein the first layer: obtains relative depth values from the latent depth-maps and the normalized 2D heatmaps, 
 Within analogous art, Wei Yang teaches wherein the first layer: obtains relative depth values from the latent depth-maps and the normalized 2D heatmaps ( Page 5258- Col. 2- “…the depth information into this representation, we created P depth maps, which have the same resolution as the 2D heatmaps for body joints. Each map is a matrix denoting the depth of a  body joint at the corresponding location. The heatmaps and depth maps…”), 
	One of ordinary skill in the art would have been motivated to combine the teaching of Wei Yang within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin and  the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu  and the Monocular 3D Human Pose Estimation by Predicting Depth on Joints mentioned by Bruce Xiaohan Nie because the 3D Human Pose Estimation in the Wild by Adversarial Learning mentioned by Wei Yang provides a system and method for implementing  deep convolutional neural network for the human pose estimation and 3D location of body parts. 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the 3D Human Pose Estimation in the Wild by Adversarial Learning mentioned by Wei Yang within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin and the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu and the Monocular 3D Human Pose Estimation by Predicting Depth on Joints mentioned by Bruce Xiaohan Nie for implementation of a system and method  for deep convolutional neural network for the human pose estimation and 3D location of body parts.
Combination of Han et al. and Helge Rhodin  and Wei Yang and   Bruce Xiaohan Nie  does not explicitly teach wherein the relative depth values and the 2D pose coordinates define the 2.5D pose.
Within analogous art, Jiajun Wu teaches wherein the relative depth values and the 2D pose coordinates define the 2.5D pose ( Page 3 – 3.1 2.5D Sketch Estimation- “…The first component of our network (Figure 2a) takes a 2D RGB image as input, and predicts its 2.5D sketch: surface normal, depth, and silhouette….”) .
 	One of ordinary skill in the art would have been motivated to combine the teaching of Jiajun Wu within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin and the Monocular 3D Human Pose Estimation by Predicting Depth on Joints mentioned by Bruce Xiaohan Nie and the 3D Human Pose Estimation in the Wild by Adversarial Learning mentioned by Wei Yang because the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu provides a system and method for implementing  efficient 3D shape reconstruction from images . 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the MarrNet: 3D Shape Reconstruction via 2.5D Sketches mentioned by Jiajun Wu  within the combined modified teaching of the Estimator Training Method And Pose Estimating Method Using Depth Image mentioned by Han et al. and  the Learning Monocular 3D Human Pose Estimation from Multi-view Images mentioned Helge Rhodin and the Monocular 3D Human Pose Estimation by Predicting Depth on Joints mentioned by Bruce Xiaohan Nie and the 3D Human Pose Estimation in the Wild by Adversarial Learning mentioned by Wei Yang for implementation of a system and method  for  efficient 3D shape reconstruction from images .


It is noted that any citations to specific, pages, columns, lines, or figures in the prior art references and any interpretation of the reference should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. See MPEP 2123. 

Allowable Subject Matter

5.          Claims 9,13,14,15  and 16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

6.         The following is an examiner’s statement of reasons for objecting the claims as allowable subject matter: 

As to claim 9, prior art of record does not teach or suggest the limitation mentioned within claim 9: “…receiving at least one labeled image, each labeled image of the at least one labeled image being of a corresponding object different than the particular object and being annotated with a 2D pose of the corresponding object, wherein the at least one labeled image is used for determining heatmap loss during the training.  ” 

As to claim 13, prior art of record does not teach or suggest the limitation mentioned within claim 13: “…wherein the second layer: determines a depth of a root joint, and uses the depth of the root joint to reconstruct scale normalized 3D locations of joints using perspective projection. ”

As to claims 14 and 15 ,  Claims 14 and 15 depends on objected allowable claim 13, therefore claims 14 and 15 are  considered  objected over prior art of record. 

As to claim 16, Claim 16 depends on objected allowable claim 15, therefore claim 16 is considered  objected over prior art of record.





Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Examiner’s Notes

7.	The Examiner acknowledges the following prior arts below as pertinent to the current applications claim limitations and inventive concept, although the following prior arts shown below were not relied upon to address the limitations within the claim , they are analogous art mentioning the inventive concept key points on (image processing , 3D pose estimation, unlabeled data processing , neural network etc.).

1) 	US-20200364554
2)	US-20200279428
3)	US-20200211206
4)	US-20200184668
5)	US-20200005538
6)	US-20190371080
7)	US-20070217676
8)	US-20070080967



Conclusion

8. 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to OMAR S. ISMAIL whose telephone number is (571)272-9799 and Fax # (571)273-9799. The examiner can normally be reached on M-F: 9:00 AM - 6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http:/ If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, David C. Payne can be reached on (571)272-3024. The fax phone number for the organization where this application or proceeding is assigned is (571)273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free)? If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/OMAR S ISMAIL/Primary Examiner, Art Unit 2637