DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Response to Amendment
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant’s submission filed 5/16/2022 has been entered. The claims 1, 2, 9 and 14 have been amended. The claims 1-18 are pending in the current application.  

Response to Arguments
Applicant's arguments filed 5/16/2022 with respect to the new claim limitations have been fully considered but are moot in view of the new ground(s) of rejection based on the newly cited references. 

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-18 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
The claim 1 recites the new claim limitation of “screening the detected 2D key points by selecting via a consensus operation of a single relation equation that defines associated relations between the 2D key points and the 3D key points for multiple images, wherein the consensus operation is configured to screen among a same frame of images of the hand under the different angles of view to include key points having a higher reliability”. 
However, applicant’s specification discloses a single relation equation at Paragraph 0174 “the relation equation between 2D key points and 3D key points can be established through the camera internal and external parameters” and at Paragraph 0215 that for the 20 observations generated by each key point, some can be selected from them as a credible observation by employing RANSAC algorithm framework. Applicant’s specification failed to describe screening the detected 2D key points by selecting via a consensus operation of a single relation equation. Moreover, the single relation equation at Paragraph 0174 is not used to screen the 2D key points to include key points having a higher reliability. There is no direct relation between the claimed single relation equation and screening of key points having a higher reliability. 
To comply with the “written description” requirement of 35 U.S.C. § 112, first paragraph, an applicant must convey with reasonable clarity to those skilled in the art that, as of the filing date sought, he or she was in possession of the invention. The invention is, for purposes of the “written description” inquiry, whatever is now claimed. Vas-Cath. Inc, v. Mahurkar. 935 F.2d 1555, 1563-64, 19 USPQ2d 1111, 1117 (Fed. Cir. 1991). For purposes of written description, one shows “possession” by descriptive means such as words, structures, figures, diagrams, and formulas that fully set forth the claimed invention. Lockwood v. American Airlines. Inc.. 107 F.3d 1565, 1572, 41 USPQ2d 1961, 1966 (Fed. Cir. 1997). Such descriptive means is not found in the disclosure for the inventions of the amended base claim 1.
The claims 9 and 14 are subject to the same rationale of rejection as the base claim 1. The claims 2-8 are rejected due to their dependency on the claim 1.  The claims 10-13 are dependent upon the claim 9 and are rejected due to their dependency on the claim 9. The claims 15-18 are dependent upon the claim 14 and are rejected due to its dependency on the claim 14. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The claim 1 recites the new claim limitation of “screening the detected 2D key points by selecting via a consensus operation of a single relation equation that defines associated relations between the 2D key points and the 3D key points for multiple images, wherein the consensus operation is configured to screen among a same frame of images of the hand under the different angles of view to include key points having a higher reliability”. 
However, applicant’s specification discloses a single relation equation at Paragraph 0174 “the relation equation between 2D key points and 3D key points can be established through the camera internal and external parameters” and at Paragraph 0215 that for the 20 observations generated by each key point, some can be selected from them as a credible observation by employing RANSAC algorithm framework. Applicant’s specification failed to describe screening the detected 2D key points by selecting via a consensus operation of a single relation equation. Moreover, the single relation equation at Paragraph 0174 is not used to screen the 2D key points to include key points having a higher reliability. There is no direct relation between the claimed single relation equation and screening of key points having a higher reliability. 
Applicant failed to particularly point out and distinctly claim the subject matter which applicant regards as the invention at least in the amended base claim 1. 
The claims 9 and 14 are subject to the same rationale of rejection as the base claim 1. The claims 2-8 are rejected due to their dependency on the claim 1.  The claims 10-13 are dependent upon the claim 9 and are rejected due to their dependency on the claim 9. The claims 15-18 are dependent upon the claim 14 and are rejected due to its dependency on the claim 14. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-18 are rejected under 35 U.S.C. 103 as being unpatentable over Guleryuz US-PGPUB No. 2019/0180473 (hereinafter Guleryuz) in view of Hebbalaguppe et al. US PGPUB No. 2019/0107894 (hereinafter Hebbalaguppe); Fitzgibbon et al. US-PGPUB No. 2020/0226786 (hereinafter Fitzgibbon); Iqbal et al. US-PGPUB No. 2019/0278983 (hereinafter Iqbal) and Sinha et al. US-PGPUB No. 2014/0010407 (hereinafter Sinha); Kellogg et al. US-PGPUB No. 2019/0026948 (hereinafter Kellogg). 
Re Claim 1: 
Guleryuz teaches a method for automatically generating labeled data of a hand, comprising: 
acquiring at least three images to be processed of the hand under different angles of view ( 
Guleryuz implicitly teaches the claim limitation. 
Guleryuz implicitly teaches capturing a collection of images of hand from different viewpoints/poses of the camera. 
Guleryuz teaches at Paragraph 0048 that the 3D skeleton model of the hand is used to determine a 3D key point 1020 that corresponds to the same location in the hand as the key point 1015 and at Paragraph 0051 that the 3D skeleton model is generated according to embodiments of the method 800 shown in FIG. 8 and the method 900 shown in FIG. 9. Guleryuz teaches at FIG. 8, Step 805 and Step 810 capturing 2D images of hand in training set of poses and identifying key points in 2D images of hand and at Paragraph 0047 that images captured by the camera are projected onto the image plane 1005. Characteristics of the camera also determine a vanishing point 1010 that is an abstract point on the image plane 1005 where 2D projects of parallel lines in 3D space appear to converge. 
Guleryuz teaches at Paragraph 0020 that the learning phase includes generating one or more lookup tables (LUTs) 230 using training images of the hand 205 and the lengths of the phalanxes of the fingers and thumb are determined from the set of training images of the hand 205. Guleryuz teaches at Paragraph 0024 that values of the parameters that define the palm triangle 300 are learned using 2D images of the hand and at Paragraph 0037 that 2D images of a hand positioned in a training set of poses are captured and at Paragraph 0038 that the processor identifies key-points in the 2D images of the hand and at Paragraph 0051 that the 3D skeleton model is generated according to embodiments of the method 800 shown in FIG. 8); 
detecting 2D key points on the at least three images to be processed respectively (
Applicant’s specification discloses at Paragraph 0049 of the instant application publication that “a frame of image is a collection of images from different angles of view”. 
Guleryuz implicitly teaches capturing a collection of 2D images of hand from different viewpoints/poses of the camera and the association relation of the key points among the captured 2D images. 
Guleryuz teaches at Paragraph 0048 that the 3D skeleton model of the hand is used to determine a 3D key point 1020 that corresponds to the same location in the hand as the key point 1015 and at Paragraph 0051 that the 3D skeleton model is generated according to embodiments of the method 800 shown in FIG. 8 and the method 900 shown in FIG. 9. Guleryuz teaches at FIG. 8, Step 805 and Step 810 capturing 2D images of hand in training set of poses and identifying key points in 2D images of hand and at Paragraph 0047 that images captured by the camera are projected onto the image plane 1005. Characteristics of the camera also determine a vanishing point 1010 that is an abstract point on the image plane 1005 where 2D projects of parallel lines in 3D space appear to converge. 
Guleryuz teaches at Paragraph 0020 that the learning phase includes generating one or more lookup tables (LUTs) 230 using training images of the hand 205 and the lengths of the phalanxes of the fingers and thumb are determined from the set of training images of the hand 205. Guleryuz teaches at Paragraph 0024 that values of the parameters that define the palm triangle 300 are learned using 2D images of the hand and at Paragraph 0037 that 2D images of a hand positioned in a training set of poses are captured and at Paragraph 0038 that the processor identifies key-points in the 2D images of the hand. 
Guleryuz teaches at Paragraph 0034 that one or more of the key points that are derived from the LUT 600 for one 3D pose are the same or similar to one or more of the key points that are derived from the LUT 600 for another 3D pose and a confidence score is derived for the dissimilar poses that can result from the same set of projected 2D coordinates. Guleryuz teaches at FIG. 7 that relationship of the 2D coordinates of (circles 1, 2, 3, 4 and 5 in FIG. 6) that define the position of the skeleton model 705/7110/715/720/725 of the finger in the finger pose plane. 
Guleryuz teaches at Paragraph 0047 that the images captured by the camera are projected onto the image plane 1005 and at Paragraph 0038 that the processor identifies key-points in the 2D images of the hand and at Paragraph 0039 that the processor determines lengths of phalanxes in the fingers and thumb of the hand based on the key points of the 2D images of the hand. 
Guleryuz teaches at Paragraph 0043 that the processor learns orientations of the palm triangle and the thumb triangle); 
reconstructing the 3D key points as a three-dimensional space representation of the hand with regard to the 2D key points screened on the same frame of image, in combination with a given finger bone length (
Applicant’s specification discloses at Paragraph 0049 of the instant application publication that “a frame of image is a collection of images from different angles of view”. 
Guleryuz implicitly teaches capturing a collection of 2D images of hand from different viewpoints/poses of the camera and the association relation of the key points among the captured 2D images. 
Guleryuz teaches at Paragraph 0051 that the 3D skeleton model is generated according to embodiments of the method 800 shown in FIG. 8. Guleryuz teaches at Paragraph 0052 that the processor identifies a first set of 3D key-points that are compliant with the 3D skeleton model of the hand and at Paragraph 0053 that the processor identifies second 3D key-points based on the first 3D key-points and a vanishing point associated with the image…the vanishing point is determined based on characteristics of a camera that acquired the 2D image….the second set of 3D key-points includes the camera-compliant key-point 1030. 
Guleryuz teaches at Paragraph 0042 that the processor can compare lengths of the phalanxes of the fingers and thumb in the skeleton model to lengths of the corresponding phalanxes in the 2D image to account for perspective projection and de-project the 2D image of the hand. 
Guleryuz teaches at Paragraph 0048-0049 that a 3D skeleton model of the hand is lifted from the 2D image on the basis of the noisy key-point 1015 extracted from the 2D image. The 3D skeleton model of the hand is used to determine a 3D key-point 1020 that corresponds to the same location in the hand as the key-point 1015. Guleryuz teaches at Paragraph 0049 that a modified 3D key-point 1030 is therefore determined by projecting the skeleton-compliant key-point 1020 onto the line 1025); 
projecting the 3D key points on the three-dimensional representation of the hand onto the at least three images to be processed (Applicant’s specification discloses at Paragraph 0049 of the instant application publication that “a frame of image is a collection of images from different angles of view”. 
Guleryuz implicitly teaches capturing a collection of 2D images of hand from different viewpoints/poses of the camera and the association relation of the key points among the captured 2D images. 
Guleryuz teaches at Paragraph 0042 that the processor can compare lengths of the phalanxes of the fingers and thumb in the skeleton model to lengths of the corresponding phalanxes in the 2D image to account for perspective projection and de-project the 2D image of the hand. Guleryuz teaches at Paragraph 0024 that values of the parameters that define the palm triangle 300 are learned using 2D images of the hand and at Paragraph 0037 that 2D images of a hand positioned in a training set of poses are captured and at Paragraph 0038 that the processor identifies key-points in the 2D images of the hand. 
Guleryuz teaches at Paragraph 0049 that a modified 3D key-point 1030 is therefore determined by projecting the skeleton-compliant key-point 1020 onto the line 1025); and 
generating the labeled data of the hand on the images to be processed by using the projected 3D key points on the at least three images to be processed (
Applicant’s specification discloses at Paragraph 0049 of the instant application publication that “a frame of image is a collection of images from different angles of view”. 
Guleryuz implicitly teaches capturing a collection of 2D images of hand from different viewpoints/poses of the camera and the association relation of the key points among the captured 2D images. 
Guleryuz teaches at Paragraph 0021 that the processor determines a 3D pose and location of the hand 205 using locations of the key-points to access 2D coordinates of the fingers and thumb from the LUTs 230 which stores the 2D coordinates of each finger and thumb as a function of a relative location of the fingertip and the palm knuckle and at Paragraph 0022 that the processor 225 then modifies the 3D locations of the key-points indicated by the skeleton model based on projections of the 3D locations of the key-points into an image plane along a line connecting the original noisy key-points to a vanishing point associated with the 2D image. 
Guleryuz teaches at Paragraph 0046 and block 903 of FIG. 9 that the processor generates a 3D skeleton model that represents the 3D pose of the hand and at Paragraph 0052 that the first set of 3D key points represents key points corresponding to tips of the fingers and thumb, joints of the fingers and thumb, palm knuckles of the fingers and thumb and a wrist location defined by the 3D skeleton model of the hand and at Paragraph 0055 that an updated 3D skeleton model is generated on the basis of the modified values of the noisy key points).

Hebbalaguppe teaches the claim limitation: acquiring at least three images to be processed of the hand under different angles of view (
Hebbalaguppe teaches at Paragraph 0032 that the media stream captured by the RGB camera in user’s FPV (first person view) and at Paragraph 0044 that using a large-scale 3D hand pose dataset having a plurality of training sample RGB images….the camera location may be chosen randomly in spherical vicinity around the hand for each frame and at Paragraph 0040 that temporal information includes a plurality of key-points on hand present in the user’s field of view (FoV) in the frames. The plurality of key-points includes 21 hand key-points comprises 4 key points per finger and one key-point close to wrist of the user’s hand. The gesture recognition system detects the plurality of key-points an learns/estimates a plurality of network-implicit 3D articulation prior having the plurality of key points of sample user’s hands from sample RGB images using the deep learning network…RGB images such as images 130, 132, 134 are received at the gesture recognition system at 502. The gesture recognition system may include the hand pose estimation module 502 for estimating temporal information associated the gesture); 
detecting 2D key points on the at least three images to be processed respectively (
Hebbalaguppe teaches at Paragraph 0040 that temporal information includes a plurality of key-points on hand present in the user’s field of view (FoV) in the frames. The plurality of key-points includes 21 hand key-points comprises 4 key points per finger and one key-point close to wrist of the user’s hand. The gesture recognition system detects the plurality of key-points an learns/estimates a plurality of network-implicit 3D articulation prior having the plurality of key points of sample user’s hands from sample RGB images using the deep learning network…RGB images such as images 130, 132, 134 are received at the gesture recognition system at 502. The gesture recognition system may include the hand pose estimation module 502 for estimating temporal information associated the gesture. 
Hebbalaguppe teaches at FIG. 7 and Paragraph 0054-0055 that the 21 key points detected by the hand pose detection module are shown as an overlay on the input images while testing the gesture recognition system. 
Hebbalaguppe teaches at Paragraph 0045 that the processor determines the 2D finger coordinates of the fingers and thumb based on the LUTs and relative locations of the tips of the fingers and the corresponding palm knuckle and at Paragraph 0045 the first layer includes a LSTM layer…to learn long-term dependencies and patterns in 3D coordinates sequence of 21 key-points detected on the user’s hand). 
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have incorporated Hebbalaguppe’s teaching of capturing three or more 2D images of the hand as input for the gesture recognition of the hand into Guleryuz’s system of generating a 3D skeleton model of hand based on the gesture recognition (learning orientations of finger pose planes) by identifying the 3D coordinates of the 3D key points of the 3D skeleton model. One of the ordinary skill in the art would have been motivated to have provided the 2D images captured from the different viewpoints of the camera to have collected to the 2D key points from the 2D images so that the 3D coordinates of the 3D key points of the 3D skeleton model can be determined. 
Because the perspective projection Guleryuz requires a single relation equation as evidenced in Iqbal/ Fitzgibbon, Guleryuz at least suggests the claim limitation within the meaning of applicant’s specification: 
screening the detected 2D key points by selecting via a consensus operation of a single relation equation that defines associated relations between the 2D key points and the 3D key points for multiple images, wherein the consensus operation is configured to screen among a same frame of images of the hand under the different angles of view to include key points having a higher reliability (
Guleryuz teaches at Paragraph 0030 that the key points are filtered for outliers using techniques such as median or median absolute deviation to find and reject the outlier key points. 
Guleryuz teaches at Paragraph 0016 that the 3D locations of the key-points indicated by the skeleton model are modified based on projections of the 3D locations of the key-points into an image plane along a line connecting the original 2D key-points to a vanishing point associated with the 2D image and at Paragraph 0048-0052 that the 3D key-point 1020 is not necessarily consistent with the perspective projection (a single relation equation) of the initial key-point 1015 because the skeleton compliant key-point 1020 is not necessarily on a line 1025 between the initial key-point 1015 and the vanishing point 1010. A modified 3D key-point 1030 is therefore determined by projecting the skeleton compliant key-point 1020 onto the line 1025…This process is iterated until a convergence criterion for the key-point is satisfied…FIG. 11 is a flow diagram of a method 1100 of denoising key-points extracted from a 2D image of a hand…..The processor identifies second 3D key-points based on the first 3D key-points and a vanishing point associated with the image….the second set of 3D key-points includes the camera-compliant key-point 1030. Accordingly, Guleryuz teaches identifying/screening the 2D key points 1015 via a perspective projection equation (transformation) that defines associated relations/correspondences between the 2D key points 1015 and 3D key points 1030. The correspondences having higher reliability have been identified. 
Guleryuz teaches at Paragraph 0034 that information in the LUT 600 is used to determine when two or more dissimilar poses resulting the same or a similar set of projected 2D coordinates of the finger, e.g., one or more of the key-points that are derived from the LUT 600 for one 3D pose are the same or similar to one or more of the key-points that are derived from the LUT 600 for another 3D pose…a confidence score is derived for the dissimilar poses that can result from the same set of projected 2D coordinates…a distance from a current pose to a most distance pose that has the same 2D coordinates is used to generate a confidence score such as a high confidence score if the distance is zero or less than a threshold distance…the dissimilar poses are disambiguated on the basis of the confidence scores for the key-points or 2D coordinates that generate the dissimilar poses and at Paragraph 0052 that the processor identifies a first set of 3D key-points that are compliant with the 3D skeleton model of the hand). 
With respect to the claim limitation of “a single relation equation”, Iqbal teaches at Paragraph 0031 that given the intrinsic camera parameters K, the relationship between the 3D location Pk and corresponding 2D projection pk can be written by a single relation equation (1) under a perspective projection. 
Fitzgibbon et al. US-PGPUB No. 2020/0226786 (hereinafter Fitzgibbon) teaches at Paragraph 0043-0046 and 0050-0051 a perspective projection model expressed in terms of a single relation equation using a PnP algorithm taking consideration of the correspondences (pairs of 2D image points depicting key points and 3D key point positions…a fourth correspondence is used to remove ambiguity thus creating a P4P algorithm…a floating key-point 204 is used to remove the ambiguity such as floating key point 204….the optimizer minimizes an energy function such that the higher reliability correspondences between Ki and the image point pi are included to minimize the energy function. Moreover, by eliminating the floating key points, the system retains the key points with higher reliability assessment values. Fitzgibbon teaches at Paragraph 0058-0059 where a probability model is employed for each possible key-point to determine whether or not the predicted label depicts the first key-point or whether or not another predicted label depicts the second key-point. 
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have incorporated a specific single relation equation for the perspective projection between the 2D image points and 3D key points according to Iqbal/Fitzgibbon to have implemented the perspective projection between the 2D image points and 3D key points of Guleryuz. One of the ordinary skill in the art would have provided a specific perspective projection to have minimized the total least squares error when the mapping of the 2D image points and 3D key points of the hand. 
Moreover, Kellogg et al. US-PGPUB No. 2019/0026948 (hereinafter Kellogg) teaches at Paragraph 0029-0036 that the correlation of 2D feature points between images can be used to determine relative rotations and/or translations between 2D reference systems of the respective images….the presently disclosed technology can transform 2D feature points extracted from multiple images…convert the 2D point cloud into a 3D point cloud…locations of 3D feature points that correspond to extracted 2D feature points can be determined based on triangulation…or PnP based methods…outliers of the feature points can be removed using bundle adjustment based methods. 
Sinha teaches at Paragraph 0026 that the outliers in the 2D-3D matches are pruned during pose estimation…key points with known 3D point correspondences are typically tracked over longer sequences and at Paragraph 0038 that a coarse location recognition procedure is employed to filter as many incorrect 2D-3D matches as possible during the image-based localization stage…fewer RANSAC hypotheses will be required during robust pose estimation and at Paragraph 0043 and Paragraph 0052 that the subsequent RANSAC based pose estimation step handles outliers and at Paragraph 0060 that RANSAC is used with three-point pose estimation to find a set of inliers…estimating the camera pose involves initially estimating the 3D position and 3D orientation of the video camera using a RANSAC procedure with thee-point pose estimation to identify a set of inliers among the 2D-3D correspondences.
Therefore, Iqbal/Fitzgibbon in view of Kellogg/Sinha teaches the claim limitation of screening the detected 2D key points by selecting via a consensus operation of a single relation equation that defines associated relations between the 2D key points and the 3D key points for multiple images, wherein the consensus operation is configured to screen among a same frame of images of the hand under the different angles of view to include key points having a higher reliability. 
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have further incorporated Kellogg/Sinha’s removing outlier key points via bundle adjustment based methods or other methods into the system and method of Guleryuz modified by Fitzgibbon/Iqbal’s single relation equation for the perspective projection to have further enhanced Guleryuz’s reconstruction of the 3D key points based on the 2D key points by screening the 2D key points. One of the ordinary skill in the art would have been motivated to have removed the lower reliability feature points. 

Re Claim 2: 
The claim 2 encompasses the same scope of invention as that of the claim 1 except additional claim limitation that the at least three images to be processed are taken by a camera set, and the method further comprises: calibrating the camera set and obtaining camera internal and external parameters of the camera set, wherein the camera internal and external parameters are employed to generate the relation equation. 
Iqbal/Fitzgibbon and Guleryuz further teach the claim limitation that the at least three images to be processed are taken by a camera set, and the method further comprises: calibrating the camera set and obtaining camera internal and external parameters of the camera set (
Iqbal teaches at Paragraph 0031 that given the intrinsic camera parameters K, the relationship between the 3D location Pk and corresponding 2D projection pk can be written by a single relation equation (1) under a perspective projection. 
Fitzgibbon teaches at Paragraph 0041-0046 that the PnP algorithm knows the intrinsic camera parameters such as the camera focal length, principal image, point and skew parameter.  
Applicant’s specification discloses at Paragraph 0049 of the instant application publication that “a frame of image is a collection of images from different angles of view”. 
Guleryuz implicitly teaches capturing a collection of 2D images of hand from different viewpoints/poses of the camera and the association relation of the key points among the captured 2D images. 
Guleryuz teaches at Paragraph 0048 that the 3D skeleton model of the hand is used to determine a 3D key point 1020 that corresponds to the same location in the hand as the key point 1015 and at Paragraph 0051 that the 3D skeleton model is generated according to embodiments of the method 800 shown in FIG. 8 and the method 900 shown in FIG. 9. Guleryuz teaches at FIG. 8, Step 805 and Step 810 capturing 2D images of hand in training set of poses and identifying key points in 2D images of hand and at Paragraph 0047 that images captured by the camera are projected onto the image plane 1005. Characteristics of the camera also determine a vanishing point 1010 that is an abstract point on the image plane 1005 where 2D projects of parallel lines in 3D space appear to converge. 
Guleryuz teaches at Paragraph 0020 that the learning phase includes generating one or more lookup tables (LUTs) 230 using training images of the hand 205 and the lengths of the phalanxes of the fingers and thumb are determined from the set of training images of the hand 205. Guleryuz teaches at Paragraph 0024 that values of the parameters that define the palm triangle 300 are learned using 2D images of the hand and at Paragraph 0037 that 2D images of a hand positioned in a training set of poses are captured and at Paragraph 0038 that the processor identifies key-points in the 2D images of the hand). 

Re Claim 3: 
The claim 3 encompasses the same scope of invention as that of the claim 1 except additional claim limitation that calculating the finger bone length by: acquiring at least two images of the hand in a frame of image under different angles of view; performing gesture recognition for the hand by using the at least two images under different angles of view; performing detection of key points on each hand image in at least two images respectively in the case that the recognized gesture is a predefined simple gesture; reconstructing a three-dimensional representation of the hand by using the detected key points; and calculating the finger bone length of the hand according to the three-dimensional key points on the reconstructed three-dimensional representation of the hand. 
Guleryuz further teaches the claim limitation that calculating the finger bone length by: acquiring at least two images of the hand in a frame of image under different angles of view (Guleryuz teaches at Paragraph 0048 that the 3D skeleton model of the hand is used to determine a 3D key point 1020 that corresponds to the same location in the hand as the key point 1015 and at Paragraph 0051 that the 3D skeleton model is generated according to embodiments of the method 800 shown in FIG. 8 and the method 900 shown in FIG. 9. Guleryuz teaches at FIG. 8, Step 805 and Step 810 capturing 2D images of hand in training set of poses and identifying key points in 2D images of hand and at Paragraph 0047 that images captured by the camera are projected onto the image plane 1005. Characteristics of the camera also determine a vanishing point 1010 that is an abstract point on the image plane 1005 where 2D projects of parallel lines in 3D space appear to converge. 
Guleryuz teaches at Paragraph 0020 that the learning phase includes generating one or more lookup tables (LUTs) 230 using training images of the hand 205 and the lengths of the phalanxes of the fingers and thumb are determined from the set of training images of the hand 205. Guleryuz teaches at Paragraph 0024 that values of the parameters that define the palm triangle 300 are learned using 2D images of the hand and at Paragraph 0037 that 2D images of a hand positioned in a training set of poses are captured and at Paragraph 0038 that the processor identifies key-points in the 2D images of the hand); performing gesture recognition for the hand by using the at least two images under different angles of view (Guleryuz teaches at Paragraph 0020 that the learning phase includes generating one or more lookup tables (LUTs) 230 using training images of the hand 205 and the lengths of the phalanxes of the fingers and thumb are determined from the set of training images of the hand 205. Guleryuz teaches at Paragraph 0024 that values of the parameters that define the palm triangle 300 are learned using 2D images of the hand and at Paragraph 0037 that 2D images of a hand positioned in a training set of poses are captured and at Paragraph 0038 that the processor identifies key-points in the 2D images of the hand); performing detection of key points on each hand image in at least two images respectively in the case that the recognized gesture is a predefined simple gesture (Guleryuz teaches at FIG. 8, Step 805 and Step 810 capturing 2D images of hand in training set of poses and identifying key points in 2D images of hand and at Paragraph 0047 that images captured by the camera are projected onto the image plane 1005. Characteristics of the camera also determine a vanishing point 1010 that is an abstract point on the image plane 1005 where 2D projects of parallel lines in 3D space appear to converge); reconstructing a three-dimensional representation of the hand by using the detected key points (Applicant’s specification discloses at Paragraph 0049 of the instant application publication that “a frame of image is a collection of images from different angles of view”. 
Guleryuz implicitly teaches capturing a collection of 2D images of hand from different viewpoints/poses of the camera and the association relation of the key points among the captured 2D images. 
Guleryuz teaches at Paragraph 0052 that the processor identifies a first set of 3D key-points that are compliant with the 3D skeleton model of the hand and at Paragraph 0053 that the processor identifies second 3D key-points based on the first 3D key-points and a vanishing point associated with the image…the vanishing point is determined based on characteristics of a camera that acquired the 2D image….the second set of 3D key-points includes the camera-compliant key-point 1030. 
Guleryuz teaches at Paragraph 0042 that the processor can compare lengths of the phalanxes of the fingers and thumb in the skeleton model to lengths of the corresponding phalanxes in the 2D image to account for perspective projection and de-project the 2D image of the hand. 
Guleryuz teaches at Paragraph 0048-0049 that a 3D skeleton model of the hand is lifted from the 2D image on the basis of the noisy key-point 1015 extracted from the 2D image. The 3D skeleton model of the hand is used to determine a 3D key-point 1020 that corresponds to the same location in the hand as the key-point 1015. Guleryuz teaches at Paragraph 0049 that a modified 3D key-point 1030 is therefore determined by projecting the skeleton-compliant key-point 1020 onto the line 1025); and calculating the finger bone length of the hand according to the three-dimensional key points on the reconstructed three-dimensional representation of the hand (Guleryuz teaches at Paragraph 0042 that the processor can compare lengths of the phalanxes of the fingers and thumb in the skeleton model to lengths of the corresponding phalanxes in the 2D image to account for perspective projection and de-project the 2D image of the hand). 
Re Claim 4: 
The claim 4 encompasses the same scope of invention as that of the claim 3 except additional claim limitation that after the step of performing detection of key points on each hand image in at least three images respectively, the calculation method for the finger bone length further comprises: screening the detected key points by using the association relation among the at least three images in the same frame of image, wherein the steps of reconstructing the three-dimensional representation of the hand by using the detected key points comprises: reconstructing the three-dimensional representation of the hand by using the screened key points.
Guleryuz further teaches the claim limitation that after the step of performing detection of key points on each hand image in at least three images respectively, the calculation method for the finger bone length further comprises: screening the detected key points by using the association relation among the at least three images in the same frame of image (Guleryuz implicitly teaches the claim limitation. Guleryuz teaches at Paragraph 0034 that one or more of the key points that are derived from the LUT 600 for one 3D pose are the same or similar to one or more of the key points that are derived from the LUT 600 for another 3D pose and a confidence score is derived for the dissimilar poses that can result from the same set of projected 2D coordinates. Guleryuz teaches at FIG. 7 that relationship of the 2D coordinates of (circles 1, 2, 3, 4 and 5 in FIG. 6) that define the position of the skeleton model 705/7110/715/720/725 of the finger in the finger pose plane. 
Guleryuz teaches at Paragraph 0047 that the images captured by the camera are projected onto the image plane 1005 and at Paragraph 0038 that the processor identifies key-points in the 2D images of the hand and at Paragraph 0039 that the processor determines lengths of phalanxes in the fingers and thumb of the hand based on the key points of the 2D images of the hand. 
Guleryuz teaches at Paragraph 0043 that the processor learns orientations of the palm triangle and the thumb triangle), wherein the steps of reconstructing the three-dimensional representation of the hand by using the detected key points comprises: reconstructing the three-dimensional representation of the hand by using the screened key points (Guleryuz teaches at Paragraph 0048-0049 that a 3D skeleton model of the hand is lifted from the 2D image on the basis of the noisy key-point 1015 extracted from the 2D image. The 3D skeleton model of the hand is used to determine a 3D key-point 1020 that corresponds to the same location in the hand as the key-point 1015. Guleryuz teaches at Paragraph 0049 that a modified 3D key-point 1030 is therefore determined by projecting the skeleton-compliant key-point 1020 onto the line 1025. 
Guleryuz teaches at Paragraph 0052 that the processor identifies a first set of 3D key-points that are compliant with the 3D skeleton model of the hand and at Paragraph 0053 that the processor identifies second 3D key-points based on the first 3D key-points and a vanishing point associated with the image…the vanishing point is determined based on characteristics of a camera that acquired the 2D image….the second set of 3D key-points includes the camera-compliant key-point 1030. 
Guleryuz teaches at Paragraph 0042 that the processor can compare lengths of the phalanxes of the fingers and thumb in the skeleton model to lengths of the corresponding phalanxes in the 2D image to account for perspective projection and de-project the 2D image of the hand). 

Re Claim 5: 
The claim 5 encompasses the same scope of invention as that of the claim 4 except additional claim limitation that the at least three images are taken by a camera set, and the step of reconstructing the three-dimensional representation of the hand by using the screened key points further comprises: reconstructing the three-dimensional representation of the hand by using the screened key points, in combination with the internal and external parameters of the camera set.
Guleryuz further teaches the claim limitation that the at least three images are taken by a camera set, and the step of reconstructing the three-dimensional representation of the hand by using the screened key points further comprises: reconstructing the three-dimensional representation of the hand by using the screened key points, in combination with the internal and external parameters of the camera set (Guleryuz teaches at Paragraph 0048-0049 that a 3D skeleton model of the hand is lifted from the 2D image on the basis of the noisy key-point 1015 extracted from the 2D image. The 3D skeleton model of the hand is used to determine a 3D key-point 1020 that corresponds to the same location in the hand as the key-point 1015. Guleryuz teaches at Paragraph 0049 that a modified 3D key-point 1030 is therefore determined by projecting the skeleton-compliant key-point 1020 onto the line 1025. 
Guleryuz teaches at Paragraph 0052 that the processor identifies a first set of 3D key-points that are compliant with the 3D skeleton model of the hand and at Paragraph 0053 that the processor identifies second 3D key-points based on the first 3D key-points and a vanishing point associated with the image…the vanishing point is determined based on characteristics of a camera that acquired the 2D image….the second set of 3D key-points includes the camera-compliant key-point 1030. 
Guleryuz teaches at Paragraph 0042 that the processor can compare lengths of the phalanxes of the fingers and thumb in the skeleton model to lengths of the corresponding phalanxes in the 2D image to account for perspective projection and de-project the 2D image of the hand). 
Re Claim 6: 
The claim 6 encompasses the same scope of invention as that of the claim 1 except additional claim limitation that acquiring positions of a plurality of three-dimensional key points in the three-dimensional representation of the hand, the three-dimensional representation of the hand being reconstructed according to at least two two-dimensional images of the hand; generating an auxiliary geometric structure associated with each key point according to the category of each key point in the plurality of three-dimensional key points; generating for each auxiliary geometric structure a set of auxiliary points on the surface of each auxiliary geometric structure; projecting the auxiliary points onto the at least two two-dimensional images; and acquiring edge nodes at the topmost, bottommost, leftmost and rightmost in the projection of the auxiliary points on the at least two two-dimensional images, and generating the gesture bounding box based on the four nodes.
Guleryuz further teaches the claim limitation that acquiring positions of a plurality of three-dimensional key points in the three-dimensional representation of the hand, the three-dimensional representation of the hand being reconstructed according to at least two two-dimensional images of the hand (Guleryuz teaches at Paragraph 0048-0049 that a 3D skeleton model of the hand is lifted from the 2D image on the basis of the noisy key-point 1015 extracted from the 2D image. The 3D skeleton model of the hand is used to determine a 3D key-point 1020 that corresponds to the same location in the hand as the key-point 1015. Guleryuz teaches at Paragraph 0049 that a modified 3D key-point 1030 is therefore determined by projecting the skeleton-compliant key-point 1020 onto the line 1025); generating an auxiliary geometric structure associated with each key point according to the category of each key point in the plurality of three-dimensional key points (Guleryuz teaches at Paragraph 0015 lengths of the phalanxes of the fingers and thumb are determined from a set of training images of the hand. The finger pose lookup tables are generated based on the lengths and anatomical constraints on ranges of motion of the joints that connect the phalanxes. The palm of the hand is represented as a palm triangle and a thumb triangle, which are defined by corresponding sets of vertices and parameters that define the palm triangle and the thumb triangle are also determined from the set of training images. Guleryuz teaches at Paragraph 0016 that a 3D pose of the fingers is then determined by rotating the 2D coordinates based on the orientation of the palm triangle. 
Guleryuz teaches at Paragraph 0042 that the processor can compare lengths of the phalanxes of the fingers and thumb in the skeleton model to lengths of the corresponding phalanxes in the 2D image to account for perspective projection and de-project the 2D image of the hand); generating for each auxiliary geometric structure a set of auxiliary points on the surface of each auxiliary geometric structure (Guleryuz teaches at Paragraph 0015 lengths of the phalanxes of the fingers and thumb are determined from a set of training images of the hand. The finger pose lookup tables are generated based on the lengths and anatomical constraints on ranges of motion of the joints that connect the phalanxes. The palm of the hand is represented as a palm triangle and a thumb triangle, which are defined by corresponding sets of vertices and parameters that define the palm triangle and the thumb triangle are also determined from the set of training images. Guleryuz teaches at Paragraph 0016 that a 3D pose of the fingers is then determined by rotating the 2D coordinates based on the orientation of the palm triangle); projecting the auxiliary points onto the at least two two-dimensional images (Guleryuz teaches at Paragraph 0016 that the 3D locations of the key points (including the set of vertices associated with the palm triangle and the thumb triangle) indicated by the skeleton model are modified based on projections of the 3D locations of the key points into an image plane along a line connecting the original 2D key points to a vanishing point associated with the 2D image); and acquiring edge nodes at the topmost, bottommost, leftmost and rightmost in the projection of the auxiliary points on the at least two two-dimensional images, and generating the gesture bounding box based on the four nodes (Guleryuz teaches at FIG. 1 that the thumb triangle and the palm triangle forms a bounding box based on the four nodes). 
Re Claim 7: 
The claim 7 encompasses the same scope of invention as that of the claim 6 except additional claim limitation that the step of projecting the auxiliary points onto the at least two two-dimensional images further comprises: calculating projection positions of the auxiliary points on the at least two two-dimensional images in combination with the internal and external parameters of the camera set, wherein the at least two images are taken by the camera set.
Guleryuz further teaches the claim limitation that the step of projecting the auxiliary points onto the at least two two-dimensional images further comprises: calculating projection positions of the auxiliary points on the at least two two-dimensional images in combination with the internal and external parameters of the camera set, wherein the at least two images are taken by the camera set (Guleryuz teaches at Paragraph 0016 that the 3D locations of the key points (including the set of vertices associated with the palm triangle and the thumb triangle) indicated by the skeleton model are modified based on projections of the 3D locations of the key points into an image plane along a line connecting the original 2D key points to a vanishing point associated with the 2D image. Guleryuz teaches at Paragraph 0048-0049 that a 3D skeleton model of the hand is lifted from the 2D image on the basis of the noisy key-point 1015 extracted from the 2D image. The 3D skeleton model of the hand is used to determine a 3D key-point 1020 that corresponds to the same location in the hand as the key-point 1015. Guleryuz teaches at Paragraph 0049 that a modified 3D key-point 1030 is therefore determined by projecting the skeleton-compliant key-point 1020 onto the line 1025. 
Guleryuz teaches at Paragraph 0052 that the processor identifies a first set of 3D key-points that are compliant with the 3D skeleton model of the hand and at Paragraph 0053 that the processor identifies second 3D key-points based on the first 3D key-points and a vanishing point associated with the image…the vanishing point is determined based on characteristics of a camera that acquired the 2D image….the second set of 3D key-points includes the camera-compliant key-point 1030. 
Guleryuz teaches at Paragraph 0042 that the processor can compare lengths of the phalanxes of the fingers and thumb in the skeleton model to lengths of the corresponding phalanxes in the 2D image to account for perspective projection and de-project the 2D image of the hand). 
Re Claim 8: 
The claim 8 further recites a computer program product that automatically generates labeled data of a hand, the product for causing one or more processors to execute the method according to claim 1. 
The claim 8 is in parallel with the claim 1 in the form of a computer program product claim. The claim 8 is subject to the same rationale of rejection as the claim 1. Additionally, Guleryuz further teaches the claim limitation of a computer program product that automatically generates labeled data of a hand, the product for causing one or more processors to execute the method according to claim 1 (Guleryuz Paragraph 0056). 

Re Claim 9: 
The claim 9 is in parallel with the claim 1 in the form of an apparatus claim. The claim 9 is subject to the same rationale of rejection as the claim 1. 

The claim 9 further recites the claim limitation of a device for automatically generating labeled data of a hand, comprising: 
an acquisition device for acquiring at least three images to be processed under different angles of view for a hand; 
a detection device for detecting 2D key points on the at least three images to be processed respectively; 
a screening device for screening the detected 2D key points by selecting via a consensus operation of a single relation equation that defines associated relations between the 2D key points and the 3D key points for multiple images, wherein the consensus operation is configured to screen among a same frame of images of the hand under the different angles of view to include key points having a higher reliability assessment value; 
a reconstruction device for reconstructing the 3D key points as a three-dimensional space representation of the hand with regard to the 2D key points screened on the same frame of image, in combination with a given finger bone length; 
a projection device for projecting the 3D key points on the three-dimensional representation of the hand onto the at least three images to be processed; and 
a labeling device for generating the labeled data of the hand on the images to be processed by using the projected 3D key points on the at least three images to be processed.

However, Guleryuz further teaches the claim limitation that a device for automatically generating labeled data of a hand (e.g., FIG. 2), comprising: 
an acquisition device (e.g., camera 215 of FIG. 2 for performing the operation 805 of FIG. 8); 
a detection device (e.g., processor 225 of FIG. 2 for performing the operation 810 of FIG. 8); 
a screening device (e.g., processor 225 of FIG. 2 for performing the operations of FIGS. 8-9 and FIG. 11); 
a reconstruction device for reconstructing the 3D key points as a three-dimensional space representation of the hand with regard to the 2D key points screened on the same frame of image, in combination with a given finger bone length (e.g., the processor 225 of FIG. 2 for performing the operation 1105 of FIG. 11 and the operations 815 and 820 of FIG. 8); 
a projection device (e.g., the processor 225 of FIG. 2 for performing the operation 905 and the operations of FIG. 10 in relation to the block 1110 and block 115 of FIG. 11 and Paragraph 0052-0053); and 
a labeling device (e.g., the processor 225 of FIG. 2 performing the operation of block 1125 of FIG. 1 that an updated 3D skeleton model is generated based on the modified values of the noisy key points).

Re Claim 10: 
The claim 10 encompasses the same scope of invention as that of the claim 9 except additional claim limitation that the at least three images to be processed are taken by a camera set, and the device further comprises: a calibration device for calibrating the camera set and obtaining camera internal and external parameters of the camera set.
The claim 10 is in parallel with the claim 2 in the form of an apparatus claim. The claim 10 is subject to the same rationale of rejection as the claim 2. 
Re Claim 11: 
The claim 11 encompasses the same scope of invention as that of the claim 9 except additional claim limitation that a recognition device and a calculation device, wherein, the acquisition device further for acquiring at least two images of the hand in a frame of image under different angles of view; the recognition device for performing gesture recognition for the hand by using the at least two images under different angles of view; the detection device further for performing detection of key points on each hand image in at least two images respectively in the case that the recognized gesture is a predefined simple gesture; the reconstruction device further for reconstructing a three-dimensional representation of the hand by using the detected key points; and the calculation device for calculating the finger bone length of the hand according to the three-dimensional key points on the reconstructed three-dimensional representation of the hand.
The claim 11 is in parallel with the claim 3 in the form of an apparatus claim. The claim 11 is subject to the same rationale of rejection as the claim 3. 
Re Claim 12: 
The claim 12 encompasses the same scope of invention as that of the claim 11 except additional claim limitation that the at least two images are at least three images, and the device further comprises: a screening device for screening the detected key points by using an association relation among the at least three images in the same frame of image; and wherein the reconstruction device is used for reconstructing the three-dimensional representation of the hand by using the screened key points.
The claim 12 is in parallel with the claim 4 in the form of an apparatus claim. The claim 12 is subject to the same rationale of rejection as the claim 4. 
Re Claim 13: 
The claim 13 encompasses the same scope of invention as that of the claim 9 except additional claim limitation that an auxiliary geometric structure generation device, an auxiliary point generation device and a bounding box generation device, wherein, the acquisition device is further configured for acquiring positions of a plurality of three-dimensional key points in the three-dimensional representation of the hand, the three-dimensional representation of the hand is reconstructed according to at least two two-dimensional images of the hand; the auxiliary geometric structure generation device is configured for generating an auxiliary geometric structure associated with each key point according to the category of each key point in the plurality of three-dimensional key points; the auxiliary point generation device is configured for generating for each auxiliary geometric structure a set of auxiliary points on the surface of each auxiliary geometric structure; the projection device is further configured for projecting the auxiliary points onto the at least two two-dimensional images; and the bounding box generation device is configured for acquiring edge nodes at the topmost, bottommost, leftmost and rightmost in the projection of the auxiliary points on the at least two two-dimensional images, and generating the gesture bounding box based on the four nodes. 
The claim 13 is in parallel with the claim 6 in the form of an apparatus claim. The claim 13 is subject to the same rationale of rejection as the claim 6. 
Re Claim 14: 
The claim 14 is in parallel with the claim 1 in the form of an apparatus claim. The claim 14 is subject to the same rationale of rejection as the claim 1. 
The claim 14 recites a system for automatically generating labeled data of a hand, comprising: 
an image capture system comprising a camera set configured to acquire at least three images to be processed for the hand under different angles of view; and a labeling device configured to carry out the following operations: 
detecting 2D key points on the at least three images to be processed respectively;
screening the detected 2D key points by selecting via a consensus operation of a single relation equation that defines associated relations between the 2D key points and the 3D key points for multiple images, wherein the consensus operation is configured to screen among a same frame of images of the hand under the different angles of view to include key points having a higher reliability; 
reconstructing the 3D key points as a three-dimensional space representation of the hand with regard to the 2D key points screened on the same frame of image, in combination with a given finger bone length; 
projecting the 3D key points on the three-dimensional representation of the hand onto the at least three images to be processed; and 
generating the labeled data of the hand on the images to be processed by using the projected 3D key points on the at least three images to be processed.
However, Guleryuz further teaches the claim limitation that a system for automatically generating labeled data of a hand (e.g., FIG. 2), comprising: 
a system for automatically generating labeled data of a hand, comprising: 
an image capture system (e.g., camera 215 of FIG. 2 for performing the operation 805 of FIG. 8); and a labeling device (e.g., the processor 225 of FIG. 2 performing the operation of block 1125 of FIG. 1 that an updated 3D skeleton model is generated based on the modified values of the noisy key points).

Re Claim 15: 
The claim 15 encompasses the same scope of invention as that of the claim 14 except additional claim limitation that the operation of reconstructing the three-dimensional representation of the hand further comprises: reconstructing the three-dimensional space representation of the hand with regard to the screened key points on the same frame of image, in combination with the given finger bone length, and the camera internal and external parameters of the camera set. 
The claim 15 is in parallel with the claim 5 in the form of an apparatus claim. The claim 15 is subject to the same rationale of rejection as the claim 5. 
Re Claim 16: 
The claim 16 encompasses the same scope of invention as that of the claim 14 except additional claim limitation that a calculation device for the finger bone length, wherein, the image capture system is further configured to acquire at least two images of the hand in a frame of image under different angles of view; and the calculation device for the finger bone length is configured to carry out the following operations: performing gesture recognition for the hand by using the at least two images under different angles of view; performing detection of key points on each hand image in at least two images respectively in the case that the recognized gesture is a predefined simple gesture; reconstructing a three-dimensional representation of the hand by using the detected key points; and calculating the finger bone length of the hand according to the three-dimensional key points on the reconstructed three-dimensional representation of the hand.
The claim 16 is in parallel with the claim 3 in the form of an apparatus claim. The claim 16 is subject to the same rationale of rejection as the claim 3. 
Re Claim 17: 
The claim 17 encompasses the same scope of invention as that of the claim 16 except additional claim limitation that the at least two images are at least three images, and the calculation device for the finger bone length is further configured to carry out the following operation after the step of performing detection of key points on each hand image in at least three images respectively: screening the detected key points by using an association relation between the at least two images in the same frame of image, wherein the step of reconstructing the three-dimensional representation of the hand by using the detected key points comprises: reconstructing the three-dimensional representation of the hand by using the screened key points. 
The claim 17 is in parallel with the claim 4 in the form of an apparatus claim. The claim 17 is subject to the same rationale of rejection as the claim 4. 
Re Claim 18: 
The claim 18 encompasses the same scope of invention as that of the claim 14 except additional claim limitation that a calculation device for a gesture bounding box, wherein, the image capture system is further configured to acquire at least two images of the hand in a frame of image under different angles of view; and the calculation device for the gesture bounding box is configured to carry out the following operations: acquiring positions of a plurality of three-dimensional key points in the three-dimensional representation of the hand, the three-dimensional representation of the hand is reconstructed according to at least two two-dimensional images of the hand; generating an auxiliary geometric structure associated with each key point according to the category of each key point in the plurality of three-dimensional key points; generating for each auxiliary geometric structure a set of auxiliary points on the surface of each auxiliary geometric structure; projecting the auxiliary points onto the at least two two-dimensional images; and acquiring edge nodes at the topmost, bottommost, leftmost and rightmost in the projection of the auxiliary points on the at least two two-dimensional images, and generating the gesture bounding box based on the four nodes.
The claim 18 is in parallel with the claim 6 in the form of an apparatus claim. The claim 18 is subject to the same rationale of rejection as the claim 6. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JIN CHENG WANG whose telephone number is (571)272-7665. The examiner can normally be reached Mon-Fri 8:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached on 571-272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JIN CHENG WANG/Primary Examiner, Art Unit 2613