DETAILED ACTION
Response to Amendment
Applicants’ response to the last Office Action, filed on 04/19/2022 has been entered and made of record.
Applicant’s amendment has necessitated new grounds of rejection.  Thus, new grounds of rejection are presented in this Office Action.
Response to Arguments
Applicant’s arguments with respect to the claim have been considered, however, the arguments are indicated towards the newly added limitation of wherein the detecting of the key points, the connecting of the key points, and the recognizing of the gesture are performed based on 2D information from the at least one 2D image without any depth information and actuating a control system of the vehicle or outputting a humanly perceivable information signal from the vehicle, automatically in response to a dependent on the signal indicating the gesture.  Thus, Examiner has brought in references Bhuyan et al. (Hand pose recognition from monocular images by geometrical and texture analysis) and Arndt et al. (US 2016/0012301) to address the added limitation to the claims.
Claim Interpretation
As discussed in the Office action dated 08/11/2021, limitations of claims 13 and 14 invoke 35 U.S.C. 112(f) and are being interpreted to cover the corresponding structure described in the specification that achieves the claimed functions of the “device”.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 10, 11, 13, 14, 16-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Bhuyan et al. (Hand pose recognition from monocular images by geometrical and texture analysis) in view of Arndt et al. (US 2016/0012301).
With regards to claim 1, Bhuyan et al. discloses a method comprising the steps: 
a) detecting key points of body parts of a person in at least one 2D image from a monocular camera  (2.6. Hand modeling: Para. 1 lines 6-9 – first bullet point, 4. Experimental results: Para. 1 lines 5-8, Fig. 7, “joint positions” “CCD camera”), 
b) connecting the key points to form a skeleton-like representation of the body parts of the person, wherein the skeleton-like representation represents a relative position and a relative orientation of respective individual ones of the body parts of the person (Abstract: Para. 1 lines 7-10, 2.6. Hand modeling: Para. 1 lines 6-23 – first to fourth bullet points, Fig. 7, “hand modeling” “skeletal model”), 
c) recognizing a gesture of the person based on the skeleton-like representation (2.8.2. Gesture classification: Para. 1 lines 21-25, Fig. 11, “Hand modelling” “Gesture recognition”), wherein the detecting of the key points, the connecting of the key points, and the recognizing of the gesture are performed based on 2D information from the at least one 2D image without any depth information (Introduction: Para. 8 lines 1-3, 4. Experimental results: Para. 1 lines 5-8, “2D images” “CCD camera”, where no depth information is used), 
d) producing a signal indicating the gesture (4. Experimental results: Para. 4 lines 6-12, Para. 5 lines 7-9, Fig. 21, Fig. 23, “results”).
Bhuyan et al. does not explicitly teach the monocular camera to be a monocular vehicle camera mounted on a vehicle and e) actuating a control system of the vehicle or outputting a humanly perceivable information signal from the vehicle, automatically in response to and dependent on the signal indicating the gesture.
However, Arndt et al. discloses the concept of a monocular camera mounted on a vehicle and actuating a control system of the vehicle automatically in response to and dependent on a signal indicating a gesture, allowing for more flexibility and more variety of uses of recognizing gestures and improving traffic safety (Para. 0002 lines 1-6, 0010 lines 1-3, 0030 lines 1-8, 0032 lines 1-11, 0033 lines 1-5, 0036 lines 1-5, 0037 lines 1-11, 0038 lines 1-5, 0039 lines 1-13, “traffic safety” “camera” “gestures”, “classification result”).  While Bhuyan et al. discloses recognizing gestures of a person from images, Arndt et al. teaches recognizing gestures of a person from images in the setting of a vehicle.  In both cases, gestures of a person from images obtained from a camera are recognized.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to include the concept of the monocular camera mounted on a vehicle and actuating a control system of the vehicle in response to and dependent on a signal indicating a gesture as taught by Arndt et al. into the method of Bhuyan et al.  Thus, Bhuyan et al. would be modified to use the gesture recognition method in the setting of a vehicle and mount the monocular camera on the vehicle and actuate the control system of the vehicle automatically in response to and dependent on a signal indicating a gesture.  The motivation for this would be to improve traffic safety and allow for more flexibility and more variety of uses of gesture recognition.
With regards to claim 10, the combination of Bhuyan et al. and Arndt et al. discloses the method according to claim 1, wherein the recognizing of the gesture is based on a gesture classification which has previously been trained (Bhuyan et al.: 2.8.2. Gesture classification: Para. 1 lines 21-25, 4. Experimental results: Para. 6 lines 6-9 – paragraph starting with “Finally, Fig. 24 shows…”, Para. 7 lines 5-8 – paragraph starting with “We compared our proposed method…”, Fig. 11, “training session” “Gesture recognition”).
With regards to claim 11, the combination of Bhuyan et al. and Arndt et al. discloses the method according to claim 1, wherein a number of the key points of the body parts of the person is a maximum of 20 (Bhuyan et al.: 2.6. Hand modeling: Para. 1 lines 6-9 – first bullet point, Fig. 7, where there are 15 key points, “joint positions”).
With regards to claim 13, Bhuyan et al. discloses a device configured: 
a) to detect key points of body parts of a person in at least one 2D image from a monocular camera  (2.6. Hand modeling: Para. 1 lines 6-9 – first bullet point, 4. Experimental results: Para. 1 lines 5-8, Fig. 7, “joint positions” “CCD camera”), 
b) to connect the key points to form a skeleton-like representation of the body parts of the person, wherein the skeleton-like representation represents a relative position and a relative orientation of respective individual ones of the body parts of the person (Abstract: Para. 1 lines 7-10, 2.6. Hand modeling: Para. 1 lines 6-23 – first to fourth bullet points, Fig. 7, “hand modeling” “skeletal model”), 
c) to recognize a gesture of the person based on the skeleton-like representation (2.8.2. Gesture classification: Para. 1 lines 21-25, Fig. 11, “Hand modelling” “Gesture recognition”), wherein the detecting of the key points, the connecting of the key points, and the recognizing of the gesture are performed based on 2D information from the at least one 2D image without any depth information (Introduction: Para. 8 lines 1-3, 4. Experimental results: Para. 1 lines 5-8, “2D images” “CCD camera”, where no depth information is used), 
d) to produce a signal indicating the gesture (4. Experimental results: Para. 4 lines 6-12, Para. 5 lines 7-9, Fig. 21, Fig. 23, “results”).
Bhuyan et al. does not explicitly teach the monocular camera to be a monocular vehicle camera mounted on a vehicle and e) to actuate a control system of the vehicle or to output a humanly perceivable information signal from the vehicle, automatically in response to and dependent on the signal indicating the gesture.
However, Arndt et al. discloses the concept of a monocular camera mounted on a vehicle and actuating a control system of the vehicle automatically in response to and dependent on a signal indicating a gesture, allowing for more flexibility and more variety of uses of recognizing gestures and improving traffic safety (Para. 0002 lines 1-6, 0010 lines 1-3, 0030 lines 1-8, 0032 lines 1-11, 0033 lines 1-5, 0036 lines 1-5, 0037 lines 1-11, 0038 lines 1-5, 0039 lines 1-13, “traffic safety” “camera” “gestures”, “classification result”).  While Bhuyan et al. discloses recognizing gestures of a person from images, Arndt et al. teaches recognizing gestures of a person from images in the setting of a vehicle.  In both cases, gestures of a person from images obtained from a camera are recognized.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to include the concept of the monocular camera mounted on a vehicle and actuating a control system of the vehicle in response to and dependent on a signal indicating a gesture as taught by Arndt et al. into the device of Bhuyan et al.  Thus, Bhuyan et al. would be modified to use the gesture recognition method in the setting of a vehicle and mount the monocular camera on the vehicle and actuate the control system of the vehicle automatically in response to and dependent on a signal indicating a gesture.  The motivation for this would be to improve traffic safety and allow for more flexibility and more variety of uses of gesture recognition.
With regards to claim 14, the combination of Bhuyan et al. and Arndt et al. discloses a vehicle having a monocular vehicle camera and a device according to claim 13 (Arndt et al.: Para. 0002 lines 1-6, 0030 lines 1-8, 0032 lines 1-11, 0033 lines 1-5, 0036 lines 1-5, “camera” “gestures”, see also claim 13 rejection above).
With regards to claim 16, the combination of Bhuyan et al. and Arndt et al. discloses the method according to claim 1, wherein the step e) comprises the actuating of the control system of the vehicle automatically in response to and dependent on the signal indicating the gesture (Arndt et al.: Para. 0002 lines 1-6, 0010 lines 1-3, 0030 lines 1-8, 0032 lines 1-11, 0033 lines 1-5, 0036 lines 1-5, 0037 lines 1-11, 0038 lines 1-5, 0039 lines 1-13, “traffic safety” “camera” “gestures”, “classification result”).
With regards to claim 17, the combination of Bhuyan et al. and Arndt et al. discloses the method according to claim 1, wherein the step 3) comprises the outputting of the humanly perceivable information signal, which communicates, from the vehicle to the person, a warning or an acknowledgment indicating that the person has been detected, automatically in response to and dependent on the signal indicating the gesture (Arndt et al.: Para. 0002 lines 1-6, 0010 lines 1-3, 0037 lines 1-11, 0039 lines 1-13, “audible or visual signals” “message” “gestures”, “classification result”).
With regards to claim 18, the combination of Bhuyan et al. and Arndt et al. discloses the method according to claim 1, wherein the gesture is a static gesture (Bhuyan et al.: 2.8.2. Gesture classification: Para. 1 lines 21-25, 5. Conclusion: Para. 1 lines 1-4, Fig. 11, Fig. 19, “Hand modelling” “Gesture recognition” “static hand gesture”).
With regards to claim 20, the combination of Bhuyan et al. and Arndt et al. discloses the method according to claim 1, wherein the at least one image is a single still monocular image (Bhuyan et al.: 4. Experimental results: Para. 1 lines 5-8, Fig. 7, “input images” “CCD camera”).
Claims 2-7, 15, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Bhuyan et al. (Hand pose recognition from monocular images by geometrical and texture analysis) in view of Arndt et al. (US 2016/0012301) and further in view of Bulzacki (US 2013/0278501).
With regards to claim 2, the combination of Bhuyan et al. and Arndt et al. discloses the method according to claim 1.
The combination of Bhuyan et al. and Arndt et al. does not explicitly teach further comprising forming a first group of a first subset of the body parts, and forming a second group of a second subset of the body parts, wherein the second subset is different from the first subset. 
However, Bulzacki discloses forming a first group of a first subset of the body parts, and forming a second group of a second subset of the body parts, where the second subset is different from the first subset to recognizing gestures (Para. 0075 lines 8-17 and 24-29, 0088 lines 5-7 and 26-27 and 29-30, Para. 0089 lines 1-33, 0090 lines 1-4 and 9-12 and 15-35, “fingertip of a hand...a reference point” “hip of a person and a reference point” “waist point”).  There are a finite number of ways to recognize a gesture from a skeleton-like representation of a person, by comparing positions of individual body parts and by forming groups of body parts as poses.  The combination of Bhuyan et al. and Arndt et al. discloses recognizing a gesture from the skeleton-like representation of the person and the technique as taught by Bulzacki is just one of a finite number of ways to recognize a gesture from a skeleton-like representation of a person. 
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to try and include the technique of recognizing gestures by forming a first group of a first subset of the body parts, and forming a second group of a second subset of the body parts, where the second subset is different from the first subset as taught by Bulzacki. into the combination of Bhuyan et al. and Arndt et al. since one of ordinary skill in the art could have pursued the technique with a reasonable expectation of success of recognizing a gesture from the skeleton-like representation of the person.
With regards to claim 3, the combination of Bhuyan et al., Arndt et al., and Bulzacki discloses the method according to claim 2, wherein at least one of the body parts belongs to more than one of the groups (Bulzacki: Para. 0075 lines 8-17 and 24-29, 0088 lines 5-7 and 26-27 and 29-30, Para. 0089 lines 1-33, 0090 lines 1-4 and 9-12 and 15-35, “waist point” “reference point”).
With regards to claim 4, the combination of Bhuyan et al., Arndt et al., and Bulzacki discloses the method according to claim 2, wherein the gesture is a static gesture, and further comprising adjusting a number of the groups (Bulzacki: Para. 0075 lines 8-17 and 24-29, 0090 lines 1-4 and 9-12 and 15-40, “series of gesture data” “single frame”).
With regards to claim 5, the combination of Bhuyan et al., Arndt et al., and Bulzacki  discloses the method according to claim 2, further comprising assigning a respective feature vector respectively to each one of the groups, wherein the forming of the groups comprises combining the key points associated with the related ones of the body parts respectively in each respective one of the groups, and wherein said feature vector of a respective one of the groups is based on coordinates of the key points which are combined in the respective group (Bulzacki: Para. 0088 lines 5-7 and 26-27 and 29-30, Para. 0089 lines 1-33, 0090 lines 1-4 and 9-12 and 15-35, “gesture data” “vectors in x and y coordinate system”). 
With regards to claim 6, the combination of Bhuyan et al., Arndt et al., and Bulzacki discloses the method according to claim 5, further comprising merging the feature vectors of the groups with reference to a clustered pose directory to produce a final feature vector (Bulzacki: Para. 0088 lines 5-7 and 26-27 and 29-30, Para. 0089 lines 1-33, 0090 lines 1-4 and 9-12 and 15-35, 0166 lines 1-8, 0190 lines 1-9, 0212 lines 1-2, “gesture” “gesture data” “vectors in x and y coordinate system” “matrix” “GDF”). 
With regards to claim 7, the combination of Bhuyan et al., Arndt et al., and Bulzacki discloses the method according to claim 6, wherein the recognizing of the gesture is based on classifying the final feature vector (Bulzacki: Para. 0075 lines 34-38, 0212 lines 1-2, 0213 lines 1-3, “GDF array” “predict the gesture”). 
With regards to claim 15, the combination of Bhuyan et al., Arndt et al., and Bulzacki discloses the method according to claim 7, wherein at least one of the body parts belongs to more than one of the groups (Bulzacki: Para. 0075 lines 8-17 and 24-29, 0088 lines 5-7 and 26-27 and 29-30, Para. 0089 lines 1-33, 0090 lines 1-4 and 9-12 and 15-35, “waist point” “reference point”).
With regards to claim 21, the combination of Bhuyan et al. and Arndt et al. discloses the method according to claim 1. 
The combination of Bhuyan et al. and Arndt et al. does not explicitly teach wherein the body parts of the person include at least one body part selected from the group consisting of an upper body, shoulders, upper arms, elbows, legs, thighs, hips, knees, and ankles. 
However, Bulzacki discloses performing gesture recognition using body parts of a person including fingers, shoulders, and knees (Para. 0075 lines 8-17, Fig. 7, “shoulder” “knee” “finger tip”).  While the combination of Bhuyan et al. and Arndt et al. discloses performing gesture recognition using body parts of a person including hands and fingers, Bulzacki teaches a broader concept of using body parts of a person including fingers, shoulders, and knees for the same purpose of performing gesture recognition.  In both cases, a skeleton-like representation of body parts of a person are used to recognize a gesture of the person.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Bhuyan et al. and Arndt et al. to replace the technique of performing gesture recognition using limited body parts (hands and fingers) with a broader concept of performing gesture recognition using body parts of a person including fingers, shoulders, and knees as taught by Bulzacki since one of ordinary skill in the art would have been able to carry out such a substitution and the results from the substitution would be predictable to recognize a gesture of the person using a skeleton-like representation of body parts of the person.
Claims 8, 9, and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Bhuyan et al. (Hand pose recognition from monocular images by geometrical and texture analysis) in view of Arndt et al. (US 2016/0012301) and further in view of Solomon et al. (Driver Attention and Behavior Detection with Kinect).
With regards to claim 8, the combination of Bhuyan et al. and Arndt et al. discloses the method according to claim 1.
The combination of Bhuyan et al. and Arndt et al. does not explicitly teach further comprising estimating a viewing direction of the person based on the skeleton-like representation.
However, Solomon et al. discloses a similar concept of determining and using a skeleton-like representation for recognition but in the setting of a vehicle, and teaches the concept of estimating a viewing direction of a person based on a skeleton representation in order to determine if the person is looking in the direction of travel, allowing for more flexibility and more variety of uses of the recognition system (Abstract: Para. 1 lines 5-14, IV. Experimental Evaluation and Analysis: Para. 2 lines 1-11, Para. 9 lines 1-3, Para. 10 lines 1-4, Fig. 18, Fig. 19, “skeleton tracking” “head pose down” “looking”). 
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to include the concept of estimating a viewing direction of a person based on a skeleton representation in the setting of a vehicle as taught by Solomon et al. into the method of the combination of Bhuyan et al. and Arndt et al.  The motivation for this would be to allow for more flexibility and more variety in uses of the gesture recognition system. 
 With regards to claim 9, the combination of Bhuyan et al., Arndt et al., and Solomon et al. discloses the method according to claim 8, further comprising checking whether the viewing direction of the person is directed toward the monocular vehicle camera (Solomon et al.: IV. Experimental Evaluation and Analysis: Para. 2 lines 1-11, Para. 9 lines 1-3, Para. 10 lines 1-4, Fig. 18, Fig. 19, where the viewing direction is looking down and not directed towards the camera).
With regards to claim 12, the combination of Bhuyan et al., Arndt et al., and Solomon et al. discloses the method according to claim 8, further comprising classifying the person as a distracted road user when the gesture and the viewing direction indicate that the person has lowered his or her head and is looking at his or her hand (Solomon et al.: IV. Experimental Evaluation and Analysis: Para. 2 lines 1-11, Para. 9 lines 1-3, Para. 10 lines 1-4, V. Conclusion: Para. 1 lines 5-7, Fig. 18, Fig. 19, “distraction” “warning”).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Applicants are directed to consider additional pertinent prior art included on the Notice of References Cited (PTOL 892) attached herewith.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 


Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CAROL WANG whose telephone number is (571)272-5766.  The examiner can normally be reached on 9:30-3:30 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached on (571) 272-3638.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/CAROL WANG/            Primary Examiner, Art Unit 2662