DETAILED ACTION
Response to Amendment
Applicants’ response to the last Office Action, filed on 11/10/2021 has been entered and made of record.
In view of the Applicant’s amendments, the rejection under 35 U.S.C. 112 of claims 1-14 are expressly withdrawn.
Response to Arguments
Applicant’s arguments regarding the use of Misra et al., filed 11/10/2021, with respect to the rejection(s) of claim(s) 1-14 under 35 U.S.C. 102(a)(2) and 35 U.S.C. 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Mathe et al. (US 2010/0194872).  Examiner also strongly suggests considering the prior arts listed and made of record and not relied upon that are considered pertinent to applicant’s disclosure. 
With regards to Applicant’s argument that the term “device” is not intended and not embodied as a generic “means for performing a specified function”, but rather as a concrete physical thing, Examiner notes that the term “device” is a generic placeholder and does not limit the scope of the claim to any specific structure for performing the claimed functions.  As Applicant indicates in the arguments on Pages 12 and 13 (in number 6), a “device” comprises electronic hardware such as a microcontroller, microprocessor, digital signal processor…and may be implemented in digital electronic circuits, computer hardware, or the like.  Thus, the term “device” does not denote a definite structure and is used as a generic placeholder.  Therefore, while the claim language in claims 13 and 14 do not specifically say "means for", the term "device" is a substitute for the term "means for" and thus still invokes 35 U.S.C. 112(f).  
With regards to Applicant’s argument that Miranda does not teach forming groups of related ones of the body parts but the entire detected body skeleton representation defined based on the joint angles thereof is considered as a whole at each stage, Examiner respectfully 
With regards to Applicant’s argument regarding using depth data, Examiner notes that only claim 19 explicitly discloses not using any depth information.  Claims 1-18 and 20 do not exclude the use of depth information.
Claim Interpretation
As discussed in the Office action dated 08/11/2021, limitations of claims 13 and 14 invoke 35 U.S.C. 112(f) and are being interpreted to cover the corresponding structure described in the specification that achieves the claimed functions of the “device”.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 11, 13, and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Mathe et al. (US 2010/0194872).
With regards to claim 1, Mathe et al. discloses a method of recognizing gestures of a person from at least one image from a monocular camera, comprising the steps: 
Para. 0035 lines 1-5, 0061 lines 1-6, 0081 lines 1-7, 0082 lines 1-8, 0083 lines 1-5", “RGB camera" "joints"), 
b) connecting the key points to form a skeleton-like representation of body parts of the person, wherein the skeleton-like representation represents a relative position and a relative orientation of respective individual ones of the body parts of the person (Para. 0061 lines 1-6, 0081 lines 1-7, 0082 lines 1-8, 0083 lines 1-5, Fig. 10, "skeletal model"), 
c) recognizing a gesture of the person based on the skeleton-like representation (Para. 0035 lines 9-22, 0051 lines 1-5, 0089 lines 6-9, "interpret movements" "interpret one or more gestures"), and 
d) producing and outputting a signal indicating the gesture (Para. 0035 lines 9-22, 0089 lines 6-9, "control an application").
With regards to claim 11, Mathe et al. discloses the method according to claim 1, wherein a number of the key points of the person is a maximum of 20 (Para. 0083 lines 1-2, “j18”).
With regards to claim 13, Mathe et al. discloses a device for recognizing gestures of a person from at least one image from a monocular 3camera, wherein the device is configured: 4
a) to detect key points of the person in the at least 5one image from the monocular camera (Para. 0035 lines 1-5, 0061 lines 1-6, 0081 lines 1-7, 0082 lines 1-8, 0083 lines 1-5", “RGB camera" "joints"), 
6b) to connect the key points to form a skeleton-like 7representation of body parts of the person, wherein the eskeleton-like representation represents a orelative position and a relative orientation of respective 10individual ones of the body parts of the person (Para. 0061 lines 1-6, 0081 lines 1-7, 0082 lines 1-8, 0083 lines 1-5, Fig. 10, "skeletal model"), 
~c) to recognize a gesture of the person based on 12the skeleton-like representation (Para. 0035 lines 9-22, 0051 lines 1-5, 0089 lines 6-9, "interpret movements" "interpret one or more gestures"), and 
13d) to produce and output an output signal indicating the gesture (Para. 0035 lines 9-22, 0089 lines 6-9, "control an application").
With regards to claim 20, Mathe et al. discloses the method according to claim 1, wherein the at least one image is a single still monocular image (Para. 0035 lines 1-5, 0061 lines 1-6, “image or frame of a scene capture by…the RGB camera”).
Claim(s) 1 and 19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Bhuyan et al. (Hand pose recognition from monocular images by geometrical and texture analysis).
With regards to claim 1, Bhuyan et al. discloses a method of recognizing gestures of a person from at least one image from a monocular camera, comprising the steps: 
a) detecting key points of the person in the at least one image from the monocular camera (2.6. Hand modeling: Para. 1 lines 6-9 – first bullet point, 4. Experimental results: Para. 1 lines 5-8, Fig. 7, “joint positions” “CCD camera”), 
b) connecting the key points to form a skeleton-like representation of body parts of the person, wherein the skeleton-like representation represents a relative position and a relative orientation of respective individual ones of the body parts of the person (Abstract: Para. 1 lines 7-10, 2.6. Hand modeling: Para. 1 lines 6-23 – first to fourth bullet points, Fig. 7, “hand modeling” “skeletal model”), 
c) recognizing a gesture of the person based on the skeleton-like representation (2.8.2. Gesture classification: Para. 1 lines 21-25, Fig. 11, “Hand modelling” “Gesture recognition”), and 
4. Experimental results: Para. 4 lines 6-12, Para. 5 lines 7-9, Fig. 21, Fig. 23, “results”).
With regards to claim 19, Bhuyan et al. discloses the method according to claim 1, wherein the at least one image is at least one 2D image from the monocular camera, and wherein the detecting of the key points, the connecting of the key points, and the recognizing of the gesture are performed based on 2D information from the at least one 2D image without any depth information (Introduction: Para. 8 lines 1-3, 4. Experimental results: Para. 1 lines 5-8, “2D images” “CCD camera”, where no depth information is used).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-7, 10, 15, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Mathe et al. (US 2010/0194872) in view of Miranda et al. (Real-time gesture recognition from depth data through key poses learning and decision forests).
With regards to claim 2, Mathe et al. discloses the method according to claim 1.
Mathe et al. does not explicitly teach further comprising forming groups respectively from related ones of the body parts. 
However, Miranda et al. discloses forming groups from related ones of the body parts to recognize gestures (IV. Key Pose Statistical Learning: A. Joint-angles Representation: Para. 2 lines 1-8, Para. 9 lines 6-7, B. Multi-class SVM formulation: Para. 1 lines 1-11, Fig. 6, Table I, “key pose” “gesture”).  There are a finite number of 
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to try and include the technique of forming groups from related ones of the body parts to recognize gestures as taught by Miranda et al. into Mathe et al. since one of ordinary skill in the art could have pursued the technique with a reasonable expectation of success of recognizing a gesture from the skeleton-like representation of the person.
With regards to claim 3, the combination of Mathe et al. and Miranda et al. discloses the method according to claim 2, wherein at least one of the body parts belongs to more than one of the groups (Miranda et al.: IV. Key Pose Statistical Learning: A. Joint-angles Representation: Para. 2 lines 1-8, Para. 9 lines 6-7, B. Multi-class SVM formulation: Para. 1 lines 1-11, Fig. 6, Table I, “key pose”).
With regards to claim 4, the combination of Mathe et al. and Miranda et al. discloses the method according to claim 2, wherein the gesture is a static gesture, and further comprising adjusting a number of the groups (Miranda et al.: IV. Key Pose Statistical Learning: A. Joint-angles Representation: Para. 2 lines 1-8, Para. 9 lines 6-7, B. Multi-class SVM formulation: Para. 1 lines 1-11, Fig. 6, Fig. 5, Table I, “key pose” “K”).
With regards to claim 5, Mathe et al. discloses the method according to claim 2. 
Mathe et al. does not explicitly teach further comprising assigning a respective feature vector respectively to each one of the groups, wherein the forming of the groups 
However, Miranda et al. teaches assigning a feature vector respectively to each one of the groups, wherein the forming of the groups comprises combining the key points associated with the related ones of the body parts in each respective one of the groups, and where the feature vector of a respective one of the groups is based on coordinates of the key points which are combined in the respective group to recognize gestures (III. Technical Overview: Para. 2 lines 1-4, IV. Key Pose Statistical Learning: A. Joint-angles Representation: Para. 2 lines 1-8, Para. 9 lines 6-7, B. Multi-class SVM formulation: Para. 1 lines 1-11, V. Gesture Recognition through Decision Forest: Para. 1 lines 1-6, A. Defining a gesture: Para. 1 lines 1-5, B. Recognizing gestures: Para. 1 lines 1-5, Para. 2 lines 1-3, Fig. 5, Fig. 6, Table I, “pose descriptor” “key pose”).  There are a finite number of ways to recognize a gesture from a skeleton-like representation of a person, by comparing positions of individual body parts and by using feature vectors.  Mathe et al. discloses recognizing a gesture from the skeleton-like representation of the person and the technique as taught by Miranda et al. is just one of a finite number of ways to recognize a gesture from a skeleton-like representation of a person. 
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to try and include the technique of assigning a feature vector respectively to each one of the groups, wherein the forming of the groups comprises combining the key points associated with the related ones of the body parts in each respective one of the groups, and where the feature vector of a respective one of the groups is based on coordinates of the key points which are combined in the 
With regards to claim 6, the combination of Mathe et al. and Miranda et al. discloses the method according to claim 5, further comprising merging the feature vectors of the groups with reference to a clustered pose directory to produce a final feature vector (Miranda et al.: V. Gesture Recognition Through Decision Forest: Para. 1 lines 1-6, A. Defining a gesture: Para. 1 lines 1-5, B. Recognizing gestures: Para. 1 lines 1-5, Para. 2 lines 1-3, Fig. 5, “gesture”). 
With regards to claim 7, the combination of Mathe et al. and Miranda et al. discloses the method according to claim 6, wherein the recognizing of the gesture is based on classifying the final feature vector (Miranda et al.: V. Gesture Recognition Through Decision Forest: Para. 1 lines 1-6, A. Defining a gesture: Para. 1 lines 1-5, B. Recognizing gestures: Para. 1 lines 1-5, Para. 2 lines 1-3, Fig. 5, Table I, “gesture”). 
With regards to claim 10, Mathe et al. discloses the method according to claim 1. 
Mathe et al. does not explicitly teach wherein the recognizing of the gesture is based on a gesture classification which has previously been trained. 
However, Miranda et al. teaches recognizing the gesture based on a gesture classification which has previously been trained (III. Technical Overview: Para. 2 lines 1-4, IV. Key Pose Statistical Learning: A. Joint-angles Representation: Para. 2 lines 1-8, Para. 9 lines 6-7, B. Multi-class SVM formulation: Para. 1 lines 1-11, V. Gesture Recognition through Decision Forest: Para. 1 lines 1-6, A. Defining a gesture: Para. 1 lines 1-5, Para. 2 lines 1-5, Para. 3 lines 1-5, B. Recognizing gestures: Para. 1 lines 1-8, Para. 2 lines 1-3, VI. Results: C. Gesture recognition: Para. 1 lines 1-2, Fig. 2, “gesture training set” “SVM learning machine” “gesture learning machine”).  There are a finite number of ways to recognize a gesture from a skeleton-like representation of a person, by comparing positions of individual body parts and by using classification.  Mathe et al. discloses recognizing a gesture from the skeleton-like representation of the person and the technique as taught by Miranda et al. is just one of a finite number of ways to recognize a gesture from a skeleton-like representation of a person, by using a gesture classification which has been previously trained. 
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to try and include the technique of recognizing the gesture based on a gesture classification which has previously been trained as taught by Miranda et al. into Mathe et al. since one of ordinary skill in the art could have pursued the technique with a reasonable expectation of success of recognizing a gesture from the skeleton-like representation of the person.
With regards to claim 15, the combination of Mathe et al. and Miranda et al. discloses the method according to claim 7, wherein at least one of the body parts belongs to more than one of the groups (Miranda et al.: IV. Key Pose Statistical Learning: A. Joint-angles Representation: Para. 2 lines 1-8, Para. 9 lines 6-7, B. Multi-class SVM formulation: Para. 1 lines 1-11, Fig. 6, Table I, “key pose”).
With regards to claim 18, Mathe et al. discloses the method according to claim 1.
Mathe et al. does not explicitly teach wherein the gesture is a static gesture.
However, Miranda et al. discloses where each gesture is made up of static gestures or key poses and recognizing the static gestures (Key Pose Statistical Learning: A. Joint-angles Representation: Para. 2 lines 1-8, Para. 9 lines 6-7, B. Multi-class SVM formulation: Para. 1 lines 1-11, Fig. 6, Table I, “key pose”).  While 
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to modify Mathe et al. to replace the technique of recognizing gestures with recognizing gestures where the gestures are static gestures as taught by Miranda et al. since one of ordinary skill in the art would have been able to carry out such a substitution and the results from the substitution would be predictable to recognize gestures.
Claims 8, 9, and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Mathe et al. (US 2010/0194872) in view of Solomon et al. (Driver Attention and Behavior Detection with Kinect).
With regards to claim 8, Mathe et al. discloses the method according to claim 1.
Mathe et al. does not explicitly teach further comprising estimating a viewing direction of the person based on the skeleton-like representation.
However, Solomon et al. discloses a similar concept of determining and using a skeleton-like representation for recognition but in the setting of a vehicle, and teaches the concept of estimating a viewing direction of a person based on a skeleton representation in order to determine if the person is looking in the direction of travel, allowing for more flexibility and more variety of uses of the recognition system (Abstract: Para. 1 lines 5-14, IV. Experimental Evaluation and Analysis: Para. 2 lines 1-11, Para. 9 lines 1-3, Para. 10 lines 1-4, Fig. 18, Fig. 19, “skeleton tracking” “head pose down” “looking”). 
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to include the concept of estimating a viewing direction of a person based on a skeleton representation in the setting of a vehicle as 
 With regards to claim 9, the combination of Mathe et al. and Solomon et al. discloses the method according to claim 8, further comprising checking whether the viewing direction of the person is directed toward the monocular camera (Solomon et al.: IV. Experimental Evaluation and Analysis: Para. 2 lines 1-11, Para. 9 lines 1-3, Para. 10 lines 1-4, Fig. 18, Fig. 19, where the viewing direction is looking down and not directed towards the camera).
With regards to claim 12, the combination of Mathe et al. and Solomon et al. discloses the method according to claim 8, further comprising classifying the person as a distracted road user when the gesture and the viewing direction indicate that the person has lowered his or her head and is looking at his or her hand (Solomon et al.: IV. Experimental Evaluation and Analysis: Para. 2 lines 1-11, Para. 9 lines 1-3, Para. 10 lines 1-4, V. Conclusion: Para. 1 lines 5-7, Fig. 18, Fig. 19, “distraction” “warning”).
Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Arndt et al. (US 2016/0012301) in view of Mathe et al. (US 2010/0194872).
With regards to claim 14, Arndt et al. discloses vehicle having a monocular camera and a device for recognizing gestures of a person from images (Para. 0002 lines 1-6, 0030 lines 1-8, 0032 lines 1-11, 0033 lines 1-5, 0036 lines 1-5, “camera” “gestures”).
Arndt et al. does not explicitly teach the vehicle having the device as disclosed in claim 13.
However, Mathe et al. discloses the device according to claim 13 (see claim 13 rejection above).  While Arndt et al. discloses the device for recognizing gestures of a person from images, Mathe et al. teaches a device for recognizing gestures of a person 
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to modify Arndt et al. to replace the device for recognizing gestures of a person with the device for recognizing gestures of a person according to claim 13 as taught by Mathe et al. since one of ordinary skill in the art would have been able to carry out such a substitution and the results from the substitution would be predictable to recognize gestures of a person from images.
Claims 16 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Mathe et al. (US 2010/0194872) in view of Arndt et al. (US 2016/0012301).
With regards to claim 16, Mathe et al. discloses the method according to claim 1.
Mathe et al. does not explicitly teach wherein the monocular camera is a monocular vehicle camera mounted on a vehicle, and further comprising actuating a control system of the vehicle in response to and dependent on the signal indicating the gesture.
However, Arndt et al. discloses the concept of a monocular camera mounted on a vehicle and actuating a control system of the vehicle in response to and dependent on a signal indicating a gesture, allowing for more flexibility and more variety of uses of recognizing gestures and improving traffic safety (Para. 0002 lines 1-6, 0010 lines 1-3, 0030 lines 1-8, 0032 lines 1-11, 0033 lines 1-5, 0036 lines 1-5, 0037 lines 1-11, 0038 lines 1-5, 0039 lines 1-13, “traffic safety” “camera” “gestures”, “classification result”).  While Mathe et al. discloses recognizing gestures of a person from images, Arndt et al. teaches recognizing gestures of a person from images in the setting of a vehicle.  In both cases, gestures of a person from images obtained from a camera are recognized.

With regards to claim 17, Mathe et al. discloses the method according to claim 1.
Mathe et al. does not explicitly teach wherein the monocular camera is a monocular vehicle camera mounted on a vehicle, and further comprising automatically communicating, from the vehicle to the person, a warning or an acknowledgment indicating that the person has been detected, in response to and dependent on the signal indicating the gesture.
However, Arndt et al. discloses the concept of a monocular camera mounted on a vehicle and automatically communicating, from the vehicle to the person, a warning or an acknowledgment indicating that the person has been detected, in response to and dependent on the signal indicating the gesture, allowing for more flexibility and more variety of uses of recognizing gestures and improving traffic safety (Para. 0002 lines 1-6, 0010 lines 1-3, 0030 lines 1-8, 0032 lines 1-11, 0033 lines 1-5, 0036 lines 1-5, 0037 lines 1-11, 0038 lines 1-5, 0039 lines 1-13, “traffic safety” “camera” “gestures”, “classification result”).  While Mathe et al. discloses recognizing gestures of a person from images, Arndt et al. teaches recognizing gestures of a person from images in the setting of a vehicle.  In both cases, gestures of a person from images obtained from a camera are recognized.
.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Kontschieder et al. (US 10,083,233) discloses recognizing motor tasks using depth information computed from RGB images captured using a RGB camera and without a depth camera.
Pohl et al. (A driver-distraction-based lane-keeping assistance system) discloses using gaze information and head pose information to determine a driver is distracted.
Applicants are also directed to consider additional pertinent prior art included on the Notice of References Cited (PTOL 892) attached herewith.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CAROL WANG whose telephone number is (571)272-5766.  The examiner can normally be reached on 9:30-3:30 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached on (571) 272-3638.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/CAROL WANG/            Primary Examiner, Art Unit 2662