DETAILED ACTION
This is in response to applicant’s amendment/response filed on 03/29/2022, which has been entered and made of record. Claim(s) 1, 13 have been amended. Claim(s) 1-20 are pending in the application. 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 2, 3, 9, 12, 13, 14, 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (CN 104850846 B) in view of Trundle et al. (US 10836309).
Regarding claim 1, Chen discloses A method of training a model to detect human actor interactions (Chen, “The invention claims a video identification field, especially relates to a depth-based neural network of human body action identifying method and identifying system”), the method comprising: 
detecting a human actor in real-world data; tracking the human actor over time (Chen, “Referring to FIG. 1, in one embodiment of the present invention based on human body action identifying method flow chart of depth neural network. in step S11, obtaining the original depth data stream of the actor. In this embodiment, the step S11 of obtaining the original depth data stream of the actor comprises: monitoring the pair of actor in the view, and obtaining real-time RGB video stream collected by the kinect sensor, and using original depth flow data optically encoded obtaining actor”);
augmenting the real-world data by replacing at least a portion of the human with at least a portion of a virtual actor performing a target actor interaction to generate augmented real-world data, wherein the human actor is not performing the target actor interaction in the real-world driving data, whereby useful data is generated without having to acquire real-world data of an actor actually performing the target actor interaction (Chen, “through modeling the whole human body to feature extraction, sending the characteristic data into the restricted Boltzmann machine network for pre-processing, initializing the obtained weight BP neural network parameter, training the deep neural network model, and performing behaviour recognition to the characteristic extraction result; using multi-thread parallel processing, overlapping the extracted human body skeleton closing node data with the actual human body, and displaying the identified action in real time. establishing deep neural network system model by learning, can identify a plurality of human behaviour, such as common fall, call, stoop, reading, sitting, squatting and so on multiple human body behaviour. Please continue to refer to FIG. 1, in step S15, using multi-thread parallel processing, the extracted human framework node data is overlapped with the actual human body, and the identified behaviour is displayed in real time”); and
training a model, using the augmented real-world driving data, to detect the target driver interaction (Chen, “S14. extracting characteristic by modeling the whole human body, sending the characteristic data into the restricted Boltzmann machine network for pre-processing, initializing the obtained weight BP neural network parameter, training the deep neural network model, and using the trained deep neural network model to perform behavior identification to the characteristic extraction result; S15, using multi-thread parallel processing, overlapping the extracted human body skeleton closing node data with the actual human body, and displaying the identified action in real time; S16, establishing abnormal behaviour template library, and alarming the detected abnormal behaviour”).
On the other hand, Chen fails to explicitly disclose but Chen discloses A method of training a model to detect driver interactions through a window of a vehicle (Trundle, col.10, lines 46-49, “The surveillance camera can collect image data, for example, that depicts the driver's hand position on the steering wheel, the driver's eye gaze (e.g., what the driver is looking at)”. Col.16, “lines 35-46, “training a vehicle/driver model 134 includes one or more heuristics that can be applied to the monitoring data, for example, head pose/body pose, telemetry, hand position (e.g., whether a driver is touching a steering wheel of the vehicle 106), gaze target (e.g., whether the driver is looking out a front windshield of the vehicle 106)... Neural networks can be trained to output one or more heuristics that can be applied to the monitoring data. Col.28, lines 14-18, Each delivery truck of the multiple delivery trucks can be identified, for example, by a license plate number, an RFID tag in a window of the truck, or other physical markings on the truck that distinguish a particular delivery truck from each other truck in the fleet”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Chen and Trundle, to include all limitations of claim 1. That is, applying the machine learning of an actor’s behavior of Chen to the vehicle driver of Trundle. The motivation/ suggestion would have been to provide various monitoring hardware devices installed at a roadway intersection for tracking vehicle activity through the intersection, where distracted drivers can be detected and reported to a user (Trundle, col.1, lines 31-34).
Regarding claim 13, it recites similar limitations as claim 1 except that it further discloses one or more processors, the one or more processors being programmed to initiate executable operations.
Chen further discloses one or more processors, the one or more processors being programmed to initiate executable operations (Chen, “In addition, those skilled in the art can understand that all or part of the steps in the above embodiments method can be finished by program instruction related hardware, the corresponding program can be stored in a computer readable storage medium, the storage medium, such as ROM/RAM, Disk or optical disk and so on”).
Regarding claim 2, Chen in view of Trundle discloses The method of claim 1.
On the other hand, Chen fails to explicitly disclose but Trundle discloses wherein the target driver interaction includes a gaze or a gesture (Trundle, col.10, lines 46-49, “The surveillance camera can collect image data, for example, that depicts the driver's hand position on the steering wheel, the driver's eye gaze (e.g., what the driver is looking at)”. Col.16, “lines 35-41, “training a vehicle/driver model 134 includes one or more heuristics that can be applied to the monitoring data, for example, head pose/body pose, telemetry, hand position (e.g., whether a driver is touching a steering wheel of the vehicle 106), gaze target (e.g., whether the driver is looking out a front windshield of the vehicle 106). The same motivation of claim 1 applies here.
Regarding claim 3, Chen in view of Trundle discloses The method of claim 1.
Chen further discloses generating a dataset of virtual drivers performing a target driver interaction (Chen, “a data processing module, used for using multi-thread parallel processing; the extracted human framework node data is overlapped with the actual human body, and the identified action is displayed in real time; a template establishing module, used for establishing abnormal behaviour template library”).
Regarding claim(s) 14, 15, they are interpreted and rejected for the same reasons set forth in claim(s) 2, 3, respectively.
Regarding claim 9, Chen in view of Trundle discloses The method of claim 1.
On the other hand, Chen fails to explicitly disclose but Trundle discloses wherein the window is a front windshield (Trundle, Col.16, lines 35-41, “training a vehicle/driver model 134 includes one or more heuristics that can be applied to the monitoring data, for example, head pose/body pose, telemetry, hand position (e.g., whether a driver is touching a steering wheel of the vehicle 106), gaze target (e.g., whether the driver is looking out a front windshield of the vehicle 106)”). The same motivation of claim 1 applies here.
Regarding claim 12, Chen in view of Trundle discloses The method of claim 1.
On the other hand, Chen fails to explicitly disclose but Trundle discloses wherein the real-world driving data is acquired from at least one of: one or more vehicles and one or more road infrastructure devices (Trundle, col.8, lines 8-12, “The one or more traffic monitoring devices 114 (e.g., a surveillance camera) are deployed at a roadway (e.g., at an intersection 108) of interest to monitor traffic in the roadway from, for example, vehicles, pedestrians, cyclists, etc.”). The same motivation of claim 1 applies here.
Claim(s) 4, 5, 16, 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (CN 104850846 B) in view of Trundle et al. (US 10836309), and further in view of Parker et al. (US 10885691).
Regarding claim 4, Chen in view of Trundle discloses The method of claim 3, wherein virtual drivers performing a target driver interaction, has been disclosed. 
On the other hand, Chen in view of Trundle fails to explicitly disclose but Parker discloses wherein generating the dataset of virtual character performing the target interaction is performed using a simulator, human computer models, procedural animation, or any combination thereof (Parker, col.15, lines 34-38, “in some implementations the application 810 may be another type of application that may include procedural animations based on motion capture data and/or that may transition between two different animations, such as educational software”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Parker into the combination of Chen and Trundle, to include all limitations of claim 4. That is, adding the procedural animations based on motion capture data of Parker to generate the virtual driver dataset of Chen and Trundle. The motivation/ suggestion would have been the use of procedural animation can result in a larger variety of animation within a game while reducing storage space for the game data 704 of a game (Parker, col.13, line 67-col.14, line 2).
Regarding claim 5, Chen in view of Trundle and Parker discloses The method of claim 4, wherein Parker discloses procedural animation includes scripted procedural animation techniques or motion capture (Parker, col.15, lines 34-38, “in some implementations the application 810 may be another type of application that may include procedural animations based on motion capture data and/or that may transition between two different animations, such as educational software”). The same motivation of claim 4 applies here.
Regarding claim(s) 16, 17, they are interpreted and rejected for the same reasons set forth in claim(s) 4, 5, respectively.
Claim(s) 6, 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (CN 104850846 B) in view of Trundle et al. (US 10836309), and further in view of Chen et al. (US 10861144).
Regarding claim 6, Chen in view of Trundle discloses The method of claim 1.
On the other hand, Chen in view of Trundle fails to explicitly disclose but Chen discloses at least partially aligning at least a portion of the virtual human with an originally detected pose of the human (Chen, fig.9. Col.8, lines 46-49, 59-62, “At step 908, the 3D skeleton of step 906 is super-imposed or fit (e.g. as a part of augmented reality) over the original master image of the driver as has been obtained by the first camera 202 or at the first view-point. Further, the predicted skeleton may be leveraged to mimic motion of the driver as a part of augmented reality and may be used to training robots”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Chen into the combination of Chen and Trundle, to include all limitations of claim 6. That is, applying the fitting skeleton over the driver of Chen to the augmentation of Chen and Trundle. The motivation/ suggestion would have been There lies at least of predicting skeleton of a living being accurately based on images captured in the multi-camera environment (Chen, col.1, lines 60-62).
Regarding claim(s) 18, it is interpreted and rejected for the same reasons set forth in claim(s) 6.
Claim(s) 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (CN 104850846 B) in view of Trundle et al. (US 10836309), and further in view of Unklesbay et al. (WO 2019215550 A1).
Regarding claim 7, Chen in view of Trundle discloses The method of claim 6, wherein aligning the virtual driver has been disclosed.
On the other hand, Chen in view of Trundle fails to explicitly disclose but Unklesbay discloses matching substantially static body parts or substantially static body joints that are not involved in the target human interaction (Unklesbay, page 6, 2nd paragraph, “An SfM (Structure from Motion) technique could instead be applied on static (non-moving) parts of the face once the feature matching step below has been completed on multiple sequential frames to obtain a more accurate camera extrinsic model. Knowing the camera extrinsics provides a good 3D point representation of each of the landmarks which can help indicate the pose and further be used for more accurate augmented reality”). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Unklesbay into the combination of Chen and Trundle, to include all limitations of claim 7. That is, applying the face pose matching based on static parts of the face of Unklesbay to the driver matching of Chen and Trundle. The motivation/ suggestion would have been Knowing the camera extrinsics provides a good 3D point representation of each of the landmarks which can help indicate the pose and further be used for more accurate augmented reality (Unklesbay, page 6, 2nd paragraph).
Claim(s) 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (CN 104850846 B) in view of Trundle et al. (US 10836309), and further in view of Yi et al. (KR 20150069996 A).
Regarding claim 8, Chen in view of Trundle discloses The method of claim 6, wherein aligning the virtual driver has been disclosed.
On the other hand, Chen in view of Trundle fails to explicitly disclose but Yi discloses wherein at least partially aligning is performed using motion transfer from the virtual object to the real object (Yi, “the calibration to obtain the coordinates on the transparent display 11 in matching with the virtual model 26 can be used for matching, and calculated in the course of compensating the movement conversion matrix between the coordinate system”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Yi into the combination of Chen and Trundle, to include all limitations of claim 8. That is, applying the matching virtual object to real object based on movement conversion matrix of Yi to the matching from virtual driver to the real driver of Chen and Trundle. The motivation/ suggestion would have been the matching to reflect the movement and it can be carried out, and the transformation matrix can be used subsequently to compensate for the movement (Yi).
Claim(s) 10, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (CN 104850846 B) in view of Trundle et al. (US 10836309), and further in view of Saxena et al. (US 20180288161).
Regarding claim 10, Chen in view of Trundle discloses The method of claim 1.
On the other hand, Chen in view of Trundle fails to explicitly disclose but Saxena discloses wherein the detecting and the tracking are performed using a human pose detector (Saxena, “[0067] At 212, sensor data is received. In some embodiments, the received sensor data includes data from one or more sensor devices of devices 102 of FIG. 1A. For example, data from a switch, a camera, a motion detector, an infrared detector (e.g., passive infrared sensor), a light detector, an accelerometer, a thermal detector/camera, a human pose/motion camera/detector”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Saxena into the combination of Chen and Trundle, to include all limitations of claim 10. That is, applying human pose detector of Saxena to detect/track the driver motion of Chen and Trundle. The motivation/ suggestion would have been The controller hub may anticipate needs of the user by applying machine learning, specified preferences of the user, and/or automation rules to detected sensor inputs (Saxena, [0027]), and also provide a type of detector to collect motion data.
Regarding claim(s) 20, it is interpreted and rejected for the same reasons set forth in claim(s) 10.
Claim(s) 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (CN 104850846 B) in view of Trundle et al. (US 10836309), and further in view of Mukherjee (US (US 20210232858).
Regarding claim 11, Chen in view of Trundle discloses The method of claim 1, wherein augmented real-world driving data has been disclosed.
On the other hand, Chen in view of Trundle fails to explicitly disclose but Mukherjee discloses improving the realism of the augmented data by using simulation-to-real domain adaptation techniques (Mukherjee, “[0098] “Domain-adapted” here in the application can be defined as a state where a data distribution difference between the domains of rendered images from a 3D model and the images obtained from a camera containing the object in a scene is alleviated or compensated without substantial degradation of data necessary to effectively train for object detection. Domain adaptation techniques such as the use of random or algorithmically chosen textures (e.g., noise, certain lighting conditions), and certain enhancement filters are adapted”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Mukherjee into the combination of Chen and Trundle, to include all limitations of claim 11. That is, applying the domain adaptation technique of Mukherjee to the augmented real-world driving data of Chen and Trundle. The motivation/ suggestion would have been “Domain-adapted” here in the application can be defined as a state where a data distribution difference between the domains of rendered images from a 3D model and the images obtained from a camera containing the object in a scene is alleviated or compensated without substantial degradation of data necessary to effectively train for object detection (Mukherjee, [0098]).
Claim(s) 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (CN 104850846 B) in view of Trundle et al. (US 10836309), and further in view of Konishi (US (US 9477881).
Regarding claim 19, Chen in view of Trundle discloses The system of claim 13, wherein augmented real-world driving data has been disclosed.
On the other hand, Chen in view of Trundle fails to explicitly disclose but Konishi discloses detecting the window of the vehicle in the real-world driving data (Konishi, col.9, line 63- col.10, line 5, “The passenger counting system may be configured so that the profile detection means limits a range in the input image, from which a partial image is cut out, to a predetermined range in which inside of a vehicle may be captured, a range in which the inside of a vehicle, which is determined by detecting a window frame by using image processing using template matching, is captured, or an existing range of a vehicle or an existing range of a window frame, which are detected by using a laser range scanner or a passage sensor”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Konishi into the combination of Chen and Trundle, to include all limitations of claim 19. That is, adding the window detection of Konishi to the system of Chen and Trundle. The motivation/ suggestion would have been to acquire an image in which the inside of the vehicle, which is an image capturing target, is captured (Konishi, col.6, lines 51-53).  

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GRACE Q LI whose telephone number is (571)270-0497. The examiner can normally be reached Monday - Friday, 8:00 am-5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on (571)-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GRACE Q LI/Examiner, Art Unit 2611                                                                                                                                                                                            5/21/2022