DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 1-21 are presented for examination.
Claims 1-21 are allowed.

Terminal Disclaimer
The Terminal Disclaimer, filed 06/29/2022, has been acknowledged, approved, and placed in the files of records.

Invention
The Present invention teaches "In one embodiment, a system generates a plurality of driving scenarios to train a reinforcement learning (RL) agent and replays each of the driving scenarios to train the RL agent by: applying a RL algorithm to an initial state of a driving scenario to determine a number of control actions from a number of discretized control/action options for the ADV to advance to a number of trajectory states which are based on a number of discretized trajectory state options, determining a reward prediction by the RL algorithm for each of the controls/actions, determining a judgment score for the trajectory states, and updating the RL agent based on the judgment score.”

EXAMINER’S AMENDMENT
An Examiner's amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
         Authorization for this Examiner's amendment was given in an interview with Zhan John Cao, Reg. No. L 1167 on June 29, 2022.


The claims have been amended as follows:


   9. (Currently Amended) The non-transitory machine-readable medium of claim 8, wherein the plurality of discretized control action options are generated based on a vehicle dynamic model for autonomous driving.

10. (Currently Amended) The non-transitory machine-readable medium of claim 8, wherein the plurality of discretized trajectory state options are generated by discretizing a region of interest for the driving scenario in view of a final destination trajectory state.

11. (Currently Amended) The non-transitory machine-readable medium of claim 8, wherein the judgment score includes scores representing whether the trajectory ends at a planned destination state, the trajectory is smooth, and the trajectory avoids one or more obstacles of an environment model.

12. (Currently Amended) The non-transitory machine-readable medium of claim 8, wherein the driving scenario includes one or more regions of interest (ROIs).

13. (Currently Amended) The non-transitory machine-readable medium of claim 8, wherein the RL agent includes an actor neural network and a critic neural network, and wherein the actor and critic neural networks are deep neural networks.

14. (Currently Amended) The non-transitory machine-readable medium of claim 13, wherein the actor neural network includes a convolutional neural network.

16. (Currently Amended) The data processing system of claim 15, wherein the plurality of discretized control action options are generated based on a vehicle dynamic model for autonomous driving.

17. (Currently Amended) The data processing system of claim 15, wherein the plurality of discretized trajectory state options are generated by discretizing a region of interest for the driving scenario in view of a final destination trajectory state.

18. (Currently Amended) The data processing system of claim 15, wherein the judgment score includes scores representing whether the trajectory ends at a planned destination state, the trajectory is smooth, and the trajectory avoids one or more obstacles of an environment model.

19. (Currently Amended)  The data processing system of claim 15, wherein the driving scenario includes one or more regions of interest (ROIs).

20. (Currently Amended) The data processing system of claim 15, wherein the RL agent includes an actor neural network and a critic neural network, and wherein the actor and critic neural networks are deep neural networks.

21. (Currently Amended) The data processing system of claim 20, wherein the actor neural network includes a convolutional neural network.

Reason for Allowance
Claims 1-21 are allowed granted that all pending issues are rendered moot. The following is an Examiner’s statement of reasons for allowance: claims 1-21 are allowed. The claimed subject matter is allowed based on the following: The claims are allowed based on the Remarks/Arguments of the Applicants filed 06/16/2022, Pages 1-4.  Further, the prior art on record fails to teach or suggest, either in singularity or in combination, the claimed subject matter of the invention. Therefore, the independent claims 1, 8, and 15 are allowed, the claims 2-7, 9-14, and 16-21 are also allowed based on their dependency upon the independent claims 1, 8, and 15.

Therefore, when taken as a whole application, and incorporating all the respective limitations, none of the prior art discloses the features as claimed.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submission should be clearly labeled "Comments on Statement of Reasons for Allowance.”

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

          Beller et al. (US Pub. No.: 2020/0139967 A1) teaches “Techniques for determining to modify a trajectory based on an object are discussed herein. A vehicle can determine a drivable area of an environment, capture sensor data representing an object in the environment, and perform a spot check to determine whether or not to modify a trajectory. Such a spot check may include processing to incorporate an actual or predicted extent of the object into the drivable area to modify the drivable area. A distance between a reference trajectory and the object can be determined at discrete points along the reference trajectory, and based on a cost, distance, or intersection associated with the trajectory and the modified area, the vehicle can modify its trajectory. One trajectory modification includes following, which may include varying a longitudinal control of the vehicle, for example, to maintain a relative distance and velocity between the vehicle and the object.”

          Zhang et al.  (US Pat. No.: 2019/0235516 A1) teaches “According to some embodiments, a system calculates a first trajectory based on a map and a route information. The system performs a path optimization based on the first trajectory, traffic rules, and an obstacle information describing obstacles perceived by the ADV. The path optimization is performed by performing a spline curve based path optimization on the first trajectory, determining whether a result of the spline curve based path optimization satisfies a first predetermined condition, performing a finite element based path optimization on the first trajectory in response to determining that the result of the spline curve based path optimization does not satisfy the first predetermined condition, performing a speed optimization based on a result of the path optimization, and generating a second trajectory based on the path optimization and the speed optimization to control the ADV.”
        
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BABAR SARWAR whose telephone number is (571)270-5584.  The examiner can normally be reached on Mon-Fri 9:00 AM-5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Faris S. Almatrahi can be reached on (313)446-4821.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/BABAR SARWAR/Primary Examiner, Art Unit 3667