DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 13 and 22 are objected to because of the following informalities:  The claims depend from the same independent claim (claim 15) however contain the same subject matter.  It appears as if claim 13 should depend on claim 10.  Appropriate correction is required.



Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 5 and 19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claims 5 and 19 recites the limitation "the target trajectory".  There is insufficient antecedent basis for this limitation in the claim.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claim(s) 1-3, 8, 10, 13, 15-17, and 22 is/are rejected under 35 U.S.C. 102 (a)(2) as being anticpated by Choi US 2018/0124423.

Regarding claims 1, and 10 Choi discloses a computer-implemented method to generate a motion planning cost function for an autonomous driving vehicle (ADV) (claim 1), A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations (Claim 10), comprising: 


collecting information for a driving environment surrounding the ADV using a plurality of sensors of the ADV (in at least paragraph [0056], sensor interface collects data from one or more sensors); 

generating a plurality of sample trajectories from a trajectory sample space for the driving environment (in at least paragraph [0020], block 202 generates a diverse set of prediction samples to provide accurate prediction of future trajectories 106); 



ranking the sample trajectories based on the determined rewards (in at least paragraphs [0022] and [0035], wherein block 204 ranks the samples using an RNN that is augmented with a fusion layer that incorporates interaction between agents and block 204 uses training in a multitask learning framework where the ranking objective is formulated using inverse optimal control (also known in the art as inverse reinforcement learning); 

determining a highest ranked trajectory based on the ranking (in at least paragraph [0057], wherein the most likely future trajectories for agents are predicted based on the ranking an refinement module determination of a set of predictions and further claim 1, when a trajectory meets a predetermined condition(highest ranked); and 

selecting the highest ranked trajectory to control the ADV autonomously according to the highest ranked trajectory (in at least paragraph [0058], wherein a response module provides manual or automated actions responsive to the determined trajectories matching certain conditions).  

Regarding claim 15, Choi discloses a computer-implemented method to train a rewards model for an autonomous driving vehicle (ADV), the method comprising: 



generating a plurality of sample trajectories from a trajectory sample space for a driving environment of the target trajectory (in at least paragraph [0020], block 202 generates a diverse set of prediction samples to provide accurate prediction of future trajectories 106); and 

generating a reward model by applying a rank based conditional inverse reinforcement learning algorithm to the sample trajectories and the target trajectory (in at least paragraphs [0022] and [0035], wherein block 204 ranks the samples using an RNN that is augmented with a fusion layer that incorporates interaction between agents and block 204 uses training in a multitask learning framework where the ranking objective is formulated using inverse optimal control (also known in the art as inverse reinforcement learning).  


Regarding claims 2 and 16, Choi discloses the limitations of claims 1 and 15 as shown above.  Choi further discloses the method, wherein the reward model comprises a machine learning model comprises a multi-layer perceptron neural network model (in at least paragraph [0020-0022], wherein a RNN neural network is augmented with a fusion layer that incorporates interaction between agents and a convolutional neural network, and further wherein it is old and well known that a perceptron is known as a single layer of a neural network, and therefore since a plurality of neural networks are used, there are multi-layers of perceptrons used).
  

	Regarding claims 3 and 17, Choi discloses the limitations of claims 2 and 16 as shown above.  Choi further discloses the method, wherein the multi-layer perceptron neural network model includes an output layer to output a trajectory cost value (in at least paragraph [0043], wherein the RNN decoder outputs a scoring block which scores and samples and tracks accumulated rewards).  


Regarding claims 8, 13 and 22, Choi discloses the limitations of claims 1 and 15 as shown above.  Choi further discloses the method, further comprising determining a plurality of features for each of the sample trajectories, and wherein the reward for each of the sample trajectories is determined based on the plurality of features  (in at least paragraph [0015] and scene context and trajectories derived from image-based features and further paragraphs [0018] and [0022]).   

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 4-5, and 18-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Choi in view of Dey US 2010/0106603.

	Regarding claims 4 and 18, Choi discloses the limitations of claims 1 and 15 as shown above.  Choi further discloses the reward model comprises a model based on a combination of features for the driving environment (in at least paragraph [0015] and scene context and trajectories derived from image-

	Regarding claims 5 and 19, the combination of Choi and Dey teaches the limitations of claims 4 and 18 as shown above.  Choi fails to explicitly disclose however Dey teaches the method, wherein the features comprise: acceleration, jerk, and velocity of the sample trajectory or the target trajectory, smoothness of roadway, or a distance from the sample trajectory or the target trajectory to surrounding obstacles observed on the roadway (in at least paragraphs [0008-0010], [0051-0052] and [0083-0087], wherein the speed is used as a feature for the sample or target trajectory).  Although Dey does not explicitly teach acceleration and jerk, a person of ordinary skill in the art at the time of the invention would easily ascertain these values since speed is provided as a distance over time (mph), and acceleration is the speed per unit time and the jerk is the acceleration per unit time.  Therefore if a user has a distance per unit time, each value would be obvious to a person of ordinary skill in the art.  It would have been obvious to a person of ordinary skill in the art at the time of the invention to provide the motion planning cost function as disclosed by Choi with the linear function and plurality of features as taught by Dey in order to accurately predict a decision making behavior.    



Claims 6-7, 11-12, and 20-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Choi in view of Duan “Learning State Representations for Robotic Control”.


Regarding claims 6, 11 and 20, Choi discloses the limitations of claims 1, 10 and 15, as shown above.  Choi further discloses wherein the reward model is generated by: generating a It, It+1), both branches of the Siamese network first extract a state representation pair (xt, xt+1). From the state pair, subsequent layers have to predict the action that will cause the transition from xt to xt+1. Clearly, the Siamese network is modeling an inverse dynamics of the poking task. Given a dynamic system, an inverse model predicts the action that will bring the current system state into the desired state. In optimal control settings, inverse models are commonly used as feedforward controllers [39]. It is primarily useful in trajectory type of tasks).  It would have been obvious to a person of ordinary skill in the art at the time of the invention to provide the motion planning cost function as disclosed by Choi with the Siamese network as taught by Duan in order to extract representations without reconstructing image data. 


	Regarding claims 7, 12, and 21, the combination of Choi and Duan teaches the limitations of claims 6, 11 and 20.  Choi further discloses wherein the expert trajectory is generated based on a .  


	
Claims 9, 14, and 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Choi in view of Uchibe US 2017/0213151.

Regarding claims 9, 14, and 23, Choi discloses the limitations of claims 1, 11 and 15, as shown above.  Choi further discloses the plurality of sample trajectories is generated based on information for a driving environment of the ADV (in at least paragraph [0020], block 202 generates a diverse set of prediction samples to provide accurate prediction of future trajectories 106).  Choi fails to explicitly disclose however Uchibe teaches the method, wherein the plurality of sample trajectories is generated uniformly based on information for a driving environment of the ADV (in at least paragraphs [0159-0160], uniform sampling method over the entire state space).  It would have been obvious to a person of ordinary skill in the art at the time of the invention to provide the motion planning cost function as disclosed by Choi with the uniform distribution as taught by Uchibe in order to accurately determine samples based on a uniform distribution to increase the ability of a decision to be made based on cost/rewards from observed behaviors.  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

-US10831208 discloses a computing system can be programmed to determine a vehicle action based on vehicle sensor data input to a deep neural network (DNN) trained using an inverse reinforcement learning (IRL) system that includes a variational auto-encoder (VAE).  The computing system can be further programmed to operate a vehicle based on the vehicle action.


Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS K WILTEY whose telephone number is (571)272-7193.  The examiner can normally be reached on M-F 7-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John Olszewski can be reached on (571)272-2706.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/NICHOLAS K WILTEY/Primary Examiner, Art Unit 3669