DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 1-21 are presented for examination.
Claims 1-21 are rejected.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claim 21 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
101 Analysis – Step 1
Claim 21 is directed to computer-implemented method (i.e., a method). 
Therefore, Claim 21 is within at least one of the four statutory categories.
101 Analysis – Step 2A, Prong I
Regarding Prong I of the Step 2A analysis in the 2019 PEG, the claims are to be analyzed to determine whether they recite subject matter that falls within one of the follow groups of abstract ideas: a) mathematical concepts, b) certain methods of organizing human activity, and/or c) mental processes. 
Independent claim 21 includes the recitations “…building, by a system, an archive storing states reachable by an agent, the system configured to receive sensor data from one or more sensors, the sensor data describing an environment of the agent, the agent being trained to start from one or more starting states and reach one or more final states, the building comprising repeatedly performing: selecting a state from the archive, reaching, by the agent, the selected state, determining, from the selected state, one or more explore states reachable from the selected state by performing one or more actions at the selected state, determining, for each explore state, whether the explore state is already stored in the archive, and responsive to determining that an explore state is not already stored in the archive, storing the explore state in the archive; training a model based on trajectories stored in the archive, the model configured to receive input sensor data and determine an action to be performed based on the input sensor data; providing the trained model to a target system comprising a new agent; and executing the trained model by the new agent of the target system, the executing comprising, repeating following steps to reach a final state: receiving sensor data describing environment of the new agent, providing the sensor data as input to the trained model, determining using the trained model, a next action to be performed, and executing by the new agent instructions for performing the next action.” step. This “building …, storing…, performing…, selecting…, determining…, receiving…, providing…, executing…” steps recite an abstract idea. 
The examiner submits that the foregoing “building …, storing…, performing…, selecting…, determining…, receiving…, providing…, executing…” steps limitation(s) constitute a “mental process” because under its broadest reasonable interpretation, the claim covers performance of the limitation in the human mind. For example, the “building …, storing…, performing…, selecting…, determining…, receiving…, providing…, executing…” steps in the context of the claim encompass a user mentally performing calculations to achieve the functions of computing driving information and generating a control moment based on the driving information. 
Accordingly, the claim recites at least one abstract idea.
101 Analysis – Step 2A, Prong II
Regarding Prong II of the Step 2A analysis in the 2019 PEG, the claims are to be analyzed to determine whether the claim, as a whole, integrates the abstract into a practical application. As noted in the 2019 PEG, it must be determined whether any additional elements in the claim beyond the abstract idea integrate the exception into a practical application in a manner that imposes a meaningful limit on the judicial exception. The courts have indicated that additional elements merely using a computer to implement an abstract idea, adding insignificant extra solution activity, or generally linking use of a judicial exception to a particular technological environment or field of use do not integrate a judicial exception into a “practical application.”
In the present case, the additional limitations beyond the above-noted abstract idea are “one or more sensors…, and… the agent…, the new agent…, the trained model…” steps; “building …, storing…, performing…, selecting…, determining…, receiving…, providing…, executing…” steps.

For the following reason(s), the examiner submits that the above identified additional limitations do not integrate the above-noted abstract idea into a practical application.
Regarding the additional limitations of a “one or more sensors…, and… the agent…, the new agent…, the trained model…” steps, the examiner submits that the step is recited at a high-level of generality (i.e., as a generic computer component) such that it amounts no more than mere instructions to apply the exception using a generic computer component. The “one or more sensors…, and… the agent…, the new agent…, the trained model…” steps are recited at a high-level of generality (i.e., as a generic data gathering/receiving/displaying means) such that they amount to mere solution activities to apply the recited abstract idea(s) in the field of navigation.
Thus, taken alone, the additional elements do not integrate the abstract idea into a practical application. Further, looking at the additional limitation(s) as an ordered combination or as a whole, the limitation(s) add nothing that is not already present when looking at the elements taken individually. For instance, there is no indication that the additional elements, when considered as a whole, reflect an improvement in the functioning of a computer or an improvement to another technology or technical field, apply or use the above-noted judicial exception to effect a particular treatment or prophylaxis for a disease or medical condition, implement/use the above-noted judicial exception with a particular machine or manufacture that is integral to the claim, effect a transformation or reduction of a particular article to a different state or thing, or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is not more than a drafting effort designed to monopolize the exception (MPEP § 2106.05). Accordingly, the additional limitation(s) do/does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
101 Analysis – Step 2B
Regarding Step 2B of the Revised Guidance, the independent claims 1 and 17 do not include additional elements (considered both individually and as an ordered combination) that are sufficient to amount to significantly more than the judicial exception for the same reasons to those discussed above with respect to determining that the claim does not integrate the abstract idea into a practical application. As discussed above with respect to integration of the abstract idea into a practical application, the additional limitations of a “one or more processors configured to …and… processor-implemented method…” step, the examiner submits that the step is recited at a high-level of generality (i.e., as a generic computer component) such that it amounts no more than mere instructions to apply the exception using a generic computer component. The “one or more sensors…, and… the agent…, the new agent…, the trained model…” steps are recited at a high-level of generality (i.e., as a generic data gathering/receiving/displaying means) such that they amount to mere solution activities to apply the recited abstract idea(s) in the field of navigation.
As explained, the additional elements are recited at a high level of generality to simply implement the abstract idea and are not themselves being technologically improved.  See, e.g., MPEP §2106.05; Alice Corp. v. CLS Bank, 573 U.S., 208, 223 (“[T]he mere recitation of a generic computer cannot transform a patent-ineligible abstract idea into a patent-eligible invention”).  Additionally, the transmission of data such as “providing the trained model to a target system comprising a new agent; and executing the trained model by the new agent of the target system…”, is not only an abstract idea, but further constitutes insignificant extra-solution activity (see MPEP for example: Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); TLI Communications LLC v. AV Auto. LLC, 823 F.3d 607, 610, 118 USPQ2d 1744, 1745 (Fed. Cir. 2016) (using a telephone for image transmission); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network)). See, e.g. MPEP 2106.05; In re Bilski, 545 F.3d 943, 963 (Fed. Cir. 2008) (en banc), aff’d on other grounds, 561 U.S. 593 (2010) (characterizing data gathering steps as insignificant extra-solution activity); see also CyberSource, 654 F.3d at 1371–72 (noting that even if some physical steps are required to obtain information from a database (e.g., entering a query via a keyboard, clicking a mouse), such data-gathering steps cannot alone confer patentability). Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016) (Selecting information for collection, analysis and display constitute insignificant extra-solution activity). Hence, the claim(s) is/are not patent eligible.
As such, claim 21 is rejected under 35 USC 101 as being drawn to an abstract idea without significantly more, and thus are ineligible. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claim(s) 1-21 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Ogale et al. (US Pub. No.: 2019/0034794 A1: hereinafter “Ogale”).

          Consider claims 1, 20, and 21:
                   Ogale teaches a self-driving vehicle, a computer-implemented method (See Ogale, e.g., “Systems, methods, devices, and other techniques for training a trajectory planning neural network system to determine waypoints for trajectories of vehicles. A neural network training system can train the trajectory planning neural network system on the multiple training data sets…” of Abstract, ¶ [0010], ¶ [0023], ¶ [0027], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718)  comprising: receiving, from one or more sensors on a self-driving vehicle, sensor data describing an environment of the self-driving vehicle (See Ogale, e.g., “obtaining, by a neural network training system, multiple training data sets…a first training input that characterizes a set of waypoints that represent respective locations of a vehicle at each of a series of first time steps, (ii) a second training input that characterizes at least one of (a) environmental data that represents a current state of an environment of the vehicle…” of ¶ [0010], ¶ [0023], ¶ [0027], ¶ [0031], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718); determining a state of the environment based on the sensor data (See Ogale, e.g., “obtaining, by a neural network training system…environmental data that represents a current state of an environment of the vehicle…” of ¶ [0010], ¶ [0023], ¶ [0027], ¶ [0031], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718); determining an action to be performed by the self-driving vehicle by applying a trained model to the state of the environment (See Ogale, e.g., “…processing the first training input and the second training input according to current values of parameters of the trajectory planning neural network system to generate a set of output scores…determining an output error using the target output and the set of output scores, and adjusting the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0112], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), the trained model including an archive storing states reachable by an agent in a training environment, each state stored in the archive is associated with a trajectory for reaching the state (See Ogale, e.g., “…the training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), the archive generated by performing operations comprising: selecting a state from the archive, reaching, by the agent, the selected state (See Ogale, e.g., “…for a given set of training data, the first training input can characterize waypoint data representing a set of locations traversed by a human-operated vehicle at a series of time steps from 1 through n−1. The training target output 708 for the set of training data can then characterize data representing the location actually traversed by the human-operated vehicle at time step n, i.e., the time step that immediately follows the last time step represented by the waypoint data of the first training input…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), determining, from the selected state, one or more explore states reachable from the selected state by performing one or more actions at the selected state (See Ogale, e.g., “…the training inputs 704 and/or 706, the training target output 708, or both, of one or more training data sets are derived from results of one or more virtual vehicles driven in a simulated environment…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), determining, for each explore state, whether the explore state is already stored in the archive (See Ogale, e.g., “…the training system can train the trajectory planning neural network on a collection of training data sets that include some sets derived from records of human-operated vehicles driven in a real-world environment and other sets derived from results of virtual vehicles driven in a simulated environment…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), and responsive to determining that an explore state is not already stored in the archive, storing the explore state in the archive (See Ogale, e.g., “…The training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604. Examples of driving scenarios that may be emphasized during a training session include lane merges, unprotected left turns, lane changes, impending collisions, and post-collision activity…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718); and instructing the self-driving vehicle to operate in the environment according to the determined action (See Ogale, e.g., “…the training system then adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 2:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein the agent is another self-driving vehicle in a real-world environment (See Ogale, e.g., “…The environment 600 can include a neural network training system 602, a training data repository 604, and the trajectory planning neural network system 102. The neural network training system 602 can include one or more computers in one or more locations…” of ¶ [0010], ¶ [0023], ¶ [0027], ¶ [0031], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), wherein the agent has one or more sensors configured to detect the environment as sensor data (See Ogale, e.g., “…the training system then adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 3:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein the agent is a simulated vehicle in a simulated real-world environment (See Ogale, e.g., “…the training system then adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 4:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 3. In addition, Ogale teaches wherein a simulator is used for building the archive and training the model (See Ogale, e.g., “…The training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604. Examples of driving scenarios that may be emphasized during a training session include lane merges, unprotected left turns, lane changes, impending collisions, and post-collision activity…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), wherein the simulator is configured to act in a deterministic mode (See Ogale, e.g., “…the training system then adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718) and a stochastic mode (See Ogale, e.g., “…the training system can train the trajectory planning neural network on a collection of training data sets that include some sets derived from records of human-operated vehicles driven in a real-world environment and other sets derived from results of virtual vehicles driven in a simulated environment…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), wherein the simulator is used in the deterministic mode for building the archive (See Ogale, e.g., “…the training system then adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), and wherein the simulator is used in the stochastic mode for training the model (See Ogale, e.g., “…the training system can train the trajectory planning neural network on a collection of training data sets that include some sets derived from records of human-operated vehicles driven in a real-world environment and other sets derived from results of virtual vehicles driven in a simulated environment…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

         Consider claim 5:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein determining the state of the environment based on the sensor data comprises transforming the sensor data and identifying the state corresponding to the transformed sensor data (See Ogale, e.g., “obtaining, by a neural network training system…environmental data that represents a current state of an environment of the vehicle…” of ¶ [0010], ¶ [0023], ¶ [0027], ¶ [0031], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

         Consider claim 6:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 5. In addition, Ogale teaches wherein transforming the sensor data comprises extracting one or more features of the environment from the sensor data (See Ogale, e.g., “obtaining, by a neural network training system…environmental data that represents a current state of an environment of the vehicle…” of ¶ [0010], ¶ [0023], ¶ [0027], ¶ [0031], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 7:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein the training environment is a resettable environment (See Ogale, e.g., “obtaining, by a neural network training system…environmental data that represents a current state of an environment of the vehicle…” of ¶ [0010], ¶ [0023], ¶ [0027], ¶ [0031], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), wherein reaching the selected state by the agent comprises: resetting the state of the agent to the selected state (See Ogale, e.g., “…the training system can train the trajectory planning neural network on a collection of training data sets that include some sets derived from records of human-operated vehicles driven in a real-world environment and other sets derived from results of virtual vehicles driven in a simulated environment…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 8:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein the training environment is a deterministic environment (See Ogale, e.g., “…the training system then adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), wherein reaching the selected state by the agent comprises: accessing a trajectory of the selected state from the archive (See Ogale, e.g., “…the training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718); and replaying the trajectory of the selected state by a sequence of actions corresponding to the trajectory (See Ogale, e.g., “…the training system then adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 9:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein the training environment is a stochastic environment (See Ogale, e.g., “…the training system can train the trajectory planning neural network on a collection of training data sets that include some sets derived from records of human-operated vehicles driven in a real-world environment and other sets derived from results of virtual vehicles driven in a simulated environment…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), wherein the model is a first model, wherein reaching the selected state by the agent comprises: training a second model to follow the trajectory of the selected state (See Ogale, e.g., “…The training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604. Examples of driving scenarios that may be emphasized during a training session include lane merges, unprotected left turns, lane changes, impending collisions, and post-collision activity…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718); and executing the second model to follow the trajectory (See Ogale, e.g., “…the training system then adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 10:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein the archive is generated by performing the operations over multiple iterations (See Ogale, e.g., “…The training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604. Examples of driving scenarios that may be emphasized during a training session include lane merges, unprotected left turns, lane changes, impending collisions, and post-collision activity…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 11:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein storing the explore state in the archive comprises: determining a trajectory for the explore state based on the trajectory for the selected state and one or more actions performed to reach the explore state (See Ogale, e.g., “…the training inputs 704 and/or 706, the training target output 708, or both, of one or more training data sets are derived from results of one or more virtual vehicles driven in a simulated environment…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718); and storing the trajectory for the explore state in association with the explore state in the archive (See Ogale, e.g., “…The training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604. Examples of driving scenarios that may be emphasized during a training session include lane merges, unprotected left turns, lane changes, impending collisions, and post-collision activity…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 12:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein the operations further comprise: responsive to determining that an explore state is already stored in the archive, accessing a previously stored trajectory associated with the explore state from the archive (See Ogale, e.g., “…The training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604. Examples of driving scenarios that may be emphasized during a training session include lane merges, unprotected left turns, lane changes, impending collisions, and post-collision activity…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718); determining a new trajectory for the explore state based on the trajectory for the selected state and one or more actions performed to reach the explore state from the selected state (See Ogale, e.g., “…the training system then adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718); comparing the previously stored trajectory to the new trajectory (See Ogale, e.g., “…adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718); and responsive to determining that the new trajectory is shorter than the previously stored trajectory, replacing the previously stored trajectory with the new trajectory in the archive (See Ogale, e.g., “…adjusts the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 13:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein the model is trained based on trajectories of states stored in the archive (See Ogale, e.g., “…The training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604. Examples of driving scenarios that may be emphasized during a training session include lane merges, unprotected left turns, lane changes, impending collisions, and post-collision activity…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 14:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 13. In addition, Ogale teaches wherein the model is trained using an imitation learning based technique, the imitation based learning technique evaluating one or more demonstrations of trajectories from a start state to a final state (See Ogale, e.g., “…The training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 15:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 13. In addition, Ogale teaches wherein the model is a neural network (See Ogale, e.g., “Systems, methods, devices, and other techniques for training a trajectory planning neural network system to determine waypoints for trajectories of vehicles. A neural network training system can train the trajectory planning neural network system on the multiple training data sets…” of Abstract, ¶ [0010], ¶ [0023], ¶ [0027], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 16:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 1. In addition, Ogale teaches wherein each state stored in the archive is assigned an explorative score indicating a likelihood that that the agent will discover a new state from that state (See Ogale, e.g., “…The training data sets can be sampled (e.g., selected) from a pool of candidate training data sets that have been made available to the training system, e.g., training data sets stored in training data repository 604…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718), wherein selecting the state is based on the explorative scores of one or more states in the archive (See Ogale, e.g., “…the training inputs 704 and/or 706, the training target output 708, or both, of one or more training data sets are derived from results of one or more virtual vehicles driven in a simulated environment…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 17:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 16. In addition, Ogale teaches wherein the explorative score for a state stored in the archive is based on a timestamp when the state was stored in the archive (See Ogale, e.g., “…the training inputs 704 and/or 706, the training target output 708, or both, of one or more training data sets are derived from results of one or more virtual vehicles driven in a simulated environment…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 18:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 16. In addition, Ogale teaches wherein the explorative score for a state stored in the archive is based on a size of the trajectory of the state (See Ogale, e.g., “…processing the first training input and the second training input according to current values of parameters of the trajectory planning neural network system to generate a set of output scores…adjusting the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

          Consider claim 19:
                   Ogale teaches everything claimed as implemented above in the rejection of claim 16. In addition, Ogale teaches wherein the explorative score for a state stored in the archive is based on one or more features in the state (See Ogale, e.g., “…processing the first training input and the second training input according to current values of parameters of the trajectory planning neural network system to generate a set of output scores…adjusting the current values of the parameters of the trajectory planning neural network system using the output error…” of ¶ [0031], ¶ [0047], ¶ [0096]-¶ [0103], ¶ [0108]-¶ [0113], and Fig. 1 elements 100-128, Fig. 4 steps 402-412, Fig. 6 elements 602-604, and Fig. 7 steps 700-718).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

          Ding et al. (US Pub. No.: 2021/0286360 A1) teaches “Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying high-priority agents in the vicinity of a vehicle and, for only those agents which are high priority agents, generating data characterizing the agents using a first prediction model. In a first aspect, a system identifies multiple agents in an environment in a vicinity of a vehicle. The system generates a respective importance score for each of the agents by processing a feature representation of each agent using an importance scoring model. The importance score for an agent characterizes an estimated impact of the agent on planning decisions generated by a planning system of the vehicle which plans a future trajectory of the vehicle. The system identifies, as high-priority agents, a proper subset of the plurality of agents with the highest importance scores.”

         Muller et al. (US Pub. No.: 2019/0384303 A1) teaches “n various examples, a machine learning model—such as a deep neural network (DNN)—may be trained to use image data and/or other sensor data as inputs to generate two-dimensional or three-dimensional trajectory points in world space, a vehicle orientation, and/or a vehicle state. For example, sensor data that represents orientation, steering information, and/or speed of a vehicle may be collected and used to automatically generate a trajectory for use as ground truth data for training the DNN. Once deployed, the trajectory points, the vehicle orientation, and/or the vehicle state may be used by a control component (e.g., a vehicle controller) for controlling the vehicle through a physical environment. For example, the control component may use these outputs of the DNN to determine a control profile (e.g., steering, decelerating, and/or accelerating) specific to the vehicle for controlling the vehicle through the physical environment.”

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BABAR SARWAR whose telephone number is (571)270-5584.  The examiner can normally be reached on Mon-Fri 9:00 AM-5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Faris S. Almatrahi can be reached on (313)446-4821.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/BABAR SARWAR/Primary Examiner, Art Unit 3667