DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Claims 1-20 are currently pending and have been examined in this application. This communication is the first action on the merits (FAOM). 

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/15/2022.  The submissions are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because:
reference character 230 has been used to designate both Motion Forecast Data and Respective Trajectories.  
reference character 324 has been used to designate both Camera Image and Continuous Fusion.  
reference character 185 has been used to designate interaction system (mentioned in Fig. 1), prediction system (mentioned in Fig. 2), and third party trajectory system (mentioned in Fig. 6).
Reference character 715 is used to designate a “trajectory forecasting unit” in Fig. 7, but it is called “trajectory/behavior forecasting unit(s)” in the specification (¶ 150). 
Reference character 725 is used to designate a “object detection unit” in Fig. 7, but it is called “machine-learned object recognition/detection model application unit(s)” in the specification (¶ 150). 
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description: 349 found in Figure 3B.  
Corrected drawing sheets in compliance with 37 CFR 1.121(d), or amendment to the specification to add the reference character(s) in the description in compliance with 37 CFR 1.121(b) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.

The disclosure is objected to because of the following informalities:
In paragraph [0130], “network(s) 640” should read “network(s) 645”
In paragraph [0144], “model trainer 785” should read “model trainer 685”
In paragraph [0145], “model trainer 680” should read “model trainer 685”
The equations shown on paragraphs [0035 - 0036], [0038 - 0040], [0051 - 0052], [0091], [0093 -0095], [0099], [0102], [0109], and [0112 - 0115] are blurry making it difficult to discern the various symbols/variables.  
Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are:
an interaction transformer model… in claims 1-2, 7, 11, and 16; no structure could be explicitly found in the instance specification.
a prediction model… in claim 1; no structure could be explicitly found in the instance specification.
a recurrent model… in claims 7, and 16; no structure could be explicitly found in the instance specification.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.



In claims 1-2, 7, 11, and 16 the limitations “interaction transformer model…”, “prediction model…”, and “recurrent model…” invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. The disclosure is devoid of any structure that performs the functions in the claims. Rather, the disclosure repeats the titles “interaction transformer model”, “prediction model”, and “recurrent model” and describes their functions, but the disclosure does not clearly explain the nature of these items. Therefore, the claims 1-2, 7, 11, and 16 along with the corresponding dependent claims 3-9, 12, and 14 are indefinite and are rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

Claims 1-2, 7, 11, and 16 along with the corresponding dependent claims 3-9, 12, and 14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. It is unclear what performs/executes these models. The Examiner interprets these models as a generic processor to execute the functions in the claims. As best understood by the Examiner, the structure for the limitations  “interaction transformer model…”, “prediction model…”, and “recurrent model…” may be found respectively at least in paragraphs [0087], [0082], and [0093] of the instant specification; a processor. Appropriate correction is required.
Additionally, dependent claims 3-9, 12, and 14 are rejected as being dependent on the previously rejected base claims and for failing to cure the deficiencies listed above. Specifically, the models are only in claims 1-2, 7, and 11, and claims 3-9, 12, and 14 are dependent on the previously rejected claims. Appropriate correction and/or clarification is required to remedy the above deficiencies.

The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-2, 7, 11, and 16 along with the corresponding dependent claims 3-9, 12, and 14 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
As described above, the disclosure does not provide adequate structure to perform the claimed functions of the models. The specification does not demonstrate that applicant has made an invention that achieves the claimed function because the invention is not described with sufficient detail such that one of the ordinary skill in the art can reasonably conclude that the inventor had possession of the claimed invention. . As best understood by the Examiner, the structure for the limitations  “interaction transformer model…”, “prediction model…”, and “recurrent model…” has been assumed to correspond to a processor, and they may be found respectively at least in paragraphs [0087], [0082], and [0093] of the instant specification. Correction is required.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-6, 8-15, and 17 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Millard (US 11,016,481).

With respect to claim 1, Millard discloses a computing system (ABSTRACT, Fig. 2, Computing System 200), comprising: 
an interaction transformer model (Fig. 2, Mapping Engine 214) configured to receive a relative location embedding that describes relative respective locations of a plurality of actors with respect to an autonomous vehicle (Fig. 2, Sensor Data 212, Col. 6, lines 4-30, “The mapping engine 214 is operable to receive sensor data 212 from the sensor subsystems 204 and to generate a virtual representation of at least a portion of the robot's environment from the sensor data 212.”), and in response to receipt of the relative location embedding, generate motion forecast data with respect to the plurality of actors (Fig. 2, Occupancy Maps 216, Col. 7, line 66 - Col. 8, line 4, “The occupancy maps 216 that are provided as output from mapping engine 214 indicate current or recent locations of obstacles that were observed by the robot 202, the obstacle location prediction data 226 indicates predictions of where the obstacles will be located at one or more future times.”); 
a prediction model (Fig. 2, Obstacle Location Prediction Engine 218, Col. 8, lines 45-47, “The obstacle location prediction engine 218 can include object recognizer 220 and predictive model 222.”) configured to receive the motion forecast data (Col. 8, line 66 - Col. 9, line 4, “The obstacle location prediction engine 218 receives a series of occupancy maps 216 from the mapping engine 214. Each occupancy map 216 indicates locations of obstacles in the robot's environment at a different one of a series of time steps (e.g., a current time step and one or more preceding time steps).”), and in response to receipt of the motion forecast data, generate respective trajectories for the plurality of actors for a current time step and respective projected trajectories for a subsequent time step (Fig. 2, Obstacle Prediction Data 226, Col. 7, line 63 - Col. 8, line 4, “The obstacle location prediction engine 218 processes occupancy maps 216 from mapping engine 214 and outputs obstacle location prediction data 226. Whereas the occupancy maps 216 that are provided as output from mapping engine 214 indicate current or recent locations of obstacles that were observed by the robot 202, the obstacle location prediction data 226 indicates predictions of where the obstacles will be located at one or more future times.” and Col. 8, lines 18-22, “Obstacle location prediction data 226 indicates, for each obstacle represented in an occupancy map 216 at a current or recent time, a probability distribution of the object being located at one or more locations in the environment at one or more future times.”); 
a memory that stores a set of instructions (Fig. 6, Memory 620); and 
one or more processors (Fig. 6, Processor 610) which use the set of instructions to: 
for a first iteration corresponding with a first time step: 
generate first motion forecast data for a plurality of actors at the first time step using the interaction transformer model based on a first relative location embedding (Col. 9, lines 7-11, “The robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step. The mapping engine 214 may then process each iteration of sensor data 212 to generate an occupancy map 216 for a next time step.”); 
input the first motion forecast data into the prediction model (Col. 7, lines 63-65, “the obstacle location prediction engine 218 processes occupancy maps 216 from mapping engine 214 ...”); and 
receive, as an output of the prediction model (Col. 7, lines 65-66, “… and outputs obstacle location prediction data 226”), first respective trajectories of the plurality of actors for the first time step and respective projected trajectories for a second time step that is after the first time step (Col. 9, lines 11-16, “The obstacle location prediction engine 218 processes occupancy maps 216 from a set of the most recent time steps (e.g., a predetermined number n of the most recent time steps) to generate obstacle location predictions for one or more future time steps.”); and 
for a second iteration corresponding with the second time step: 
generate a second relative location embedding for the second time step based on the respective projected trajectories for the second time step from the first iteration (Col. 9, lines 49-61, “the predictive model 222 may process preceding obstacle location predictions (e.g., predicted occupancy maps from preceding time steps) to generate obstacle location predictions for time steps further in the future. For example, the predictive model may initially process occupancy maps 216 from time steps n−10 through n, where n is the current time or most recent time step, one at a time and in sequence to generate a predicted occupancy map for future time step n+1. The predicted occupancy map for future time step n+1 may then be provided as input to the predictive model 222 to generate the next predicted occupancy map for future time step n+2.”); 
analyze the second relative location embedding using the interaction transformer model to produce (Col. 7, lines 55-56, “the mapping engine 214 analyzes the sensor data 212”), as an output of the interaction transformer model, second motion forecast data for the plurality of actors at the second time step (Col. 9, lines 9-11, “The mapping engine 214 may then process each iteration of sensor data 212 to generate an occupancy map 216 for a next time step.”); and 
determine second respective trajectories of the plurality of actors for the second time step using the prediction model based on the second motion forecast data (Col. 9, lines 49-61, “the predictive model 222 may process preceding obstacle location predictions (e.g., predicted occupancy maps from preceding time steps) to generate obstacle location predictions for time steps further in the future. (Col. 10, lines 26-28, “The obstacle location predictions can be provided in obstacle location prediction data 226.”). For example, the predictive model may initially process occupancy maps 216 from time steps n−10 through n, where n is the current time or most recent time step, one at a time and in sequence to generate a predicted occupancy map for future time step n+1. The predicted occupancy map for future time step n+1 may then be provided as input to the predictive model 222 to generate the next predicted occupancy map for future time step n+2.”).

With respect to claim 2, Millard discloses the computing system of claim 1, wherein: the interaction transformer model (Fig. 2, Mapping Engine 214) is configured to generate the motion forecast data (Fig. 2, Occupancy Maps 216) by receiving the relative location embedding that describes the relative respective locations of the plurality of actors with respect to the autonomous vehicle (Fig. 2, Sensor Data 212) and by further receiving an input sequence (Col. 9, lines 7-9, “the robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step” and Fig. 4, Stage 402, Col. 13, lines 40-47) describing object detection data (Col. 6, line 33 - Col. 7, line 32), and in response to receipt of the relative location embedding and the input sequence, generate the motion forecast data with respect to the plurality of actors (Fig. 3, Occupancy Map 300, Col. 7, lines 34-51, i.e., “Black-shaded cells indicate that the corresponding portion of the environment represented by the cell is occupied by an obstacle (e.g., static objects). The white-shaded cells indicate that the corresponding portion of the environment represented by the cell is not occupied by an obstacle (e.g., dynamic objects).” and Col. 9, lines 1-11, “Each occupancy map 216 indicates locations of obstacles in the robot's environment at a different one of a series of time steps (e.g., a current time step and one or more preceding time steps). The occupancy maps 216 can be based on observations of the robot's environment at times corresponding to the series of time steps. For example, the robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step. The mapping engine 214 may then process each iteration of sensor data 212 to generate an occupancy map 216 for a next time step.”); 
the first motion forecast data is based on the first relative location embedding and a first input sequence corresponding with the first time step (Fig. 4, Stage 404, Col. 13, lines 51-55, “For each time step in the series of time steps where the robot provided an updated instance of sensor data, the mapping engine determines an occupancy map for the time step that indicates the current locations of obstacles in the environment at that time step.”); and 
the second motion forecast data is produced by using the interaction transformer model to analyze the second relative location embedding and a second input sequence corresponding with the second time step (Fig. 4, Stage 404, Col. 13, lines 51-55, “For each time step in the series of time steps where the robot provided an updated instance of sensor data, the mapping engine determines an occupancy map for the time step that indicates the current locations of obstacles in the environment at that time step.”).

With respect to claim 3, Millard discloses the computing system of claim 2, wherein the second input sequence is based at least in part on the first motion forecast data (Col. 9, lines 53-58, “The predictive model may initially process occupancy maps 216 from time steps n−10 through n, where n is the current time or most recent time step, one at a time and in sequence to generate a predicted occupancy map for future time step n+1.”).

With respect to claim 4, Millard discloses the computing system of claim 1, wherein the instructions further comprise: generate respective trajectory sequences for the plurality of actors (Fig. 4, Stage 406, Col. 14, lines 8-12, “An obstacle location prediction engine provides a sequence of occupancy maps for a series of time steps to a predictive model. The predictive model processes the sequence of occupancy maps one at a time and then outputs an occupancy prediction for a future time step.”), the respective trajectory sequences comprising the first respective trajectories of the plurality of actors for the first time step (Fig. 4, Stage 406, Col. 14, lines 23-25, “A series of occupancy maps up to a current time step based on sensor data can be processed to generate a first predicted occupancy map for a first future time step.”) and the second respective trajectories of the plurality of actors for the second time step (Fig. 4, Stage 406, Col. 14, lines 25-28, “An input based on the first predicted occupancy map can then be processed by the predictive model to generate a second predicted occupancy map for a second future time step.”).

With respect to claim 5, Millard discloses the computing system of claim 2, wherein the second input sequence further describe one or more features for each of the plurality of actors respectively at the first time step or the second time step, the one or more features comprising at least one of: 
a derivative of the location of the respective actor relative to the autonomous vehicle; 
a size of the respective actor (Col. 10, line 45 - Col. 11, line 12, i.e., “Motion models for dynamic obstacles may specify travel characteristics for the obstacle, such as expected velocities, accelerations, trajectories, directional changes, or a combination of these and/or other characteristics. the obstacle location prediction engine 218 may provide the motion model with information indicating a current location of the obstacle, a recent location of the obstacle at one or more preceding time steps, a current direction that the obstacle is moving or facing/oriented, a recent direction that the obstacle was moving or facing/oriented at one or more preceding time steps an indication of whether the obstacle is traveling with other obstacles (e.g., whether a person riding a bicycle and a person walking are traveling together and are unlikely to separate, or whether the person riding the bicycle and the person walking are independent), a pose or posture of the obstacle, an operational status of the obstacle, or a combination of values for all or some of these parameters and/or other parameters.”); 
an orientation of the respective actor relative to the autonomous vehicle (Col. 10, line 45 - Col. 11, line 12, i.e., “Motion models for dynamic obstacles may specify travel characteristics for the obstacle, such as expected velocities, accelerations, trajectories, directional changes, or a combination of these and/or other characteristics. the obstacle location prediction engine 218 may provide the motion model with information indicating a current location of the obstacle, a recent location of the obstacle at one or more preceding time steps, a current direction that the obstacle is moving or facing/oriented, a recent direction that the obstacle was moving or facing/oriented at one or more preceding time steps an indication of whether the obstacle is traveling with other obstacles (e.g., whether a person riding a bicycle and a person walking are traveling together and are unlikely to separate, or whether the person riding the bicycle and the person walking are independent), a pose or posture of the obstacle, an operational status of the obstacle, or a combination of values for all or some of these parameters and/or other parameters.”); or 
a center location of the respective actor.

With respect to claim 6, Millard discloses the computing system of claim 1, wherein: 
the first relative location embedding describes the respective locations of the plurality of actors with respect to the autonomous vehicle at the first time step (Col. 6, lines 4-30, “The mapping engine 214 is operable to receive sensor data 212 from the sensor subsystems 204 and to generate a virtual representation of at least a portion of the robot's environment from the sensor data 212.” and Col. 9, lines 7-9, “the robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step.”); and 
the second relative location embedding describes the respective locations of the plurality of actors with respect to the autonomous vehicle at the second time step(Col. 6, lines 4-30, “The mapping engine 214 is operable to receive sensor data 212 from the sensor subsystems 204 and to generate a virtual representation of at least a portion of the robot's environment from the sensor data 212.” and Col. 9, lines 7-9, “The robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step.”).

With respect to claim 8, Millard discloses the computing system of claim 1, wherein at least one of the interaction transformer model or the prediction model comprises one or more neural networks (Col. 8, lines 51-52, “The predictive model 222 can be a machine-learning model such as a deep neural network.”).

With respect to claim 9, Millard discloses the computing system of claim 1, wherein the prediction model comprises one or more multi-layer perceptrons (Col. 8, lines 55-65, “Some neural networks include one or more hidden layers in addition to an output layer. The output of each hidden layer is used as input to the next layer in the network, e.g., the next hidden layer or the output layer. Each layer of the network generates an output from a received input in accordance with current values of a respective set of parameters. The predictive model 222 includes one or more convolutional layers followed by one or more recurrent layers. The recurrent layers are followed by transpose convolutional layers mirroring the convolutional layers at the input.”).

With respect to claim 10, Millard discloses a computer-implemented method (Fig. 6), the method comprising: 
for a first iteration corresponding with a first time step: 
inputting, by a computing system comprising one or more computing devices (Fig. 6, computing device 600), a first relative location embedding that describes relative respective locations of a plurality of actors with respect to an autonomous vehicle into an interaction transformer model that is configured to receive a relative location embedding (Col. 6, lines 4-30, “The mapping engine 214 is operable to receive sensor data 212 from the sensor subsystems 204 and to generate a virtual representation of at least a portion of the robot's environment from the sensor data 212.” and Col. 9, lines 7-9, “the robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step.”), and in response to receipt of the relative location embedding, generate motion forecast data with respect to the plurality of actors (Col. 9, lines 9-11, “The mapping engine 214 may then process each iteration of sensor data 212 to generate an occupancy map 216 for a next time step.”); 
receiving, by the computing system and as an output of the interaction transformer model, first motion forecast data for a first plurality of actors at the first time step (Col. 7, line 66 - Col. 8, line 2, “The occupancy maps 216 that are provided as output from mapping engine 214 indicate current or recent locations of obstacles that were observed by the robot 202.”); 
inputting, by the computing system, the first motion forecast data into a prediction model, the prediction model (Col. 8, lines 45-47, “The obstacle location prediction engine 218 can include object recognizer 220 and predictive model 222.”) configured to receive the motion forecast data (Col. 8, line 66 - Col. 9, line 4, “the obstacle location prediction engine 218 receives a series of occupancy maps 216 from the mapping engine 214. Each occupancy map 216 indicates locations of obstacles in the robot's environment at a different one of a series of time steps (e.g., a current time step and one or more preceding time steps).”), and in response to receipt of the motion forecast data, generate respective trajectories for the plurality of actors for a current time step and respective projected trajectories for a subsequent time step (Col. 7, line 63 - Col. 8, line 4, “The obstacle location prediction engine 218 processes occupancy maps 216 from mapping engine 214 and outputs obstacle location prediction data 226. Whereas the occupancy maps 216 that are provided as output from mapping engine 214 indicate current or recent locations of obstacles that were observed by the robot 202, the obstacle location prediction data 226 indicates predictions of where the obstacles will be located at one or more future times.” and Col. 8, lines 18-22, “Obstacle location prediction data 226 indicates, for each obstacle represented in an occupancy map 216 at a current or recent time, a probability distribution of the object being located at one or more locations in the environment at one or more future times.”); and 
receiving, by the computing system and as an output of the prediction model (Col. 7, lines 63-66, “The obstacle location prediction engine 218 processes occupancy maps 216 from mapping engine 214 and outputs obstacle location prediction data 226.”), first respective trajectories of the plurality of actors for the first time step and respective projected trajectories for a second time step that is after the first time step (Col. 9, lines 11-16, “The obstacle location prediction engine 218 processes occupancy maps 216 from a set of the most recent time steps (e.g., a predetermined number n of the most recent time steps) to generate obstacle location predictions for one or more future time steps.”); and 
for a second iteration corresponding with the second time step: 
generating, by the computing system, a second relative location embedding for the second time step based on the respective projected trajectories for the second time step from the first iteration (Col. 9, lines 49-61, “the predictive model 222 may process preceding obstacle location predictions (e.g., predicted occupancy maps from preceding time steps) to generate obstacle location predictions for time steps further in the future. For example, the predictive model may initially process occupancy maps 216 from time steps n−10 through n, where n is the current time or most recent time step, one at a time and in sequence to generate a predicted occupancy map for future time step n+1. The predicted occupancy map for future time step n+1 may then be provided as input to the predictive model 222 to generate the next predicted occupancy map for future time step n+2.”); 
analyzing, by the computing system using the interaction transformer model, the second relative location embedding to produce (Col. 7, lines 55-56, “the mapping engine 214 analyzes the sensor data 212”), as an output of the interaction transformer model, second motion forecast data for the plurality of actors at the second time step (Col. 9, lines 9-11, “The mapping engine 214 may then process each iteration of sensor data 212 to generate an occupancy map 216 for a next time step.”); and 
determining, by the computing system using the prediction model, second respective trajectories of the plurality of actors for the second time step using the prediction model based on the second motion forecast data (Col. 9, lines 49-61, “the predictive model 222 may process preceding obstacle location predictions (e.g., predicted occupancy maps from preceding time steps) to generate obstacle location predictions for time steps further in the future (Col. 10, lines 26-28, “The obstacle location predictions can be provided in obstacle location prediction data 226.”). For example, the predictive model may initially process occupancy maps 216 from time steps n−10 through n, where n is the current time or most recent time step, one at a time and in sequence to generate a predicted occupancy map for future time step n+1. The predicted occupancy map for future time step n+1 may then be provided as input to the predictive model 222 to generate the next predicted occupancy map for future time step n+2.”).

With respect to claim 11, Millard discloses the computer-implemented method of claim 10, wherein: 
the interaction transformer model (Mapping Engine 214) is configured to generate the motion forecast data (Occupancy Maps 216) by receiving the relative location embedding that describes the relative respective locations of the plurality of actors with respect to the autonomous vehicle (Sensor Data 212) and by further receiving an input sequence (Col. 9, lines 7-9, “the robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step” and Fig. 4, stage 402, Col. 13, lines 40-47) describing object detection data (Col. 6, line 33 - Col. 7, line 32), and in response to receipt of the relative location embedding and the input sequence, generate the motion forecast data with respect to the plurality of actors (Fig. 3, Occupancy Map 300, Col. 7, lines 34-51, i.e., “black-shaded cells indicate that the corresponding portion of the environment represented by the cell is occupied by an obstacle (e.g., static objects). The white-shaded cells indicate that the corresponding portion of the environment represented by the cell is not occupied by an obstacle (e.g., dynamic objects).” and Col. 9, lines 1-11, “Each occupancy map 216 indicates locations of obstacles in the robot's environment at a different one of a series of time steps (e.g., a current time step and one or more preceding time steps). The occupancy maps 216 can be based on observations of the robot's environment at times corresponding to the series of time steps. For example, the robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step. The mapping engine 214 may then process each iteration of sensor data 212 to generate an occupancy map 216 for a next time step.”); 
the first motion forecast data is based on the first relative location embedding and a first input sequence corresponding with the first time step (Fig. 4, stage 404, Col. 13, lines 51-55, “For each time step in the series of time steps where the robot provided an updated instance of sensor data, the mapping engine determines an occupancy map for the time step that indicates the current locations of obstacles in the environment at that time step.”); and 
the second motion forecast data is produced by using, by the computing system, the interaction transformer model to analyze the second relative location embedding and a second input sequence corresponding with the second time step (Fig. 4, stage 404, Col. 13, lines 51-55, “For each time step in the series of time steps where the robot provided an updated instance of sensor data, the mapping engine determines an occupancy map for the time step that indicates the current locations of obstacles in the environment at that time step.”).

With respect to claim 12, Millard discloses the computer-implemented method of claim 11, wherein the second input sequence is based at least in part on the first motion forecast data (Col. 9, lines 53-58, “The predictive model may initially process occupancy maps 216 from time steps n−10 through n, where n is the current time or most recent time step, one at a time and in sequence to generate a predicted occupancy map for future time step n+1.”).

With respect to claim 13, Millard discloses the computer-implemented method of claim 10, further comprising generating, by the computing system, respective trajectory sequences for the plurality of actors (Fig. 4, stage 406, Col. 14, lines 8-12, “an obstacle location prediction engine provides a sequence of occupancy maps for a series of time steps to a predictive model. The predictive model processes the sequence of occupancy maps one at a time and then outputs an occupancy prediction for a future time step.”), the respective trajectory sequences comprising the first respective trajectories of the plurality of actors for the first time step (Fig. 4, stage 406, Col. 14, lines 23-25, “A series of occupancy maps up to a current time step based on sensor data can be processed to generate a first predicted occupancy map for a first future time step.”) and the second respective trajectories of the plurality of actors for the second time step (Fig. 4, stage 406, Col. 14, lines 25-28, “An input based on the first predicted occupancy map can then be processed by the predictive model to generate a second predicted occupancy map for a second future time step.”).

With respect to claim 14, Millard discloses the computer-implemented method of claim 11, wherein the second input sequence further describe one or more features for each of the plurality of actors respectively at the first time step or the second time step, the one or more features comprising at least one of: 
a derivative of the location of the respective actor relative to the autonomous vehicle; 
a size of the respective actor (Col. 10, line 45 - Col. 11, line 12, i.e., “Motion models for dynamic obstacles may specify travel characteristics for the obstacle, such as expected velocities, accelerations, trajectories, directional changes, or a combination of these and/or other characteristics. the obstacle location prediction engine 218 may provide the motion model with information indicating a current location of the obstacle, a recent location of the obstacle at one or more preceding time steps, a current direction that the obstacle is moving or facing/oriented, a recent direction that the obstacle was moving or facing/oriented at one or more preceding time steps an indication of whether the obstacle is traveling with other obstacles (e.g., whether a person riding a bicycle and a person walking are traveling together and are unlikely to separate, or whether the person riding the bicycle and the person walking are independent), a pose or posture of the obstacle, an operational status of the obstacle, or a combination of values for all or some of these parameters and/or other parameters.”); 
an orientation of the respective actor relative to the autonomous vehicle (Col. 10, line 45 - Col. 11, line 12, i.e., “Motion models for dynamic obstacles may specify travel characteristics for the obstacle, such as expected velocities, accelerations, trajectories, directional changes, or a combination of these and/or other characteristics. the obstacle location prediction engine 218 may provide the motion model with information indicating a current location of the obstacle, a recent location of the obstacle at one or more preceding time steps, a current direction that the obstacle is moving or facing/oriented, a recent direction that the obstacle was moving or facing/oriented at one or more preceding time steps an indication of whether the obstacle is traveling with other obstacles (e.g., whether a person riding a bicycle and a person walking are traveling together and are unlikely to separate, or whether the person riding the bicycle and the person walking are independent), a pose or posture of the obstacle, an operational status of the obstacle, or a combination of values for all or some of these parameters and/or other parameters.”); or 
a center location of the respective actor.

With respect to claim 15, Millard discloses the computer-implemented method of claim 10, wherein: 
the first relative location embedding describes the respective locations of the plurality of actors with respect to the autonomous vehicle at the first time step (Col. 6, lines 4-30, “The mapping engine 214 is operable to receive sensor data 212 from the sensor subsystems 204 and to generate a virtual representation of at least a portion of the robot's environment from the sensor data 212.” and Col. 9, lines 7-9, “the robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step.”); and 
the second relative location embedding describes the respective locations of the plurality of actors with respect to the autonomous vehicle at the second time step(Col. 6, lines 4-30, “The mapping engine 214 is operable to receive sensor data 212 from the sensor subsystems 204 and to generate a virtual representation of at least a portion of the robot's environment from the sensor data 212.” and Col. 9, lines 7-9, “the robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step.”).

With respect to claim 17, Millard discloses the computer-implemented method of claim 10, wherein the prediction model comprises one or more multi-layer perceptrons (Col. 8, lines 55-65, “Some neural networks include one or more hidden layers in addition to an output layer. The output of each hidden layer is used as input to the next layer in the network, e.g., the next hidden layer or the output layer. Each layer of the network generates an output from a received input in accordance with current values of a respective set of parameters. The predictive model 222 includes one or more convolutional layers followed by one or more recurrent layers. The recurrent layers are followed by transpose convolutional layers mirroring the convolutional layers at the input.”).

With respect to claim 18, Millard discloses a computer-implemented method for training one or more machine-learned models (Fig. 5 and Fig. 6), the method comprising: 
for a first iteration corresponding with a first time step: 
inputting, by a computing system comprising one or more computing devices (Fig. 6, computing device 600), a first relative location embedding that describes relative respective locations of a plurality of actors with respect to an autonomous vehicle into an interaction transformer model that is configured to receive a relative location embedding (Col. 6, lines 4-30, “The mapping engine 214 is operable to receive sensor data 212 from the sensor subsystems 204 and to generate a virtual representation of at least a portion of the robot's environment from the sensor data 212.” and Col. 9, lines 7-9, “the robot 202 and sensor subsystems 204 may periodically provide updated sensor data 212 to the mapping engine 214 at each passing time step.”), and in response to receipt of the relative location embedding, generate motion forecast data with respect to the plurality of actors (Col. 9, lines 9-11, “The mapping engine 214 may then process each iteration of sensor data 212 to generate an occupancy map 216 for a next time step.”); 
receiving, by the computing system and as an output of the interaction transformer model, first motion forecast data for a first plurality of actors at the first time step (Col. 7, line 66 - Col. 8, line 2, “The occupancy maps 216 that are provided as output from mapping engine 214 indicate current or recent locations of obstacles that were observed by the robot 202.”); 
inputting, by the computing system, the first motion forecast data into a prediction model, the prediction model (Col. 8, lines 45-47, “The obstacle location prediction engine 218 can include object recognizer 220 and predictive model 222.”) configured to receive the motion forecast data (Col. 8, line 66 - Col. 9, line 4, “the obstacle location prediction engine 218 receives a series of occupancy maps 216 from the mapping engine 214. Each occupancy map 216 indicates locations of obstacles in the robot's environment at a different one of a series of time steps (e.g., a current time step and one or more preceding time steps).”), and in response to receipt of the motion forecast data, generate respective trajectories for the plurality of actors for a current time step and respective projected trajectories for a subsequent time step (Col. 7, line 63 - Col. 8, line 4, “The obstacle location prediction engine 218 processes occupancy maps 216 from mapping engine 214 and outputs obstacle location prediction data 226. Whereas the occupancy maps 216 that are provided as output from mapping engine 214 indicate current or recent locations of obstacles that were observed by the robot 202, the obstacle location prediction data 226 indicates predictions of where the obstacles will be located at one or more future times.” and Col. 8, lines 18-22, “Obstacle location prediction data 226 indicates, for each obstacle represented in an occupancy map 216 at a current or recent time, a probability distribution of the object being located at one or more locations in the environment at one or more future times.”); and 
receiving, by the computing system and as an output of the prediction model (Col. 7, lines 63-66, “The obstacle location prediction engine 218 processes occupancy maps 216 from mapping engine 214 and outputs obstacle location prediction data 226.”), first respective trajectories of the plurality of actors for the first time step and respective projected trajectories for a second time step that is after the first time step (Col. 9, lines 11-16, “The obstacle location prediction engine 218 processes occupancy maps 216 from a set of the most recent time steps (e.g., a predetermined number n of the most recent time steps) to generate obstacle location predictions for one or more future time steps.”); and 
for a second iteration corresponding with the second time step: 
generating, by the computing system, a second relative location embedding for the second time step based on the respective projected trajectories for the second time step from the first iteration (Col. 9, lines 49-61, “the predictive model 222 may process preceding obstacle location predictions (e.g., predicted occupancy maps from preceding time steps) to generate obstacle location predictions for time steps further in the future. For example, the predictive model may initially process occupancy maps 216 from time steps n−10 through n, where n is the current time or most recent time step, one at a time and in sequence to generate a predicted occupancy map for future time step n+1. The predicted occupancy map for future time step n+1 may then be provided as input to the predictive model 222 to generate the next predicted occupancy map for future time step n+2.”); 
analyzing, by the computing system using the interaction transformer model, the second relative location embedding to produce (Col. 7, lines 55-56, “the mapping engine 214 analyzes the sensor data 212”), as an output of the interaction transformer model, second motion forecast data for the plurality of actors at the second time step (Col. 9, lines 9-11, “The mapping engine 214 may then process each iteration of sensor data 212 to generate an occupancy map 216 for a next time step.”); and 
determining, by the computing system using the prediction model, second respective trajectories of the plurality of actors for the second time step using the prediction model based on the second motion forecast data (Col. 9, lines 49-61, “the predictive model 222 may process preceding obstacle location predictions (e.g., predicted occupancy maps from preceding time steps) to generate obstacle location predictions for time steps further in the future (Col. 10, lines 26-28, “The obstacle location predictions can be provided in obstacle location prediction data 226.”). For example, the predictive model may initially process occupancy maps 216 from time steps n−10 through n, where n is the current time or most recent time step, one at a time and in sequence to generate a predicted occupancy map for future time step n+1. The predicted occupancy map for future time step n+1 may then be provided as input to the predictive model 222 to generate the next predicted occupancy map for future time step n+2.”); and 
adjusting, by the computing system, one or more parameters of the interaction transformer model and the prediction model based on the second respective trajectories of the plurality of actors (Col. 9, lines 62-66, i.e., “This feedback loop may continue for a pre-defined number of time steps, or until some other terminating condition is met. In some implementations, the predicted occupancy maps from preceding time steps are modified before being fed back as input to the predictive model 222 (i.e., Col. 8, lines 51-65, “The predictive model 222 can be a machine-learning model such as a deep neural network. Neural networks are machine-learning models that employ one or more layers of nonlinear units to predict an output for a received input. Some neural networks include one or more hidden layers in addition to an output layer. The output of each hidden layer is used as input to the next layer in the network, e.g., the next hidden layer or the output layer. Each layer of the network generates an output from a received input in accordance with current values of a respective set of parameters. In some implementations, the predictive model 222 includes one or more convolutional layers followed by one or more recurrent layers. In some implementations, the recurrent layers are followed by transpose convolutional layers mirroring the convolutional layers at the input.”).) and Col. 13, lines 22-27, “The system 200 may re-analyze the environment and update the planned path on a frequent basis to ensure the robot continues to travel on an optimized trajectory even as conditions in the environment may change over time during the course of the robot's travel.”).

With respect to claim 19, Millard discloses the computer-implemented method of claim 18, wherein adjusting, by the computing system, one or more parameters of the interaction transformer model and the prediction model based on the second respective trajectories of the plurality of actors comprises adjusting, by the computing system, the one or more parameters of the interaction transformer model and the prediction model based on a loss function (Col. 15, lines 18-21, “The process 500 can employ machine-learning techniques to iteratively update values of internal parameters of the model such as weights of nonlinear units in a deep neural network.”) that describes a difference between respective ground truth trajectories of the plurality of actors and the second respective trajectories of the plurality of actors for the second time step (Col. 15, lines 24-57, i.e., “The training system compares the predicted occupancy map (output data of the predictive model) to the target occupancy map (locations of obstacles in the environment at a future time step following the time steps of the training occupancy maps. The training series of occupancy maps may be obtained for time steps n-m through n. The target occupancy map can then represent locations of obstacles at time step n+j.). The difference between the predicted and target occupancy maps represents an error in the prediction. Then, the system updates the values of the internal parameters of the predictive model based on the error.”).

With respect to claim 20, Millard discloses the computer-implemented method of claim 18, wherein adjusting, by the computing system, one or more parameters of the interaction transformer model and the prediction model based on the second respective trajectories of the plurality of actors comprises training, in an end-to-end configuration, the interaction transformer and the prediction model (Col. 15, lines 57-59, “The values of the model's parameters are updated using machine-learning techniques such as backpropagation with gradient descent.”).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 7 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Millard (US 11,016,491) in view of Singh et al. (US 2020/0379461, hereafter “Singh”).

With respect to claim 7, Millard discloses the computing system of claim 1, wherein the interaction transformer model comprises…, but Millard does not expressly disclose an interaction model configured to receive the relative location embedding that describes the relative respective locations of the plurality of actors with respect to the autonomous vehicle, and in response to receipt of the relative location embedding, generate an attention embedding with respect to the plurality of actors; a recurrent model configured to receive the attention embedding, and in response to receipt of the attention embedding, generate the motion forecast data with respect to the plurality of actors. 
However, Singh, in the same field of endeavor, discloses an interaction model (Fig. 2, prediction and forecasting subsystem 123) configured to receive the relative location embedding that describes the relative respective locations of the plurality of actors with respect to the autonomous vehicle (¶ 29, “The prediction and forecasting subsystem 123 may predict the future locations, trajectories, and/or actions of the objects based at least in part on perception information (e.g., the state data for each object) received from the perception subsystem 122, the location information received from the location subsystem 121, the sensor data, and/or any other data that describes the past and/or current state of the objects, the autonomous vehicle 101, the surrounding environment, and/or their relationship(s).” and Fig. 3, step 302, e.g., ¶ 37), and in response to receipt of the relative location embedding, generate an attention embedding with respect to the plurality of actors (¶¶ 51-52, i.e., “The RNN can relationship between the inputs and outputs of a sequence of temporal signals with a plurality of nodes. Each node performs a relatively simple data transformation on a single dimension, i.e., an activation function. The activation function may take on a various forms including, linear functions, sigmoid functions and Gaussian functions. The nodes may also be recurrent, i.e., dependent from the input of output of any of the nodes from an earlier portion of a temporal sequence. Therefore, an output of the RNN can be dependent upon the output of a plurality of interconnected nodes rather than a single transformation. A deep neural network includes multiple hidden layers in the network hierarchy. A deep neural network comprises a plurality of nodes, including input nodes, hidden nodes, and output nodes. The nodes can be connected by edges, which can be weighted according to the strength of the edges. For example, an RNN may have any number of layers, nodes, trainable parameters (also known as weights) and/or recurrences.” and Fig. 3, step 304, e.g., ¶¶ 39-40); 
a recurrent model (Fig. 2, neural network 123(a), ¶ 29, “The neural network 123(a) can be implemented in two phases: an offline training phase and an operational phase.”) configured to receive the attention embedding, and in response to receipt of the attention embedding, generate the motion forecast data with respect to the plurality of actors (Fig. 3, step 306, ¶ 41, “The system may determine a reference path (¶ 42, i.e., “The reference path (e.g., centerline) may be a curvilinear line or it may be a series of linear segments.”) encoded in semantically rich vector maps for each object trajectory in the object data set.” and Fig. 3, step 308, ¶¶ 46-47, i.e., “The system may transform the object trajectories of the object data set into a 2D curvilinear coordinate system using the reference paths. A trajectory Vi at timestep t may be represented as a collection of curvilinear coordinates (Si t, Oi t) at different time stamps t=t1, t2, t3 . . . ti,  and represents time series data. This data may be used to build and dynamically update models for performing predictions and forecasting. Semantic attributes Mi t of the vector map may also be encoded in the trajectory Vi.”).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Singh into the invention of Millard for the benefit of avoiding collision by assisting the vehicle’s system in understanding and recognizing its surrounding environment and its relationship thereto (Singh: ¶ 48).

With respect to claim 16, Millard discloses the computer-implemented method of claim 10, wherein the interaction transformer model comprises:…, but Millard does not expressly disclose an interaction model configured to receive the relative location embedding that describes the relative respective locations of the plurality of actors with respect to the autonomous vehicle, and in response to receipt of the relative location embedding, generate an attention embedding with respect to the plurality of actors; a recurrent model configured to receive the attention embedding, and in response to receipt of the attention embedding, generate the motion forecast data with respect to the plurality of actors. 
However, Singh, in the same field of endeavor, discloses an interaction model (Fig. 2, prediction and forecasting subsystem 123) configured to receive the relative location embedding that describes the relative respective locations of the plurality of actors with respect to the autonomous vehicle (¶ 29, “The prediction and forecasting subsystem 123 may predict the future locations, trajectories, and/or actions of the objects based at least in part on perception information (e.g., the state data for each object) received from the perception subsystem 122, the location information received from the location subsystem 121, the sensor data, and/or any other data that describes the past and/or current state of the objects, the autonomous vehicle 101, the surrounding environment, and/or their relationship(s).” and Fig. 3, step 302, e.g., ¶ 37), and in response to receipt of the relative location embedding, generate an attention embedding with respect to the plurality of actors (¶¶ 51-52, i.e., “The RNN can relationship between the inputs and outputs of a sequence of temporal signals with a plurality of nodes. Each node performs a relatively simple data transformation on a single dimension, i.e., an activation function. The activation function may take on a various forms including, linear functions, sigmoid functions and Gaussian functions. The nodes may also be recurrent, i.e., dependent from the input of output of any of the nodes from an earlier portion of a temporal sequence. Therefore, an output of the RNN can be dependent upon the output of a plurality of interconnected nodes rather than a single transformation. A deep neural network includes multiple hidden layers in the network hierarchy. A deep neural network comprises a plurality of nodes, including input nodes, hidden nodes, and output nodes. The nodes can be connected by edges, which can be weighted according to the strength of the edges. For example, an RNN may have any number of layers, nodes, trainable parameters (also known as weights) and/or recurrences.” and Fig. 3, step 304, e.g., ¶¶ 39-40); 
a recurrent model (Fig. 2, neural network 123(a), ¶ 29, “The neural network 123(a) can be implemented in two phases: an offline training phase and an operational phase.”) configured to receive the attention embedding, and in response to receipt of the attention embedding, generate the motion forecast data with respect to the plurality of actors (Fig. 3, step 306, ¶ 41, “The system may determine a reference path (¶ 42, i.e., “The reference path (e.g., centerline) may be a curvilinear line or it may be a series of linear segments.”) encoded in semantically rich vector maps for each object trajectory in the object data set.” and Fig. 3, step 308, ¶¶ 46-47, i.e., “The system may transform the object trajectories of the object data set into a 2D curvilinear coordinate system using the reference paths. A trajectory Vi at timestep t may be represented as a collection of curvilinear coordinates (Si t, Oi t) at different time stamps t=t1, t2, t3 . . . ti,  and represents time series data. This data may be used to build and dynamically update models for performing predictions and forecasting. Semantic attributes Mi t of the vector map may also be encoded in the trajectory Vi.”).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Singh into the invention of Millard for the benefit of avoiding collision by assisting the vehicle’s system in understanding and recognizing its surrounding environment and its relationship thereto (Singh: ¶ 48).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
KEHL et al. (US 2019/0370606) teaches a training model that receives input data x to transform the input data x to an output y. The output y is input to a transform function, and the transform function may transform the output y to a format corresponding to the ground truth label y*. The output of the transform function is received at a loss function. Further, the loss function compares the transformed output y or non-transformed output y to the ground truth label y*. The error is the difference (e.g., loss) between the transformed output y or non-transformed output y and the ground truth label y*.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Kimia Kohankhaki whose telephone number is (571)272-5959. The examiner can normally be reached Monday - Thursday 7:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Geepy Pe can be reached on (571) 270-3703. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KIMIA KOHANKHAKI/               Examiner, Art Unit 3663                                                                                                                                                                                         
/Geepy Pe/Supervisory Patent Examiner, Art Unit 3663                                                                                                                                                                                                        11/2/2022