DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings are objected to because elements 222-230 in Fig. 2A and elements 262-270 in Fig. 2B are unclear.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 8-14, 16-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over McGill, JR. et al. (U.S. 2021-0302960 A1).

Claim 1, McGill teaches:
A method (McGill, Fig. 1) comprising: 
obtaining scene data (McGill, Paragraphs [0035-0036], The driving recommendation system 170 gathers information regarding a target agent as well as data gathered from raw sensor data in the ego vehicle, which are collectively interpreted as scene data.) characterizing an environment at a current time point (McGill, Paragraphs [0036-0037], The collected data regarding both the target agent and the ego vehicle are updated over time.  A first instance of the collected data is thus a current time point, e.g. a current status of the plurality of lane-level cells 340 (see McGill, Paragraph [0038]).); 
processing a first network input generated from the scene data using a first neural network (McGill, Fig. 4: 425-460 and 480, Paragraphs [0046-0047], The neural network 405 receives an encoded graph network 480 and sensor data 425-460, which are representative of first network input.) to generate an intermediate output (McGill, Fig. 4: 470, 475, Paragraphs [0046-0047], In response to the inputs, the neural network 405 generates decoded graph-network data 475 and a confidence metric 470, which collectively is an intermediate output.); 
obtaining an identification of a future time point that is after the current time point (McGill, Paragraphs [0038-0039], The prediction module 240 can predict the future status of the plurality of lane-level cells 340 over a predetermined time horizon, which represents a future time point that is dependent on the update rate of the driving recommendation system 170.); and 
generating an occupancy output, wherein the occupancy output comprises respective occupancy probabilities for each of a plurality of locations in the environment, wherein the respective occupancy probability for each location characterizes a likelihood that one or more agents will occupy the location at the future time point (McGill, Paragraphs [0046-0047], The prediction module 240, via the neural network(s) 405, generate a future status of the cells and a confidence metric 470.  The confidence metric 470 represents the confidence level, i.e. characterizes a likelihood, associated with the predictions of future status of the lane-level cells.  The future status of the cells represents whether the system predicts a detected road agent to occupy the given cell(s) in the future (see McGill, Paragraph [0033]).).
McGill does not explicitly teach:
Generating, from the intermediate output and the future time point, an occupancy output.
However, it would have been obvious to one of ordinary skill in the art for the prediction module 240, which includes the neural network(s) 405 (see McGill, Paragraph [0041]), to utilize the output of the neural network(s) 405, which includes graph decoder operation 475, to generate the prediction of future status of the cells.  Such a modification would be consistent with Fig. 4 of McGill wherein the decoder operation 475 feeds into graph network 410, and ultimate to Recommendation Module 250.  Therefore, one of ordinary skill in the art would have an expectation of success and predictable results.

Claim 2, McGill further teaches:
The method of claim 1, wherein the scene data comprises, for each of one or more agents in the environment: 
a current location (McGill, Paragraph [0037], Location data includes the cells in which the road agents are located.) and current values for a predetermined set of motion parameters of the agent (McGill, Paragraph [0055], One example motion parameter is the velocity of the road agent.); and 
a previous location and previous values for the predetermined set of motion parameters of the agent for each of one or more previous time points (McGill, Paragraph [0055], The graph-update module 230 continues to update the current status of the plurality of lane-level cells 340.  Thus, as the current status of the cells is updated, the previously updated statuses are previous locations and previous values, respectively.).

Claim 3, McGill further teaches:
The method of claim 1, wherein the intermediate output is a machine-learned representation of the first network input (McGill, Fig. 4: 475, Paragraph [0046], The decoded graph-network 475 is a representation of the graph encoder operation 480.  The modules operate under machine learning algorithms (see McGill, Paragraph [0077]).  Thus, it would have been obvious to one of ordinary skill in the art for the output of the neural network(s) to be machine-learned.).

Claim 8, McGill further teaches:
The method of claim 1, wherein the intermediate output comprises, for each of the plurality of locations in the environment: 
a predicted enter time and a predicted exit time that define a future time interval in which a surrounding agent will occupy the location (McGill, Paragraph [0055], An example of a cell status includes the determination that a road agent will exit a particular cell and enter a different cell.  The time it takes the road agent to exit one cell and enter another cell, e.g. within a predetermined time horizon, establishes a future time interval.), and 
a first probability that represents a confidence that the predicted enter time and predicted exit time are accurate (McGill, Paragraph [0046], The confidence level is associated with the predictions of future status of the lane-level cells.  The status of lane-level cells includes moments when the road agent exits a particular cell and enters a different cell, which changes the statuses of both cells (see McGill, Paragraph [0055]).).

Claim 9, McGill further teaches:
The method of claim 8, wherein generating the occupancy output comprises, for each of the plurality of locations in the environment: 
determining whether the future time point is between the predicted enter time and the predicted exit time of the location (McGill, Paragraph [0055], In the example of a road agent exiting a particular cell and entering a different cell within a predetermined time horizon, the time horizon includes the future time point.); 
in response to determining that the future time point is between the predicted enter time and the predicted exit time of the location, generating the occupancy probability for the location to be equal to the first probability of the location; and 
in response to determining that the future time point is not between the predicted enter time and the predicted exit time of the location, generating the occupancy probability for the location to be zero (McGill, Paragraph [0046], It is noted that Applicant’s “in response” steps are interpreted as including either determining that the future time point is between the predicted enter time and the predicted exit time or that the future time point is not between the predicted enter time and the predicted exit time.  When it is determined during, the predetermined time horizon, that the road agent exits a particular cell and enters a different cell, a confidence level is determined.).

Claim 10, McGill further teaches:
The method of claim 1, wherein the identification of the future time point is obtained from a planner of a vehicle in the environment (McGill, Paragraphs [0038-0039], The prediction module 240 is functionally equivalent to a planner of a vehicle.).

Claim 11, McGill further teaches:
The method of claim 1, wherein the occupancy output comprises a two-dimensional array of data values, wherein each position in the array corresponds to a respective location in the environment (McGill, Fig. 4: 410, Paragraph [0046], Graph network 410 is functionally equivalent to a two-dimensional array of data values, wherein each section, i.e. elements 420-423, includes a plurality of nodes 415, and wherein each node 415 represents a portion of the roadway (see McGill, Paragraph [0045]). Each node 415 corresponds to lane-level cells 340, which individually have future statuses, i.e. data values, associated therewith (see McGill, Paragraph [0038]).), and wherein the data values each characterize the occupancy probability of the respective location (McGill, Paragraph [0038], Future status indicates whether or not the system predicts that each lane-level cell 340 will be occupied by a road agent.).

Claim 12, McGill teaches:
A system (McGill, Figs. 1 and 2) comprising one or more computers (McGill, Paragraph [0025], The modules operate under computer-readable instructions executed by one or more processors, which are functionally equivalent to one or more computers.) and one or more storage devices storing instructions (McGill, Paragraph [0025]) that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: 
obtaining scene data (McGill, Paragraphs [0035-0036], The driving recommendation system 170 gathers information regarding a target agent as well as data gathered from raw sensor data in the ego vehicle, which are collectively interpreted as scene data.) characterizing an environment at a current time point (McGill, Paragraphs [0036-0037], The collected data regarding both the target agent and the ego vehicle are updated over time.  A first instance of the collected data is thus a current time point, e.g. a current status of the plurality of lane-level cells 340 (see McGill, Paragraph [0038]).); 
processing a first network input generated from the scene data using a first neural network (McGill, Fig. 4: 425-460 and 480, Paragraphs [0046-0047], The neural network 405 receives an encoded graph network 480 and sensor data 425-460, which are representative of first network input.) to generate an intermediate output (McGill, Fig. 4: 470, 475, Paragraphs [0046-0047], In response to the inputs, the neural network 405 generates decoded graph-network data 475 and a confidence metric 470, which collectively is an intermediate output.); 
obtaining an identification of a future time point that is after the current time point (McGill, Paragraphs [0038-0039], The prediction module 240 can predict the future status of the plurality of lane-level cells 340 over a predetermined time horizon, which represents a future time point that is dependent on the update rate of the driving recommendation system 170.); and 
generating, an occupancy output, wherein the occupancy output comprises respective occupancy probabilities for each of a plurality of locations in the environment, wherein the respective occupancy probability for each location characterizes a likelihood that one or more agents will occupy the location at the future time point (McGill, Paragraphs [0046-0047], The prediction module 240, via the neural network(s) 405, generate a future status of the cells and a confidence metric 470.  The confidence metric 470 represents the confidence level, i.e. characterizes a likelihood, associated with the predictions of future status of the lane-level cells.  The future status of the cells represents whether the system predicts a detected road agent to occupy the given cell(s) in the future (see McGill, Paragraph [0033]).).
McGill does not explicitly teach:
Generating, from the intermediate output and the future time point, an occupancy output.
However, it would have been obvious to one of ordinary skill in the art for the prediction module 240, which includes the neural network(s) 405 (see McGill, Paragraph [0041]), to utilize the output of the neural network(s) 405, which includes graph decoder operation 475, to generate the prediction of future status of the cells.  Such a modification would be consistent with Fig. 4 of McGill wherein the decoder operation 475 feeds into graph network 410, and ultimate to Recommendation Module 250.  Therefore, one of ordinary skill in the art would have an expectation of success and predictable results.

Claim 13, McGill further teaches:
The system of claim 12, wherein the scene data comprises, for each of one or more agents in the environment: 
a current location (McGill, Paragraph [0037], Location data includes the cells in which the road agents are located.) and current values for a predetermined set of motion parameters of the agent (McGill, Paragraph [0055], One example motion parameter is the velocity of the road agent.); and 
a previous location and previous values for the predetermined set of motion parameters of the agent for each of one or more previous time points (McGill, Paragraph [0055], The graph-update module 230 continues to update the current status of the plurality of lane-level cells 340.  Thus, as the current status of the cells is updated, the previously updated statuses are previous locations and previous values, respectively.).

Claim 14, McGill further teaches:
The system of claim 12, wherein the intermediate output is a machine-learned representation of the first network input (McGill, Fig. 4: 475, Paragraph [0046], The decoded graph-network 475 is a representation of the graph encoder operation 480.  The modules operate under machine learning algorithms (see McGill, Paragraph [0077]).  Thus, it would have been obvious to one of ordinary skill in the art for the output of the neural network(s) to be machine-learned.).

Claim 16, McGill further teaches:
The system of claim 12, wherein the intermediate output comprises, for each of the plurality of locations in the environment: 
a predicted enter time and a predicted exit time that define a future time interval in which a surrounding agent will occupy the location (McGill, Paragraph [0055], An example of a cell status includes the determination that a road agent will exit a particular cell and enter a different cell.  The time it takes the road agent to exit one cell and enter another cell, e.g. within a predetermined time horizon, establishes a future time interval.), and a first probability that represents a confidence that the predicted enter time and predicted exit time are accurate (McGill, Paragraph [0046], The confidence level is associated with the predictions of future status of the lane-level cells.  The status of lane-level cells includes moments when the road agent exits a particular cell and enters a different cell, which changes the statuses of both cells (see McGill, Paragraph [0055]).).

Claim 17, McGill teaches:
One or more non-transitory storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations (McGill, Paragraph [0025], It is noted that the Applicant has defined a tangible non-transitory storage medium away from an artificially-generated propagated signal in Paragraph [0123] of the Applicant’s specification.) comprising: 
obtaining scene data (McGill, Paragraphs [0035-0036], The driving recommendation system 170 gathers information regarding a target agent as well as data gathered from raw sensor data in the ego vehicle, which are collectively interpreted as scene data.) characterizing an environment at a current time point (McGill, Paragraphs [0036-0037], The collected data regarding both the target agent and the ego vehicle are updated over time.  A first instance of the collected data is thus a current time point, e.g. a current status of the plurality of lane-level cells 340 (see McGill, Paragraph [0038]).); 
processing a first network input generated from the scene data using a first neural network (McGill, Fig. 4: 425-460 and 480, Paragraphs [0046-0047], The neural network 405 receives an encoded graph network 480 and sensor data 425-460, which are representative of first network input.) to generate an intermediate output (McGill, Fig. 4: 470, 475, Paragraphs [0046-0047], In response to the inputs, the neural network 405 generates decoded graph-network data 475 and a confidence metric 470, which collectively is an intermediate output.); 
obtaining an identification of a future time point that is after the current time point (McGill, Paragraphs [0038-0039], The prediction module 240 can predict the future status of the plurality of lane-level cells 340 over a predetermined time horizon, which represents a future time point that is dependent on the update rate of the driving recommendation system 170.); and 
generating an occupancy output, wherein the occupancy output comprises respective occupancy probabilities for each of a plurality of locations in the environment, wherein the respective occupancy probability for each location characterizes a likelihood that one or more agents will occupy the location at the future time point (McGill, Paragraphs [0046-0047], The prediction module 240, via the neural network(s) 405, generate a future status of the cells and a confidence metric 470.  The confidence metric 470 represents the confidence level, i.e. characterizes a likelihood, associated with the predictions of future status of the lane-level cells.  The future status of the cells represents whether the system predicts a detected road agent to occupy the given cell(s) in the future (see McGill, Paragraph [0033]).).
McGill does not explicitly teach:
Generating, from the intermediate output and the future time point, an occupancy output.
However, it would have been obvious to one of ordinary skill in the art for the prediction module 240, which includes the neural network(s) 405 (see McGill, Paragraph [0041]), to utilize the output of the neural network(s) 405, which includes graph decoder operation 475, to generate the prediction of future status of the cells.  Such a modification would be consistent with Fig. 4 of McGill wherein the decoder operation 475 feeds into graph network 410, and ultimate to Recommendation Module 250.  Therefore, one of ordinary skill in the art would have an expectation of success and predictable results.

Claim 18, McGill further teaches:
The system of claim 17, wherein the intermediate output is a machine-learned representation of the first network input (McGill, Fig. 4: 475, Paragraph [0046], The decoded graph-network 475 is a representation of the graph encoder operation 480.  The modules operate under machine learning algorithms (see McGill, Paragraph [0077]).  Thus, it would have been obvious to one of ordinary skill in the art for the output of the neural network(s) to be machine-learned.).

Claim 20, McGill further teaches:
The system of claim 17, wherein the intermediate output comprises, for each of the plurality of locations in the environment: 
a predicted enter time and a predicted exit time that define a future time interval in which a surrounding agent will occupy the location (McGill, Paragraph [0055], An example of a cell status includes the determination that a road agent will exit a particular cell and enter a different cell.  The time it takes the road agent to exit one cell and enter another cell, e.g. within a predetermined time horizon, establishes a future time interval.), and a first probability that represents a confidence that the predicted enter time and predicted exit time are accurate (McGill, Paragraph [0046], The confidence level is associated with the predictions of future status of the lane-level cells.  The status of lane-level cells includes moments when the road agent exits a particular cell and enters a different cell, which changes the statuses of both cells (see McGill, Paragraph [0055]).).

Claims 4-7, 15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over McGill, JR. et al. (U.S. 2021-0302960 A1) in view of Hayward et al. (U.S. 2021/0287297 A1).

Claim 4, McGill teaches:
The method of claim 3, wherein generating the occupancy output comprises: 
processing a second network input (McGill, Fig. 4: 425-460) comprising the identification of the future time point using a second neural network to generate the occupancy output (McGill, Paragraph [0046], The neural network 405 can be a plurality of neural networks 405 operating together.  Thus, in the production of future status (McGill, Paragraphs [0046-0047]), the output is performed by the plurality of neural networks 405 including a plurality of different network inputs in addition to the future time point.).
McGill does not specifically teach:
Processing the intermediate output using the second neural network.
Hayward teaches:
Chaining neural networks together, wherein the output of the first neural network is fed as input to a second neural network (Hayward, Paragraph [0065]).
Therefore, it would have been obvious to one of ordinary skill in the art, at the time of filing, to modify the neural networks in McGill by integrating the teaching of chaining neural networks, as taught by Hayward.
The motivation would be to train the neural networks by comparing known results with weighted results of the training neural networks to prevent overestimating or underestimating (see Hayward, Paragraph [0066]).  Thus, in the combination of McGill in view of Hayward, the neural networks 405 of McGill would be chained together, as taught by Hayward.

Claim 5, McGill in view of Hayward further teaches:
The method of claim 4, wherein the second neural network has fewer parameters than the first neural network (Hayward, Paragraph [0065], In the combination of McGill in view of Hayward, there are more inputs of the first neural network 405 of McGill (see McGill, Fig. 4: 425-460 and 480), than the single input to the chained second neural network.).

Claim 6, McGill in view of Hayward further teaches:
The method of claim 4, wherein the first neural network and the second neural network are trained end-to-end (Hayward, Paragraph [0065-0066], The neural networks are trained in a chained formation, which is functionally equivalent to being trained end-to-end.).

Claim 7, McGill in view of Hayward further teaches:
The method of claim 4, wherein: the second neural network has been trained on a training data set comprising a plurality of training examples having respective training future time points (McGill, Paragraph [0047], In the combination of McGill in view of Hayward, all neural networks 405 are trained.  A plurality of conditions, e.g. weather conditions, day of the week, day of the year, and temperature are a plurality of training examples.  It would have been obvious to one of ordinary skill in the art for the examples of the day of the week and the day of the year to have respective future time points.); and 
the future time point is not included in the plurality of training future time points (McGill, Paragraph [0047], It would have been obvious to one of ordinary skill in the art for the future time points, i.e. the times in which the system predicts a change in status of the cells/nodes (see McGill, Paragraphs [0046-0047]), to either be the same or different than the day(s) of the week or day(s) of the year.  Therefore, it is within the scope of the combination of McGill in view of Hayward to train the neural networks for time periods not predicted to include a change in status of the cells/nodes.).

Claim 15, McGill teaches:
The system of claim 14, wherein generating the occupancy output comprises: 
processing a second network input (McGill, Fig. 4: 425-460) comprising the identification of the future time point using a second neural network to generate the occupancy output (McGill, Paragraph [0046], The neural network 405 can be a plurality of neural networks 405 operating together.  Thus, in the production of future status (McGill, Paragraphs [0046-0047]), the output is performed by the plurality of neural networks 405 including a plurality of different network inputs in addition to the future time point.).
McGill does not specifically teach:
Processing the intermediate output using the second neural network.
Hayward teaches:
Chaining neural networks together, wherein the output of the first neural network is fed as input to a second neural network (Hayward, Paragraph [0065]).
Therefore, it would have been obvious to one of ordinary skill in the art, at the time of filing, to modify the neural networks in McGill by integrating the teaching of chaining neural networks, as taught by Hayward.
The motivation would be to train the neural networks by comparing known results with weighted results of the training neural networks to prevent overestimating or underestimating (see Hayward, Paragraph [0066]).  Thus, in the combination of McGill in view of Hayward, the neural networks 405 of McGill would be chained together, as taught by Hayward.

Claim 19, McGill teaches:
The system of claim 18, wherein generating the occupancy output comprises: 
processing a second network input (McGill, Fig. 4: 425-460) comprising the identification of the future time point using a second neural network to generate the occupancy output (McGill, Paragraph [0046], The neural network 405 can be a plurality of neural networks 405 operating together.  Thus, in the production of future status (McGill, Paragraphs [0046-0047]), the output is performed by the plurality of neural networks 405 including a plurality of different network inputs in addition to the future time point.).
McGill does not specifically teach:
Processing the intermediate output using the second neural network.
Hayward teaches:
Chaining neural networks together, wherein the output of the first neural network is fed as input to a second neural network (Hayward, Paragraph [0065]).
Therefore, it would have been obvious to one of ordinary skill in the art, at the time of filing, to modify the neural networks in McGill by integrating the teaching of chaining neural networks, as taught by Hayward.
The motivation would be to train the neural networks by comparing known results with weighted results of the training neural networks to prevent overestimating or underestimating (see Hayward, Paragraph [0066]).  Thus, in the combination of McGill in view of Hayward, the neural networks 405 of McGill would be chained together, as taught by Hayward.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES J YANG whose telephone number is (571)270-5170. The examiner can normally be reached 10:00am-7:00p M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian Zimmerman can be reached on (571) 272-3059. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JAMES J YANG/Primary Examiner, Art Unit 2683