DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to Applicant's Amendment and Remarks filed on 4/6/2022. This Action is made FINAL.
Claims 1-20 are pending for examination.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1 have been considered but are moot because the new ground of rejection utilizing a new combination of paragraphs from the references cited in the previous office action and new reference MOHAJERIN (US20200148215A1).

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 6, 16  provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 7, 14 of copending Application No. 16591518 (hereafter '518), in view of  MOHAJERIN (US20200148215A1).
Table has been created below to compare claims 7, 14  of the instant application and claims             of the '518 application side by side.
Instant Application             
Copending Application             
6. A method comprising:
	receiving first sensor data associated with a first sensor type;
	receiving second sensor data associated with a second sensor type;
	determining, based at least in part on the first sensor data and using a first machine-learned model, a first predicted occupancy map, wherein the first predicted occupancy map indicates whether a portion of an environment surrounding an autonomous vehicle is occupied or unoccupied at a future time;
	determining, based at least in part on the second sensor data and using a second machine-learned model, a second predicted occupancy map, wherein the second predicted occupancy map indicates whether the portion of the environment surrounding the autonomous vehicle is occupied or unoccupied at the future time;
	combining the first predicted occupancy map and the second predicted occupancy map into a data structure indicating whether the portion of the environment is occupied or unoccupied at the future time; and
	

	controlling the autonomous vehicle based at least in part on the data structure.
7. A method comprising:
	receiving first sensor data associated with a first sensor type;
	receiving second sensor data associated with a second sensor type;
	determining, based at least in part on the first sensor data, a first occupancy map wherein the first occupancy map indicates whether a portion of an environment surrounding an autonomous vehicle is occupied or unoccupied at a future time;
	

	determining, based at least in part on the second sensor data, a second occupancy map, wherein the second occupancy map indicates whether the portion is occupied or unoccupied at the future time;



	combining the first occupancy map and the second occupancy map into a data structure based at least in part on the first occupancy map and the second occupancy map, wherein the data structure indicates whether the portion of the environment is occupied or unoccupied at the future time; and
	controlling the autonomous vehicle based at least in part on the data structure.



As illustrated in the table above, claims 6 of the instant application are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 7 of '518 application. All matching elements of the claim limitations appear in bold while non-matching elements of the claim laminations are not bolded.
Although the claims at issue are not identical, they are not patentably distinct from each other because both inventions are directed to  controlling the autonomous vehicle. Both inventions complete their function using the equivalent components such as first sensor data and second sensor data. Similar, both inventions determine occupancy maps and combine occupancy maps. Lastly, both inventions controlling the autonomous vehicle.
Copending application does not include teachings of “determining…using a first machine-learned model, a first predicted occupancy map” and “determining…using a second machine-learned model, a second predicted occupancy map”  in the instant application. 
However, in the same field of endeavor,  MOHAJERIN (US20200148215A1) teaches “determining…using a first machine-learned model, a first predicted occupancy map” and “determining…using a second machine-learned model, a second predicted occupancy map”( MOHAJERIN: Para 70 " the OGM prediction system 120 may include a machine learning module that implements a learned model that generates predicted OGMs from an input OGM; i.e.      learned model could be apply to different input to generate first and second predicted occupancy map").      
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the method of instant application with the feature of “determining…using a first machine-learned model, a first predicted occupancy map” and “determining…using a second machine-learned model, a second predicted occupancy map” disclosed by MOHAJERIN. 
As per claim 16, it recites a non-transitory computer-readable medium having limitations similar to those of claim 6 and therefore is rejected on the same basis in view of claims 14 of '518 application and MOHAJERIN.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-4, 6-12, 15-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Giorgio (US20210131823A1) in view of Yadmellat (US20210064040A1) and MOHAJERIN (US20200148215A1).

	In regards to claim 1, Giorgio teaches A system comprising:
	one or more processors(Giorgio: Para 245 “one or more of those stages may be integrated in the form of a multi-functional stage and/or circuit, e.g. in a single processor or DSP”); and
	a memory storing processor-executable instructions that, when executed by the one or more processors, cause the system to perform operations (Giorgio: Claim 17 “a computer program product loadable in the memory of at least one processing circuit and comprising software code portions for executing the steps of the method”)  comprising:
	receiving spatial data associated with a first sensor type (Giorgio: Claim 41 “The plurality of sensors/detector devices S1, S2, Si includes sensors, known per se, such as LIDAR/radar sensor, camera, GPS antenna and sources using the GPS information such as (GPS/GIS) context map, etc., and it is coupled to the system 100. Specifically, in the example shown the sensors are a LIDAR, a camera and a GPS/GIS system producing a (GPS/GIS) context map on the basis of the detected position”; Para 52 “the second temporal fusion stage 302 receives the second dataset Y2 of sensor readings from a second sensor S2 and provides as output a second map G2”);
	receiving image data associated with an image sensor (Giorgio: Claim 41 “The plurality of sensors/detector devices S1, S2, Si includes sensors, known per se, such as LIDAR/radar sensor, camera, GPS antenna and sources using the GPS information such as (GPS/GIS) context map, etc., and it is coupled to the system 100. Specifically, in the example shown the sensors are a LIDAR, a camera and a GPS/GIS system producing a (GPS/GIS) context map on the basis of the detected position”; Para 52 “the first temporal fusion stage 301 receives the first dataset Y1 of sensor readings from a first sensor S1 and provides as output a first map G1”);
	determining, based at least in part on the spatial data [[and using a first machine-learned model,]] a first current occupancy map (Giorgio: Para 52 “the second temporal fusion stage 302 receives the second dataset Y2 of sensor readings from a second sensor S2 and provides as output a second map G2”; Para 108 “the second grid map G2 is the result of processing data from LiDAR sensors, providing information about estimated distance of static and/or dynamic obstacles”) [[and a first predicted occupancy map]],
	determining, based at least in part on the image data [[and using a second machine-learned model,]] a second current occupancy map (Giorgio: Fig. 4A Element a1; Para 52 “the first temporal fusion stage 301 receives the first dataset Y1 of sensor readings from a first sensor S1 and provides as output a first map G1”; Para 108 “the first occupancy grid map G1 is the result of processing data from a visual sensor, thus providing information both on static and dynamic objects and on the topology of the environment (e.g. road with lanes)”) [[and a second predicted occupancy map]]		
	combining the first current occupancy map and the second current occupancy map into a data structure indicating whether the portion of the environment is occupied or unoccupied at the current time (Giorgio: Para 112 “Specifically, after respective sensor occupancy grid maps G1, G2, Gi are generated in the temporal fusion stage 30 via temporal fusion processing 30 i of sensor readings Y1, Y2, Yi, for every time step k, the grid maps in the set of sensor occupancy grid maps G1, G2, Gi are “fused” together in the set FF of fused maps which includes at least one “fused” occupancy grid map F, E”); 
Yet Giorgio do not teach determining… using a first machine-learned model …a first predicted occupancy map,
determining… using a second machine-learned model… a second predicted occupancy map, 
combining the first predicted occupancy map and the second predicted occupancy map into the data structure indicating whether the portion of the environment is occupied or unoccupied at a future time; and 
controlling an autonomous vehicle based at least in part on the data structure.
However, in the same field of endeavor, Yadmellat teaches combining the first predicted occupancy map and the second predicted occupancy map into the data structure indicating whether the portion of the environment is occupied or unoccupied at the future time (Yadmellat: Fig. 6C Element 650; Para 89 “Next, the pre-summation OGMs 675 a, 675 b, 675 c are processed in a summation operation to generate a final predicted 2D OGM 650, which is then sent to the trajectory generator module 360 to generate a final predictive trajectory 695. Comparing the predictive trajectory 695 to the non-predictive trajectory 690, it is readily clear that trajectory 695 is safer and leaves room for object 680 a to move ahead before merging into traffic”); and 
controlling an autonomous vehicle based at least in part on the data structure (Yadmellat: Fig. 6C Element 650; Para 89 “Next, the pre-summation OGMs 675 a, 675 b, 675 c are processed in a summation operation to generate a final predicted 2D OGM 650, which is then sent to the trajectory generator module 360 to generate a final predictive trajectory 695. Comparing the predictive trajectory 695 to the non-predictive trajectory 690, it is readily clear that trajectory 695 is safer and leaves room for object 680 a to move ahead before merging into traffic”; Para 3 “The generated trajectory is then fed to a vehicle control system to control the vehicle to follow the given trajectory”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the system of Giorgio with the feature of combining the first predicted occupancy map and the second predicted occupancy map into the data structure indicating whether the portion of the environment is occupied or unoccupied at the future time; and controlling an autonomous vehicle based at least in part on the data structure. disclosed by Yadmellat. One would be motivated to do so for the benefit of “generating a trajectory based on the single predicted OGM” (Yadmellat: Para 16).
Yet the combination of Giorgio and Yadmellat do not teach determining… using a first machine-learned model …a first predicted occupancy map,
determining… using a second machine-learned model… a second predicted occupancy map, 
However, in the same field of endeavor, MOHAJERIN teaches determining… using a first machine-learned model …a first predicted occupancy map (MOHAJERIN: Para 50 “The sensor system 110 communicates with the OGM prediction system 120 to provide observed OGMs. The sensor system 110 receives sensor data (e.g., observations) from the various sensing units, generates an observed OGM from the received sensor data (e.g., observations), and provides the observed OGM to the OGM system 120”; Para 51 “The RNN 122, MF extractor 124, OGM classifier 126, encoder 128 and decoder 129 operate together to generate predicted OGMs from observed OGMs”; Para 54 “The OGM prediction system 120 may repeatedly (e.g., in regular intervals) receive observed OGMs from the OGM generator 121 of the sensor system 110 and generate predicted OGMs in real-time or near real-time”; Para 69 “a machine learning-based system for OGM prediction (e.g., the OGM prediction system 120) that aims to address one or more of the drawbacks discussed above. The disclosed OGM prediction system 120 generates, at each time step in a defined time period, a predicted OGM (e.g., one predicted OGM) for the next time step in the defined time period based on an input OGM for a current time step in the defined time period. The input OGM for the current time step may be a historical observed OGM obtained from a set of historical observed OGMs (e.g., a set comprising about 10 observed OGMs generated by the OGM generator 121 from sensor data received from one or more of the sensing units 112, 114, 116, 118) or a previously-predicted OGM (e.g., a predicted OGM previously generated by the OGM prediction system 120 for the current time step in the defined time period)”; Para 70 “the OGM prediction system 120 may include a machine learning module that implements a learned model that generates predicted OGMs from an input OGM”),
determining… using a second machine-learned model… a second predicted occupancy map (MOHAJERIN: Para 50 “The sensor system 110 communicates with the OGM prediction system 120 to provide observed OGMs. The sensor system 110 receives sensor data (e.g., observations) from the various sensing units, generates an observed OGM from the received sensor data (e.g., observations), and provides the observed OGM to the OGM system 120”; Para 51 “The RNN 122, MF extractor 124, OGM classifier 126, encoder 128 and decoder 129 operate together to generate predicted OGMs from observed OGMs”; Para 54 “The OGM prediction system 120 may repeatedly (e.g., in regular intervals) receive observed OGMs from the OGM generator 121 of the sensor system 110 and generate predicted OGMs in real-time or near real-time”; Para 69 “a machine learning-based system for OGM prediction (e.g., the OGM prediction system 120) that aims to address one or more of the drawbacks discussed above. The disclosed OGM prediction system 120 generates, at each time step in a defined time period, a predicted OGM (e.g., one predicted OGM) for the next time step in the defined time period based on an input OGM for a current time step in the defined time period. The input OGM for the current time step may be a historical observed OGM obtained from a set of historical observed OGMs (e.g., a set comprising about 10 observed OGMs generated by the OGM generator 121 from sensor data received from one or more of the sensing units 112, 114, 116, 118) or a previously-predicted OGM (e.g., a predicted OGM previously generated by the OGM prediction system 120 for the current time step in the defined time period)”; Para 70 “the OGM prediction system 120 may include a machine learning module that implements a learned model that generates predicted OGMs from an input OGM”; i.e. the learned model could be applied to different input resulting in different predicted occupancy maps), 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the system of the combination of Giorgio and Yadmellat with the feature of determining… using a first machine-learned model …a first predicted occupancy map, determining… using a second machine-learned model… a second predicted occupancy map, disclosed by MOHAJERIN. One would be motivated to do so for the benefit of “Occupancy Grid Maps (OGMs) are commonly used to represent the environment surrounding an autonomous device. An OGM may be generated from sensor data received from sensors of the autonomous device (also referred to as observations) and represented as a grid of cells” (MOHAJERIN: Para 3).

	In regards to claim 2, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The system of claim 1, and Giorgio further teaches the operations further comprising:
	determining a first set of detections using a first perception system (Giorgio: Para 52 “the second temporal fusion stage 302 receives the second dataset Y2 of sensor readings from a second sensor S2 and provides as output a second map G2”; Para 108 “the second grid map G2 is the result of processing data from LiDAR sensors, providing information about estimated distance of static and/or dynamic obstacles”); and
wherein at least one of [[the first predicted occupancy map,]] the second predicted occupancy map, [[or the data structure]] is determined using a second perception system(Giorgio: Fig. 4A Element a1; Para 52 “the first temporal fusion stage 301 receives the first dataset Y1 of sensor readings from a first sensor S1 and provides as output a first map G1”; Para 108 “the first occupancy grid map G1 is the result of processing data from a visual sensor, thus providing information both on static and dynamic objects and on the topology of the environment (e.g. road with lanes)”).

In regards to claim 3, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The system of claim 2, and MOHAJERIN further teaches the first machine-learned model comprises a first encoder and a first decoder (MOHAJERIN: Para 70 “OGM prediction system 120 includes encoder/decoder(s) so that large observed OGMs may be processed in a relatively short period of time (e.g., equal to or faster than the frequency for obtaining one frame of sensor data using typical sensors on an autonomous vehicle)”; Para 12 “a first encoder configured to receive, at each time step in a defined time period, an input OGM for a current time step in the defined time period and extract OGM features from the input OGM”); and
wherein the second machine-learned model comprises a second encoder and a second decoder (MOHAJERIN: Para 70 “OGM prediction system 120 includes encoder/decoder(s) so that large observed OGMs may be processed in a relatively short period of time (e.g., equal to or faster than the frequency for obtaining one frame of sensor data using typical sensors on an autonomous vehicle)”; Para 13 “a second encoder configured for extracting reference map features from a reference map, the reference map representing a priori information about the sensed environment”; i.e. it is well known in the art that each encoder would be match with an decoder).

	In regards to claim 4, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The system of claim 3, and MOHAJERIN further teaches the first encoder comprises five encoder blocks (MOHAJERIN: Fig. 6 Element 128a; i.e. the drawing shows multiple encoders blocks in one system), a first encoder block of the five encoder blocks comprising two convolutional layers followed by a first normalization layer (MOHAJERIN: Para 93 “the encoder 128 may be implemented as a series of convolution layers, followed by pooling operations.”);
	wherein the first decoder comprises five decoder blocks, wherein a first decoder block of the five decoder blocks comprises three convolutional layers and a second normalization layer (MOHAJERIN: Para 141 “the first encoder may include one or more convolution and pooling layers for extracting the OGM features. The machine learning-based system may also include: a decoder including one or more deconvolution layers corresponding to transpositions of the one or more convolution and pooling layers of the first encoder, wherein the decoder converts output from the recurrent neural network to the corrective term”; i.e. the number of decoders are matched with the number of encoders 128 a, 128 b, 128 c);
	wherein the second encoder comprises four encoder blocks, a first encoder block of the four encoder blocks comprising a ResNet block (MOHAJERIN: Fig. 6-7 Element 122; Para 81 “OGM prediction system 120 a, OGM predictions are generated based on features fk m extracted from the reference map Mk*, features fk o extracted from the OGM (observed or predicted) at each time step, and features fk p extracted from a motion-flow μk of the OGM at each time step. Generally, features may be extracted using any suitable neural network, for example a convolution neural network, as discussed further below”; Para 139 “The disclosed OGM prediction system 120 may be implemented using any suitable machine learning-based architecture, including any suitable neural network architectures”; i.e. ResNet is a type of neural network) and a pooling layer(MOHAJERIN: Para 93 “the encoder 128 may be implemented as a series of convolution layers, followed by pooling operations”); and
	wherein the second decoder comprises four decoder blocks, a first decoder block of the four decoder blocks comprising a ResNet block (MOHAJERIN: Fig. 6-7 Element 122; Para 81 “OGM prediction system 120 a, OGM predictions are generated based on features fk m extracted from the reference map Mk*, features fk o extracted from the OGM (observed or predicted) at each time step, and features fk p extracted from a motion-flow μk of the OGM at each time step. Generally, features may be extracted using any suitable neural network, for example a convolution neural network, as discussed further below”; Para 139 “The disclosed OGM prediction system 120 may be implemented using any suitable machine learning-based architecture, including any suitable neural network architectures”;  i.e. ResNet is a type of neural network) and an upsampling layer (MOHAJERIN: Para 98 The use of the encoder 128 together with the decoder 129 may be referred to as an encoder/decoder architecture. The encoder 128 may be considered to be performing downsampling of the features. The decoder 129 may be considered as performing upsampling of the output the RNN 122 (e.g., the predicted corrective term Δok), to increase the dimensionality of the output the RNN 122 (e.g., the predicted corrective term Δok) back to the original dimensions of the input OGM ok.”).

	In regards to claim 6, Giorgio teaches a method comprising:
	receiving first sensor data associated with a first sensor type (Giorgio: Claim 41 “The plurality of sensors/detector devices S1, S2, Si includes sensors, known per se, such as LIDAR/radar sensor, camera, GPS antenna and sources using the GPS information such as (GPS/GIS) context map, etc., and it is coupled to the system 100. Specifically, in the example shown the sensors are a LIDAR, a camera and a GPS/GIS system producing a (GPS/GIS) context map on the basis of the detected position”; Para 52 “the second temporal fusion stage 302 receives the second dataset Y2 of sensor readings from a second sensor S2 and provides as output a second map G2”);
	receiving second sensor data associated with a second sensor type (Giorgio: Claim 41 “The plurality of sensors/detector devices S1, S2, Si includes sensors, known per se, such as LIDAR/radar sensor, camera, GPS antenna and sources using the GPS information such as (GPS/GIS) context map, etc., and it is coupled to the system 100. Specifically, in the example shown the sensors are a LIDAR, a camera and a GPS/GIS system producing a (GPS/GIS) context map on the basis of the detected position”; Para 52 “the first temporal fusion stage 301 receives the first dataset Y1 of sensor readings from a first sensor S1 and provides as output a first map G1”); 
Yet Giorgio do not teach determining, based at least in part on the first sensor data and using a first machine-learned model, a first predicted occupancy map, wherein the first predicted occupancy map indicates whether a portion of an environment surrounding an autonomous vehicle is occupied or unoccupied at a future time;
determining, based at least in part on the second sensor data and using a second machine-learned model, a second predicted occupancy map, wherein the second predicted occupancy map indicates whether the portion of the environment surrounding the autonomous vehicle is occupied or unoccupied at the future time;
combining the first predicted occupancy map and the second predicted occupancy map into a data structure indicating whether the portion of the environment is occupied or unoccupied at the future time; and
controlling the autonomous vehicle based at least in part on the data structure.
	However, in the same field of endeavor, Yadmellat teaches … wherein the first predicted occupancy map (Yadmellat: Fig. 6C Element 670a-670c; Para 88 “FIG. 6C illustrates an example where a set of predicted OGMs Mt={Mt0, Mt1, Mt2} 410 are processed (e.g. filtered) by the OGM merger module 351 with weight maps 415 and area of interest maps 610 to generate a set of filtered predicted OGMs 670 a, 670 b, 670 c, which at each point in time t=t0, t1 or t2, takes into consideration where other moving objects 680 a, 680 b, 680 c might be”)  indicates whether a portion of an environment surrounding an autonomous vehicle is occupied or unoccupied at a future time (Yadmellat: Para 62 “Occupancy grid maps (OGMs) are commonly used to represent the environment surrounding the vehicle 100. An OGM may be represented as a grid cell. Each cell represents a physical space in the environment, and each cell contains a value representing the probability of each cell being occupied by an object, based on information or observations regarding the vehicle 100 and its surrounding environment. For example, the path planning system 130 may include a module, such as the OGM prediction module 340, for generating a predicted OGM”);
	… wherein the second predicted occupancy map (Yadmellat: Fig. 6C Element 675a-675c; Para 88 “The filtered predicted OGMs 670 a, 670 b, 670 c are then processed to generate pre-summation predicted OGMs 675 a, 675 b, 675 c, which only show the moving object(s) that appear in a relevant weight region 640 at a point in time”)  indicates whether the portion of the environment surrounding the autonomous vehicle is occupied or unoccupied at the future time (Yadmellat: Para 62 “Occupancy grid maps (OGMs) are commonly used to represent the environment surrounding the vehicle 100. An OGM may be represented as a grid cell. Each cell represents a physical space in the environment, and each cell contains a value representing the probability of each cell being occupied by an object, based on information or observations regarding the vehicle 100 and its surrounding environment. For example, the path planning system 130 may include a module, such as the OGM prediction module 340, for generating a predicted OGM”);
combining the first predicted occupancy map and the second predicted occupancy map into the data structure indicating whether the portion of the environment is occupied or unoccupied at the future time (Yadmellat: Fig. 6C Element 650; Para 89 “Next, the pre-summation OGMs 675 a, 675 b, 675 c are processed in a summation operation to generate a final predicted 2D OGM 650, which is then sent to the trajectory generator module 360 to generate a final predictive trajectory 695. Comparing the predictive trajectory 695 to the non-predictive trajectory 690, it is readily clear that trajectory 695 is safer and leaves room for object 680 a to move ahead before merging into traffic”); and 
controlling an autonomous vehicle based at least in part on the data structure (Yadmellat: Fig. 6C Element 650; Para 89 “Next, the pre-summation OGMs 675 a, 675 b, 675 c are processed in a summation operation to generate a final predicted 2D OGM 650, which is then sent to the trajectory generator module 360 to generate a final predictive trajectory 695. Comparing the predictive trajectory 695 to the non-predictive trajectory 690, it is readily clear that trajectory 695 is safer and leaves room for object 680 a to move ahead before merging into traffic”; Para 3 “The generated trajectory is then fed to a vehicle control system to control the vehicle to follow the given trajectory”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the system of Giorgio with the feature of wherein the first predicted occupancy map indicates whether a portion of an environment surrounding an autonomous vehicle is occupied or unoccupied at a future time; wherein the second predicted occupancy map indicates whether the portion of the environment surrounding the autonomous vehicle is occupied or unoccupied at the future time; combining the first predicted occupancy map and the second predicted occupancy map into the data structure indicating whether the portion of the environment is occupied or unoccupied at the future time; and controlling an autonomous vehicle based at least in part on the data structure. disclosed by Yadmellat. One would be motivated to do so for the benefit of “generating a trajectory based on the single predicted OGM” (Yadmellat: Para 16).
Yet the combination of Giorgio and Yadmellat do not teach determining, based at least in part on the first sensor data and using a first machine-learned model, a first predicted occupancy map…
determining, based at least in part on the second sensor data and using a second machine-learned model, a second predicted occupancy map…
However, in the same field of endeavor, MOHAJERIN teaches determining, based at least in part on the first sensor data and using a first machine-learned model, a first predicted occupancy map (MOHAJERIN: Para 50 “The sensor system 110 communicates with the OGM prediction system 120 to provide observed OGMs. The sensor system 110 receives sensor data (e.g., observations) from the various sensing units, generates an observed OGM from the received sensor data (e.g., observations), and provides the observed OGM to the OGM system 120”; Para 51 “The RNN 122, MF extractor 124, OGM classifier 126, encoder 128 and decoder 129 operate together to generate predicted OGMs from observed OGMs”; Para 54 “The OGM prediction system 120 may repeatedly (e.g., in regular intervals) receive observed OGMs from the OGM generator 121 of the sensor system 110 and generate predicted OGMs in real-time or near real-time”; Para 69 “a machine learning-based system for OGM prediction (e.g., the OGM prediction system 120) that aims to address one or more of the drawbacks discussed above. The disclosed OGM prediction system 120 generates, at each time step in a defined time period, a predicted OGM (e.g., one predicted OGM) for the next time step in the defined time period based on an input OGM for a current time step in the defined time period. The input OGM for the current time step may be a historical observed OGM obtained from a set of historical observed OGMs (e.g., a set comprising about 10 observed OGMs generated by the OGM generator 121 from sensor data received from one or more of the sensing units 112, 114, 116, 118) or a previously-predicted OGM (e.g., a predicted OGM previously generated by the OGM prediction system 120 for the current time step in the defined time period)”; Para 70 “the OGM prediction system 120 may include a machine learning module that implements a learned model that generates predicted OGMs from an input OGM”),
determining, based at least in part on the second sensor data and using a second machine-learned model, a second predicted occupancy map (MOHAJERIN: Para 50 “The sensor system 110 communicates with the OGM prediction system 120 to provide observed OGMs. The sensor system 110 receives sensor data (e.g., observations) from the various sensing units, generates an observed OGM from the received sensor data (e.g., observations), and provides the observed OGM to the OGM system 120”; Para 51 “The RNN 122, MF extractor 124, OGM classifier 126, encoder 128 and decoder 129 operate together to generate predicted OGMs from observed OGMs”; Para 54 “The OGM prediction system 120 may repeatedly (e.g., in regular intervals) receive observed OGMs from the OGM generator 121 of the sensor system 110 and generate predicted OGMs in real-time or near real-time”; Para 69 “a machine learning-based system for OGM prediction (e.g., the OGM prediction system 120) that aims to address one or more of the drawbacks discussed above. The disclosed OGM prediction system 120 generates, at each time step in a defined time period, a predicted OGM (e.g., one predicted OGM) for the next time step in the defined time period based on an input OGM for a current time step in the defined time period. The input OGM for the current time step may be a historical observed OGM obtained from a set of historical observed OGMs (e.g., a set comprising about 10 observed OGMs generated by the OGM generator 121 from sensor data received from one or more of the sensing units 112, 114, 116, 118) or a previously-predicted OGM (e.g., a predicted OGM previously generated by the OGM prediction system 120 for the current time step in the defined time period)”; Para 70 “the OGM prediction system 120 may include a machine learning module that implements a learned model that generates predicted OGMs from an input OGM”; i.e. the learned model could be applied to different input resulting in different predicted occupancy maps), 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the system of the combination of Giorgio and Yadmellat with the feature of determining, based at least in part on the first sensor data and using a first machine-learned model, a first predicted occupancy map… determining, based at least in part on the second sensor data and using a second machine-learned model, a second predicted occupancy map…disclosed by MOHAJERIN. One would be motivated to do so for the benefit of “Occupancy Grid Maps (OGMs) are commonly used to represent the environment surrounding an autonomous device. An OGM may be generated from sensor data received from sensors of the autonomous device (also referred to as observations) and represented as a grid of cells” (MOHAJERIN: Para 3).

	In regards to claim 7, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The method of claim 6, and Giorgio further teaches further comprising:
	determining a first detection using a first system(Giorgio: Para 52 “the second temporal fusion stage 302 receives the second dataset Y2 of sensor readings from a second sensor S2 and provides as output a second map G2”; Para 108 “the second grid map G2 is the result of processing data from LiDAR sensors, providing information about estimated distance of static and/or dynamic obstacles”); and
wherein at least one of [[the first predicted occupancy map,]] the second predicted occupancy map, [[or the data structure]] is determined using a second perception system(Giorgio: Fig. 4A Element a1; Para 52 “the first temporal fusion stage 301 receives the first dataset Y1 of sensor readings from a first sensor S1 and provides as output a first map G1”; Para 108 “the first occupancy grid map G1 is the result of processing data from a visual sensor, thus providing information both on static and dynamic objects and on the topology of the environment (e.g. road with lanes)”).

	In regards to claim 8 the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The method of claim 6, and MOHAJERIN further teaches the first machine-learned model comprises a first encoder, wherein the first encoder further comprises five encoder blocks (MOHAJERIN: Fig. 6 Element 128a; i.e. the drawing shows multiple encoders blocks in one system), a first encoder block of the five encoder blocks comprising two convolutional layers followed by a first normalization layer (MOHAJERIN: Para 93 “the encoder 128 may be implemented as a series of convolution layers, followed by pooling operations.”).

In regards to claim 9, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The method of claim 8, and MOHAJERIN further teaches the first encoder further comprises a skip connection between a fourth encoder block of the five encoder blocks to a second encoder block of the five encoder blocks and a pooling layer (MOHAJERIN: Fig. 3; Para 93 “Encoders 128 a, 128 b, 128 c extract the features fk m, fk o, fk p from the reference map Mk*, the input OGM ok and the motion-flow μk, respectively. The encoders 128 a, 128 b, 128 c may be implemented using the encoder 128 shown in FIG. 1A, or may be implemented using separate encoder units. For simplicity, the present disclosure will refer to a single encoder 128, however this is not intended to be limiting. In the disclosed OGM prediction system 120 a, the encoder 128 may be implemented as a series of convolution layers, followed by pooling operations”; Para 103 “As shown in FIG. 3, during the prediction phase, the predicted OGM õk+1 is fed back (via the delay block 304 b) and used as the input OGM for the next time step”;  i.e. the feedback indicate a skip connection while the pooling operations indicate a pooling layer).

In regards to claim 10, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The method of claim 6, and MOHAJERIN further teaches the first machine-learned model comprises a first decoder further comprises five decoder blocks, wherein a first decoder block of the five decoder blocks comprises three convolutional layers and a second normalization layer (MOHAJERIN: Para 141 “the first encoder may include one or more convolution and pooling layers for extracting the OGM features. The machine learning-based system may also include: a decoder including one or more deconvolution layers corresponding to transpositions of the one or more convolution and pooling layers of the first encoder, wherein the decoder converts output from the recurrent neural network to the corrective term”; i.e. the number of decoders are matched with the number of encoders 128 a, 128 b, 128 c). 

In regards to claim 11, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The method of claim 6, and MOHAJERIN further teaches the second machine-learned model comprises a second encoder comprises four encoder blocks,  a first encoder block of the four encoder blocks comprising a ResNet block (MOHAJERIN: Fig. 6-7 Element 122; Para 81 “OGM prediction system 120 a, OGM predictions are generated based on features fk m extracted from the reference map Mk*, features fk o extracted from the OGM (observed or predicted) at each time step, and features fk p extracted from a motion-flow μk of the OGM at each time step. Generally, features may be extracted using any suitable neural network, for example a convolution neural network, as discussed further below”; Para 139 “The disclosed OGM prediction system 120 may be implemented using any suitable machine learning-based architecture, including any suitable neural network architectures”; i.e. ResNet is a type of neural network) and a pooling layer (MOHAJERIN: Para 93 “the encoder 128 may be implemented as a series of convolution layers, followed by pooling operations”)

In regards to claim 12, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The method of claim 6, and MOHAJERIN further teaches the second machine-learned model comprises a second decoder comprises four decoder blocks, a first decoder block of the four decoder blocks comprising a ResNet block (MOHAJERIN: Fig. 6-7 Element 122; Para 81 “OGM prediction system 120 a, OGM predictions are generated based on features fk m extracted from the reference map Mk*, features fk o extracted from the OGM (observed or predicted) at each time step, and features fk p extracted from a motion-flow μk of the OGM at each time step. Generally, features may be extracted using any suitable neural network, for example a convolution neural network, as discussed further below”; Para 139 “The disclosed OGM prediction system 120 may be implemented using any suitable machine learning-based architecture, including any suitable neural network architectures”;  i.e. ResNet is a type of neural network) and an upsampling layer (MOHAJERIN: Para 98 The use of the encoder 128 together with the decoder 129 may be referred to as an encoder/decoder architecture. The encoder 128 may be considered to be performing downsampling of the features. The decoder 129 may be considered as performing upsampling of the output the RNN 122 (e.g., the predicted corrective term Δok), to increase the dimensionality of the output the RNN 122 (e.g., the predicted corrective term Δok) back to the original dimensions of the input OGM ok.”).

In regards to claim 15, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The method of claim 6, and MOHAJERIN further teaches training the first machine-learned model using data associated with the first sensor type (MOHAJERIN: Para 49 “The various sensing units 112, 114, 116, 118 may collect information about the local external environment of the vehicle 100 (e.g., any immediately surrounding obstacles) as well as information from a wider vicinity (e.g., the radar unit 112 and LIDAR unit 114 may collect information from an area of up to 100 m radius or more around the vehicle 100) and provide sensor data indicative of the collected information to an OGM generator 121”; Para 50 “The sensor system 110 communicates with the OGM prediction system 120 to provide observed OGMs. The sensor system 110 receives sensor data (e.g., observations) from the various sensing units, generates an observed OGM from the received sensor data (e.g., observations)”; Para 102 “In order to generate the predicted OGM õk+1 that contains values representing occupancy probability at the next time step, the OGM classifier 126 is used. In some embodiments, the OGM classifier 126 is a machine learning module that includes another neural network (e.g., another RNN) that has been trained to learn to generate a predicted OGM õk+1 that contains values representing occupancy probability at the next time step from a corrected OGM ôk+1. The neural network of the OGM classifier 126 is trained to using training dataset comprising corrected OGMs ôk+1 to learn generate predicted OGMs õk+1. The training of the neural network using the training dataset involves adjusting the parameters (e.g., the weights and biases) of the neural network until a loss function is optimized”; Fig. 1A element 110; i.e. element 110 indicated different type of sensors which generate different datasets forming observed OGMs used to train the OGM prediction system) ; and
training the second machine-learned model using data associated with the second sensor type (MOHAJERIN: Para 49 “The various sensing units 112, 114, 116, 118 may collect information about the local external environment of the vehicle 100 (e.g., any immediately surrounding obstacles) as well as information from a wider vicinity (e.g., the radar unit 112 and LIDAR unit 114 may collect information from an area of up to 100 m radius or more around the vehicle 100) and provide sensor data indicative of the collected information to an OGM generator 121”; Para 50 “The sensor system 110 communicates with the OGM prediction system 120 to provide observed OGMs. The sensor system 110 receives sensor data (e.g., observations) from the various sensing units, generates an observed OGM from the received sensor data (e.g., observations)”; Para 102 “In order to generate the predicted OGM õk+1 that contains values representing occupancy probability at the next time step, the OGM classifier 126 is used. In some embodiments, the OGM classifier 126 is a machine learning module that includes another neural network (e.g., another RNN) that has been trained to learn to generate a predicted OGM õk+1 that contains values representing occupancy probability at the next time step from a corrected OGM ôk+1. The neural network of the OGM classifier 126 is trained to using training dataset comprising corrected OGMs ôk+1 to learn generate predicted OGMs õk+1. The training of the neural network using the training dataset involves adjusting the parameters (e.g., the weights and biases) of the neural network until a loss function is optimized”; Fig. 1A element 110; i.e. element 110 indicated different type of sensors which generate different datasets forming observed OGMs used to train the OGM prediction system).

As per claim 16, it recites a non-transitory computer-readable medium having limitations similar to those of claim 6 and therefore is rejected on the same basis. MOHAJERIN further teaches a non-transitory computer-readable medium storing processor- executable instructions (MOHAJERIN: Para 63 “The non-transitory memories(s) 180 may include a volatile or non-volatile memory (e.g., a flash memory, a random access memory (RAM), and/or a read-only memory (ROM)). The non-transitory memory(ies) 180 may store instructions for execution by the processing device(s) 172, such as to carry out examples described in the present disclosure”)

As per claim 17, it recites a non-transitory computer-readable medium having limitations similar to those of claim 7 and therefore is rejected on the same basis. 

As per claim 18, it recites a non-transitory computer-readable medium having limitations similar to those of claim 8 and 10 and therefore is rejected on the same basis. 

As per claim 19, it recites a non-transitory computer-readable medium having limitations similar to those of claim 11 and 12 and therefore is rejected on the same basis. 

Claim 5, 13-14, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Giorgio (US20210131823A1) in view of Yadmellat (US20210064040A1) and MOHAJERIN (US20200148215A1) further in view of Parchami (US20190235520A1)

In regards to claim 5, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The system of claim 3.
Yet the combination of Giorgio, Yadmellat, and MOHAJERIN do not teach the second machine-learned model further comprises an orthographic feature transform layer configured to convert a pixel space to a top-down space, wherein the orthographic feature transform layer is positioned between the second encoder and the second decoder
However, in the same field of endeavor, Parchami teaches the second machine-learned model further comprises an orthographic feature transform layer configured to convert a pixel space to a top-down space, wherein the orthographic feature transform layer is positioned between the second encoder and the second decoder (Parchami: Para 34 “Prediction images p2-p6 combine distance information with segmentation information to transform estimated results 412 from convolutional layer C10 and deconvolutional layers D2, D4, D6 and D8 into estimated cognitive maps 414 by orthographically projecting the estimated results 412 onto a 2D ground plane based on distance information to segmented objects and coloring the estimated cognitive map 414 based on information regarding object detection, pixel-wise segmentation, 3D object poses, and relative distances included in prediction images p2-p6”; Fig. 4 element 414; i.e. estimated cognitive map 414 is located between the convolutional layers C1-C10(encoder) and deconvolutional layers D1-D10 (decoder)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify The system of the combination of Giorgio, Yadmellat, and MOHAJERIN with the feature of the second machine-learned model further comprises an orthographic feature transform layer configured to convert a pixel space to a top-down space, wherein the orthographic feature transform layer is positioned between the second encoder and the second decoder disclosed by Parchami. One would be motivated to do so for the benefit of “provide accurate and timely information regarding objects near or around a vehicle to support operation of the vehicle” (Parchami: Para 1).

In regards to claim 13, the combination of Giorgio, Yadmellat, and MOHAJERIN teaches The method of claim 6.
Yet the combination of Giorgio, Yadmellat, and MOHAJERIN do not teach the second machine-learned model further comprises an orthographic feature transform layer configured to convert a pixel space to a top-down space.
However, in the same field of endeavor, Parchami teaches the second machine-learned model further comprises an orthographic feature transform layer configured to convert a pixel space to a top-down space (Parchami: Para 34 “Prediction images p2-p6 combine distance information with segmentation information to transform estimated results 412 from convolutional layer C10 and deconvolutional layers D2, D4, D6 and D8 into estimated cognitive maps 414 by orthographically projecting the estimated results 412 onto a 2D ground plane based on distance information to segmented objects and coloring the estimated cognitive map 414 based on information regarding object detection, pixel-wise segmentation, 3D object poses, and relative distances included in prediction images p2-p6”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify The system of the combination of Giorgio, Yadmellat, and MOHAJERIN with the feature of the second machine-learned model further comprises an orthographic feature transform layer configured to convert a pixel space to a top-down space disclosed by Parchami. One would be motivated to do so for the benefit of “provide accurate and timely information regarding objects near or around a vehicle to support operation of the vehicle” (Parchami: Para 1).

In regards to claim 14, the combination of Giorgio, Yadmellat, MOHAJERIN, and Parchami teaches The method of claim 6, and Parchami further teaches the orthographic feature transform layer is positioned between a second encoder and a second decoder(Parchami: Fig. 4 element 414; i.e. estimated cognitive map 414 is located between the convolutional layers C1-C10(encoder) and deconvolutional layers D1-D10 (decoder)).

As per claim 20, it recites a non-transitory computer-readable medium having limitations similar to those of claim 13 and 14 and therefore is rejected on the same basis. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to WENYUAN YANG whose telephone number is (571)272-5455. The examiner can normally be reached Monday - Thursday 9:00AM-5:00PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, James J. Lee can be reached on (571) 270-5965. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/W.Y./Examiner, Art Unit 3668                                                                                                                                                                                                        
/JAMES J LEE/Supervisory Patent Examiner, Art Unit 3668