DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 6-8, 10, 12-18 and 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Laddah et al (US PG-PUB No. 20220035376 A1).
-Regarding claim 6, Laddah discloses a method comprising (Abstract; FIGS. 1-9): receiving sensor data from a sensor associated with an environment including an object (FIG. 1, data 116, system 118; FIG. 6, step 602; [0023]); determining, based at least in part on the sensor data, spatial data representing the environment ([0007], “obtaining map data associated with the LIDAR data and the radar data”; [0025]; [0027]; [0029], “map data”; FIG. 1, data 122; FIG. 6, step 604); inputting the spatial data into a first portion of a machine learned (ML) model (FIG. 1; FIG. 3, data 306, map subnetwork 340; FIG. 4); receiving intermediate data from the first portion of the ML model (FIG. 3, subnetwork 340, feature 342; FIG. 4), wherein the intermediate data includes spatial feature data corresponding to a spatial feature encoded as being associated with the object ([0096], “generate a coordinate frame with various information, such as lane locations”); determining, based at least in part on the sensor data (FIG. 3, data 302, 304), secondary feature data corresponding to a feature associated with the object (FIG. 3, features 332, 312, output 320; FIG. 4; FIG. 6, step 608); inputting the spatial feature data and the secondary feature data into a second portion of the ML model (FIG. 1; FIG. 3, backbone 350, head 360; FIG. 4; FIG. 6, steps 610-612; [0098]); and determining a classification probability of the object based at least in part on data received from the second portion of the ML model (FIGS. 1, 3-4, 6; [0105], “probabilistic multi-hypothesis trajectory predictions for each actor/detected object”; [0118]; [0098]).
-Regarding claim 15, Laddah discloses one or more non-transitory computer-readable media storing instructions executable by a processor ([0066], “non-transitory”; [0132]; FIG. 9), wherein the instructions, when executed, cause the processor to perform operations ([0059]; [0066]; FIG. 9) comprising (Abstract; FIGS. 1-9): receiving sensor data from a sensor associated with an environment including an object (FIG. 1, data 116, system 118; FIG. 6, step 602; [0023]); determining, based at least in part on the sensor data, spatial data representing the environment ([0007], “obtaining map data associated with the LIDAR data and the radar data”; [0025]; [0027]; [0029], “map data”; FIG. 1, data 122; FIG. 6, step 604); inputting the spatial data into a first portion of a machine learned (ML) model (FIG. 1; FIG. 3, data 306, map subnetwork 340; FIG. 4); receiving intermediate data from the first portion of the ML model (FIG. 3, subnetwork 340, feature 342; FIG. 4), wherein the intermediate data includes spatial feature data corresponding to a spatial feature encoded as being associated with the object ([0096], “generate a coordinate frame with various information, such as lane locations”); determining, based at least in part on the sensor data (FIG. 3, data 302, 304), secondary feature data corresponding to a feature associated with the object (FIG. 3, features 332, 312, output 320; FIG. 4; FIG. 6, step 608); inputting the spatial feature data and the secondary feature data into a second portion of the ML model (FIG. 1; FIG. 3, backbone 350, head 360; FIG. 4; FIG. 6, steps 610-612; [0098]); and determining a classification probability of the object based at least in part on data received from the second portion of the ML model (FIGS. 1, 3-4, 6; [0105], “probabilistic multi-hypothesis trajectory predictions for each actor/detected object”; [0118]; [0098]).
-Regarding claims 7 and 16, Laddah discloses the method of claim 6 and the non-transitory computer-readable media of claim 15.
Laddah further discloses the spatial feature comprises at least one of: map information associated with the environment (FIG.1, map data 122; FIG. 3, map data 306; [0023]); a bounding box associated with the object (FIGS. 1, 3; [0073], “bounding shape”); or a size associated with the object (FIGS. 1, 3; [0073], “size/footprint”); and the feature comprises at least one of a velocity associated with the object (FIGS. 1, 3; [0073], “current speed”); an acceleration associated with the object (FIGS. 1, 3; [0063], “a velocity, acceleration, a trajectory”); or a lighting state associated with the object.
-Regarding claims 8 and 17, Laddah discloses the method of claim 7 and the non-transitory computer-readable media of claim 16.
Laddah further discloses  wherein the spatial data represents a top-down view of the environment (FIGS. 1, 3; [0037], “top-down”; [0086]; [0095]).
-Regarding claims 10 and 20, Laddah discloses the method of claim 6 and the non-transitory computer-readable media of claim 15.
Laddah further discloses wherein the first portion of the ML model comprises a Convolutional Neural Network (CNN) (FIG. 4), and the second portion of the ML model comprises a Deep Neural Network (DNN) (FIG. 4).
-Regarding claim 12, Laddah discloses the method of claim 6.
Laddah further discloses associating the sensor data with a three-dimensional voxel space representing the environment ([0032]); and wherein the spatial data represents a reduced-dimensionality representation of the three-dimensional voxel space ([0086]).
-Regarding claim 13, Laddah discloses the method of claim 6.
Laddah further discloses comprising receiving the sensor data from a sensor associated with an autonomous vehicle in the environment (FIG. 1, [0023]).
-Regarding claim 14, Laddah discloses the method of claim 6.
Laddah further discloses controlling an autonomous vehicle based at least in part on the classification probability (FIG. 1; [0041]; [0054]; [0062]; FIG. 6, steps 612-618; [0118]; [0121]).
-Regarding claim 18, Laddah discloses the non-transitory computer-readable media of claim 15.
	Laddah further discloses wherein the spatial data includes: first channel comprising a first spatial feature; and a second channel comprising a second spatial feature (FIG. 4, [0102], “have a different number of channels, features”; FIG. 3, [0090]; [0095], “multi-channel binary occupancy feature”; [0035]; [0093]).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-5 are rejected under 35 U.S.C. 103 as being unpatentable over Smolyanskiy et al (US PG-PUB No. 20210150230 A1) in view of Laddah et al (US PG-PUB No. 20220035376 A1).
-Regarding claim 1, Smolyanskiy discloses a system comprising (Abstract; Figures 1-17): one or more processors (Figure 17; Figures 16C-16D); and one or more non-transitory computer-readable media storing instructions executable by one or more processors (Figure 17; Figures 16C-16D; [0041]; [0099]), wherein the instructions, when executed, cause the system to perform operations comprising ([0041]; [0099]; [0148]; [0243]; Figure 17): receiving sensor data from a sensor associated with an environment including an object (Figure 4, data 402; Figure 6, input data 406; Figure 8; [0058]; Abstract; [0006]-[0007]; [0030], “LiDAR … objects …environment”; [0032]; Figure 10, B1002); determining, based at least in part on the sensor data (Figures 6, 8, 10), multi-channel image data representing a top-down view of the environment (Figure 3, [0005]; Figure 6, 630, [0059], “N channels”; [0071]-[0073], “multi-channel”; Figure 8, 830; Figure 10, B1008), the multi-channel image data including image data associated with a spatial feature corresponding to the object (Figure 3; [0037], “bounding boxes”; [0042]; [0057], “height maps”; [0105], “spatial dimension”; [0107]); inputting the multi-channel input data into a first portion of a machine learned (ML) model (Figure 4, model 408, object detection 416; Figure 6, trunk 650 (encoder), head 655; Figure 8, “2nd stage NN”); receiving, as an intermediate output (Figure 6, output of encoder of trunk 650), intermediate output data from the first portion of the ML model wherein the intermediate output data includes the spatial feature encoded as being associated with the object; determining, based at least in part on the sensor data ([0037], “bounding boxes”; [0042]; [0057], “height maps”; [0063]; [0105], “spatial dimension”; [0107]; Figures 6, 8), non-spatial feature data representing a non-spatial feature associated with the object (Figure 5, [0050], “sensor data 402 may be encoded 530 into a suitable representation … different features such as reflection characteristics”; [0053], “bearing, azimuth, elevation, range, intensity, reflectivity, SNR”; [0137]); inputting the intermediate output data and the non-spatial feature data into a second portion of the ML model; receiving output data from the second portion of the ML model (Figure 6, trunk 650 (decoder), class confidence head 655; [0070]); and determining a classification probability for the object based at least in part on the output data (Figures 4, 6, confidence data 410; [0071], “classification values (e.g., probability, score, or logit)”).
Smolyanskiy does not disclose inputting the intermediate output data and the non-spatial feature data into a second portion of the ML model. A person of ordinary skill in the art understands that multi-sensor features or multi-features fusion is a commonly used method for two-dimensional (2D) or three-dimensional (3D) object detection by fusing feature maps based on sensor input (i.e., intermediate output data) with other features (See Liang et al, 2019 CVPR: Figures 2-3; Page 7352, Section 5, “multi-sensor fusion … temporal information”). 
	In the same field of endeavor, Laddah teaches system and methods generating, one or more features for each of LIDAR data, transformed radar data, and map data, combining the one or more generated features to generate fused feature data, and generating prediction data based on the fused feature data (Laddah: Abstract; FIGS. 1-9; [0083], “top-down, … Birds Eye View (BEV)”).
 	Laddah further teaches receiving, as an intermediate output, intermediate output data from the first portion of the ML model (Laddah: FIG. 1, 123; FIG. 3, 310, 312, 332) wherein the intermediate output data includes the spatial feature encoded as being associated with the object (Laddah: [0034]-[0036]; [0094]-[0095]; FIG. 3) , and inputting the intermediate output data and the non-spatial feature data into a second portion of the ML model (Laddah: FIG. 3, 350, 320; [0091], “feature vector … containing a position … velocity”; [0033], “velocity (expressed as a vector in the coordinate frame)”; [0034], “The temporal features can be learned on a plurality (e.g., sequence) of sweeps to learn other features (e.g., velocity) of objects.”; FIGS. 6, 1).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Smolyanskiy with the teaching of Laddah by fusing non-spatial feature data with the intermediate output data in order to provide significant improvements to both object detection and trajectory prediction or classification, and can be trained from end-to-end using previously-acquired sensor data.
-Regarding claim 2, the combination further discloses wherein: the spatial feature comprises at least one of: map information associated with the environment; a bounding box associated with the object ; or a size associated with the object (Smolyanskiy: Figure 3; [0037], “bounding boxes”; [0042]; [0057], “height maps”; [0105], “spatial dimension”; [0107]); and the non-spatial feature comprises at least one of velocity associated with the object; an acceleration associated with the object; or a lighting state associated with the object (Smolyanskiy: [0044], “Reflections and reflection characteristics may depend on the objects in the environment, speeds, materials…”; Figure 5, [0050], “sensor data 402 may be encoded 530 into a suitable representation … different features such as reflection characteristics”; [0053], “bearing, azimuth, elevation, range, intensity, reflectivity, SNR”; [0137]).
-Regarding claim 3, the combination further discloses wherein the classification probability comprises at least one of: an object type classification; an object behavior classification; an object gaze classification; an object trajectory classification; a lane change classification; or an emergency vehicle classification (Smolyanskiy: Figures 4, 6, confidence data 410; [0071], “classification values (e.g., probability, score, or logit)”; [0081]).
-Regarding claim 4, the combination further discloses further comprising: determining a graphical reference corresponding to a corresponding location of the object within the multi-channel image data, wherein the spatial feature is encoded as being associated with the object based at least in part on the graphical reference (Smolyanskiy: [0037]; [0042]; [0073]; FIGS. 8, 15).
-Regarding claim 5, Smolyanskiy discloses wherein the first portion of the ML model comprises a Convolutional Neural Network (CNN) (Figure 6; [0058], “U-Net” ).
 Smolyanskiy does not discloses that the second portion of the ML model comprises a Deep Neural Network (DNN).
In the same field of endeavor, Laddah teaches system and methods generating, one or more features for each of LIDAR data, transformed radar data, and map data, combining the one or more generated features to generate fused feature data, and generating prediction data based on the fused feature data (Laddah: Abstract; FIGS. 1-9; [0083], “top-down, … Birds Eye View (BEV)”).
 	Laddah further teaches that the second portion of the ML model comprises a Deep Neural Network (DNN) (Laddah: FIGS. 3-4; [0098]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Smolyanskiy with the teaching of Laddah by fusing non-spatial feature data with the intermediate output data in order to provide significant improvements to both object detection and trajectory prediction or classification, and can be trained from end-to-end using previously-acquired sensor data.
Claim(s) 9 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Laddah et al (US PG-PUB No. 20220035376 A1) in view of Smolyanskiy et al (US PG-PUB No. 20210150230 A1).
-Regarding claims 9 and 19, Laddah discloses the method of claim 6 and the non-transitory computer-readable media of claim 15.
Laddah discloses determining based on the spatial data, a feature vector associated with the object, wherein the intermediate data includes the feature vector ([0033], “a feature vector … containing a position”; [0036]; FIGS. 1, 3-4; [0091]-[0094]).
Laddah does not disclose determining a mask identifying a corresponding location of the object in the spatial data ; and determining based on the mask, a feature vector. 
In the same field of endeavor, Smolyanskiy teaches a deep neural network(s) (DNN) method to detect objects from sensor data of a three-dimensional (3D) environment (Smolyanskiy: Abstract; Figures 1-9). Smolyanskiy further teaches determining a mask identifying a corresponding location of the object in the spatial data, and determining based on the mask, a feature vector (Figures. 1-2, 6; [0037], “2D mask”; [0059], “masks for each class (channel)”; [0061], “spatial dimension”; [0065]; Figure 7, [0074]).
	Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Laddah with the teaching of Smolyanskiy by determining a mask identifying a corresponding location of the object in the spatial data in order to perform class and instance segmentation of images.
Allowable Subject Matter
Claim 11 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XIAO LIU whose telephone number is (571)272-4539. The examiner can normally be reached Monday-Thursday and Alternate Fridays 8:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nay Maung can be reached on (571) 272-7882. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/XIAO LIU/Examiner, Art Unit 2664                                                                                                                                                                                             /NANCY BITAR/Primary Examiner, Art Unit 2664