DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
Claims 1-20 are currently pending and have been examined. 
Information Disclosure Statement
The information disclosure statement(s) (IDS) submitted on 09/08/2021, 05/12/2022 and 08/15/2022 have been considered by the examiner and initialed copies are hereby attached.  
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1,5,7,8,11,12,14,15,19 and 20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Douillard (US 10593042 B1).

Regarding claim 1, Douillard discloses 
A method (Figs. 1 and 2) comprising: 
generating, from sensor data comprising LiDAR data, a representation of a range image of an environment (Fig. 1, steps 102, 106 and 112, where sensor data set 102 includes LiDAR data obtained from a LiDAR sensor onboard a vehicle and is used to generate a two-dimensional image (i.e. range image); Col. 6, lines 3-16, “For example, the two-dimensional array 116 may correspond to a range of each LIDAR data point associated with the projection shape, the two-dimensional array 118 may correspond to the x-coordinate of each LIDAR data point associated with the projection shape, the two-dimensional array 120 may correspond to the y-coordinate of each LIDAR data point associated with the projection shape, and the two-dimensional array 122 may correspond to the z-coordinate of each LIDAR data point associated with the projection shape. In some instances, each of the two-dimensional arrays 116, 118, 120, and 122 may be referred to as a channel, with the two-dimensional arrays 116, 118, 120, and 120 collectively referred to as a multi-channel two-dimensional image.”); 
extracting, based at least on the representation of the range image (Col. 5, line 62 - Col 6, line 3, “At operation 112, the process can include converting the projection shape into a multi-channel two-dimensional image. In an example 114, the projection shape is converted into a plurality of two-dimensional arrays 116, 118, 120, and 122. In some instances, the two-dimensional arrays 116, 118, 120, and 122 may be considered to be individual “images”, with each image corresponding to an individual dimension of the LIDAR data stored in the cell 110 of the projection shape.”) and using one or more Neural Networks (NNs) (Fig. 1, the range image (i.e. two dimensional image generated at step 112) is input into a neural network at step 124 where classification data is extracted; Col. 6, lines 34-39, “In some instances, the multi-channel two-dimensional images may be input to one or more machine learning networks, such as a convolutional neural network, to perform deep learning operations on the data to perform tasks such as segmentation and/or classification.”), classification data representing one or more classifications of elements in the environment (Col. 6, lines 26-34, “At operation 124, the process may include performing segmentation and/or classification on the multi-channel two-dimensional image. An example 126 illustrates an output of one such segmentation operation, including segmentation information 128 associated with an object. In some instances, the segmentation information 128 may include a segmentation identification (ID) associated with each pixel or LIDAR data point, for example, with a particular segmentation ID defining a particular object.”); 
generating one or more bounding shapes of the elements based at least on the classification data (Fig. 2, bounding shape 218 is generated based on the classification data output from step 124; Col. 7, lines 30-37, “In some instances, the operation 214 may include receiving two-dimensional segmentation information and adding depth information to the segmentation information to capture depth information of the three-dimensional data. An example 216 illustrates segmentation information 218 applied to the three-dimensional data 220 corresponding to the object 206.”); and 
providing data representing the one or more bounding shapes to a control component of an autonomous vehicle (Col. 2, lines 12-14, “FIG. 12 depicts an example process 1200 for generating a trajectory for an autonomous vehicle based on image segmentation, as discussed herein”, where FIG. 12 depicts that the image segmentation (where the image segmentation here is referring to the segmentation performed in Fig. 10A which depicts the “data representing the one or more bounding shapes”) generates a trajectory for the autonomous vehicle and commands the autonomous vehicle to follow the trajectory; Col. 21, lines 3-14, “At operation 1208, the process may include generating a sequence of commands to command the autonomous vehicle to drive along the trajectory generated in operation 1206. In some instances, the trajectory generated in the operation 1206 may constrain the operation of the autonomous vehicle to operate within the free space segmented in the operation 1204, or to avoid objects identified and/or tracked by a planner system of the autonomous vehicle. Further, the commands generated in the operation 1208 can be relayed to a controller onboard an autonomous vehicle to control the autonomous vehicle to drive the trajectory.”).

Regarding claim 5, Douillard further discloses
The method of claim 1, further comprising generating the range image with multiple layers storing different reflection characteristics represented by the LiDAR data (Fig. 1, the multi-dimensional image (i.e. range image) is generated from multiple channel layers (i.e. layers 116,118,120,122), where each layer represents different characteristics, Col. 6, lines 3-16, “For example, the two-dimensional array 116 may correspond to a range of each LIDAR data point associated with the projection shape, the two-dimensional array 118 may correspond to the x-coordinate of each LIDAR data point associated with the projection shape, the two-dimensional array 120 may correspond to the y-coordinate of each LIDAR data point associated with the projection shape, and the two-dimensional array 122 may correspond to the z-coordinate of each LIDAR data point associated with the projection shape. In some instances, each of the two-dimensional arrays 116, 118, 120, and 122 may be referred to as a channel, with the two-dimensional arrays 116, 118, 120, and 120 collectively referred to as a multi-channel two-dimensional image.”), wherein the extracting of the classification data using the one or more NNs comprises feeding the multiple layers of the range image into separate channels of the one or more NNs (Col. 17, lines 37-43, “As may be understood in the context of this disclosure, the channels associated with time T.sub.0 may be input to a convolutional neural network for segmentation and/or classification, while the channels associated with time T.sub.1 may be separately input to the convolutional neural network for subsequent segmentation and/or classification.”).

Regarding claim 7, Douillard further discloses
The method of claim 1, further comprising: 
accumulating the LiDAR data from one or more LiDAR sensors of the autonomous vehicle accumulated over a period of time to generate accumulated LiDAR data (Fig. 9 depicts the accumulation of LiDAR data over time, Col. 17, lines 25-27, “FIG. 9 is an example 900 of combining data over time for incorporating temporal features into multi-channel data for image analysis.”); 
converting the accumulated LiDAR data to motion-compensated LiDAR data corresponding to a position of the autonomous vehicle at a particular time (Col. 17, lines 30-35, “In some instances, the channels 902, 904, 906, and 908 may correspond to the channels 808, 810, 812, and 814, respectively, at a time T.sub.0. That is, the channels 902, 904, 906, and 908 may represent two-dimensional data of an instant of time represented as T.sub.0.”; Col. 16, lines 37-51, “FIG. 7A illustrates an example 700 of an autonomous vehicle 702 using a LIDAR sensor 704 to capture LIDAR associated with one or more measurements of a building 706. For example, a vector 708 from the LIDAR sensor 704 to a point P 710 is captured as a measurement associated with the building 706. In some instances, the vector 708 may be associated with (x, y, z) coordinate information, a time of the measurement, a location of the autonomous vehicle 702, a distance of the vector 708 (e.g., a range), a surface normal vector 712 associated with point P 710, etc. As may be understood, the LIDAR sensor 704 may be capturing thousands or millions of points per second, at varying resolutions, frame rates, etc. In some instances, the operations illustrated in the example 700 may correspond to an operation of capturing three-dimensional LIDAR data.”); and 
projecting the motion-compensated LiDAR data into two-dimensional (2D) image-space to generate the range image (Col. 17, lines 25-30, “FIG. 9 is an example 900 of combining data over time for incorporating temporal features into multi-channel data for image analysis. For example, as described above, three-dimensional data can be projected onto a projection shape and unrolled into a multi-channel image, comprised of individual channels 902, 904, 906, and 908.”).

Regarding claim 8, the same cited section and rationale as claim 1 is applied. Douillard further discloses,  
A processor comprising one or more circuits (Col. 24, lines 14-16, “The processor and the memory can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).”, where the processing comprising the integrated circuits perform the claimed features) to: 
receive LiDAR data from one or more LiDAR sensors of a vehicle in an environment (Col. 7, lines 3-6, “As illustrated in example 204, the operation 202 can include capturing three-dimensional data of an object 206 using a LIDAR sensor 208 associated with a perception system of an autonomous vehicle 210.”; Col. 7, lines 53-56, “For example, turning to the example 204, the autonomous vehicle 210 is located relative to the object 206 (illustrated as another car) such that the LIDAR sensor 208 essentially captures a front view of the object 206.”, Col. 12, lines 64-65, “The sensor 406 may include a LIDAR sensor mounted on a roof of the vehicle 402,”); 

Regarding claim 11, the same cited section and rationale as claim 4 is applied, where the processor includes one or more circuits to perform the recited functions (see claim 8).  

Regarding claim 12, the same cited section and rationale as claim 5 is applied, where the processor includes one or more circuits to perform the recited functions (see claim 8).  

Regarding claim 14, the same cited section and rationale as claim 7 is applied, where the processor includes one or more circuits to perform the recited functions (see claim 8).   

Regarding claim 15, Douillard discloses 
A system (Col. 2, lines 36-38, “This disclosure describes methods, apparatuses, and systems for converting multi-dimensional data for image analysis.”) comprising: 
one or more processing units (Col. 13, lines 27-35, “FIGS. 1, 2, 5A, 5B, 8, and 11-13 illustrate example processes in accordance with embodiments of the disclosure. These processes are illustrated as logical flow graphs, each operation of which represents a sequence of operations that can be implemented in hardware, software, or a combination thereof. In the context of software, the operations represent computer-executable instructions stored on one or more computer-readable storage media that, when executed by one or more processors, perform the recited operations.”); and 
one or more memory units (Col. 22, lines 30-33, “The storage 1404, the processor(s) 1406, the memory 1408, and the operating system 1410 may be communicatively coupled over a communication infrastructure 1412.”) storing instructions that, when executed by the one or more processing units, cause the one or more processing units to execute operations (Col. 27, lines 57-64, “Suitable processors for the execution of a program of instructions include, but are not limited to, general and special purpose microprocessors, and the sole processor or one of multiple processors or cores, of any kind of computer. A processor may receive and store instructions and data from a computerized data storage device such as a read-only memory, a random access memory, both, or any combination of the data storage devices described herein.”) comprising: 
generating, from LiDAR data, a representation of a projection image of an environment (Fig. 1, steps 102, 106 and 112, where sensor data set 102 includes LiDAR data obtained from a LiDAR sensor onboard a vehicle and is used to generate a projection shape (i.e. projection image); Col. 5, line 62 – Col. 6, line 12, “At operation 112, the process can include converting the projection shape into a multi-channel two-dimensional image. In an example 114, the projection shape is converted into a plurality of two-dimensional arrays 116, 118, 120, and 122. In some instances, the two-dimensional arrays 116, 118, 120, and 122 may be considered to be individual “images”, with each image corresponding to an individual dimension of the LIDAR data stored in the cell 110 of the projection shape. For example, the two-dimensional array 116 may correspond to a range of each LIDAR data point associated with the projection shape, the two-dimensional array 118 may correspond to the x-coordinate of each LIDAR data point associated with the projection shape, the two-dimensional array 120 may correspond to the y-coordinate of each LIDAR data point associated with the projection shape, and the two-dimensional array 122 may correspond to the z-coordinate of each LIDAR data point associated with the projection shape.”); 
generating, based at least on the representation of the projection image (Col. 5, line 62 - Col 6, line 3, “At operation 112, the process can include converting the projection shape into a multi-channel two-dimensional image. In an example 114, the projection shape is converted into a plurality of two-dimensional arrays 116, 118, 120, and 122. In some instances, the two-dimensional arrays 116, 118, 120, and 122 may be considered to be individual “images”, with each image corresponding to an individual dimension of the LIDAR data stored in the cell 110 of the projection shape.”) and using one or more neural networks (NNs) (Fig. 1, the data representing the projection image is input into a neural network at step 124 where classification data is extracted; Col. 6, lines 34-39, “In some instances, the multi-channel two-dimensional images may be input to one or more machine learning networks, such as a convolutional neural network, to perform deep learning operations on the data to perform tasks such as segmentation and/or classification.”), classification data representing one or more classifications of the environment (Col. 6, lines 26-34, “At operation 124, the process may include performing segmentation and/or classification on the multi-channel two-dimensional image. An example 126 illustrates an output of one such segmentation operation, including segmentation information 128 associated with an object. In some instances, the segmentation information 128 may include a segmentation identification (ID) associated with each pixel or LIDAR data point, for example, with a particular segmentation ID defining a particular object.”); and 
generating data representing one or more bounding shapes (Fig. 2, bounding shape 218 is generated based on the classification data output from step 124; Col. 7, lines 30-37, “In some instances, the operation 214 may include receiving two-dimensional segmentation information and adding depth information to the segmentation information to capture depth information of the three-dimensional data. An example 216 illustrates segmentation information 218 applied to the three-dimensional data 220 corresponding to the object 206.”) and class labels of one or more objects or scenery detected in the environment based at least on the classification data (Col. 7, lines 20-30, “In some instances, the segmentation information generated in the process 100 may include segmentation information associated with the two-dimensional representation of the three-dimensional data, in which case, the two-dimensional segmentation information may be converted to three-dimensional segmentation information. In some instances, the segmentation information may include a segmentation identification (ID) associated with each of the three-dimensional data points, which may be used in turn to determine which data points are to be associated with a particular object for segmentation.”, where a “segmentation ID” here is tantamount to a “class label”).

Regarding claim 19, the same cited section and rationale as claim 5 is applied.  

Regarding claim 20, Douillard further discloses
The system of claim 15, wherein the system is comprised in at least one of: 
a control system for an autonomous or semi-autonomous machine (Col. 21, lines 11-17, “Further, the commands generated in the operation 1208 can be relayed to a controller onboard an autonomous vehicle to control the autonomous vehicle to drive the trajectory. Although discussed in the context of an autonomous vehicle, the process 1200, and the techniques and systems described herein, can be applied to a variety systems utilizing machine vision.”); 
a perception system for an autonomous or semi-autonomous machine (Col. 2, lines 36-41, “This disclosure describes methods, apparatuses, and systems for converting multi-dimensional data for image analysis. In some examples, the multi-dimensional data may include data captured by a LIDAR system for use in conjunction with a perception system for an autonomous vehicle.”); 
a system for performing simulation operations (Col. 10, lines 5-14, “The computer system(s) 302 may further include simulated data that has been generated by a computer simulation algorithm, for use in part in testing. In some instances, the simulated data may include any type of simulated data, such as camera data, LIDAR data, Radar data, GPS data, etc. In some instances, computer system(s) 302 can modify, transform, and/or perform the converting operations described herein on the simulated data for verifying an operation and/or for training the machine learning algorithms, as described herein.”); 
a system for performing deep learning operations (Col. 6, lines 34-39, “In some instances, the multi-channel two-dimensional images may be input to one or more machine learning networks, such as a convolutional neural network, to perform deep learning operations on the data to perform tasks such as segmentation and/or classification.”); 
a system implemented using an edge device; 
a system implemented using a robot (Col. 25, lines 14-19, “In one or more embodiments, the computing device may be operatively coupled to any machine vision based system. For example, such machine based vision systems include but are not limited to manually operated, semi-autonomous, or fully autonomous industrial or agricultural robots, household robot, inspection system, security system, etc.”); 
a system incorporating one or more virtual machines (VMs) (Col. 28, lines 1-11, “The systems, modules, and methods described herein can be implemented using one or more virtual machines operating alone or in combination with one other. Any applicable virtualization solution can be used for encapsulating a physical computing machine platform into a virtual machine that is executed under the control of virtualization software running on a hardware computing platform or host. The virtual machine can have both virtual system hardware and guest operating system software.”); 
a system implemented at least partially in a data center; or 
a system implemented at least partially using cloud computing resources.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 2,6,9,13 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Douillard (US 10593042 B1) in view of Nezhadarya (US 10915793 B2). 

Regarding claim 2, Douillard discloses [Note: what Douillard fails to specifically disclose is strike-through]
The method of claim 1, further comprising generating the representation of the range image by stacking the range image (Col. 9, lines 45-52, “The Radar module 308 may include one or more Radar sensors to capture range, angle, and/or velocity of objects in an environment. As may understood in the context of this disclosure, the Radar module 308 may capture data and may transmit datasets to the computer system(s) 302 for subsequent processing. For example, data from the Radar module 308 may be included as one or more channels of a multi-channel image.”) and (Col. 9, lines 35-44, “For example, the camera module 306 may include any color cameras, monochrome cameras, depth cameras, RGB-D cameras, stereo cameras, infrared (IR) cameras, ultraviolet (UV) camera, etc. As may understood in the context of this disclosure, the camera module 306 may capture data and may transmit datasets to the computer system(s) 302 for subsequent processing. For example, data from the camera module 306 may be included as one or more channels of a multi-channel image.”, where it is obvious that the multi-channel two dimensional image includes multiple “tensors”/channels and these tensors can include one tensor to be RGB image data while another tensor of the multi-channel tensor is the range image data. It is implied that these tensors are “stacked” as shown in ), wherein the extracting of the classification data using the one or more NNs comprises feeding the tensor as an input into the one or more NNs (Fig. 1, where the multi-channel tensors are fed into the neural network of step 124 to extract classification data).

Nezhadarya discloses, 
further comprising generating the representation of the range image by stacking the range image and an RGB image of the environment into separate channels of a tensor (Fig. 5A discloses the combiner 502 to stack RGB image data (i.e. 202) with range image data (i.e. 404 ) as separate channels of a tensor where the tensors are then input into the neural network 124). 

It would have been obvious to someone in the art prior to the effective filing date of the claimed invention to modify Douillard with Nezhadarya to incorporate the features of: 
further comprising generating the representation of the range image by stacking the range image and an RGB image of the environment into separate channels of a tensor. 
Douillard and Nezhadarya are considered analogous arts to each other and the claimed invention as they disclose the classification and segmentation of data obtained through lidar sensors on board autonomous vehicles using machine learning models such as neural networks.  Douillard is similar to the instant application as it specifically discloses the feature of receiving lidar data from lidar sensors onboard a vehicle and converting the data into a projection image. Douillard further discloses the features of applying the data used to create the projection image to a neural network. Douillard further discloses that the multi-dimensional image generated from the projection image includes multiple channels where one channel can be range data while another channel of the multi-channel image can be camera data. However, Douillard fails to specifically disclose the features of: 
further comprising generating the representation of the range image by stacking the range image and an RGB image of the environment into separate channels of a tensor. 
Although such features can be inferred from Douillard, Nezhadarya is being used to clearly disclose these feature and show that is known in the art to perform such a process. These features are disclosed by Nezhadarya as shown in the citations above. It would be obvious to stack different channel data (range image data, and RGB image data) as separate tensors into a neural network structure as disclosed by Nezhadarya. The addition of such features as disclosed by Nezhadarya into Douillard would lead to a more efficient autonomous vehicle system with more accurate detection and isolation of objects. 



Regarding claim 6, Douillard discloses [Note: what Douillard fails to specifically disclose is strike-through]
The method of claim 1, 



Nezhadarya discloses, 
wherein the one or more NNs include a common trunk (Fig. 5A, networks 212,214, 216 include a common trunk, the “2D CNN 124”; Col. 14, lines 35-40, “The 2D CNN 124 together with the 2D segmentation and regression unit 126 form a mask regional convolution neural network (Mask R-CNN) that include sub-networks which implement machine learning algorithms for object detection, classification, regression, and segmentation.”) connected to: 
a first stream of layers configured to predict the classification data representing the one or more classifications of the elements in the environment (Fig. 5A; Col. 9, lines 26-30, “The 2D segmentation and regression unit 126 in this example includes sub-networks including an objection classification sub-network 212 for labeling detected object with classes of interest (e.g., cars, trucks and pedestrians, in the context of autonomous vehicles)”); and 
a second stream of layers configured to regress a location or dimension of a corresponding one of the elements relative to each pixel (Fig. 5A; Col. 9, lines 30-32, “a 2D regression sub-network 214 for performing 2D bounding box regression and generating 2D bounding boxes”; Col. 9, lines 10-20, “The 2D CNN 124 may be any CNN or other deep neural network that can be trained to learn to detect objects in the 2D image data. The 2D CNN 124 may be trained on any data that is in a form accepted by the 2D CNN 124. For example, regional CNNs (RCNNs) have been used for object detection and segmentation. Example RCNNs for object detection include, for example, Fast RCNN, Faster RCNN, and Mask R-CNN. Mask R-CNN, in particular, has been used to provide object detection and segmentation on a pixel level.”, where “on a pixel level” indicates that the training is performed on elements relative to each pixel).

It would have been obvious to someone in the art prior to the effective filing date of the claimed invention to modify Douillard with Nezhadarya to incorporate the features of: 
wherein the one or more NNs include a common trunk connected to: 
a first stream of layers configured to predict the classification data representing the one or more classifications of the elements in the environment; and 
a second stream of layers configured to regress a location or dimension of a corresponding one of the elements relative to each pixel.
Douillard and Nezhadarya are considered analogous arts to each other and the claimed invention as they disclose the classification and segmentation of data obtained through lidar sensors on board autonomous vehicles using machine learning models such as neural networks.  Douillard is similar to the instant application as it specifically discloses the feature of receiving lidar data from lidar sensors onboard a vehicle and converting the data into a projection image. Douillard further discloses the features of applying the data used to create the projection image to a neural network. The neural network uses that input to generate an output of detected objects that are isolated and/or identified. The neural network further generates a bounding shape around the at least one detected object of the one or more detected objects. However, Douillard fails to disclose the features of: 
wherein the one or more NNs include a common trunk connected to: 
a first stream of layers configured to predict the classification data representing the one or more classifications of the elements in the environment; and 
a second stream of layers configured to regress a location or dimension of a corresponding one of the elements relative to each pixel.
These features are disclosed by Nezhadarya as shown in the citations above. It would be obvious to combine to neural network structure as disclosed by Nezhadarya into the neural network classification and segmentation model as disclosed by Douillard. The addition of such features as disclosed by Nezhadarya into Douillard would lead to a more efficient autonomous vehicle system with more accurate detection and isolation of objects. 

 Regarding claim 9, the same cited section and rationale as claim 2 is applied, where the processor includes one or more circuits to perform the recited functions (see claim 8).  

Regarding claim 13, the same cited section and rationale as claim 6 is applied, where the processor includes one or more circuits to perform the recited functions (see claim 8).  

Regarding claim 16, the same cited section and rationale as claim 2 is applied.  

Claim(s) 3, 10 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Douillard (US 10593042 B1) in view of Venable (US 20180108134 A1). 

Regarding claim 3, Douillard discloses [Note: what Douillard fails to specifically disclose is strike-through]
The method of claim 1, 

Venable discloses, 
further comprising generating the range image with a height in pixels corresponding to a number of horizontal scan lines of a LiDAR sensor that captured the LiDAR data (Paragraph 0085, “The system and method disclosed operates by generating a depth map of the aisle by combining all the vertical scans along the aisle into a pixel image where each column of the pixel image represents one scan, and each row of the image corresponds to a vertical distance from the ground up, and each pixel value associated with the pixel image represents a distance to an obstruction (the range) detected by the LIDAR.”).

It would have been obvious to someone in the art prior to the effective filing date of the claimed invention to modify Douillard with Venable to incorporate the features of: 
further comprising generating the range image with a height in pixels corresponding to a number of horizontal scan lines of a LiDAR sensor that captured the LiDAR data. 
Douillard and Venable are considered analogous arts to each other and the claimed invention as they disclose the use of lidar data to create images of objects detected by the lidar.  Douillard is similar to the instant application as it specifically discloses the feature of receiving lidar data from lidar sensors onboard a vehicle and converting the data into a projection image. Douillard further discloses the features of applying the data used to create the projection image to a neural network. Douillard further discloses that the resolution of the projection shape may vary based on an angle of elevation of the lidar system (Col. 3, lines 25-29). However, Douillard fails to specifically disclose the features of: 
further comprising generating the range image with a height in pixels corresponding to a number of horizontal scan lines of a LiDAR sensor that captured the LiDAR data. 
These features are disclosed by Venable as shown in the citations above. It would be obvious to create a depth map where each row of the pixel image represents one scan (i.e. the height in pixels corresponding to a number of horizontal scan lines of a LiDAR sensor). The addition of such features as disclosed by Venable into Douillard would lead to the generation of higher-resolution images . 

Regarding claim 10, the same cited section and rationale as claim 3 is applied, where the processor includes one or more circuits to perform the recited functions (see claim 8).  

Regarding claim 17, the same cited section and rationale as claim 3 is applied.  

Claim(s) 4 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Douillard (US 10593042 B1) in view of Zou (US 20210109523 A1). 

Regarding claim 4, Douillard discloses [Note: what Douillard fails to specifically disclose is strike-through]
The method of claim 1, wherein the LiDAR data comprises a LiDAR point cloud (Col. 2, lines 56-59, “As mentioned above, the three-dimensional LIDAR data can include a three dimensional map or point cloud which may be represented as a plurality of vectors emanating from a light emitter and terminating at an object or surface.”), the method further comprising: 
generating the range image (Fig. 1, generating the multi-dimensional image in step 112) by projecting the LiDAR point cloud to bin points from the LiDAR point cloud into corresponding pixels of the range image (Fig. 8 depicts the projecting of the Lidar point cloud (i.e. Range, x, y, z) to a “bin” from the LiDAR point cloud (i.e. cell 804), where the point cloud here is tantamount/synonymous to a “pixel” as shown in Col. 3, line 62-Col. 4 line 10, “For example, the segmentation information may include a segmentation identifier, identification (ID), or tag associated with each point of the point cloud or pixel, and can be applied to the three-dimensional information to identify three-dimensional data associated with an object. As a non-limiting example, all LIDAR points associated with a single object may all have the same ID, whereas LIDAR points associated with different objects may have different IDs. After identifying the object in the three-dimensional data, the three-dimensional data can be converted to two-dimensional data by projecting the three-dimensional data onto a projection plane (also referred to as a rendering plane), which may include adapting or positioning a rendering perspective (e.g., the rendering plane) relative to the object.”)); and 


Zou discloses, 
for each of the pixels that bins multiple points from the LiDAR point cloud, storing a range value corresponding to a point of the multiple points having a shortest range (Paragraph 0064, “A range image computed from raw (unprocessed) received sensor data is used to capture the visibility information. For instance, this information can be stored as a matrix of values, where each value is associated with a point (pixel) in the range image... In the case of a lidar sensor, each pixel stored in the range image represents the maximum range the laser shot can see along a certain azimuth and inclination angle (view angle). For any 3D location whose visibility is being evaluated, the pixel at which the 3D location's laser shot falls into can be identified and the ranges (e.g., stored maximum visible range versus physical distance from the vehicle to the 3D location) can be compared. If the stored maximum visible range value is closer than the physical distance, then the 3D point is considered to be not visible, because there is a closer occlusion along this view angle. In contrast, if the stored maximum visible range value is at least the same as the physical distance, then the 3D point is considered to be visible (not occluded). A range image may be computed for each sensor in the vehicle's perception system.”).

It would have been obvious to someone in the art prior to the effective filing date of the claimed invention to modify Douillard with Zou to incorporate the features of: 
for each of the pixels that bins multiple points from the LiDAR point cloud, storing a range value corresponding to a point of the multiple points having a shortest range.
Douillard and Zou are considered analogous arts to each other and the claimed invention as they disclose the use of lidar technology within autonomous vehicles to detect object characteristics surrounding the vehicle.  Douillard is similar to the instant application as it specifically discloses the feature of receiving lidar data from lidar sensors onboard a vehicle and converting the data into a projection image by projecting lidar point cloud data as a “pixel” to a corresponding cell. Douillard further discloses the features of applying the data used to create the projection image to a neural network. However, Douillard fails to disclose the features of: 
for each of the pixels that bins multiple points from the LiDAR point cloud, storing a range value corresponding to a point of the multiple points having a shortest range.
These features are disclosed by Zou as shown in the citations above. Zou discloses the features of generating a range image where the information in the range image is associated with a matrix of values for each point/pixel. Furthermore, Zou discloses the storing of the range distance from the object to the sensor and then comparing that data to determine whether the 3D point is considered visible (not occluded) or non-visible (occluded). This process is performed by “storing a range value corresponding to a point of the multiple points having a shortest range”. The values of the points that are visible (not occluded) are stored to compute the range image for each sensor in the vehicle’s perception system. Therefore, it would be obvious to combine the features as disclosed by Zou into the invention as disclosed by Douillard. The addition of such features as disclosed by Zou into Douillard would allow for the range image to be created based on more accurate data that is considered to be not occluded (i.e. visible data).  

Regarding claim 18, the same cited section and rationale as claim 4 is applied.  

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NAZRA N. WAHEED whose telephone number is (571)272-6713. The examiner can normally be reached M-F (8 AM - 4:30 PM).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vladimir Magloire can be reached on (571)270-5144. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NAZRA NUR WAHEED/Examiner, Art Unit 3648                                                                                                                                                                                                        
/PETER M BYTHROW/Primary Examiner, Art Unit 3648