DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claims 1-20 are pending under this Office action.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-9, 11-12, and 14-20 are rejected under 35 U.S.C. 103 as being unpatentable over Kwon, etc. (US 20200118281 A1) in view of Ecins, etc. (US 20200110158 A1).
Regarding claim 1, Kwon teaches that a computer-implemented method to generate synthetic light detection and ranging (LiDAR) data, the method (See Kwon: Figs. 1 and 6-7, and [0053], “Referring now to FIGS. 6A, 6B and FIG. 7, FIGS. 6A and 6B are a flow chart of an example of a method 600 for generating a 3D model point cloud of an object in accordance with another embodiment of the present disclosure. FIG. 7 is a block schematic diagram illustrating portions of the exemplary method 600 in FIGS. 6A and 6B. In accordance with an embodiment, the method 600 is embodied in and performed by the system 100 in FIGS. 1A and 1B. For example, the set of functions 145 includes the method 600. As described herein, generating the 3D model point cloud includes generating the 3D model point cloud using heterogeneous 2D and 3D sensor fusion in that data from the 2D imaging sensor 106 is 
obtaining, by a computing system comprising one or more computing devices, a three- dimensional map of an environment (See Kwon: Figs. 1A-B, and [0034], “FIG. 1A is a block schematic diagram of an example of a system 100 for generating a 3D model point cloud 102 of an object 104 in accordance with an embodiment of the present disclosure. The 3D model point cloud 102 may also be referred to herein as simply the 3D model. The system 100 includes a two dimensional (2D) imaging sensor 106 for capturing a 2D image 108 of the object 104. An example of the 2D imaging sensor 106 is an electro-optical camera 109 that generated a high resolution 2D electro-optical image. Other examples of the 2D imaging sensor include any type device capable of generating a high resolution 2D image. The 2D image 108 includes a plurality of pixels 112 that provide a predetermined high resolution 114. The 2D image 108 also includes a 2D image plane 110 or is referenced in the 2D image plane 110 as illustrated in FIG. 1B”);
determining, by the computing system, a trajectory that describes a series of locations of a virtual object relative to the environment over time;
performing, by the computing system, ray casting on the three-dimensional map according to the trajectory to generate an initial three-dimensional point cloud that comprises a plurality of points, wherein at least a respective depth is associated with each of the plurality of 
processing, by the computing system using a machine-learned geometry network, the initial three-dimensional point cloud to predict a respective adjusted depth for one or more of the plurality of points (See Kwon: Figs. 6-7, and [0056], “In block 606, a determination is made whether the current viewpoint or location of the sensor platform is a first viewpoint or location. If the determination is made that this is the first viewpoint or location of the sensor platform, the method 600 advances to block 614. In block 614, the current 3D model point cloud 704 (FIG. 7) at the first viewpoint or first iteration is empty. As described in more detail herein for subsequent viewpoints or iterations a current 3D model point cloud 704 which is the updated 3D model point cloud 712 from a previous viewpoint of the sensor platform and the upsampled 3D point cloud 702 are merged 706 (FIG. 7) to create a new 3D model point cloud 708. In block 616, the new 3D model point cloud 708 is quantized 710 or subsampled to generate an updated 3D model point cloud 712 (M.sup.K)”); and
generating, by the computing system, an adjusted three-dimensional point cloud in which the one or more of the plurality of points have the respective adjusted depth predicted by the machine-learned geometry network (See Kwon: Figs. 6-7, and [0057], “In block 618, a determination is made whether all viewpoints or locations of the sensor platform have been completed. If not, the method 600 will advance to block 620. In block 620, the sensor platform 
However, Kwon fails to explicitly disclose that determining, by the computing system, a trajectory that describes a series of locations of a virtual object relative to the environment over time; and by the computing system using a machine-learned geometry network.
However, Ecins teaches that determining, by the computing system, a trajectory that describes a series of locations of a virtual object relative to the environment over time (See Ecins: Fig. 2, and [0052], “In at least one example, the localization component 220 can include functionality to receive data from the sensor system(s) 206 to determine a position of the vehicle 202. For example, the localization component 220 can include and/or request/receive a three-dimensional map of an environment and can continuously determine a location of the autonomous vehicle within the map. In some instances, the localization component 220 can utilize SLAM (simultaneous localization and mapping) or CLAMS (calibration, localization and mapping, simultaneously) to receive image data, LIDAR data, radar data, IMU data, GPS data, wheel encoder data, and the like to accurately determine a location of the autonomous vehicle. In some instances, the localization component 220 can provide data to various components of 
by the computing system using a machine-learned geometry network (See Ecins: Figs. 1-2, and [0076], “Further, in some examples, the semantic information can include a semantic classification of objects represented by polygons in the 3D map. For example, the semantic information can include semantic classifications including, but not limited to, ground, wall, road, curb, sidewalk, grass, tree, tree trunk/branch, foliage (e.g., leaves), building, wall, fire hydrant, mailbox, pole, post, pedestrian, bicyclist, animal (e.g., dog), and the like. In some instances, the semantic information can provide an indication of whether the polygon, object, or element represents a static object, dynamic object, etc. In some instances, the semantic information can include an object identifier to distinguish between different instances of the same semantic classification (e.g., tree #1, tree #2, etc.). The semantic information component 240 can use classical and/or machine learning (e.g., neural network) algorithms to receive data and output one or more detections, segmentations, and/or classifications of objects associated with the data”).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention was effectively filed to modify Kwon to have a determining, by the computing system, a trajectory that describes a series of locations of a virtual object relative to the environment over time; and by the computing system using a machine-learned geometry network as taught by Ecins in order to increase the accuracy and utility of data representing an environment, such as a mesh (See Ecins: [0018], “The data validation techniques discussed herein can increase the accuracy and utility of data representing an environment, such as a 
Regarding claim 2, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. Further, Ecins teaches that the computer-implemented method of claim 1, further comprising:
generating, by the computing system, a respective intensity value for each of the plurality of points based at least in part on intensity data included in the three-dimensional map 
Regarding claim 3, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. Further, Ecins teaches that the computer-implemented method of claim 1, wherein performing, by the computing system, the ray casting to generate the initial three-dimensional point cloud comprises determining, by the computing system for each of a plurality of rays, a ray casting location and a ray casting direction based at least in part on the trajectory (See Ecins: Fig. 1, and [0019], “Although many examples are discussed in the context of voxels, the techniques can be implemented for other representations, such as meshes. For example, in 
Regarding claim 4, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. Further, Ecins teaches that the computer-implemented method of claim 3, wherein performing, by the computing system, the ray casting to generate the initial three-dimensional point cloud comprises:
identifying, by the computing system for each of the plurality of rays, a closest surface element in the three-dimensional map to the ray casting location and along the ray casting direction (See Ecins: Fig. 1, and [0036], “A situation 124 represents a valid situation (e.g., no noisy-surface-normals error). Here, the computing device analyzes a row of voxels and determines that a first voxel that is associated with the ground identifier 118 and a second voxel that is associated with the ground identifier 118 are within a threshold distance of each other (e.g., adjacent voxels). Further, the computing device determines that an angle between a surface normal vector associated with the first voxel and a surface normal vector associated with the second voxel is less than a threshold amount (e.g., an angular difference of surface normal vectors is less than a threshold value)”); and
generating, by the computing system for each of the plurality of rays, one of the plurality of points with the respective depth based at least in part on a distance from the ray casting location to the closest surface element (See Ecins: Fig. 1, and [0037], “A situation 126 represents an invalid situation (e.g., a noisy-surface-normals error). Here, the computing device analyzes a row of voxels and determines that a first voxel that is associated with the ground 
Regarding claim 5, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. Further, Kwon teaches that the computer-implemented method of claim 1, further comprising feeding, by the computing system, the adjusted three-dimensional point cloud as LiDAR data input to an autonomy computing system of an autonomous vehicle to test a performance of the autonomy computing system of the autonomous vehicle in the environment (See Kwon: Figs. 1A-B, and [0036], “In accordance with an embodiment, the 2D imaging sensor 106 and the 3D imaging sensor 116 are associated with or mounted to a sensor platform 136. The sensor platform 136 is configured to move to different viewpoints 138 or locations to capture 2D images 108 and corresponding 3D images 118 at the different viewpoints 138 or location of the sensor platform 136. The multiple 2D images 108 and multiple 3D images 118 captured at different viewpoints 138 or locations of the sensor platform 136 are stored in a memory device 140. As described in more detail herein, the 2D image 108 data and the 3D image 118 data are combined or fused for each viewpoint 138 or location of the sensor platform 136 and the data for each of the viewpoints are combined or fused to generate the 3D model point cloud 102 or updated 3D model point cloud. In accordance with an embodiment, the sensor platform is a vehicle, such as a spacecraft or other type vehicle”).
claim 6, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. Further, Ecins teaches that the computer-implemented method of claim 1, wherein the machine-learned geometry network comprises a parametric continuous convolution network (See Ecins: Figs. 1-2, and [0059], “As described herein, an exemplary neural network is a biologically inspired algorithm which passes input data through a series of connected layers to produce an output. Each layer in a neural network can also comprise another neural network, or can comprise any number of layers (whether convolutional or not). As can be understood in the context of this disclosure, a neural network can utilize machine learning, which can refer to a broad class of such algorithms in which an output is generated based on learned parameters”).
Regarding claim 7, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. Further, Ecins teaches that the computer-implemented method of claim 1, wherein the machine-learned geometry network comprises a plurality of continuous fusion layers with residual connections between each adjacent layer (See Ecins: Fig. 2, and [0059], “As described herein, an exemplary neural network is a biologically inspired algorithm which passes input data through a series of connected layers to produce an output. Each layer in a neural network can also comprise another neural network, or can comprise any number of layers (whether convolutional or not). As can be understood in the context of this disclosure, a neural network can utilize machine learning, which can refer to a broad class of such algorithms in which an output is generated based on learned parameters”).
Regarding claim 8, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. Further, Ecins teaches that the computer-implemented method of claim 1, 
obtaining, by the computing system, a plurality of sets of real-world LiDAR data physically collected by one or more LiDAR systems in the environment (See Ecins: Fig. 1, and [0027], “In some instances, at the operation 106, the computing device can map individual points of a point cloud to individual voxels. In some instances, the LIDAR data can be associated with a voxel space that is fixed with respect to a global map, for example (e.g., in contrast to a voxel space fixed with respect to a moving vehicle). In some instances, the computing device can discard or omit voxels that do not include data, or that include a number of points below a threshold number, in order to create a sparse voxel space”);
removing, by the computing system, one or more moving objects from the plurality of sets of real-world LiDAR data (See Ecins: Fig. 3, and [0095], “Although the example process 300 illustrates the operations 302 and 316, in some examples the operations 302 and/or 316 (or any other operation of the process 300) may not be performed. For example, the computing device can process voxels that represent just grounds (e.g., the voxels have been processed to remove 
associating, by the computing system, the plurality of sets of real-world LiDAR data to a common coordinate system to generate an aggregate LIDAR point cloud (See Ecins: Fig. 1, and [0022], “At operation 102, a computing device can receive (e.g., obtain) data representing an environment. The data can include any form of depth data from one or more sensors, such as Light Detection and Ranging (LIDAR) data, Radar data, image data (as determined from multi-view geometry), depth sensor data (time of flight, structured light, etc.), etc. In some examples, the computing device receives (e.g., retrieves) data from a data store, such as a database. Here, the data store can store data overtime as the data is received from one or more vehicles or other devices within an environment. In some examples, the computing device receives data from one or more vehicles or other devices as the data is being captured (e.g., real-time), in a batched manner, in one or more log files received from a vehicle, or at any other time. In some examples, the computing device can receive a plurality of LIDAR datasets from a plurality of LIDAR sensors operated in connection with a perception system of an autonomous vehicle. In some examples, the computing device can combine or fuse data from two or more LIDAR sensors into a single LIDAR dataset (also referred to as a “meta spin”). In some examples, the computing device can extract a portion of the LIDAR data for processing, such as over a period of time. In some examples, the computing device can receive Radar data and associating the Radar data with the LIDAR data to generate a more detailed representation of an environment. Example 104 illustrates a LIDAR dataset, which can include LIDAR data (e.g., point clouds) 
converting, by the computing system, the aggregate LIDAR point cloud to a surface element-based three-dimensional mesh (See Ecins: Fig. 2, and [0072], “In some instances, the map generation component 238 can include functionality to receive log file(s) and/or generate a three-dimensional (3D) map based on the data in the log file(s) 104. For example, the map generation component 238 can receive one or more of LIDAR data, image sensor data, GPS data, IMU data, radar data, sonar data, etc. and can combine the data to generate a 3D map of the environment. With respect to LIDAR data, the map generation component 238 can receive a plurality of point clouds of data and can combine the data to represent an environment as captured by the vehicle 202. In some instances, the map generation component 238 can generate a mesh based on the sensor data included in a log file(s). Examples of techniques used to generate a mesh of an environment include, but are not limited to, marching cubes, screened Poisson surface reconstruction, Delaunay triangulation, tangent plane estimation, alpha shape algorithm, Cocone algorithm, PowerCrust algorithm, ball pivoting algorithm, surface interpolated methods, and the like. As can be understood, the map generation component 238 can generate a 3D map including a mesh, wherein the mesh includes a plurality of polygons that define the shape of objects in the environment. In some instances, the map generation component 238 can include functionality to divide portions of the mesh into tiles representing a discrete portion of the environment”).
Regarding claim 9, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. Further, Ecins teaches that the computer-implemented method of claim 1, 
Regarding claim 11, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. Further, Kwon and Ecins teach that a computing system (See Kwon: Figs. 1 and 6-7, and [0053], “Referring now to FIGS. 6A, 6B and FIG. 7, FIGS. 6A and 6B are a flow chart of 
one or more processors (See Kwon: Fig. 1, and [0037], “The system 100 also includes an image processing system 142. In at least one example, the memory device 140 is a component of the image processing system 142. The image processing system 142 includes a processor 144”);
a machine-learned geometry model (See Ecins: Figs. 1-2, and [0076], “Further, in some examples, the semantic information can include a semantic classification of objects represented by polygons in the 3D map. For example, the semantic information can include semantic classifications including, but not limited to, ground, wall, road, curb, sidewalk, grass, tree, tree trunk/branch, foliage (e.g., leaves), building, wall, fire hydrant, mailbox, pole, post, pedestrian, 
one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations (See Kwon: Figs. 6-7, and [0066], “The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the 
obtaining a ground truth three-dimensional point cloud collected by a physical LiDAR system  (See Kwon: Figs. 1A-B, and [0034], “FIG. 1A is a block schematic diagram of an example of a system 100 for generating a 3D model point cloud 102 of an object 104 in accordance with an embodiment of the present disclosure. The 3D model point cloud 102 may also be referred to herein as simply the 3D model. The system 100 includes a two dimensional (2D) imaging sensor 106 for capturing a 2D image 108 of the object 104. An example of the 2D imaging sensor 106 is an electro-optical camera 109 that generated a high resolution 2D electro-optical image. Other examples of the 2D imaging sensor include any type device capable of generating a high resolution 2D image. The 2D image 108 includes a plurality of pixels 112 that provide a predetermined high resolution 114. The 2D image 108 also includes a 2D image plane 110 or is referenced in the 2D image plane 110 as illustrated in FIG. 1B”) as the physical LiDAR system travelled along a trajectory through an environment; obtaining a three-dimensional map of the environment (See Ecins: Fig. 2, and [0052], “In at least one example, the localization component 220 can include functionality to receive data from the sensor system(s) 206 to determine a position of the vehicle 202. For example, the localization component 220 can include and/or request/receive a three-dimensional map of an environment and can continuously determine a location of the autonomous vehicle within the map. In some instances, the localization component 220 can utilize SLAM (simultaneous localization and mapping) or CLAMS (calibration, localization and mapping, simultaneously) to receive image data, LIDAR data, radar data, IMU data, GPS data, wheel encoder data, and the like to accurately determine a location 
performing ray casting on the three-dimensional map according to the trajectory to generate an initial three-dimensional point cloud that comprises a plurality of points, wherein at least a respective depth is associated with each of the plurality of points (See Kwon: Figs. 1A-B, and [0042], “FIG. 1B is an illustration of the exemplary system of FIG. 1A showing an example of an image plane and using heterogeneous 2D and 3D sensor fusion in accordance with an embodiment of the present disclosure. In order to synchronize or align the 3D image 118 or 3D point cloud 120 with the 2D image 108 by the processor 144 or image processing system 142, a calibration procedure between the 3D point cloud 120 and the 2D image 108 is performed by the processor 144 or image processing system 142 to determine the relative poses of the 3D point cloud 120 and the 2D image 108. Synchronizing or aligning the 3D image 118 or 3D point cloud 120 with the 2D image involve fusion or combining the data from the 2D imaging sensor 106 and the 3D imaging sensor 116 for the same viewpoint 138 or location of the sensor platform 136, which is also referred to as heterogeneous 2D and 3D sensor fusion. In one aspect, the image processing system 142 is configured to determine a feature point 160A of the object 104 within the 3D point cloud 120 as well as a corresponding pixel location 162 in the image plane 110 of the 2D image 108 which corresponds to the feature point 160A as shown in FIG. 1A. As can be seen, the feature point 160A corresponds to the pixel location 162 on the image plane 110 (e.g., the two dimensional projection of a 3D scene onto a two dimensional image captured by the 2D imaging sensor 106). In one aspect, the image processing system 142 
processing, by the machine-learned geometry model (See Ecins: Figs. 1-2, and [0076], “Further, in some examples, the semantic information can include a semantic classification of objects represented by polygons in the 3D map. For example, the semantic information can include semantic classifications including, but not limited to, ground, wall, road, curb, sidewalk, grass, tree, tree trunk/branch, foliage (e.g., leaves), building, wall, fire hydrant, mailbox, pole, post, pedestrian, bicyclist, animal (e.g., dog), and the like. In some instances, the semantic information can provide an indication of whether the polygon, object, or element represents a static object, dynamic object, etc. In some instances, the semantic information can include an object identifier to distinguish between different instances of the same semantic classification (e.g., tree #1, tree #2, etc.). The semantic information component 240 can use classical and/or machine learning (e.g., neural network) algorithms to receive data and output one or more detections, segmentations, and/or classifications of objects associated with the data”), the 
generating an adjusted three-dimensional point cloud in which the one or more of the plurality of points have the respective adjusted depth predicted by the machine-learned geometry network (See Kwon: Figs. 6-7, and [0057], “In block 618, a determination is made whether all viewpoints or locations of the sensor platform have been completed. If not, the method 600 will advance to block 620. In block 620, the sensor platform is moved to the next viewpoint or location and the method 600 will return to block 602 and the method 600 will proceed similar to that previously described. Accordingly, the process or method 600 is repeated until an updated 3D model point cloud 712 has been determined for all viewpoints or desired sensor platform locations. In block 602, a subsequent 2D image of the object is captured by the 2D imaging sensor at a current viewpoint or location of the sensor platform and a subsequent 3D image of the object is captured by the 3D imaging sensor at the current 
evaluating an objective function that compares the adjusted three-dimensional point cloud to the ground truth three-dimensional point cloud (See Ecins: Figs. 6-7, and [0140], “In some examples when a hole error is flagged, the computing device or other devices can perform further analysis to evaluate if a hole is valid or invalid. For example, a valid hole can correspond to a window, manhole, etc., while an invalid hole can correspond to data that is missing. To illustrate, the computing device can compare a size/shape of a hole to a predetermined size/shape that is associated with a window, manhole, etc. In at least some example, information from other sensor modalities may be used to make such a determination. As a non-limiting example, image data from an image sensor (e.g. a camera) may be used to verify that a detected hole corresponds to a region where no depth data is expected”); and
modifying one or more values of one or more parameters of the machine-learned geometry model based at least in part on the objective function (See Ecins: Figs. 1-2, and [0048], “In another example, the action can include recalibrating a sensor on a vehicle. For example, a parameter of a sensor system(s), such as a sensor system(s) 206 discussed below, can be adjusted so that data is captured differently. Several non-limiting examples include adjusting a focal point, light sensitivity, direction at which a sensor is aimed, pattern of light emitted, etc. Once a parameter is adjusted, a vehicle can be instructed to capture new data for an area that is associated with an error”).
Regarding claim 12, Kwon and Ecins teach all the features with respect to claim 11 as outlined above. Further, Ecins teaches that the computing system of claim 11, wherein 
Regarding claim 14, Kwon and Ecins teach all the features with respect to claim 11 as outlined above. Further, Ecins teaches that the computing system of claim 11, wherein modifying the one or more values of the one or more parameters of the machine-learned geometry model based at least in part on the objective function comprises backpropagating the 
Regarding claim 15, Kwon and Ecins teach all the features with respect to claim 11 as outlined above. Further, Ecins teaches that the computing system of claim 11, wherein the machine-learned geometry network comprises a parametric continuous convolution network (See Ecins: Figs. 1-2, and [0059], “As described herein, an exemplary neural network is a biologically inspired algorithm which passes input data through a series of connected layers to produce an output. Each layer in a neural network can also comprise another neural network, or can comprise any number of layers (whether convolutional or not). As can be understood in the context of this disclosure, a neural network can utilize machine learning, which can refer to a broad class of such algorithms in which an output is generated based on learned parameters”).
Regarding claim 16, Kwon and Ecins teach all the features with respect to claim 11 as outlined above. Further, Ecins teaches that the computing system of claim 11, wherein the machine-learned geometry network comprises a plurality of continuous fusion layers with residual connections between each adjacent layer (See Ecins: Fig. 2, and [0059], “As described herein, an exemplary neural network is a biologically inspired algorithm which passes input data through a series of connected layers to produce an output. Each layer in a neural network can also comprise another neural network, or can comprise any number of layers (whether convolutional or not). As can be understood in the context of this disclosure, a neural network can utilize machine learning, which can refer to a broad class of such algorithms in which an output is generated based on learned parameters”).
claim 17, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. Further, Kwon and Ecins teach that one or more non-transitory computer-readable media that collectively store instructions that, when executed by one or more processors, cause the one or more processor to perform operations, the operations (See Kwon: Figs. 1 and 6-7, and [0053], “Referring now to FIGS. 6A, 6B and FIG. 7, FIGS. 6A and 6B are a flow chart of an example of a method 600 for generating a 3D model point cloud of an object in accordance with another embodiment of the present disclosure. FIG. 7 is a block schematic diagram illustrating portions of the exemplary method 600 in FIGS. 6A and 6B. In accordance with an embodiment, the method 600 is embodied in and performed by the system 100 in FIGS. 1A and 1B. For example, the set of functions 145 includes the method 600. As described herein, generating the 3D model point cloud includes generating the 3D model point cloud using heterogeneous 2D and 3D sensor fusion in that data from the 2D imaging sensor 106 is combined or fused with data from the 3D imaging sensor 116 for the same viewpoint 138 or location of the sensor platform 136 for each viewpoint 138 or location of the sensor platform 136”; and [0020], “In accordance with an embodiment and any of the previous embodiments, wherein the 2D imaging sensor includes an electro-optical camera to capture a 2D electro-optical image and wherein the 3D imaging sensor includes a 3D Light Detection and Ranging (LiDAR) imaging sensor”) comprising:
obtaining a three-dimensional map of an environment (See Kwon: Figs. 1A-B, and [0034], “FIG. 1A is a block schematic diagram of an example of a system 100 for generating a 3D model point cloud 102 of an object 104 in accordance with an embodiment of the present disclosure. The 3D model point cloud 102 may also be referred to herein as simply the 3D 
determining a trajectory that describes a series of locations of a virtual object relative to the environment over time (See Ecins: Fig. 2, and [0052], “In at least one example, the localization component 220 can include functionality to receive data from the sensor system(s) 206 to determine a position of the vehicle 202. For example, the localization component 220 can include and/or request/receive a three-dimensional map of an environment and can continuously determine a location of the autonomous vehicle within the map. In some instances, the localization component 220 can utilize SLAM (simultaneous localization and mapping) or CLAMS (calibration, localization and mapping, simultaneously) to receive image data, LIDAR data, radar data, IMU data, GPS data, wheel encoder data, and the like to accurately determine a location of the autonomous vehicle. In some instances, the localization component 220 can provide data to various components of the vehicle 202 to determine an initial position of an autonomous vehicle for generating a candidate trajectory, as discussed herein”);
performing ray casting on the three-dimensional map according to the trajectory to generate an initial three-dimensional point cloud that comprises a plurality of points, wherein 
processing, using a machine-learned geometry network (See Ecins: Figs. 1-2, and [0076], “Further, in some examples, the semantic information can include a semantic classification of objects represented by polygons in the 3D map. For example, the semantic information can include semantic classifications including, but not limited to, ground, wall, road, curb, sidewalk, grass, tree, tree trunk/branch, foliage (e.g., leaves), building, wall, fire hydrant, mailbox, pole, post, pedestrian, bicyclist, animal (e.g., dog), and the like. In some instances, the semantic information can provide an indication of whether the polygon, object, or element represents a static object, dynamic object, etc. In some instances, the semantic information can include an object identifier to distinguish between different instances of the same semantic classification (e.g., tree #1, tree #2, etc.). The semantic information component 240 can use classical and/or 
generating an adjusted three-dimensional point cloud in which the one or more of the plurality of points have the respective adjusted depth predicted by the machine-learned geometry network (See Kwon: Figs. 6-7, and [0057], “In block 618, a determination is made whether all viewpoints or locations of the sensor platform have been completed. If not, the method 600 will advance to block 620. In block 620, the sensor platform is moved to the next viewpoint or location and the method 600 will return to block 602 and the method 600 will proceed similar to that previously described. Accordingly, the process or method 600 is repeated until an updated 3D model point cloud 712 has been determined for all viewpoints or desired sensor platform locations. In block 602, a subsequent 2D image of the object is 
Regarding claim 18, Kwon and Ecins teach all the features with respect to claim 17 as outlined above. Further, Ecins teaches that the one or more non-transitory computer-readable media of claim 17, wherein the operations further comprise:
generating a respective intensity value for each of the plurality of points based at least in part on intensity data included in the three-dimensional map for locations within a radius of a respective location associated with such point in either the initial three-dimensional point cloud or the adjusted three-dimensional point cloud (See Ecins: Fig. 2, and [0062], “In at least one example, the sensor system(s) 206 can include LIDAR sensors, radar sensors, ultrasonic transducers, sonar sensors, location sensors (e.g., GPS, compass, etc.), inertial sensors (e.g., inertial measurement units (IMUs), accelerometers, magnetometers, gyroscopes, etc.), cameras (e.g., RGB, IR, intensity, depth, etc.), microphones, wheel encoders, environment sensors (e.g., temperature sensors, humidity sensors, light sensors, pressure sensors, etc.), etc. The sensor system(s) 206 can include multiple instances of each of these or other types of sensors. For instance, the LIDAR sensors can include individual LIDAR sensors located at the corners, front, back, sides, and/or top of the vehicle 202. As another example, the camera sensors can include multiple cameras disposed at various locations about the exterior and/or interior of the vehicle 202. The sensor system(s) 206 can provide input to the vehicle computing device 204. Additionally, and/or alternatively, the sensor system(s) 206 can send sensor data, via the one or 
Regarding claim 19, Kwon and Ecins teach all the features with respect to claim 17 as outlined above. Further, Ecins teaches that the one or more non-transitory computer-readable media of claim 17, wherein performing the ray casting to generate the initial three-dimensional point cloud comprises determining, for each of a plurality of rays, a ray casting location and a ray casting direction based at least in part on the trajectory (See Ecins: Fig. 1, and [0019], “Although many examples are discussed in the context of voxels, the techniques can be implemented for other representations, such as meshes. For example, in a mesh representation of an environment, the techniques can use ray casting (vertical/horizontal or near vertical/horizontal rays) to see if the ray intersects with multiple mesh polygons. If so, the techniques may determine an error (e.g., a multiple-surfaces error)”).
Regarding claim 20, Kwon and Ecins teach all the features with respect to claim 17 as outlined above. Further, Ecins teaches that the one or more non-transitory computer-readable media of claim 17, wherein performing, by the computing system, the ray casting to generate the initial three-dimensional point cloud comprises:
identifying, by the computing system for each of the plurality of rays, a closest surface element in the three-dimensional map to the ray casting location and along the ray casting direction (See Ecins: Fig. 1, and [0036], “A situation 124 represents a valid situation (e.g., no noisy-surface-normals error). Here, the computing device analyzes a row of voxels and determines that a first voxel that is associated with the ground identifier 118 and a second voxel that is associated with the ground identifier 118 are within a threshold distance of each 
generating, by the computing system for each of the plurality of rays, one of the plurality of points with its respective depth based at least in part on a distance from the ray casting location to the closest surface element (See Ecins: Fig. 1, and [0037], “A situation 126 represents an invalid situation (e.g., a noisy-surface-normals error). Here, the computing device analyzes a row of voxels and determines that a first voxel that is associated with the ground identifier 118 and a second voxel that is associated with the ground identifier 118 are within a threshold distance of each other (e.g., adjacent voxels). Further, the computing device determines that an angle between a surface normal vector associated with the first voxel and a surface normal vector associated with the second voxel meets or exceeds a threshold amount. As one example, the situation 126 represents an uneven surface that is due to an error (e.g., a voxel includes inaccurate data)”).


Claims 10 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Kwon, etc. (US 20200118281 A1) in view of Ecins, etc. (US 20200110158 A1), further in view of Schulter, etc. (US 20190096125 A1).
Regarding claim 10, Kwon and Ecins teach all the features with respect to claim 1 as outlined above. However, Kwon fails to explicitly disclose that the computer-implemented 
However, Schulter teaches that the computer-implemented method of claim 1, wherein the machine-learned geometry model has been trained using an objective function that comprises an adversarial loss term that measures an ability of a discriminator network to select which of a synthetic three- dimensional point cloud generated using the machine-learned geometry model and a ground truth three-dimensional point cloud collected by a physical LiDAR system is real and which is synthetic (See Schulter: Figs. 5-6, and [0080], “Therefore, the error calculated by the adversarial loss unit 342 can be used to correct the representation of corresponding features in the background bird's eye view 108. Thus, the refinement loss module 340 returns the error from the adversarial loss unit 342 to the refinement network 310 to update the encoder parameters and decoder parameters using, e.g., gradient descent, among other backpropagation techniques”).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention was effectively filed to modify Kwon to have the computer-implemented method of claim 1, wherein the machine-learned geometry model has been trained using an objective function that comprises an adversarial loss term that measures an ability of a discriminator network to select which of a synthetic three- dimensional point cloud generated using the machine-learned geometry model and a ground truth three-dimensional point cloud collected 
Regarding claim 13, Kwon and Ecins teach all the features with respect to claim 11 as outlined above. Further, Schulter teaches that the computing system of claim 11, wherein evaluating the objective function comprises:
providing the adjusted three-dimensional point cloud and the ground truth three- dimensional point cloud to a discriminator model (See Schulter: Fig. 6, and [0085], “However, GPS signals may contain noise or inaccuracies, and angle estimates may be imperfect due to annotation noise and missing information. Therefore, the street map warp 320 can, alternatively, align the aerial view to the background bird's eye view 108 by matching semantics and geometry. For example, the street map warp 320 can include, e.g., a parametric spatial transformer 322 and a non-parametric warp 324”);
receiving, from the discriminator model, a selection of one of the adjusted three- dimensional point cloud and the ground truth three-dimensional point cloud as real (See Schulter: Fig. 6, and [0077], “To train the encoder parameters and the decoder parameters, a refinement loss module 340 incorporates information from one or more of, e.g., a simulator 330 and a street map warp 320 via an adversarial loss unit 342 and a reconstruction loss unit 344, respectively”); and
determining a loss value based at least in part on the selection received from the discriminator model (See Schulter: Fig. 6, and [0077], “To train the encoder parameters and the decoder parameters, a refinement loss module 340 incorporates information from one or more of, e.g., a simulator 330 and a street map warp 320 via an adversarial loss unit 342 and a 


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GORDON G LIU whose telephone number is (571)270-0382. The examiner can normally be reached Monday - Friday 8:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on 571-272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/GORDON G LIU/Primary Examiner, Art Unit 2612