Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/3/2018 was filed.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings are objected to because the drawings have a line quality that is too light to be reproduced (weight of all lines and letters must be heavy enough to permit adequate reproduction) (FIGs 3, 4), text that is illegible (reference characters, sheet numbers, and view numbers must be plain and legible) see 37 CFR 1.84(1) and (p)(1)) (FIG 2)  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the 

Specification
The disclosure is objected to because of the following informalities:
“and/or r computing systems” should read “and/or computing systems” [0025]
“positive training samples (e.g., training samples that do not belong to any of the classes of the one or more objects)” should read “positive training samples (e.g., training samples that belong to the classes of the one or more objects)” [0037]
“400over” should read “400 over” [00104]
Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites

- generating, by the computing system, based at least in part on the sensor data, an input representation of the one or more objects, wherein the input representation comprises a temporal dimension and one or more spatial dimensions;
- determining, by the computing system, based at least in part on the input representation and a machine-learned model, at least one of one or more detected object classes of the one or more objects, one or more locations of the one or more objects over the one or more time intervals, or one or more predicted paths of the one or more objects; and
- generating, by the computing system, based at least in part on the input representation and the machine-learned model, output data comprising one or more bounding shapes corresponding to the one or more objects.
2A Prong 1: The limitations of generating, by the computing system, based at least in part on the sensor data, an input representation of the one or more objects, wherein the input representation comprises a temporal dimension and one or more spatial dimensions, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the “by the computing system” language, “generating” in the context of this claim encompass the user mentally adding determining, by the computing system, based at least in part on the input representation and a machine-learned model, at least one of one or more detected object classes of the one or more objects, one or more locations of the one or more objects over the one or more time intervals, or one or more predicted paths of the one or more objects, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, “determining” in the context of this claim encompass the user mentally deciding the existences or trajectories of objects based on the information. Similarly, the limitations of generating, by the computing system, based at least in part on the input representation and the machine-learned model, output data comprising one or more bounding shapes corresponding to the one or more objects, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, “generating” in the context of this claim encompass the user mentally drawing bounding boxes to the objects. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites the receiving step. The receiving step is recited at a high level of generality and amounts to mere data receiving, which is a form of insignificant extra-solution activity. Further, the claim recites the additional element – using a computing system to perform the generating and determining steps. The computing system is recited at a high-level of generality.  Accordingly, this additional element does not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the receiving step was considered to be extra-solution activity in Step 2A Prong 2, and thus it is re-evaluated in Step 2B to determine if it is more than what is well-understood, routine, conventional activity in the field. The court decisions cited in MPEP 2106.05(d)(II) indicate that merely “receiving or transmitting data over a network” is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby, a conclusion that the claimed receiving step is well-understood, routine, conventional activities is supported under Berkheimer. The claim is not patent eligible.

Similarly, claims 15 and 18 are rejected under 35 U.S.C. 101, mutatis mutandis, as reciting an abstract idea without adding significantly more than the judicial exception.

Claim 2 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- generating, by the computing system, based at least in part on the sensor data, a plurality of voxels corresponding to the environment comprising the one or more objects, wherein a height dimension of the plurality of voxels is used as an input channel of the input representation, and wherein the input representation is based at least in part on the plurality of voxels corresponding to one or more portions of the environment occupied by the one or more objects.
2A Prong 1: The limitations of generating, by the computing system, based at least in part on the sensor data, a plurality of voxels corresponding to the environment comprising the one or more objects, wherein a height dimension of the plurality of voxels is used as an input channel of the input representation, and wherein the input representation is based at least in part on the plurality of voxels corresponding to one or more portions of the environment occupied by the one or more objects, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the language “by the computing system”, “generating” in the context of this claim encompass the user mentally drawing voxels on the locations of the objects based on the information.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls 
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional element – a computing system. The computing system is recited at a high-level of generality. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computing system to generate voxels amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.

Claim 3 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- wherein the input representation comprises a tensor associated with a set of dimensions comprising the temporal dimension and the one or more spatial dimensions, the temporal dimension of the tensor associated with the one or more time intervals, and the one or more spatial dimensions of the tensor comprising a width 51dimension, a depth dimension, or a height dimension that is used as an input channel for the machine-learned model. 
2A Prong 1: The limitations of wherein the input representation comprises a tensor associated with a set of dimensions comprising the temporal dimension and the one or more spatial dimensions, the temporal dimension of the tensor associated with the one or more time intervals, and the one or more spatial dimensions of the tensor comprising a width dimension, a depth dimension, or a height dimension that is used as an input channel for the machine-learned model, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally adding the timestamp and exterior measurement values to the collection. 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites no additional element. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there is no additional element. The claim is not patent eligible.
Claim 4 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- wherein the input representation is input to a first convolution layer of a plurality of convolution layers of the machine-learned model, and wherein weights of a plurality of feature maps for each of the plurality of convolution layers are shared between the plurality of convolution layers. 
2A Prong 1: The limitations of wherein the input representation is input to a first convolution layer of a plurality of convolution layers of the machine-learned model, and wherein weights of a plurality of feature maps for each of the plurality of convolution layers are shared between the plurality of convolution layers, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally relaying information to the next step. 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites no additional element. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there is no additional element. The claim is not patent eligible.

Claim 5 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- aggregating, by the computing system, temporal information to the tensor subsequent to aggregating spatial information associated with the one or more spatial dimensions to the tensor, wherein the temporal information is aggregated as the input representation is processed by the plurality of convolution layers, and wherein the temporal information is associated with the temporal dimension of the tensor.
2A Prong 1: The limitations of aggregating, by the computing system, temporal information to the tensor subsequent to aggregating spatial information associated with the one or more spatial dimensions to the tensor, wherein the temporal information is aggregated as the input representation is processed by the plurality of convolution layers, and wherein the temporal information is associated with the temporal dimension of the tensor, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the language “by the computing system”, “aggregating” in the 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional element – a computing system. The computing system is recited at a high-level of generality. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computing system to generate voxels amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.

Claim 6 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- wherein the temporal information associated with the temporal dimension is aggregated at the first convolution layer of the plurality of convolution layers.. 
2A Prong 1: The limitations of wherein the temporal information associated with the temporal dimension is aggregated at the first convolution layer of the plurality of convolution layers, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally relaying information to the next step. 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites no additional element. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there is no additional element. The claim is not patent eligible.

Claim 7 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 

2A Prong 1: The limitations of reducing, by the computing system, the one or more time intervals of the temporal dimension to one time interval by performing a one-dimensional convolution on the temporal information associated with the temporal dimension, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the language “by the computing system”, “reducing” in the context of this claim encompass the user mentally multiplying the values and combining the results into fewer variables.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional element – a computing system. The computing system is recited at a high-level of generality. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computing system to generate voxels amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.

Claim 8 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- wherein the temporal information associated with the temporal dimension of the tensor is aggregated over two or more convolution layers of the plurality of convolution layers.
2A Prong 1: The limitations of wherein the temporal information associated with the temporal dimension of the tensor is aggregated over two or more convolution layers of the plurality of convolution layers, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally relaying information to the next step. 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites no additional element. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there is no additional element. The claim is not patent eligible.

Claim 9 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- reducing, by the computing system, the one or more time intervals of the temporal dimension to one time interval by performing a two-dimensional convolution on the temporal information associated with the temporal dimension.
2A Prong 1: The limitations of reducing, by the computing system, the one or more time intervals of the temporal dimension to one time interval by performing a two-dimensional convolution on the temporal information associated with the temporal dimension, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the language “by the computing system”, “reducing” in the 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional element – a computing system. The computing system is recited at a high-level of generality. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computing system to generate voxels amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.

Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- activating, by the computing system, based at least in part on the output data, one or more systems comprising mechanical systems, one or more electromechanical systems, or one or 
2A Prong 1: The limitations of activating, by the computing system, based at least in part on the output data, one or more systems comprising mechanical systems, one or more electromechanical systems, or one or more electronic systems, associated with operation of the computing system, a manually operated vehicle, an autonomous vehicle, or one or more robotic systems., as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the language “by the computing system”, “activating” in the context of this claim encompass the user mentally relaying the instructions to the next step.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional element – a computing system. The computing system is recited at a high-level of generality. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of 

Claim 11 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- determining, by the computing system, one or more travelled paths of the one or more objects based at least in part on one or more locations of the one or more objects over a sequence of the one or more time intervals comprising a last time interval associated with a current time and the one or more time intervals prior to the current time, wherein the one or more predicted paths of the one or more objects is based at least in part on the one or more travelled paths.
2A Prong 1: The limitations of determining, by the computing system, one or more travelled paths of the one or more objects based at least in part on one or more locations of the one or more objects over a sequence of the one or more time intervals comprising a last time interval associated with a current time and the one or more time intervals prior to the current time, wherein the one or more predicted paths of the one or more objects is based at least in part on the one or more travelled paths, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the language “by the computing system”, “determining” in the context of this claim encompass the user mentally calculating the trajectory of the objects based on the information.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional element – a computing system. The computing system is recited at a high-level of generality. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computing system to generate voxels amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.

Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- detecting, by the computing system, an object of the one or more objects that is at least partly occluded; and 

2A Prong 1: The limitations of detecting, by the computing system, an object of the one or more objects that is at least partly occluded, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the language “by the computing system”, “detecting” in the context of this claim encompass the user mentally comparing the location of objects and deciding if there is an overlap. Similarly, the limitations of determining, by the computing system, based at least in part on the one or more travelled paths of the one or more objects, when the object of the one or more objects that is at least partly occluded was previously detected, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, other than reciting “by the computing system”, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the language “by the computing system”, “determining” in the context of this claim encompass the user mentally comparing the characteristics of the objects and deciding if they are the same.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional element – a computing system. The computing system is recited at a high-level of generality. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computing system to generate voxels amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.

Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- wherein the one or more sensor outputs comprise one or more three-dimensional points corresponding to a plurality of surfaces of the one or more objects detected by the one or more sensors. 
2A Prong 1: The limitations of wherein the one or more sensor outputs comprise one or more three-dimensional points corresponding to a plurality of surfaces of the one or more objects detected by the one or more sensors, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites no additional element. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there is no additional element. The claim is not patent eligible.

Claim 14 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- wherein the sensor data is associated with a birds eye view vantage point, the one or more sensors comprising one or more light detection and ranging devices (LIDAR), one or more cameras, one or more radar devices, one or more sonar devices, or one or more thermal sensors. 
2A Prong 1: The limitations of wherein the sensor data is associated with a birds eye view vantage point, the one or more sensors comprising one or more 53light detection and ranging devices (LIDAR), one or more cameras, one or more radar devices, one or more sonar devices, or one or more thermal sensors, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally collecting the data from sensors. 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites no additional element. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there is no additional element. The claim is not patent eligible.

Claim 16 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- generating the machine-learned model based at least in part on training data comprising a plurality of training objects associated with a plurality of classified features and a plurality of classified object labels, the plurality of classified features based at least in part on point cloud data comprising a plurality of three-dimensional points associated with one or more physical characteristics of the plurality of training objects.
2A Prong 1: The limitations of generating the machine-learned model based at least in part on training data comprising a plurality of training objects associated with a plurality of classified features and a plurality of classified object labels, the plurality of classified features based at least in part on point cloud data comprising a plurality of three-dimensional points associated with one or more physical characteristics of the plurality of training objects, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “generating” in the context of this claim encompass the user mentally setting up the layers of neurons and assigning weights to them.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites no additional element. Accordingly, this additional element does 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there is no additional element. The claim is not patent eligible.

Claim 19 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- wherein the machine-learned model is based at least in part on one or more classification techniques comprising a convolutional neural network. 
2A Prong 1: The limitations of wherein the machine-learned model is based at least in part on one or more classification techniques comprising a convolutional neural network, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally setting up the rules for a model. 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites no additional element. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there is no additional element. The claim is not patent eligible.

Claim 20 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites 
- determining, based at least in part on the input representation and the machine-learned model, an amount of overlap between the one or more bounding shapes; and 
- determining that the object of the one or more objects associated with the one or more bounding shapes that satisfies the one or more overlap criteria is the same object over the one or more time intervals.
2A Prong 1: The limitations of determining, based at least in part on the input representation and the machine-learned model, an amount of overlap between the one or more bounding shapes, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “determining” in the context of this claim encompass the user mentally comparing the location determining that the object of the one or more objects associated with the one or more bounding shapes that satisfies the one or more overlap criteria is the same object over the one or more time intervals, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “determining” in the context of this claim encompass the user mentally comparing the characteristics of the objects and deciding if they are the same.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim recites no additional element. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there is no additional element. The claim is not patent eligible.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1, 15 and 18 are rejected under 35 U.S.C. 102(a)(1) and 35 U.S.C. 102(a)(2) as anticipated by Butt et al. (US 2018/0157939 A1, hereinafter Butt).
Regarding Claim 1
Butt teaches:
A computer-implemented method of object detection, the computer-implemented method comprising: receiving, by a computing system comprising one or more computing devices, sensor data comprising information based at least in part on one or more sensor outputs associated with detection of an environment over one or more time intervals by one or more sensors, wherein the environment comprises one or more objects (Butt [Abstract] “The system comprises one or more processors and memory comprising computer program code stored on the memory and configured when executed by the one or more processors to cause the one or more processors to perform a method.”;  [0114] “The video is captured by the camera 108 over a period of time.” [0072] “Each video capture device 108 includes at least one image sensor 116 for capturing a plurality of images.” [0092] “The video analytics module 224 receives image data and analyzes the image data to determine properties or characteristics of the captured image or video and/or of objects found in the scene represented by the image or video.”; “processors and memory” reads on “a computing system”, “a period of time” reads on “time interval”; “objects found in the scene” reads on “the environment comprises one or more objects”)
- generating, by the computing system, based at least in part on the sensor data, an input representation of the one or more objects, wherein the input representation comprises a temporal dimension and one or more spatial dimensions; (Butt [0138] “The Data 710 in Object Profile 702 and Object Profile 704 has, for example, content including time stamp, frame number, resolution in pixels by width and height of the scene, segmentation mask of this frame by width and height in pixels and stride by row width in bytes, classification (person, vehicle, other), confidence by percent of the classification, box (bounding box surrounding the profiled object) by width and height in normalized sensor coordinates, image width and height in pixels as well as image stride (row width in bytes), segmentation mask of image, orientation, and x & y coordinates of the image box.”; “time stamp “ reads on “temporal dimension” and “width and height” reads on “spatial dimension”)
Butt [0009] “The implemented learning machine may be a second learning machine, and the identifying may be performed by a first learning machine implemented by the one or more processors.” [0145] “The temporal object classification module 912 may also maintains class (such as, for example, human, vehicle, or animal) information of an object over a period of time. … the temporal object classification module 912 determines the objects type based on its appearance in multiple frames. For example, gait analysis of the way a person walks can be useful to classify a person, or analysis of a person's legs can be useful to classify a cyclist. The temporal object classification module 912 may combine information regarding the trajectory of an object”; “object type” reads on “object classes” and “trajectory” reads on “the predicted paths”)
- generating, by the computing system, based at least in part on the input representation and the machine-learned model, output data comprising one or more bounding shapes corresponding to the one or more objects. (Butt [0106] “For example, the location metadata may be further used to generate a bounding box (such as, for example, when encoding video or playing back video) outlining the detected foreground visual object.”)

Regarding Claim 15


Regarding Claim 18
Claim 18 is a computing device claim corresponding to the methods of claim 1, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 1. Note that Butt teaches a computing device ([0007]).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


2 is rejected under 35 U.S.C. 103 as being unpatentable over Butt in view of Adams et al. (US 2019/0049242 A1 hereinafter Adams).
Regarding Claim 2
Butt teaches all of the limitations of claim 1 as cited above but does not distinctly disclose:
- The computer-implemented method of claim 1, further comprising: generating, by the computing system, based at least in part on the sensor data, a plurality of voxels corresponding to the environment comprising the one or more objects, wherein a height dimension of the plurality of voxels is used as an input channel of the input representation, and wherein the input representation is based at least in part on the plurality of voxels corresponding to one or more portions of the environment occupied by the one or more objects.
However, Adams teaches

- The computer-implemented method of claim 1, further comprising: generating, by the computing system, based at least in part on the sensor data, a plurality of voxels corresponding to the environment comprising the one or more objects, wherein a height dimension of the plurality of voxels is used as an input channel of the input representation, and wherein the input representation is based at least in part on the plurality of voxels corresponding to one or more portions of the environment occupied by the one or more objects. (Adams [0019-0020] “In some examples, various sensor data may be accumulated into a voxel space or array. Such a voxel space may be a three-dimensional representation comprising a plurality of voxels. As a non-limiting example, a voxel space may be a rectangular cuboid having a length, a width, and a height, comprising a plurality voxels, each having a similar shape. In some examples, the voxel space is representative of an environment such that an origin of the voxel space may be described by a position and/or orientation in an environment. Similarly, each voxel may be described by one or more of a position and orientation in an environment, or a coordinate relative to an origin of the voxel space”; “environment” reads on “object”)
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt with the voxel creation of Adams in order to achieve better representatives of objects and efficient determination of overlaps between objects (Adams, [0019])

Claims 3-4 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Butt in view of Mehrseresht (US 2019/0188866 A1).
Regarding Claim 3
Butt teaches all of the limitations of claim 1 as cited above but does not distinctly disclose:
- The computer-implemented method of claim 1, wherein the input representation comprises a tensor associated with a set of dimensions comprising the temporal dimension and the one or more spatial dimensions, the temporal dimension of the tensor associated with the one or more time intervals, and the one or more spatial dimensions of the tensor comprising a width dimension, a depth dimension, or a height dimension that is used as an input channel for the machine-learned model.
	However, Mehrseresht teaches
Mehrseresht [0108] “In one arrangement, an input 905 to the convolutional neural network is a segment of the sequence of spatial representations 125 containing spatial representation from 16 consecutive timestamps (16 frames). In the convolutional neural network 900 illustrated in FIG. 9, the convolution filters are c×3×3×3 tensors, where c is the number of channels in the previous layer. The convolutional neural network 900 has stride of (v.sub.1, v.sub.2, v.sub.3) indicating convolution with the stride of v.sub.3 over temporal dimension, and stride of v.sub.1 and v.sub.2 on the width and height of each frame of spatial representation (in a first convolution layer 910), or feature map of previous layers (in the subsequent convolution layers 911).”)
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt with the tensor data structure of Mehrseresht in order to achieve fine-grain detection of motions and efficient and accurate computation. (Mehrseresht, [0131])

Regarding Claim 4
Butt as modified by Mehrseresht teaches all of the limitations of claim 3 as cited above and  Mehrseresht further teaches:
Mehrseresht [0107] “Convolutional layers apply a convolution operation to the input, passing the result to the next layer. The receptive field of convolution units are often small e.g., 3 by 3, and convolution units in the same layer have the same weights. Convolution units in the same layer having the same weights is commonly referred to as weight sharing. In other words, convolutional nodes in the same layer share weights. Units in fully connected layers have connections to all units in the previous layer.”)
	Same motivation to combine Butt and Mehrseresht as claim 3.

Regarding Claim 19
Butt teaches all of the limitations of claim 18 as cited above but does not distinctly disclose:
- The computing device of claim 18, wherein the machine-learned model is based at least in part on one or more classification techniques comprising a convolutional neural network. (Mehrseresht [0114] “Training of the convolutional neural network adjusts the weights by minimizing the loss. Sigmoid cross-entropy (also called Softmax) loss is commonly used for classification.”)
	Same motivation to combine Butt and Mehrseresht as claim 3.


Claims 5-6 are rejected under 35 U.S.C. 103 as being unpatentable over Butt in view of Mehrseresht further in view of Banka et al. (US 2012/0197856 A1 hereinafter Banka)
Regarding Claim 5
	Butt as modified by Mehrseresht teaches all of the limitations of claim 4 as cited above and Mehrseresht further teaches: 
- the plurality of convolution layers (Mehrseresht [Fig. 9])
Butt as modified by Mehrseresht does not distinctly disclose:
- The computer-implemented method of claim 4, further comprising: aggregating, by the computing system, temporal information to the tensor subsequent to aggregating spatial information associated with the one or more spatial dimensions to the tensor, wherein the temporal information is aggregated as the input representation is processed by [the plurality of convolution layers], and wherein the temporal information is associated with the temporal dimension of the tensor.
	However, Banka teaches:
- The computer-implemented method of claim 4, further comprising: aggregating, by the computing system, temporal information to the tensor subsequent to aggregating spatial information associated with the one or more spatial dimensions to the tensor, wherein the temporal information is aggregated as the input representation is processed by [the plurality of convolution layers], and wherein the temporal information is associated with the temporal dimension of the tensor. (Banka [0028] “In particular embodiments, an aggregator nodes 16 may aggregate sensor data using both spatial and temporal factors. An aggregator node 16 may collect data from one or more sensor nodes 12 based both the spacial proximity of sensor nodes 12 and on the time-series of the sensor data. In particular embodiments, complex sensor data with multidimensional and temporal characteristics may be aggregated using multilinear algebraic techniques (such as, for example, tensor decomposition) and aggregator node 16 may only transmit key coefficients to indexer nodes 26.”)
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt and Mehrseresht with the data aggregation of Banka in order to achieve efficient data processing in the CNN layer (Banka, [0063])

Regarding Claim 6
Butt as modified by Mehrseresht and Banka teaches all of the limitations of claim 5 as cited above and Mehrseresht further teaches:
- The computer-implemented method of claim 5, wherein the temporal information associated with the temporal dimension is aggregated at the first convolution layer of the plurality of convolution layers.
(Mehrseresht [Fig. 9] [0108] “In one arrangement, an input 905 to the convolutional neural network is a segment of the sequence of spatial representations 125 containing spatial representation from 16 consecutive timestamps (16 frames). In the convolutional neural network 900 illustrated in FIG. 9, the convolution filters are c×3×3×3 tensors, where c is the number of channels in the previous layer. The convolutional neural network 900 has stride of (v.sub.1, v.sub.2, v.sub.3) indicating convolution with the stride of v.sub.3 over temporal dimension, and stride of v.sub.1 and v.sub.2 on the width and height of each frame of spatial representation (in a first convolution layer 910), or feature map of previous layers (in the subsequent convolution layers 911).”; Fig. 9 discloses the input of the convolutional layer and “timestamp” reads on “temporal dimension”)

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Butt in view of Mehrseresht further in view of Banka further in view of Chen et al. (CN 105910827 A hereinafter Chen) 

Regarding Claim 7
Butt as modified by Mehrseresht and Banka teaches all of the limitations of claim 6 as cited above but does not distinctly disclose:
- The computer-implemented method of claim 6, wherein aggregating the temporal information comprises: reducing, by the computing system, the one or more time intervals of the temporal dimension to one time interval by performing a one-dimensional convolution on the temporal information associated with the temporal dimension.
	However, Chen teaches
- The computer-implemented method of claim 6, wherein aggregating the temporal information comprises: reducing, by the computing system, the one or more time intervals of the temporal dimension to one time interval by performing a one-dimensional convolution on the temporal information associated with the temporal dimension. (Chen [0015-0016] “Step 21. Convolution feature learning: Construct a convolution-pooling model, use a filter to perform convolution operations on the one-dimensional motor vibration signal, reduce the dimension of the feature map while ensuring the feature position is unchanged, and pull the pooled feature map into a one-dimensional vector as the final learning Fault characteristics”; Chen discloses one-dimensional convolution reduction)
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt as modified by Mehrseresht and Banka with the one-dimensional convolution of Chen in order to reduce the burden of computation in the CNN and maintain the time-domain invariance (Chen, [0098])
	
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Butt in view of Mehrseresht further in view of Banka further in view of Karpathy et al. (“Large-scale Video Classification with Convolutional Neural Networks” hereinafter Karpathy) 

Regarding Claim 8
Butt as modified by Mehrseresht and Banka teaches all of the limitations of claim 5 as cited above but does not distinctly disclose:
- The computer-implemented method of claim 5, wherein the temporal information associated with the temporal dimension of the tensor is aggregated over two or more convolution layers of the plurality of convolution layers.
	However, Karpathy teaches

- The computer-implemented method of claim 5, wherein the temporal information associated with the temporal dimension of the tensor is aggregated over two or more convolution layers Karpathy, [Figure 1] discloses late fusion and shows the video frames are aggregated over two convolutional layers. Video frames include temporal information.) 
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt as modified by Mehrseresht and Banka with the tensor aggregation of Karpathy in order to achieve significant performance improvement in CNN training. (Karpathy, [Abstract])

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Butt in view of Mehrseresht further in view of Banka further in view of  Karpathy further in view of Liu et al. (US 2019/0130569 A1 hereinafter Liu)
Regarding Claim 9
Butt as modified by Mehrseresht, Banka and Karpathy teaches all of the limitations of claim 8 as cited above but does not distinctly disclose:
- The computer-implemented method of claim 8, wherein aggregating the temporal information comprises:  reducing, by the computing system, the one or more time intervals of the temporal dimension to one time interval by performing a two-dimensional convolution on the temporal information associated with the temporal dimension.
	However, Liu teaches
- The computer-implemented method of claim 8, wherein aggregating the temporal information comprises:  reducing, by the computing system, the one or more time intervals of the temporal dimension to one time interval by performing a two-dimensional convolution on Liu [0063] “Each unit layer of the encoder network 604 may include a 2D convolution layer 614 with a set of 2D filters, batch normalization (BN) layer 616, rectified-linear unit (ReLU) activation layer 618, followed by a max-pooling layer (the pooling layer 620) for reduction of data dimensions. The unit layer may be repeated multiple times to achieve sufficient data compression.”; Liu discloses two-dimensional convolution layer for reduction of data dimensions.)
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt as modified by Mehrseresht, Banka and Karpathy with the two-dimensional convolution of Liu in order to provide robust and accurate results. (Liu, [0005])

Claims 10-11, 13-14 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Butt in view of Vallespi-Gonzalez et al. (US 2018/0348346 A1 hereinafter Vallespi)
Regarding Claim 10
Butt teaches all of the limitations of claim 18 as cited above but does not distinctly disclose:
- The computing device of claim 18, wherein the machine-learned model is based at least in part on one or more classification techniques comprising a convolutional neural network.
	However Vallespi teaches
- The computing device of claim 18, wherein the machine-learned model is based at least in part on one or more classification techniques comprising a convolutional neural network. (Vallespi [0118] “In FIG. 13, cell classification and segmentation graph 880 depicts a first group of cells 882 of LIDAR data, a second group of cells 884 of LIDAR data, and a third group of cells 886 of LIDAR data determined in an environment surrounding autonomous vehicle 888.”; cell classification and segmentation )
	Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt with the vehicle controller of Vallespi in order to generate appropriate motion plan to the vehicle and achieve the improvement in the safety. (Vallespi, [0004])

Regarding Claim 11
Butt teaches all of the limitations of claim 1 as cited above but does not distinctly disclose:
- The computer-implemented method of claim 1, further comprising: determining, by the computing system, one or more travelled paths of the one or more objects based at least in part on one or more locations of the one or more objects over a sequence of the one or more time intervals comprising a last time interval associated with a current time and the one or more time intervals prior to the current time, wherein the one or more predicted paths of the one or more objects is based at least in part on the one or more travelled paths.
	However Vallespi teaches
- The computer-implemented method of claim 1, further comprising: determining, by the computing system, one or more travelled paths of the one or more objects based at least in part on one or more locations of the one or more objects over a sequence of the one or more time intervals comprising a last time interval associated with a current time and the one or Vallespi 
[0051-0053] “The sensor data can include information that describes the location (e.g., in three-dimensional space relative to the autonomous vehicle) of points that correspond to objects within the surrounding environment of the autonomous vehicle (e.g., at one or more times). In particular, in some implementations, the perception system can determine, for each object, state data that describes a current state of such object. As examples, the state data for each object can describe an estimate of the object's current location (also referred to as position); current speed; current heading (which may also be referred to together as velocity); current acceleration; current orientation; size/footprint (e.g., as represented by a bounding shape such as a bounding polygon or polyhedron); class of characterization (e.g., vehicle versus pedestrian versus bicycle versus other); yaw rate; and/or other state information. In some implementations, the perception system can determine state data for each object over a number of iterations. In particular, the perception system can update the state data for each object at each iteration. Thus, the perception system can detect and track objects (e.g., vehicles, bicycles, pedestrians, etc.) that are proximate to the autonomous vehicle over time, and thereby produce a presentation of the world around an autonomous vehicle along with its state (e.g., a presentation of the objects of interest within a scene at the current time along with the states of the objects). The prediction system can receive the state data from the perception system and predict one or more future locations and/or moving paths for each object based on such state data.”; “one or more time” implies current time and prior times. Vallespi discloses the state data of objects and it includes location, time data and predicted path based on the travelled path.)
Same motivation to combine Butt and Vallespi as claim 10.

Regarding Claim 13
Butt teaches all of the limitations of claim 1 as cited above but does not distinctly disclose:
 - The computer-implemented method of claim 1, wherein the one or more sensor outputs comprise one or more three-dimensional points corresponding to a plurality of surfaces of the one or more objects detected by the one or more sensors.
However Vallespi teaches
- The computer-implemented method of claim 1, wherein the one or more sensor outputs comprise one or more three-dimensional points corresponding to a plurality of surfaces of the one or more objects detected by the one or more sensors.( Vallespi [0032] “In some embodiments, LIDAR data includes a three-dimensional point cloud of LIDAR data points received from around the periphery of an autonomous vehicle.”)
Same motivation to combine Butt and Vallespi as claim 10.

Regarding Claim 14
Butt teaches all of the limitations of claim 1 as cited above but does not distinctly disclose:

However Vallespi teaches
- The computer-implemented method of claim 1, wherein the sensor data is associated with a birds eye view vantage point, the one or more sensors comprising one or more light detection and ranging devices (LIDAR), one or more cameras, one or more radar devices, one or more sonar devices, or one or more thermal sensors. (Vallespi [0035] “For example, a second representation of LIDAR data can correspond to a top-view representation. In contrast to the range-view representation of LIDAR data described above, a top-view representation can correspond to a representation of LIDAR data as viewed from a bird's eye or plan view relative to an autonomous vehicle and/or ground surface. A top-view representation of LIDAR data is generally from a vantage point that is substantially perpendicular to the vantage point of a range-view representation of the same data.”)
Same motivation to combine Butt and Vallespi as claim 10.

Regarding Claim 16
Butt teaches all of the limitations of claim 15 as cited above but does not distinctly disclose:
- The one or more tangible non-transitory computer-readable media of claim 15, further comprising: generating the machine-learned model based at least in part on training data 
	However Vallespi teaches
- The one or more tangible non-transitory computer-readable media of claim 15, further comprising: generating the machine-learned model based at least in part on training data comprising a plurality of training objects associated with a plurality of classified features and a plurality of classified object labels, the plurality of classified features based at least in part on point cloud data comprising a plurality of three-dimensional points associated with one or more physical characteristics of the plurality of training objects. (Vallespi [0150] “The detector training dataset 992 can further include a second portion of data corresponding to labels identifying corresponding objects detected within each portion of input sensor data. In some implementations, the labels can include at least a bounding shape corresponding to each detected object of interest. In some implementations, the labels can additionally include a classification for each object of interest from a predetermined set of objects including one or more of a pedestrian, a vehicle, or a bicycle.” [0063] “In some embodiments, LIDAR data 12 can include a three-dimensional point cloud of LIDAR data points received from around the periphery of an autonomous vehicle.”)
	Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt with the 

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Butt in view of Vallespi further in view of Li et al. (WO 2010042068 A1 hereinafter Li) 
Regarding Claim 12
Butt as modified by Vallespi teaches all of the limitations of claim 11 as cited above but does not distinctly disclose:
- The computer-implemented method of claim 11, further comprising: detecting, by the computing system, an object of the one or more objects that is at least partly occluded; and determining, by the computing system, based at least in part on the one or more travelled paths of the one or more objects, when the object of the one or more objects that is at least partly occluded was previously detected.
	However, Li teaches
- The computer-implemented method of claim 11, further comprising: detecting, by the computing system, an object of the one or more objects that is at least partly occluded; and determining, by the computing system, based at least in part on the one or more travelled paths of the one or more objects, when the object of the one or more objects that is at least partly occluded was previously detected. (Li [pp 19:14-25]  “In step 108 (Figure 1), the position of the object in the current frame is estimated based on motion history in the preceding frame, particularly in instances where the person is occluded. Let X-Z plane be the ground plane, where X-direction is aligned with the image plane and Z-direction is aligned with optical axis of a camera. The probability of being occluded can be estimated according to the object's position on the ground plane. Figures 7a-7c show possible events for two human objects HA and HB. In Figure 7a, human objects HA and HB are completely visible to a camera 730(?). In Figure 7b, human object HB is partially occluded by human object HA who is nearer to the camera 730 (i.e. ZA < ZB), while in Figure 7c, human object H8 is completed occluded by human object HA..... Further, if a person is occluded by more than one person, the persons occluding him are classified into two groups, i.e. on the left and right sides. The maximum probabilities for both groups are selected and the final probability value is the sum of the two maximum values.”; Li discloses how to determine the occlusion based on the probability and the object labeling implies that the object was previously detected.)
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt and Vallespi with the occlusion detection of Li in order to achieve accurate object detection and high performance in motion tracking. (Li, [pp3])

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Butt in view of Vallespi further in view of Weston et al. (US 2017/0193390 A1 hereinafter Weston) 

Regarding Claim 17
Butt as modified Vallespi teaches all of the limitations of claim 16 as cited above and Vallespi further teaches:
- The one or more tangible non-transitory computer-readable media of claim 16, further comprising: 
Vallespi [0041] “In some examples, the classification for each cell can include a probability score associated with each classification indicating the likelihood that such cell includes one or more particular classes of objects of interest.”)
Butt as modified Vallespi does not distinctly disclose:
- training the machine-learned model using the training data comprising a plurality of predefined portions of a training environment, wherein each of the plurality of predefined portions of the training environment is associated with at least one of a plurality of negative training samples or at least one of a plurality of positive training samples associated with a corresponding ground truth sample; and  
- ranking the plurality of negative training samples based at least in part on the score for the respective one of the plurality of predefined portions of the training environment, wherein a weighting of a filter of the machine-learned model is based at least in part on a predetermined portion of the plurality of the negative samples associated with the lowest scores.
	However Weston teaches
- training the machine-learned model using the training data comprising a plurality of predefined portions of a training environment, wherein each of the plurality of predefined portions of the training environment is associated with at least one of a plurality of negative training samples or at least one of a plurality of positive training samples associated with a corresponding ground truth sample; and  (Weston [0058] “Thus, the deep-learning model, may be trained so that each of the entities of the second set of entities (i.e., negative samples) has a lower similarity score than the target entity (i.e., a positive sample). In other words, the deep-learning model may be trained so that each of the entities of the second set of entities is ranked lower than the target entity.”)
- ranking the plurality of negative training samples based at least in part on the score for the respective one of the plurality of predefined portions of the training environment, wherein a weighting of a filter of the machine-learned model is based at least in part on a predetermined portion of the plurality of the negative samples associated with the lowest scores.(Weston [0058] “In particular embodiments, social-networking system 160 may assign rankings to each of the target entity and the second set of entities, and the one or more weights of the deep-learning model may be updated further based on the rankings. … In other words, the deep-learning model may be trained so that each of the entities of the second set of entities is ranked lower than the target entity. Social-networking system 160 may determine that one or more of the entities of the second set of entities are ranked above or have higher similarity scores than the target entity (i.e., have corresponding embeddings that are more proximate to the user embedding in the embedding space), and social-networking system 160 may update vector representations of one or more entities of the second set of entities.”)
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt and Vallespi with the negative training of Weston in order to improve mapping results by updating the weights. (Weston [0059])

20 is rejected under 35 U.S.C. 103 as being unpatentable over Butt in view of Boettger et al. (“Measuring the Accuracy of Object Detectors and Trackers” hereinafter Boettger).
Regarding Claim 20
Butt teaches all of the limitations of claim 18 as cited above but does not distinctly disclose:
The computing device of claim 18, further comprising: 
- determining, based at least in part on the input representation and the machine-learned model, an amount of overlap between the one or more bounding shapes; and 
- responsive to the amount of overlap between the one or more bounding shapes satisfying one or more overlap criteria, determining that the object of the one or more objects associated with the one or more bounding shapes that satisfies the one or more overlap criteria is the same object over the one or more time intervals.
However, Boettger teaches
- determining, based at least in part on the input representation and the machine-learned model, an amount of overlap between the one or more bounding shapes; and (Boettger [Abstract] “we introduce the relative Intersection over Union (rIoU) accuracy measure. The measure normalizes the IoU with the optimal box for the segmentation to generate an accuracy measure that ranges between 0 and 1 and allows a more precise measurement of accuracies.”; “measure that ranges between 0 and 1” reads on “amount of overlap”)
- responsive to the amount of overlap between the one or more bounding shapes satisfying one or more overlap criteria, determining that the object of the one or more objects associated Boettger [Figure 4] Boettger discloses the IOU value change over time; It is obvious that the criteria can be set up and the determination can be made based on the criteria.)
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the appearance search system of Butt with the relative intersection over union of Boettger in order to achieve more accurate detection and tracking of objects (Boettger, [Abstract])

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
a.	Xu et al. US-20190096086-A1
b.	Laddha et al. US- 20180348374-A1
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SUNG WON LEE whose telephone number is 571-272-8508.  The examiner can normally be reached on Mon-Fri 0730-1730.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

	

/SUNG W LEE/
Examiner, Art Unit 2123

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123