DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 6-9, 13-16, and 20-23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Daniels (US-PAT-NO: 11010640) in view of Fridman (WO 2018/015811), in view of Diessner (PGPUB: 20140313339), and further in view of Breed (PGPUB: 20080294315).

Regarding claims 1, 8, and 15, teaches an apparatus for scene filtering using motion estimation, the apparatus configured to perform steps comprising: 
filtering, from the camera data, the one or more pixels; and 
 training, based on the filtered camera data including the filtered one or more pixels, a neural network (see Col. 4, lines 5-14, determining the set of training data comprises determining a filtered set of training data from the set of vehicle data and determining the set of training data from the filtered set of training data. In some embodiments, the filtered set of training data comprises the set of training data filtered based at least in part on whether image quality is above an image quality threshold. Image quality can be measured as having the correct number of video frames, no corruption in the video (e.g., a video decoder can successfully decode the video); see Fig. 4C, Col. 12, lines 5-10, vehicle event data is filtered for quality level, for example, vehicle event data is filtered for quality level such that video events have sufficient number of pixels (e.g., 512×512, 256×256, 299×299, 224×224, etc.). In some embodiments, the video quality level is determined by a pixel size of the frame; see Fig. 4C, Col. 8 and 9, lines 65-67 3and 1-2; see Col. 10, lines 56-67, the training data set is received into the model. For example, the event data is preprocessed (e.g., filtering, cropping, decimating, etc.) and presented to the inputs of the model as well as the desired labels. In 412, the model is trained. For example, the model is trained using a subset of the training data and then tested and that training is repeated using another subset of the data until the model error difference between iterations is less than a percentage improvement (e.g., 10%, 5%, 2%, 1%, etc.) or a predefined number of iterations is reached. In 413, the model is saved. For example, the model is stored in a database of the vehicle data server).
However, Daniels does not expressly teach motion relative to the autonomous vehicle, one or more pixels.
Fridman teaches that the ego motion of the camera (and hence the vehicle body) may be estimated based on an optical flow analysis of the captured images. An optical flow analysis of a sequence of images identifies movement of pixels from the sequence of images, and based on the identified movement, determines motions of the vehicle. The ego motion may be integrated over time and along the road segment to reconstruct a trajectory associated with the road segment that the vehicle has followed (see paragraph 250).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Daniels by Fridman for providing the ego motion of the camera (and hence the vehicle body) may be estimated based on an optical flow analysis of the captured images. An optical flow analysis of a sequence of images identifies movement of pixels from the sequence of images, and based on the identified movement, determines motions of the vehicle, as teach identifying, in camera data from an autonomous vehicle, based on motion relative to the autonomous vehicle, one or more pixels. Therefore, the combination of the teaching, suggestion, or motivation in the prior art would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention.
However, blacking out the one or more pixels or zeroing out an error function applied to the one or more pixels.
Diessner teaches that all pixels above a defined intensity are set to white, while all other pixels are set to black. The white image regions are thus grown or enlarged so that they merge together into an object or objects. This is accomplished using morphology operations (dilate/erode) with a disk as a structure element. This operation produces the object image. Each white image region represents a moving object. Additional filtering may be desired or required to remove noise and to produce a stable output (see Fig. 7, paragraph 62).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination by Diessner for providing all pixels above a defined intensity are set to white, while all other pixels are set to black. The white image regions are thus grown or enlarged so that they merge together into an object or objects, as blacking out the one or more pixels or zeroing out an error function applied to the one or more pixels. Therefore, combining the elements from prior arts according to known methods and technique, such as setting pixels to black, would yield predictable results.
However, the combination teaches selecting, from a plurality of pixels in camera data from an autonomous vehicle, based on motion relative to the autonomous vehicle, one or more pixels.
Breed teaches that takes a picture using adjacent pixels with different radiation blocking filers. Four such pixel types are used allowing Nayar to essentially obtain 4 separate pictures with one snap of the shutter. Software then selects which of the four pixels to use for each part of the image so that the dark areas receive one exposure and somewhat brighter areas another exposure and so on. The brightest pixel receives all of the incident light, the next brightest filters half of the light, the next brightest half again and the dullest pixel half again. Other ratios could be used as could more levels of pixels, e.g., eight instead of four (see Fig. 21 and 22, paragraph 323). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination by Breed for providing Software then selects which of the four pixels to use for each part of the image so that the dark areas receive one exposure and somewhat brighter areas another exposure and so on. The brightest pixel receives all of the incident light, the next brightest filters half of the light, the next brightest half again and the dullest pixel half again, as selecting, from a plurality of pixels in camera data from an autonomous vehicle, based on one or more pixels. Therefore, combining the elements from prior arts according to known methods and technique, such as selecting and filtering pixels, would yield predictable results.

Regarding claims 2, 9, and 16. The apparatus of claim 8, the combination teaches wherein identifying the one or more pixels comprises
identifying, as the one or more pixels, one or more pixels associated with stationary motion and/or one or more pixels associated with motion away from the autonomous vehicle (see Fridman, Fig. 3, paragraph 139).

Regarding claims 6, 13, and 20. The apparatus of claim 8, the combination teaches wherein the steps further comprise providing the trained neural network to one or more autonomous vehicles (see Daniels, Fig. 3, Col. 8, lines 65-67, model trainer 306 builds a model by training a machine learning model, a neural network, or any other appropriate model. Model trainer 306 builds the model utilizing vehicle event data stored in vehicle event data 314 and associated event data labels stored in event data labels 316).

Regarding claims 7 and  14. The apparatus of claim 8, the combination teaches wherein the trained neural network is configured to determine, based on sensor data, one or more autonomous vehicle control operations (see Fridman, Fig. 34-35, paragraph 461-463, process 3500 may include additional operations or steps. For example, process 3500 may further include adjusting a steering system of the vehicle based on the autonomous steering action).

Regarding claim 21, 22, and 23. The method of claim 1, wherein training the machine learning model comprises providing, to the machine learning model, the filtered sensor data and one or more control operations of the autonomous vehicle corresponding to the filtered sensor data (see FRIDMAN, Fig. 21, paragraph 97 and 346, memory units may include various databases and image processing software, as well as a trained system, such as a neural network, or a deep neural network; illustrates a block diagram of memory 2015, which may store computer code or instructions for performing one or more operations for generating a road navigation model for use in autonomous vehicle navigation, As shown in FIG. 21, memory 2015 may store one or more modules for performing the operations for processing vehicle navigation information. For example, memory 2015 may include a model generating module 2105 and a model distributing module 2110).


Claims 3, 10, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Daniels (US-PAT-NO: 11010640) in view of Fridman (WO 2018/015811), in view of Diessner (PGPUB: 20140313339), in view of Breed (PGPUB: 20080294315), and further in view of Kraft (PGPUB: 20170140245).

Regarding claims 3, 10, and 17. The apparatus of claim 8, the combination does not expressly teach wherein identifying the one or more pixels comprises: 
identifying, for each pixel of the camera data, a corresponding motion vector; and
determining, based on each pixel and the corresponding motion vector, whether to include a respective pixel in the identified one or more pixels.
Kraft teaches:
identifying, for each pixel of the camera data, a corresponding motion vector (see Fig. 4, paragraph 62, the feature vector 410 shown in FIG. 4 may have at least one feature representing a relative position of a blob within an image. In one example embodiment, in which the pixel blob identification module 103 identifies blobs, a feature 410a may represent whether a blob is located between the pair of edges corresponding to the vehicular path; this feature teaches the machine learning model 106 that the blob may represent a moving vehicle on a vehicular path); and
determining, based on each pixel and the corresponding motion vector, whether to include a respective pixel in the identified one or more pixels (see Fig. 2, 7, and 10, paragraph 87, a vehicular path 1001 and two pixel blobs 1002 and 1003 that correspond to a moving vehicle identified by the application of the process described above with reference to FIG. 7. The traffic analysis module 107 determines a directional vector for the blobs corresponding to each moving vehicle by determining a difference between a centroid of each of the corresponding blobs).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination by Kraft for providing in which the pixel blob identification module 103 identifies blobs, a feature 410a may represent whether a blob is located between the pair of edges corresponding to the vehicular path and the traffic analysis module 107 determines a directional vector for the blobs corresponding to each moving vehicle by determining a difference between a centroid of each of the corresponding blobs. Therefore, combining the elements from prior arts according to known methods and technique such as pixel blobs vector and determining a directional vector would yield predictable results.


Claims 4, 11, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Daniels (US-PAT-NO: 11010640) in view of Fridman (WO 2018/015811), in view of Diessner (PGPUB: 20140313339), in view of Kraft (PGPUB: 20170140245), in view of Breed (PGPUB: 20080294315), and further in view of Reda (PGPUB: 20190297326).

Regarding claims 4, 11, and 18. The apparatus of claim 10, the combination does not expressly teach wherein identifying, for each pixel of the camera data, the corresponding motion vector comprises providing the video data to another neural network.
Reda teaches that receiving an input that includes the sequence of video frames and at least one optical flow; processing the input by layers of a first neural network to generate a set of parameters; and generating the predicted video frame based on the set of parameters. Each optical flow maps pixels from a particular video frame to motion vectors that identify corresponding pixels in a corresponding video frame in the sequence of video frames. The set of parameters includes a displacement vector and at least one convolution kernel for each pixel of a plurality of pixels in the predicted video frame (see paragraph 7); receiving an input that includes the sequence of video frames and at least one optical flow; processing the input by layers of a first neural network to generate a set of parameters; and generating the predicted video frame based on the set of parameters. Each optical flow maps pixels from a particular video frame to motion vectors that identify corresponding pixels in a corresponding video frame in the sequence of video frames. The set of parameters includes a displacement vector and at least one convolution kernel for each pixel of a plurality of pixels in the predicted video frame (see paragraph 9).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination by Reda for providing processing the input by layers of a first neural network to generate a set of parameters; and generating the predicted video frame based on the set of parameters. Each optical flow maps pixels from a particular video frame to motion vectors that identify corresponding pixels in a corresponding video frame in the sequence of video frames, as wherein identifying, for each pixel of the camera data, the corresponding motion vector comprises providing the video data to another neural network. Therefore, combining the elements from prior arts according to known methods and technique such as maps pixels from a particular video frame to motion vectors and neural network, would yield predictable results. 
	
Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 8, and 15 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XIN JIA whose telephone number is (571)270-5536. The examiner can normally be reached 9:00 am-7:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella can be reached on (571)272-7778. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/XIN JIA/Primary Examiner, Art Unit 2667