Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 08/25/2022 have been fully considered but they are not persuasive. 
In regards to claim 1, Applicant argues the newly amended limitations of claim 1, in particular “determining, based at least in part on the localizing, one or more locations within the sensor data representation corresponding to one or more features represented by the map data; generating ground truth data indicative of the one or more locations within the sensor data representation; and updating one or more parameters of a neural network using the sensor data and the ground truth data” are not taught by Viswanathan (US 20180293466). That instead, Viswanathan teaches in the previously cited paragraphs only that map information/data comprises one or more static features that are linked with ground truth actual pose corresponding to the static features and that a captured image may be compared to a map projection by a trained deep-net or neural network to determine similarity between the captured image and the map projection. Further, that at best, Viswanathan teaches using a captured image and map projection to determine a pose of the vehicle and the only ground truth the cited portions describe is the ground truth actual pose corresponding to the static features. Further, that in [0046] Viswanathan teaches determining ground truth pose of an image which is still not equivalent to the required ground truth indication of one or more locations within the image corresponding to the features of the map projection. Still further, that the ground truth of Viswanathan is only the actual pose and not one or more location, which the Applicant argues the previous Office Action states. 
However, Viswanathan teaches map information contains ground truth data corresponding to static features corresponding to the pose ([0031]) and a neural network may perform comparisons between map and image, which is trained with correct images, incorrect images, and map data and thereby parameters of the neural network are updated ([0032]). The ground truth information corresponds to the estimated or observed pose, which is observed in sensor data ([0042]). Additionally a map projection and captured image are compared ([0039]), where the map contains static features ([0031]) which are expected to be nearby the vehicle based on the location and pose of the vehicle ([0038]). The captured image is analyzed and pixels representing features and their locations are mapped to, for example, a top down image of the environment ([0041]). This determines pixel locations of features within the sensor data and compares them with map features. Following this, these are fed into a neural network to further train the network, thereby updating the parameters of the network. 
As static features corresponding to ground truth features at the pose within map data, the sensor data and the map are compared, and pixel locations of features within the images are determined and mapped as they are interpreted, the locations of ground truth static features from map data in sensor data are determined. Further, the purpose of the comparison of the map and sensor data is to determine their similarity. Any one of ordinary skill would have recognized that checking similarity includes checking locations of features. 
It was not asserted in the previous Office Action that the only ground truth of Viswanathan was the pose, but rather that the pose was one source of ground truth and static features another associated with the pose. Likewise, the claim requires only that ground truth is indicative of the one or more locations within the sensor data representation, and as such, pose indicating the locations, for example by a high similarity with map data, reads upon this claim. 
As such, Viswanathan teaches determining locations within sensor data representing corresponding features of map data, generating ground truth data indicative of the location, whether that be a pose or a feature location, and updating parameters of a neural network, and therefore this argument is unpersuasive. 
Applicant argues independent claim 16 is allowable for the same reasons as claim 1 above. 
This argument is unpersuasive for the same reasons as given above. 
In regards to independent claim 10, Applicant argues NPL Bojarski does not remedy the same deficiencies as argued above and therefore claim 10 is allowable. 
However, Bojarski is not required to remedy and challenged deficiency and therefore this argument is unpersuasive for the same reasons as given above. 
Applicant argues the dependent claims are allowable by virtue of their dependency on an allowable base claim. 
This argument is unpersuasive for the same reasons as given above. 
Additionally, claim 5 is indicated as having been amended but no amendment appears to have been performed and therefore this claim has been interpreted not to have been amended. The Examiner believes the Applicant intended this amendment to remove the “HD” before “map data”, and as such this has bene interpreted to be a minor informality and a corresponding objection has been given below. 

Claim Objections
Claim 5 objected to because of the following informalities: recites “the HD map data” however, by the current amended claim language, no HD map data has been previously introduced and as such, this has been interpreted to read “the map data” which has been previously introduced.  Appropriate correction is required.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-4, 7-9, 16, 17, and 20-22 are rejected under 35 U.S.C. 102(a) as being anticipated by Viswanathan (US 20180293466).
In regards to claim 1, Viswanathan teaches a method comprising: (Fig 3, 6)
receiving map data corresponding to a region including a location of a dynamic actor at a time; ([0031] map may provide representation of environment. [0030] estimates pose by matching image with map data localized using GPS, GNSS, or IMU.)
localizing the dynamic actor with respect to the map data; ([0030] pose estimated by matching map data localized using GPS, GNSS, or IMU.) 
receiving sensor data representative of a sensor data representation, the sensor data generated using a sensor of the dynamic actor at the time; ([0030] image is captured by image capturing device onboard vehicle.)
determining, based at least in part on the localizing, one or more locations within the sensor data representation corresponding to one or more features represented by the map data; ([0039] map projection and captured image may be compared, [0031] where the map contains static features [0038] which are expected to be nearby the vehicle based on the location and pose of the vehicle. [0041] captured image is analyzed and pixels representing features and their locations are mapped to, for example, a top down image of the environment. This determines pixel locations of features within the sensor data and compares them with map features.)
generating ground truth data indicative of the one or more locations within the sensor data representation; ([0031] map information contains ground truth data corresponding to static features corresponding to the pose. [0042] ground truth information corresponds to the estimated or observed pose, which is observed in sensor data. [0041] captured image is analyzed and pixels representing features and their locations are mapped to, for example, a top down image of the environment. This generates ground truth information in the form of comparing the ground truth data from the features of the map with the features of the sensor data.) and 
updating one or more parameters of a neural network using the sensor data and the ground truth data. ([0032] neural network may perform comparisons between map and image, which is trained with correct images, incorrect images, and map data and thereby parameters of the neural network are updated.)

In regards to claim 2, Viswanathan teaches the method of claim 1, wherein the generating the ground truth data comprises: 
transforming the map data to a coordinate system of the dynamic actor; ([0039], [0042] two dimensional map projection is generated based on estimated pose. This transforms map coordinates to vehicle coordinates.) and 
generating, within the coordinate system, one or more labels or one or more annotations corresponding to the one or more locations, ([0081] node data records are stored representing identifiers of junctions and road segments, which can be matched with map information. Identifiers are annotations corresponding to features, where features are at least road segments and junctions.)
wherein the ground truth data is representative of the one or more labels or the one or more annotations. ([0081] node data records are stored representing identifiers of junctions and road segments, which can be matched with map information. [0031] map information contains ground truth data corresponding to static features.)

In regards to claim 3, Viswanathan teaches the method of claim 1, further comprising determining the region including the location based at least in part on at least one of global navigation satellite system (GNSS) data, global positioning system (GPS) data, or differential GPS (DGPS) data generated using one or more location-based sensors of the dynamic actor. ([0030] pose estimated by matching map data localized using GPS, GNSS, or IMU.)

In regards to claim 4, Viswanathan teaches the method of claim 1, wherein the updating the one or more parameters of the neural network comprises updating the one or more parameters of the neural network such that the neural network performs inferencing for at least one of: object detection, feature detection, road feature detection, wait condition detection or classification, or future trajectory generation. ([0032] neural network is trained to compare capture image and map data. [0040] neural network describes similarity between captured image and map projection, which itself is training for feature detection and road feature detection by comparing the presence of indications of those features in image to presence in map data, even without explicitly identifying the feature, their presence is still detected.)

In regards to claim 7, Viswanathan teaches the method of claim 1, wherein the sensor data includes at least one of image data, LIDAR data, RADAR data, SONAR data, or ultrasonic data. ([0030] image captured by image capturing device.)

In regards to claim 8, Viswanathan teaches the method of claim 1, further comprising: 
transforming the map data from a first coordinate system of the map data to a second coordinate system of the sensor data, ([0039], [0042] two dimensional map projection is generated based on estimated pose. This transforms map coordinates to vehicle coordinates. Expected view of static features may be provided. The expected view is the sensor-space coordinates.)
wherein the ground truth data corresponds to the second coordinate system. ([0039], [0042] two dimensional map projection is generated based on estimated pose. This transforms map coordinates to vehicle coordinates. [0031] map information contains ground truth data corresponding to static features.)

In regards to claim 9, Viswanathan teaches the method of claim 8, wherein the first coordinate system is a three-dimensional (3D) world-space coordinate system and the second coordinate system is a two-dimensional (2D) image-space coordinate system. ([0042] 2D projection of map data is generated. Must be projected from higher dimension, 3D is only logical option.)

In regards to claim 16, Viswanathan teaches a system comprising: (Fig 1, 2A, 2B)
one or more processing units to: ([0036] processor 22, memory 24. [0075] memory stores instructions.)
	receive sensor data representative of a sensor data representation; ([0036] image capturing device captures images. All sensor data is inherently representative of the data it represents as a fundamental property of linguistics, let alone a property of sensor data.)
localize a dynamic actor with respect to a map; ([0036] one or more location sensors 36. [0031] map may provide representation of environment. [0030] estimates pose by matching image with map data localized using GPS, GNSS, or IMU.)
determine, based at least in part on the localization, one or more locations within the sensor data representation corresponding to or more features represented by the map; ([0036] pose metric network 34 compares map and image data. [0042] ground truth information corresponds to the estimated or observed pose, which is observed in sensor data. [0039] map projection and captured image may be compared, [0031] where the map contains static features [0038] which are expected to be nearby the vehicle based on the location and pose of the vehicle. [0041] captured image is analyzed and pixels representing features and their locations are mapped to, for example, a top down image of the environment. This determines pixel locations of features within the sensor data and compares them with map features.)
generate, ground truth data indicative of the one or more locations within the sensor data; ([0036] pose metric network 34 compares map and image data. [0031] map information contains ground truth data corresponding to static features. [0042] ground truth information corresponds to the estimated or observed pose, which is observed in sensor data. [0041] captured image is analyzed and pixels representing features and their locations are mapped to, for example, a top down image of the environment. This generates ground truth information in the form of comparing the ground truth data from the features of the map with the features of the sensor data.) and 
update one or more parameters of a neural network using the sensor data and the ground truth data. ([0032] neural network may perform comparisons between map and image, which is trained with correct images, incorrect images, and map data and thereby parameters of the neural network are updated. [0035] remote apparatus 10 configured to train neural network using sensor and map data.)

In regards to claim 17, Viswanathan teaches the system of claim 16, wherein the one or more features include one or more objects, one or more road features, or one or more wait conditions. ([0036] pose metric network 34 compares map and image data. [0031] map information contains ground truth data corresponding to static features. Objects and road features are static features and must be included, were these not included, the map would cease to function.)

In regards to claim 20, Viswanathan teaches the system of claim 16, wherein the one or more processing units are further to orient the map with respect a location and orientation of the dynamic actor. ([0031] map may provide representation of environment. [0030] estimates pose by matching image with map data localized using GPS, GNSS, or IMU and heading of vehicle. Map is always oriented with respect to an orientation and location of the dynamic actor, merely by existing and is further oriented by assessing heading of the vehicle.)

In regards to claim 21, Viswanathan teaches the method of claim 1, wherein: 
the sensor data is image data and the sensor data representation is an image represented by the image data; ([0030] image is captured by image capturing device onboard vehicle.) and 
the one or more locations comprise one or more pixels of the image. ([0041] captured image is analyzed and pixels representing features and their locations are mapped to, for example, a top down image of the environment. This determines pixel locations of features within the sensor data.)

In regards to claim 22, Viswanathan teaches the method of claim 1, further comprising: 
determining an orientation of the dynamic actor with respect to the map data, ([0038], [0042] pose of the vehicle including heading and location on map may be determined.)
wherein the determining the one or more locations within the sensor data representation corresponding to the one or more features represented by the map data is further based at least in part on the orientation. ([0038], [0042] pose of the vehicle including heading and location on map may be determined which provides expected view of features.)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5, 6, and 10-15 are rejected under 35 U.S.C. 103 as being unpatentable over Viswanathan in view of Non-patent Literature Mariusz Bojarski, End to End Learning for Self-Driving Cars (“Bojarski”).
In regards to claim 5, Viswanathan teaches the method of claim 1.
Viswanathan does not teach: further comprising: 
receiving data representative of a trajectory of the dynamic actor through at least a portion of a field of view represented by the sensor data, and
adjusting the trajectory based at least in part on the HD map data.
However, Bojarski teaches monitoring steering commands, which itself is data representative of a trajectory, associated with images from a vehicle (page 2), similarly a series of geographic points form a trajectory, and assessing the deviation from a lane center, which is also representative of a trajectory, and adjusts control back towards lane center (page 2). Images from a vehicle have a field of view associated with them. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the application to modify the vehicle control method of Viswanathan, by incorporating the teachings of Bojarski, such that steering commands associated with images from a vehicle are monitored and the vehicle is controlled based on the determined lane position. 
The motivation to do so is that, as acknowledged by Bojarski, this allows for improved steering commands for the vehicle (page 2).

In regards to claim 6, Bojarski teaches the neural network is trained to recover from mistakes, such that it recovers to the center (Page 2).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the application to modify the vehicle control method of Viswanathan, as already modified by Bojarski, by further incorporating the teachings of Bojarski, such that the vehicle trajectory is controlled to shift towards a center of a lane of travel.
The motivation to do so is that, as acknowledged by Bojarski, this allows the vehicle to recover from mistakes (page 2), which one of ordinary skill in the art would have recognized improves safety. 

In regards to claim 10, Viswanathan teaches a method comprising: (Fig 3, 6)
receiving map data representative of a map; ([0031] map may provide representation of environment. [0030] estimates pose by matching image with map data localized using GPS, GNSS, or IMU.)
receiving sensor data generated using one or more first sensors of a dynamic actor; ([0030] image is captured by image capturing device onboard vehicle.)
localizing the dynamic actor with respect to the map based at least in part on the first sensor data; ([0030] pose estimated by matching map data localized using GPS, GNSS, or IMU with sensor data.)
receiving image data generated using one or more sensors of the dynamic actor, the image data representative of an image; ([0030] image is captured by image capturing device onboard vehicle.)
determining, based at least in part on the localizing, one or more pixels within the image corresponding to features represented by the map; ([0039] two dimensional map projection and captured image may be compared, where the two dimensional map projection is generated based on estimated pose and [0031] where the map contains static features [0038] which are expected to be nearby the vehicle based on the location and pose of the vehicle. [0041] captured image is analyzed and pixels representing features and their locations are mapped to, for example, a top down image of the environment. This determines pixel locations of features within the sensor data and compares them with map features.) and 
generating, ground truth data indicative of the one or more pixels within the image, the ground truth data for updating one or more parameters of a neural network. ([0039], [0042] two dimensional map projection is generated based on estimated pose, and ground truth information corresponds to the estimated or observed pose, which is observed in sensor data. This transforms map coordinates to vehicle coordinates. [0031] map information contains ground truth data corresponding to static features and pose. [0032] neural network may perform comparisons between map and image, which is trained with correct images, incorrect images, and map data and thereby parameters of the neural network are updated. [0041] captured image is analyzed and pixels representing features and their locations are mapped to, for example, a top down image of the environment. This generates ground truth information in the form of comparing the ground truth data from the features of the map with the features of the sensor data.)
Viswanathan does not teach:
receiving image data generated using one or more second sensors of the dynamic actor, the image data representative of an image; 
However, Bojarski teaches an image is initially fed into the system, then processing repeats and feeds a new image to a convolutional neural network (“CNN”) to determine a ground truth center line, synchronized with steering commands (page 6). Three cameras are used to take images (page 2) and the cameras are shifted with different fields of view, were data from each camera is fed to CNN (page 3). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the application to modify the vehicle control method of Viswanathan, by incorporating the teachings of Bojarski, such that sensor data from additional cameras are received associated with steering commands, and this information is both fed into and used to further update a neural network identifying ground truth. 
The motivation to do so is that, as acknowledged by Bojarski, this allows determination of ground truth which in turn allows better steering control (page 6). 

In regards to claim 11, Viswanathan, as modified by Bojarski, teaches the method of claim 10, further comprising:
correlating, based at least in part on the localizing, the map data with the image data, the correlating including transforming the map data to a coordinate space oriented with respect to the dynamic actor, ([0039], [0042] two dimensional map projection is generated based on estimated pose, where the pose includes location and orientation. This transforms map coordinates to vehicle coordinates and as the image is taken with a known heading and location, associates positions of static features from map data with a representation of the environment and the pose.)
wherein the determining the one or more pixels within the image corresponding to the one or more features represented by the map is based at least in part on the correlating. ([0041] captured image is analyzed and pixels representing features and their locations are mapped to, for example, a top down image of the environment. This identifies the pixel locations of the static features and is thereby based on the localization as well.)

In regards to claim 12, Viswanathan, as modified by Bojarski, teaches the method of claim 11, wherein the correlating further includes, after the transforming, converting the map data from world-space coordinates to sensor-space coordinates corresponding to the image data. ([0039], [0042] two dimensional map projection is generated based on estimated pose. Expected view of static features may be provided. The expected view is the sensor-space coordinates.)

In regards to claim 13, Viswanathan, as modified by Bojarski, teaches the method of claim 10, wherein the sensor data is representative of one or more additional features within a field of view of the one or more first sensors, and the localizing the dynamic actor includes correlating the one or more additional features with one or more corresponding features represented by the map. ([0031] map may provide representation of environment including static features. As this is multiple features, then at least two features must be represented, including a first feature and any additional features. [0030] estimates pose by matching image with map data localized using GPS, GNSS, or IMU, where image is captured by first device and has a field of view and equally includes a number of features.)

In regards to claim 14, Bojarski teaches using three cameras, where the cameras are offset and each is fed into CNN (page 2, 3). Bojarski teaches monitoring steering commands, which itself is data representative of a trajectory, associated with images from a vehicle (page 2), similarly a series of geographic points form a trajectory, and assessing the deviation from a lane center, which is also representative of a trajectory, and adjusts control back towards lane center (page 2).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the application to modify the vehicle control method of Viswanathan, as already modified by Bojarski, by further incorporating the teachings of Bojarski, such that data from a third sensor is received that is associated with steering commands of the vehicle and the vehicle is further controlled based on map data. 
The motivation to do so the same as acknowledged by Bojarski in regards to claim 1 above. 

In regards to claim 15, Viswanathan, as modified by Bojarski, teaches the method of claim 10, wherein the ground truth data and outputs of the neural network are generated in three-dimensional (3D) world-space coordinates. ([0042] 2D projection of map data is generated, which includes ground truth data. This must be projected from higher dimension and 3D is only logical option. Steering commands would be illogical to not be in 3D as the world is 3D and any lesser dimension would be lacking and any more would be excessive.)

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Yang et al. (US 20170364083) using a decision making model to determine a trajectory for a vehicle. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MATTHIAS S WEISFELD whose telephone number is (571)272-7258. The examiner can normally be reached Monday-Thursday 7:00 AM - 4:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Elaine Gort can be reached on (571) 272-6781. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/M.S.W./Examiner, Art Unit 3661  

/Elaine Gort/Supervisory Patent Examiner, Art Unit 3661