DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-4, 7-11, 14-18, and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Achour (US 20180348343 A1) in view of Kiran ("Real-time Dynamic Object Detection for Autonomous Driving using Prior 3D-Maps").

	Regarding claim 1, Achour teaches
	A computer-implemented method, comprising: 
receiving a plurality of frame sets generated while navigating a local environment, each frame set comprising a point cloud representation of three-dimensional (3D) points; see at least [0023]-[0024]; and FIG. 1, where systems and methods are used in an autonomous vehicle to detect and identify targets in the vehicle’s path and surrounding environment. The targets may include a plurality of roads, walls, buildings, vehicles, pedestrians, animals, etc. "IMTM radar is a “digital eye" with true 3D vision and capable of a human-like interpretation of the world", "data pre-processing module 112 processes the radar data to encode it into a point cloud for the IMTM interface module”, “As a vehicle travels, there are different FoV (field of view) snapshots or slices, such as from a nearfield to a far-field slice." 
tracking instances classified as dynamic across the plurality of frame sets using a tracking algorithm see at least [0032]-[0035] where "[f]or example, the target identification and decision module 114 may detect a cyclist on the path of the vehicle") “IMTM interface module 104 also includes a multi-object tracker 118 to track the identified targets over time, such as, for example, with the use of a Kalman filter. The multi-object tracker 118 matches candidate targets identified by the target identification and decision module 114 with targets it has detected in previous time windows”;
assigning a single instance ID to tracked instances classified as dynamic across the plurality of frame sets see at least [0032]-[0035] and [0070] where tracking information provided by the multi-object tracker 118 and the micro-doppler signal provided by the micro-doppler module 116 are combined to produce an output containing the type of target identified, their location, their velocity, and so on. This information from IMTM radar system 100 is then sent to a sensor fusion module (described in more detail below with reference FIG. 12) in the vehicle, where it is processed together with information from other sensors"; 
estimating a bounding box for each of the instances in each of the plurality of frame sets see at least [0081] and [0085] -where "[r]etraining may be done using a combination of synthesized data and real sensor data. Real sensor data may be labeled with labels 910, which are, for example, bounding boxes placed around known items in view in each multi-dimensional slice of the radar data; “the dataset may be prepared by generating beams in the radar system in the k directions in a road-like environment, recording the reflections from known targets, and labeling the data with bounding boxes around the targets"; and 
employing the instances as ground truth data in a training of one or more deep learning classifiers. See at least [0059]-[0065] and [0109]. More specifically, in [0064], “[a]utoencoders directly learn features from unlabeled data in an unsupervised mode (i.e., by first encoding and then decoding inputs). Using autoencoder 608 in the data pre-processing module 112 improves the performance of the target identification and decision module’. In [0109],  “the IMTM interface module 1606 may use Al, machine learning, deep learning, an expert system, and/or other technology to improve performance of the IMTM radar system for target detection and identification”

Achour teaches all of the elements of the current invention as stated above except 
receiving an occupancy map (OMap) representation of the local environment, the OMap representation comprising points depicting static objects in the local environment, the OMap representation further comprising the ground and navigable boundaries within the local environment; 
for each of the plurality of frame sets: generating, using the OMap representation, one or more instances each comprising a spatial cluster of neighborhood 3D points generated from a 3D sensor scan of the local environment, and classifying each of the instances as dynamic or static based on the OMap representation by applying a deep learning algorithm to the instance; 
Kiran teaches it is known to provide the above elements. See at least page 1 “Abstract” where prior 3D maps are used to provide a static background model (local environment) and they are used to detect dynamic objects in the local environment. Also see at least page 3 “Accurate 3D Model representation” where the static background environment (including driving road surfaces, buildings planes/facades, lamps, roundabouts, traffic signs, etc.) represented by the prior 3D maps include stationary objects (such as road surface and buildings) and non-stationary static objects (such as static objects that appeared or disappeared between mapping and re-localization steps.
Also see at least FIG.’s 4-5 and page 11 “4.2 Clustering and Road Extraction” where the clustering process in which “initial frame-based clustering is achieved using the HDBSCAN algorithm.” After each step of the frame-based clustering, the labelled point cloud is saved back for each frame. In FIG. 4, the semantic segmentation labels are mapped to 3D point cloud from the Lidar. Also see at least page 7 “4. Clustering and Classification” where the classification step is usually performed over the clusters to obtain a class for each cluster such as vehicles, buildings, vegetation, etc. Page 4 “Large Scale HD Maps”, teaches that the dynamic objects are obtained via background subtraction.

It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have modified Achour to incorporate the teachings of Kiran and provide the computer-implemented method wherein an occupancy map (OMap) which represents the local environment is received, the OMap representation comprising points depicting static objects, the ground and navigable boundaries within the local environment; generating, using the OMap representation, one or more instances each comprising a spatial cluster of neighborhood 4D points generated from a 3D sensor scan of the local environment; and classifying each of the instances as dynamic or static based on the OMap representation by applying a deep learning algorithm to the instance. In doing so, “[p]re-curated maps of their environments can be further used to improve the robustness and completeness.” [page 1 – Introduction] Also, real-time dynamic object detection algorithms leverage previously mapped Lidar point clouds to reduce processing.” [page 1- Abstract]


Regarding claim 2, Achour in view of Kiran teaches
The computer-implemented method of claim 1, wherein the classifying of instances as dynamic enables the OMap representation to be cleaned to remove points depicting dynamic objects in the local environment. See at least Achour [0036], [0071] and [0074] where planar identification includes filtering out points due to targets with a non-zero velocity (dynamic objects) relative to a road since they do not correspond to a fixed planar surface.

Regarding claim 3, Achour in view of Kiran teaches
The computer-implemented method of claim 1, further comprising: determining, using the OMap representation, if the spatial cluster of neighborhood 3D points corresponding to a particular instance of the instances classified as dynamic have corresponding occupied points in the OMap representation; and in response to determining that the spatial cluster of neighborhood 3D points corresponding to the particular instance have corresponding occupied points in the OMap representation, establishing that the classification of the particular instance as dynamic by the deep learning algorithm is a false positive. See at least Achour [0065]-[0071] where spatial clusters of neighborhood 3D points are classified as dynamic in the OMap representation. “Candidate planar surfaces are compared to a confidence brightness threshold to indicate when there truly is a significant planar surface in the field of view. Candidate surfaces below a certain confidence brightness threshold are discarded and therefore detecting that the classified dynamic object was a false positive.”

Regarding claim 4, Achour in view of Kiran teaches
The computer-implemented method of claim 1, further comprising: in response to establishing that the classification of the particular instance as dynamic by the deep learning algorithm is a false positive, reclassifying the particular instance as static. See at least Achour [0059]-[0066], [0074]-[0081], [0090], where the systems and methods include an iMTM radar for detecting and identifying targets in the vehicle’s path and surrounding environment such as static objects (roads, walls, buildings, road center medians, etc.) and dynamic objects (vehicles, pedestrians, bystanders, cyclists, etc.). The iMTM interface module 104 includes a multi-object tracker 118 to track the targets over time and the detected targets are matched with targets it has detected in previous time windows. Identified objects may be classified using autoencoders which are capable of learning features from unlabeled data in an unsupervised mode. Dynamic objects are reclassified as static objects when the identified targets have a zero velocity.


Regarding claim 7, Achour in view of Kiran teaches
The computer-implemented method of claim 1, wherein the ground truth data comprises image data and/or 3D point cloud data. See at least Achour [0059]-[0065], [0070], and [0077]-[0079] where the ground truth data comprises encoding 4D radar data into a point cloud.


Regarding claim 8, Achour in view of Kiran teaches
One or more non-transitory computer readable storage media storing instructions that in response to being executed by one or more processors, cause a computer system to perform operations, the operations comprising: receiving a plurality of frame sets generated while navigating a local environment, each frame set comprising a point cloud representation of three-dimensional (3D) points; receiving an occupancy map (OMap) representation of the local environment, the OMap representation comprising points depicting static objects in the local environment, the OMap representation further comprising the ground and navigable boundaries within the local environment; for of the plurality of frame sets: generating, using the OMap representation, one or more instances each comprising a spatial cluster of neighborhood 3D points generated from a 3D sensor scan of the local environment, and classifying each of the instances as dynamic or static based on the OMap representation by applying a deep learning algorithm to the instance; tracking instances classified as dynamic across the plurality of frame sets using a tracking algorithm; assigning a single instance ID to tracked instances classified as dynamic across the plurality of frame sets; estimating a bounding box for each of the instances in each of the plurality of frame sets; and employing the instances as ground truth data in a training of one or more deep learning classifiers. See preceding logic for claim 1.

Regarding claim 9, Achour in view of Kiran teaches
The one or more non-transitory computer-readable storage media of claim 8, wherein the classifying of instances as dynamic enables the OMap representation to be cleaned to remove points depicting dynamic objects in the local environment.  See preceding logic for claim 2.

Regarding claim 10, Achour in view of Kiran teaches
The one or more non-transitory computer-readable storage media of claim 8, wherein the operations further comprise: determining, using the OMap representation, if the spatial cluster of neighborhood 3D points corresponding to a particular instance of the instances classified as dynamic have corresponding occupied points in the OMap representation; and in response to determining that the spatial cluster of neighborhood 3D points corresponding to the particular instance have corresponding occupied points in the OMap representation, establishing that the classification of the particular instance as dynamic by the deep learning algorithm is a false positive. See preceding logic for claim 3.


Regarding claim 11, Achour in view of Kiran teaches
The one or more non-transitory computer-readable storage media of claim 10, wherein the operations further comprise: in response to establishing that the classification of the particular instance as dynamic by the deep learning algorithm is a false positive, reclassifying the particular instance as static. See preceding logic for claim 4.


Regarding claim 14, Achour in view of Kiran teaches
The one or more non-transitory computer-readable storage media of claim 8, wherein the ground truth data comprises image data and/or 3D point cloud data. See preceding logic for claim 7.

Regarding claim 15, Achour in view of Kiran teaches
A computer system comprising: one or more processors; and one or more non-transitory computer readable media storing instructions that in response to being executed by the one or more processors, cause the computer system to perform operations, the operations comprising: receiving a plurality of frame sets generated while navigating a local environment, each frame set comprising a point cloud representation of three-dimensional (3D) points; receiving an occupancy map (OMap) representation of the local environment, the OMap representation comprising points depicting static objects in the local environment, the OMap representation further comprising the ground and navigable boundaries within the local environment; for of the plurality of frame sets: generating, using the OMap representation, one or more instances each comprising a spatial cluster of neighborhood 3D points generated from a 3D sensor scan of the local environment, and classifying each of the instances as dynamic or static based on the OMap representation by applying a deep learning algorithm to the instance; tracking instances classified as dynamic across the plurality of frame sets using a tracking algorithm; assigning a single instance ID to tracked instances classified as dynamic across the plurality of frame sets; estimating a bounding box for each of the instances in each of the plurality of frame sets; and employing the instances as ground truth data in a training of one or more deep learning classifiers. See preceding logic for claim 1.

Regarding claim 16, Achour in view of Kiran teaches
The computer system of claim 15, wherein the classifying of instances as dynamic enables the OMap representation to be cleaned to remove points depicting dynamic objects in the local environment. See preceding logic for claim 2.

Regarding claim 17, Achour in view of Kiran teaches
The computer system of claim 15, wherein the operations further comprise: determining, using the OMap representation, if the spatial cluster of neighborhood 3D points corresponding to a particular instance of the instances classified as dynamic have corresponding occupied points in the OMap representation; and in response to determining that the spatial cluster of neighborhood 3D points corresponding to the particular instance have corresponding occupied points in the OMap representation, establishing that the classification of the particular instance as dynamic by the deep learning algorithm is a false positive. See preceding logic for claim 3.

Regarding claim 18, Achour in view of Kiran teaches
The computer system of claim 17, wherein the operations further comprise: in response to establishing that the classification of the particular instance as dynamic by the deep learning algorithm is a false positive, reclassifying the particular instance as static. See preceding logic for claim 4.

Regarding claim 21, Achour in view of Kiran teaches
The computer system of claim 15, wherein the ground truth data comprises image data and/or 3D point cloud data. See preceding logic for claim 7.


Claim 5-6, 12-13, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Achour in view of Kiran and further in view of Levinson (US 20200098394 A1).

Regarding claim 5, Achour in view of Kiran teaches
The computer-implemented method of claim 1, Achour teaches all of the elements of the current invention as stated above except
further comprising: determining, using the OMap representation, if a particular instance of the instances classified as static is actually dynamic; and in response to determining that the particular instance classified as static is actually dynamic, establishing that the classification of the particular instance as static by the deep learning algorithm is a false negative. 
Levinson (US 20200098394 A1) teaches it is known to provide the above elements. See at least [0089]-[0101] where the system may indicate an object within the vicinity of the vehicle as static or dynamic by associating confidence levels with the identification of an object. If the confidence level falls below a threshold confidence level, the system may send a verification request to further verify the object. The object detection system 118 may determine that one or more objects included in a group (static objects) is actually a dynamic object and therefore a false negative.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have modified Achour in view of Kiran to incorporate the teachings of Levinson and provide the system wherein determining the particular instance classified as static is actually dynamic and is classified by deep learning algorithm as a false negative. In doing so, the system is capable of “identifying errors associated with individual sensor modalities by identifying respective groups of objects using data generated by the individual sensor modalities and comparing the respective groups of objects to output a perception system. Such comparisons may be used to identify sensors that are malfunctioning, that need repair, and/or that need calibration. ([0013])

Regarding claim 6, Achour in view of Kiran and Levinson teaches
The computer-implemented method of claim 5, further comprising: in response to establishing that the classification of the particular instance as static by the deep learning algorithm is a false negative, reclassifying the particular instance as dynamic. See at least Levinson [0112]-[0115] and FIG. 6-7 where the vehicle computing device may determine whether one or more objects are absent or misclassified in at least one of the groups of objects indicating a false negative. If there is an error detected, the system initiates a response using the response system 124 and generates an action using the response system, such as reclassifying the particular instance as dynamic instead of static.

Regarding claim 12, Achour in view of Kiran and Levinson teaches
The one or more non-transitory computer-readable storage media of claim 8, wherein the operations further comprise: determining, using the OMap representation, if a particular instance of the instances classified as static is actually dynamic; and in response to determining that the particular instance classified as static is actually dynamic, establishing that the classification of the particular instance as static by the deep learning algorithm is a false negative. See preceding logic for claim 5.

Regarding claim 13, Achour in view of Kiran and Levinson teaches
The one or more non-transitory computer-readable storage media of claim 12, wherein the operations further comprise: in response to establishing that the classification of the particular instance as static by the deep learning algorithm is a false negative, reclassifying the particular instance as dynamic. See preceding logic for claim 6.


Regarding claim 19, Achour in view of Kiran and Levinson teaches
The computer system of claim 15, wherein the operations further comprise: determining, using the OMap representation, if a particular instance of the instances classified as static is actually dynamic; and in response to determining that the particular instance classified as static is actually dynamic, establishing that the classification of the particular instance as static by the deep learning algorithm is a false negative. See preceding logic for claim 5.

Regarding claim 20, Achour in view of Kiran and Levinson teaches
The computer system of claim 19, wherein the operations further comprise: in response to establishing that the classification of the particular instance as static by the deep learning algorithm is a false negative, reclassifying the particular instance as dynamic. See preceding logic for claim 6.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Brittany Renee Peko whose telephone number is (408)918-7506. The examiner can normally be reached Monday - Thursday 7:30-5:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Elaine Gort can be reached on (571)272-6781. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/B.R.P./05/21/2022             Examiner, Art Unit 3661                                                                                                                                                                                           
/Elaine Gort/             Supervisory Patent Examiner, Art Unit 3661