DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
The amendment filed January 20th, 2022 has been entered. Claims 1, 6, 10-11, 16, and 20 remain pending in the application. Applicant’s amendment to the Claims have overcome each and every 112(b) previously set forth in the Non-Final Office Action mailed December 27th, 2021.
Response to Arguments
Applicant’s arguments with respect to claims 1 and 11 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
In response to applicant’s argument that claim 1 is patentable over cited prior art because partial limitation of claim 6 was allowable and it was incorporated into claim 1, claim 6 is dependent on claim 3 which is dependent on claim 2 which is then dependent on claim 1, so claim 6 is the combination of claim 1,2,3 and 6. By incorporating the partial limitation of claim 6 into claim 1, it changes the scope of claim 1 which will require further search and consideration. The amendment of claim 1 necessitated the new ground of rejection.
In response to applicant’s argument that claim 11 is patentable over cited prior art because partial limitation of claim 16 was allowable and it was incorporated into claim 11, claim 16 is dependent on claim 13 which is dependent on claim 12 which is then dependent on claim 11, so claim 16 is the combination of claim 11,12,13 and 16. By incorporating the partial limitation of claim 16 into claim 11, it changes the scope of claim 11 because now it is just the combination of claim 11 and partial of claim 16 which will require further search and consideration. The amendment of claim 11 necessitated the new ground of rejection.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Liang et al. "Deep Continuous Fusion for Multi-Sensor 3D Object Detection", hereinafter Liang in view of Magistri et al. (US 20210366144 A1), hereinafter Magistri.

Regarding claim 1, Liang teaches a processor-implemented method (Liang, Fig. 1, architecture of Liang’s model) for generating a bounding box (Liang. Fig. 4) for an object in proximity to a vehicle (Liang, Section 3, the architecture allows to generate the final detection results in BEV
space, as desired by the autonomous driving application), the method comprising: 
receiving a three-dimensional (3D) point cloud (Liang, Fig. 1, LIDAR stream, Section 3.1, a set of LIDAR points) representative of an environment; 
receiving a two-dimensional (2D) image (Liang, Fig. 1, Camera stream, Section 3.1, input camera image) of the environment; 
processing the 3D point cloud (Liang, Fig. 1, LIDAR stream, Section 3.1, a set of LIDAR points) to identify an object cluster of 3D data points (Liang, Fig. 2, cluster of 3D data points can be seen on Fig. 2) for a 3D object (Liang, Section 3, one of the stream extract features from LIDAR BEV) in the 3D point cloud (Liang, Fig. 1, LIDAR stream, Section 3.1, a set of LIDAR points); 
processing the 2D image (Liang, Fig. 1, Camera stream, Section 3.1, input camera image) to detect a 2D object (Liang, Section 3, one stream extract image features) in the 2D image (Liang, Fig. 1, Camera stream, Section 3.1, input camera image) and generate information (Liang, Section 3, extract information from the nearest corresponding image features for each point in BEV space) regarding the 2D object (Liang, Section 3, one stream extract image features) from the 2D image (Liang, Fig. 1, Camera stream, Section 3.1, input camera image); and 
when the 3D object (Liang, Section 3, one of the stream extract features from LIDAR BEV) and the 2D object (Liang, Section 3, one stream extract image features) correspond to the same object (Liang, Section 3, corresponding image features for each point in BEV space, Fig.2, project the 3D Lidar points onto the camera image plane to retrieve corresponding image features) in the environment: 
generating (Liang, Section 3, fuse information from the LIDAR which is the 3D data points and information from the image, Section 3.2, the multi-sensor detection network has two streams: the image feature network and the BEV network. We use four continuous fusion layers to fuse multiple scales of image features into BEV network from lower level to higher level. After the fusion, a 1 × 1 convolutional layer is computed over the final BEV layer to generate the detection output and a Non-Maximum Suppression (NMS) layer follows to generate the final object boxes based on the output map) a bird's eye view (BEV) bounding box (Liang, Fig. 4 shows BEV detection or bounding boxes) for the object (Liang, Section 3, corresponding image features for each point in BEV space, Fig.2, project the 3D Lidar points onto the camera image plane to retrieve corresponding image features) based on the object cluster of 3D data points (Liang, Fig. 2, cluster of 3D data points can be seen on Fig. 2) and the information from the 2D image (Liang, Section 3, extract information from the nearest corresponding image features for each point in BEV space).

Liang does not explicitly disclose wherein the information from the 2D object comprises an image heading himage of the object, and an image heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    i
                                    m
                                    a
                                    g
                                    e
                                
                                
                                    2
                                
                            
                        
                     associated with the image heading himage of the object.
	However, Magistri teaches wherein the information from the 2D object (para. 0010, “the autonomous driving system may analyze image data obtained by a camera to identify an object located in a path of the vehicle”) comprises an image heading himage of the object (para. 0010, “determine a direction that the object is facing”, para. 0019, “The direction that the object is facing may be referred to as an object viewpoint.”, as seen in Fig. 4D of the application, heading of the object is the direction it is facing) , and an image heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    i
                                    m
                                    a
                                    g
                                    e
                                
                                
                                    2
                                
                            
                        
                      (para. 0012, “Because the autonomous driving system may determine to perform different actions depending on the direction that the object is facing, the autonomous driving system may ensure that an accuracy or confidence level associated with determining the direction that the object is facing satisfies a threshold.”, as per the specification para. 0067, the heading uncertainty indicates a confidence score) associated with the image heading himage of the object (para. 0010, “determine a direction that the object is facing”, para. 0019, “The direction that the object is facing may be referred to as an object viewpoint.”, as seen in Fig. 4D of the application, heading of the object is the direction it is facing).
Liang and Magistri are both considered to be analogous to the claimed invention because they are in the same field of object detection. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as taught by Liang to incorporate the teachings of Magistri wherein the information from the 2D object comprises an image heading himage of the object, and an image heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    i
                                    m
                                    a
                                    g
                                    e
                                
                                
                                    2
                                
                            
                        
                     associated with the image heading himage of the object. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because the viewpoint system of Magistri may utilize fewer computing resources (e.g., processor resources, memory resources, communication resources, and/or the like) than prior systems used to determine a direction that an object is facing (Magistri, para. 0014). 

Regarding claim 11, a processing system (Liang, Abstract, 3D object detector, Fig. 1) for generating a bounding box (Liang. Fig. 4) for an object in proximity to a vehicle (Liang, Section 3, the architecture allows to generate the final detection results in BEV space, as desired by the autonomous driving application), the processing system comprising: 
a processing unit (Liang, Section 4.1, the processing unit is implicit in view of the experiment done using the 3D object detector); and 
a memory (Liang, section 3.1, GPU memory, Section 4.1, train the model with a batch size of 16 on 4 GPUs) coupled to the processing unit (Liang, Section 4.1, the processing unit is implicit in view of the experiment done using the 3D object detector), the memory storing machine- executable instructions that (Liang, Section 4), when executed by the processing unit (Liang, Section 4.1, the processing unit is implicit in view of the experiment done using the 3D object detector), cause the processing system (Liang, Abstract, 3D object detector, Fig. 1) to: 
receive a 3D point cloud (Liang, Fig. 1, LIDAR stream, Section 3.1, a set of LIDAR points) representative of an environment; 
receive a 2D image (Liang, Fig. 1, Camera stream, Section 3.1, input camera image) of the environment; 
process the 3D point cloud (Liang, Fig. 1, LIDAR stream, Section 3.1, a set of LIDAR points) to identify a cluster of data points (Liang, Fig. 2, cluster of 3D data points can be seen on Fig. 2) for a 3D object (Liang, Section 3, one of the stream extract features from LIDAR BEV) in the 3D point cloud (Liang, Fig. 1, LIDAR stream, Section 3.1, a set of LIDAR points); 
process the 2D image (Liang, Fig. 1, Camera stream, Section 3.1, input camera image) to detect a 2D object (Liang, Section 3, one stream extract image features) in the 2D image (Liang, Fig. 1, Camera stream, Section 3.1, input camera image) and generate information (Liang, Section 3, extract information from the nearest corresponding image features for each point in BEV space) regarding the 2D object (Liang, Section 3, one stream extract image features) from the 2D image (Liang, Fig. 1, Camera stream, Section 3.1, input camera image); and
when the 3D object (Liang, Section 3, one of the stream extract features from LIDAR BEV) and the 2D object (Liang, Section 3, one stream extract image features) correspond to the same object (Liang, Section 3, corresponding image features for each point in BEV space, Fig.2, project the 3D Lidar points onto the camera image plane to retrieve corresponding image features) in the environment: 
generate (Liang, Section 3, fuse information from the LIDAR which is the 3D data points and information from the image, Section 3.2, the multi-sensor detection network has two streams: the image feature network and the BEV network. We use four continuous fusion layers to fuse multiple scales of image features into BEV network from lower level to higher level. After the fusion, a 1 × 1 convolutional layer is computed over the final BEV layer to generate the detection output and a Non-Maximum Suppression (NMS) layer follows to generate the final object boxes based on the output map) a bird's eye view (BEV) bounding box (Liang, Fig. 4 shows BEV detection or bounding boxes) for the object (Liang, Section 3, corresponding image features for each point in BEV space, Fig.2, project the 3D Lidar points onto the camera image plane to retrieve corresponding image features) based on the object cluster of 3D data points (Liang, Fig. 2, cluster of 3D data points can be seen on Fig. 2) and the information from the 2D image (Liang, Section 3, extract information from the nearest corresponding image features 
for each point in BEV space).

Liang does not explicitly disclose wherein the information from the 2D object comprises an image heading himage of the object, and an image heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    i
                                    m
                                    a
                                    g
                                    e
                                
                                
                                    2
                                
                            
                        
                     associated with the image heading himage of the object.
	However, Magistri teaches wherein the information from the 2D object (para. 0010, “the autonomous driving system may analyze image data obtained by a camera to identify an object located in a path of the vehicle”) comprises an image heading himage of the object (para. 0010, “determine a direction that the object is facing”, para. 0019, “The direction that the object is facing may be referred to as an object viewpoint.”, as seen in Fig. 4D of the application, heading of the object is the direction it is facing) , and an image heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    i
                                    m
                                    a
                                    g
                                    e
                                
                                
                                    2
                                
                            
                        
                      (para. 0012, “Because the autonomous driving system may determine to perform different actions depending on the direction that the object is facing, the autonomous driving system may ensure that an accuracy or confidence level associated with determining the direction that the object is facing satisfies a threshold.”, as per the specification para. 0067, the heading uncertainty indicates a confidence score) associated with the image heading himage of the object (para. 0010, “determine a direction that the object is facing”, para. 0019, “The direction that the object is facing may be referred to as an object viewpoint.”, as seen in Fig. 4D of the application, heading of the object is the direction it is facing).
Liang and Magistri are both considered to be analogous to the claimed invention because they are in the same field of object detection. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the processing system as taught by Liang to incorporate the teachings of Magistri wherein the information from the 2D object comprises an image heading himage of the object, and an image heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    i
                                    m
                                    a
                                    g
                                    e
                                
                                
                                    2
                                
                            
                        
                     associated with the image heading himage of the object. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because the viewpoint system of Magistri may utilize fewer computing resources (e.g., processor resources, memory resources, communication resources, and/or the like) than prior systems used to determine a direction that an object is facing (Magistri, para. 0014). 


Claims 2 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Liang in view of Magistri in further view of Liu et al. "Fast Dynamic Vehicle Detection in Road Scenarios Based on Pose Estimation with Convex-Hull Model", hereinafter Liu.

Regarding claim 2, the combination of Liang in view of Magistri teaches the method of claim 1 (Liang, Fig. 1, architecture of Liang’s model), wherein generating the BEV bounding box (Liang, Section 3.2, and Fig. 4) comprises: 
mapping (Liang, Fig. 2, project the 3D points onto the camera image plane, Section 3.1, for each target pixel in the dense map, we find its nearest K LIDAR points over the 2D BEV plane using Euclidean distance and we extract the corresponding image features by projecting the source LIDAR point onto the image plane) the object cluster of 3D data points (Liang, Fig. 2, cluster of 3D data points can be seen on Fig. 2) to a cluster of 2D data points (Liang, Fig. 2, projected 3D data points become 2d data points) on a 2D plane in a bird's eye view (BEV) (Liang, Fig. 2, project the 3D points onto the camera image plane, Section 3.1, for each target pixel in the dense map, we find its nearest K LIDAR points over the 2D BEV plane using Euclidean distance and we extract the corresponding image features by projecting the source LIDAR point onto the image plane) and in a vehicle coordinate system (Liang, Fig. 5, the final fusion and BEV bounding box are in vehicle coordinate system) of the vehicle; 
determining and storing a group of BEV polygon points (Liang, Fig.2, shows BEV polygon points on the image plane) on the 2D plane in the BEV (Liang, Fig. 2, project the 3D points onto the camera image plane, Section 3.1, for each target pixel in the dense map, we find its nearest K LIDAR points over the 2D BEV plane using Euclidean distance and we extract the corresponding image features by projecting the source LIDAR point onto the image plane); and 
generating (Liang, Section 3, fuse information from the LIDAR which is the 3D data points and information from the image, Section 3.2, the multi-sensor detection network has two streams: the image feature network and the BEV network. We use four continuous fusion layers to fuse multiple scales of image features into BEV network from lower level to higher level. After the fusion, a 1 × 1 convolutional layer is computed over the final BEV layer to generate the detection output and a Non-Maximum Suppression (NMS) layer follows to generate the final object boxes based on the output map) the BEV bounding box (Liang, Fig. 4, BEV bounding box) based on the cluster of 2D data points on the 2D plane (Liang, Fig. 2), the group of BEV polygon points (Liang, Fig. 2), and the information from the 2D image (Liang, Section 3, fuse information from LIDAR and images, Section 3, the fusion layer before generating the detection box maps the BEV LIDAR to the image plane and create cluster of 2D data points and group of BEV polygon points which can be seen on Fig. 2, also as per Section 2 ).

The combination of Liang in view of Magistri does not expressly disclose wherein the group of BEV polygon points forms a convex hull enclosing the cluster of 2D data points on the 2D plane.
	However, Liu teaches determining and storing a group of BEV polygon points (Liu, Fig. 1a and Fig.1c) on the 2D plane (Liu, Section 2.1, 3D point-clouds are projected on the x-y plane) in the BEV (Liu, Section 2, (Cx, Cy) is the center position of the target in bird-view projection) wherein the group of BEV polygon points (Liu, Fig. 1a and Fig.1c ) forms a convex hull (Liu, Fig. 1b and Fig. 1d) enclosing the cluster of 2D data points on the 2D plane (Liu, Fig. 1, the convex hull enclose the cluster of 2D data points on the x-y plane or 2D plane).
Liu is considered to be analogous to the claimed invention because it is in the same field of object detection. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as taught by the combination of Liang in view of Magistri to incorporate the teachings of Liu wherein the group of BEV polygon points forms a convex hull enclosing the cluster of 2D data points on the 2D plane. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because most of the Lidar point-clouds come from the object contour, a point-cloud convex-hull model with a multifactor objective function is proposed to fit the point-cloud set (Liu, Section 2.1). Also, the proposed pose-estimation method requires low computation and is robust to the environment, which reduces the amount of computation while maintaining good detection efficiency (Liu, Section 1.2). 

Regarding claim 12, the combination of Liang in view of Magistri teaches the system of claim 11 (Liang, Abstract, 3D object detector, Fig. 1), wherein in order to generate the BEV bounding box (Liang. Fig. 4), the instructions (Liang, section 3.1, GPU memory, Section 4.1, train the model with a batch size of 16 on 4 GPUs), when executed by the processing unit (Liang, Section 4.1, the processing unit is implicit in view of the experiment done using the 3D object detector), cause the processing system to: 
map (Liang, Fig. 2, project the 3D points onto the camera image plane, Section 3.1, for each target pixel in the dense map, we find its nearest K LIDAR points over the 2D BEV plane using Euclidean distance and we extract the corresponding image features by projecting the source LIDAR point onto the image plane) the object cluster of 3D data points (Liang, Fig. 2, cluster of 3D data points can be seen on Fig. 2) to a cluster of 2D data points (Liang, Fig. 2, projected 3D data points become 2d data points) on a 2D plane in a bird's eye view (BEV) (Liang, Fig. 2, project the 3D points onto the camera image plane, Section 3.1, for each target pixel in the dense map, we find its nearest K LIDAR points over the 2D BEV plane using Euclidean distance and we extract the corresponding image features by projecting the source LIDAR point onto the image plane) and in a vehicle coordinate system (Liang, Fig. 5, the final fusion and BEV bounding box are in vehicle coordinate system) of the vehicle; 
determine and store a group of BEV polygon points (Liang, Fig.2, shows BEV polygon points on the image plane) on the 2D plane in the BEV (Liang, Fig. 2, project the 3D points onto the camera image plane, Section 3.1, for each target pixel in the dense map, we find its nearest K LIDAR points over the 2D BEV plane using Euclidean distance and we extract the corresponding image features by projecting the source LIDAR point onto the image plane); and 
generate (Liang, Section 3, fuse information from the LIDAR which is the 3D data points and information from the image, Section 3.2, the multi-sensor detection network has two streams: the image feature network and the BEV network. We use four continuous fusion layers to fuse multiple scales of image features into BEV network from lower level to higher level. After the fusion, a 1 × 1 convolutional layer is computed over the final BEV layer to generate the detection output and a Non-Maximum Suppression (NMS) layer follows to generate the final object boxes based on the output map) the BEV bounding box (Liang, Fig. 4, BEV bounding box) based on the cluster of 2D data points on the 2D plane (Liang, Fig. 2), the group of BEV polygon points (Liang, Fig. 2), and the information from the 2D image (Liang, Section 3, fuse information from LIDAR and images, Section 3, the fusion layer before generating the detection box maps the BEV LIDAR to the image plane and create cluster of 2D data points and group of BEV polygon points which can be seen on Fig. 2, also as per Section 2 ).

The combination of Liang in view of Magistri does not expressly disclose wherein the group of BEV polygon points forms a convex hull enclosing the cluster of 2D data points on the 2D plane.
	However, Liu teaches determine and store a group of BEV polygon points (Liu, Fig. 1a and Fig.1c) on the 2D plane (Liu, Section 2.1, 3D point-clouds are projected on the x-y plane) in the BEV (Liu, Section 2, (Cx, Cy) is the center position of the target in bird-view projection) wherein the group of BEV polygon points (Liu, Fig. 1a and Fig.1c ) forms a convex hull (Liu, Fig. 1b and Fig. 1d) enclosing the cluster of 2D data points on the 2D plane (Liu, Fig. 1, the convex hull enclose the cluster of 2D data points on the x-y plane or 2D plane).
Liang is considered to be analogous to the claimed invention because it is in the same field of object detection. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the processing system as taught by Liang to incorporate the teachings of the combination of Liang in view of Magistri wherein the group of BEV polygon points forms a convex hull enclosing the cluster of 2D data points on the 2D plane. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because most of the Lidar point-clouds come from the object contour, a point-cloud convex-hull model with a multifactor objective function is proposed to fit the point-cloud set (Liu, Section 2.1). Also, the proposed pose-estimation method requires low computation and is robust to the environment, which reduces the amount of computation while maintaining good detection efficiency (Liu, Section 1.2). 

Claims 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Liang in view of Magistri in further view of Liu, and in further view of Gong et al. "A Frustum based probabilistic framework for 3D object detection by fusion of LiDAR and camera data" (the prior art cited in IDS).

Regarding claim 3, the combination of Liang in view of Magistri in further view of Liu teaches the method of claim 2 (Liang, Fig. 1, architecture of Liang’s model), wherein generating the BEV bounding box (Liang, Fig. 4) further comprises: 
determining a center pcenter (Liu, Section 2, the target pose in this work is denoted by (Cx, Cy, θ), in which (Cx, Cy) is the center position of the target in bird-view projection) of the cluster of 2D data points (Liu, Fig. 1) on the 2D plane (Liu, Section 2.1, 3D point-clouds are projected on the x-y plane); 
determining an estimated heading hob of the object (Liu, input of Algorithm 1 includes the point-clouds of object P and its approximate direction vector Dy that is obtained by the positions of the associated objects, heading means the direction it is moving to); 
determining a plurality of selected polygon points (Liu, Section 2.1, the selected K points surround all the point-clouds in set P) from the group of BEV polygon points (Liu, Fig. 1); 
determining a plurality of candidate bounding boxes (Liu, Section 2.1, after all the points in set K have been applied to construct the bounding box, an objective function is proposed to select the optimal parameters and the parameters of the bounding box with the smallest objective function value are selected as the optimal parameters, and the corresponding bounding box is defined as the fittest box, Liu teaches of selecting a fittest box which implies a plurality of candidate boxes are created before the fittest box is defined), wherein each candidate bounding box is determined based on a respective selected polygon point (Liu, Section 2.1, points in set K have been applied to construct each bounding box as seen on Fig. 2) from the plurality of selected polygon points (Liu, Section 2.1, the selected K points surround all the point-clouds in set P); 
selecting a final bounding box (Liu, Fig.1b and Fig.1d, fittest bounding box) to be the BEV bounding box from the plurality of candidate bounding boxes (Liu, Section 2.1, after all the points in set K have been applied to construct the bounding box, an objective function is proposed to select the optimal parameters and the parameters of the bounding box with the smallest objective function value are selected as the optimal parameters, and the corresponding bounding box is defined as the fittest box, Liu teaches of selecting a fittest box which implies a plurality of candidate boxes are created before the fittest box is defined), wherein the final bounding box (Liu, Fig.1b and Fig.1d, fittest bounding box) is one of the candidate bounding boxes that covers the most number of data points (Liu, in Fig. 1b, the bounding box has all the data points inside it which implies that the most number of data points is covered by the bounding box) from the cluster of 2D data points (Liu, Fig. 1) on the 2D plane (Liu, Section 2.1, 3D point-clouds are projected on the x-y plane).
Liang is considered to be analogous to the claimed invention because it is in the same field of object detection. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as taught by the combination of Liang in view of Magistri in further view of Liu to incorporate the teachings of Liu of determining a center pcenter on the 2D plane, determining an estimated heading hob of the object, determining a plurality of selected polygon points from the group of BEV polygon points, determining a plurality of candidate bounding boxes, wherein each candidate bounding box is determined based on a respective selected polygon point from the plurality of selected polygon points, selecting a final bounding box to be the BEV bounding box from the plurality of candidate bounding boxes, wherein the final bounding box is one of the candidate bounding boxes that covers the most number of data points from the cluster of 2D data points on the 2D plane. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because the proposed pose-estimation method requires low computation and is robust to the environment, which reduces the amount of computation while maintaining good detection efficiency (Liu, Section 1.2). 

The combination of Liang in view of Magistri in further view of Liu does not expressly disclose rotating the cluster of 2D data points on the 2D plane around the center pcenter based on the estimated heading hob and rotating the BEV bounding box based on the value of hob around the center pcenter of the cluster of 2D data points on the 2D plane.
	However, Gong teaches rotating the cluster of 2D data points on the 2D plane around the center pcenter based on the estimated heading hob (Gong, Fig. 5, rough transformed model, The combination of Liang in view of Liu teaches determining the center of the cluster of 2D data points on the 2D plane while Gong, as seen in Fig. 5 teaches a center of the object in BEV and the forward direction or the heading. As seen on Fig. 5c, the data points of the object is rotated around the center based on the forward direction or heading of the object) and rotating the BEV bounding box based on the value of hob around the center pcenter of the cluster of 2D data points on the 2D plane (Gong, Fig. 5c, the BEV bounding box is also rotated around the center based on the forward direction or heading of the object. The combination of Liang in view of Liu teaches determining the center of the cluster of 2D data points on the 2D plane).
Gong is considered to be analogous to the claimed invention because they are in the same field of object detection. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as taught by the combination of Liang in view of Magistri in further view of Liu to incorporate the teachings of Gong of rotating the cluster of data points on the 2D plane around the center pcenter based on the estimated heading hob and rotating the BEV bounding box based on the value of hob around the center pcenter of the cluster of data points on the 2D plane. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because by applying the classical PCA (principal component analysis), a technique that uses an orthogonal transformation (a rotation), to roughly estimate the orientation of the 3D, three things can be accomplished: (i) isolates noise, (ii) eliminates effects of rotation, and (iii)separates out the redundant degrees of freedom (Gong, Section 3.4b). 

Regarding claim 13, the combination of Liang in view of Magistri in further view of Liu teaches the system of claim 12 (Liang, Abstract, 3D object detector, Fig. 1), wherein in order to generate the BEV bounding box (Liang. Fig. 4), the instructions (Liang, section 3.1, GPU memory, Section 4.1, train the model with a batch size of 16 on 4 GPUs), when executed by the processing unit (Liang, Section 4.1, the processing unit is implicit in view of the experiment done using the 3D object detector), cause the processing system to: 
determine a center pcenter (Liu, Section 2, the target pose in this work is denoted by (Cx, Cy, θ), in which (Cx, Cy) is the center position of the target in bird-view projection) of the cluster of 2D data points (Liu, Fig. 1) on the 2D plane (Liu, Section 2.1, 3D point-clouds are projected on the x-y plane); 
determine an estimated heading hob of the object (Liu, input of Algorithm 1 includes the point-clouds of object P and its approximate direction vector Dy that is obtained by the positions of the associated objects, heading means the direction it is moving to); 
determine a plurality of selected polygon points (Liu, Section 2.1, the selected K points surround all the point-clouds in set P) from the group of BEV polygon points (Liu, Fig. 1); 
determine a plurality of candidate bounding boxes (Liu, Section 2.1, after all the points in set K have been applied to construct the bounding box, an objective function is proposed to select the optimal parameters and the parameters of the bounding box with the smallest objective function value are selected as the optimal parameters, and the corresponding bounding box is defined as the fittest box, Liu teaches of selecting a fittest box which implies a plurality of candidate boxes are created before the fittest box is defined), wherein each candidate bounding box is determined based on a respective selected polygon point (Liu, Section 2.1, points in set K have been applied to construct each bounding box as seen on Fig. 2) from the plurality of selected polygon points (Liu, Section 2.1, the selected K points surround all the point-clouds in set P).
Liang is considered to be analogous to the claimed invention because it is in the same field of object detection. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the processing system as taught by the combination of Liang in view of Magistri in further view of Liu to incorporate the teachings of Liu of determining a center pcenter on the 2D plane, determining an estimated heading hob of the object, determining a plurality of selected polygon points from the group of BEV polygon points, determining a plurality of candidate bounding boxes, wherein each candidate bounding box is determined based on a respective selected polygon point from the plurality of selected polygon points, selecting a final bounding box to be the BEV bounding box from the plurality of candidate bounding boxes, wherein the final bounding box is one of the candidate bounding boxes that covers the most number of data points from the cluster of 2D data points on the 2D plane. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because the proposed pose-estimation method requires low computation and is robust to the environment, which reduces the amount of computation while maintaining good detection efficiency (Liu, Section 1.2). 

The combination of Liang in view of Magistri in further view of Liu does not expressly disclose rotating the cluster of 2D data points on the 2D plane around the center pcenter based on the estimated heading hob and rotating the BEV bounding box based on the value of hob around the center pcenter of the cluster of 2D data points on the 2D plane.
	However, Gong teaches rotating the cluster of 2D data points on the 2D plane around the center pcenter based on the estimated heading hob (Gong, Fig. 5, rough transformed model, The combination of Liang in view of Liu teaches determining the center of the cluster of 2D data points on the 2D plane while Gong, as seen in Fig. 5 teaches a center of the object in BEV and the forward direction or the heading. As seen on Fig. 5c, the data points of the object is rotated around the center based on the forward direction or heading of the object) and rotating the BEV bounding box based on the value of hob around the center pcenter of the cluster of 2D data points on the 2D plane (Gong, Fig. 5c, the BEV bounding box is also rotated around the center based on the forward direction or heading of the object. The combination of Liang in view of Liu teaches determining the center of the cluster of 2D data points on the 2D plane).
Gong is considered to be analogous to the claimed invention because they are in the same field of object detection. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the processing system as taught by the combination of Liang in view of Magistri in further view of Liu to incorporate the teachings of Gong of rotating the cluster of data points on the 2D plane around the center pcenter based on the estimated heading hob and rotating the BEV bounding box based on the value of hob around the center pcenter of the cluster of data points on the 2D plane. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because by applying the classical PCA (principal component analysis), a technique that uses an orthogonal transformation (a rotation), to roughly estimate the orientation of the 3D, three things can be accomplished: (i) isolates noise, (ii) eliminates effects of rotation, and (iii)separates out the redundant degrees of freedom (Gong, Section 3.4b). 

Allowable Subject Matter
Claims 4-9 and 14-19 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

The following is a statement of reasons for the indication of allowable subject matter: 

Regarding claim 4, it would be allowable for disclosing generating four rectangle boxes of a pre-determined size for each respective polygon point of the plurality of selected polygon points.
	The combination of Liang in view of Magistri in further view of Liu does not expressly disclose generating four rectangle boxes of a pre-determined size for each respective polygon point of the plurality of selected polygon points.
	Liang teaches generating two anchor boxes with fixed size (Liang, Section 3.2), but the boxes are not generated for each respective polygon point of the plurality of selected polygon points. Liu teaches selecting polygon points out of the BEV polygon points and the selected points are applied to construct the bounding boxes (Liu, Fig. 1, Fig. 2, and Section 2.1), but Liu does not expressly disclose generating four rectangle boxes of a pre-determined size. Therefore claim 4 would be allowable for disclosing generating four rectangle boxes of a pre-determined size for each respective polygon point of the plurality of selected polygon points.
	Claim 5 would be allowable because it is dependent on claim 4.

Regarding claim 6, it would be allowable for disclosing wherein the information from the 2D image includes: a class label associated with the object, a classification score associated with the class label, a size of the object, an image heading himage of the object, and an image heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    i
                                    m
                                    a
                                    g
                                    e
                                
                                
                                    2
                                
                            
                        
                     associated with the image heading himage of the object, determining that the 3D object and the 2D object correspond to the same object in the environment based on the class label associated with the object, the classification score associated with the class label, and the size of the object, and receiving or determine, a tracked heading htrack of the object and a tracked heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    t
                                    r
                                    a
                                    c
                                    k
                                
                                
                                    2
                                
                            
                        
                     associated with the tracked heading htrack of the object.
	The combination of Liang in view of Magistri in further view of Liu does not expressly disclose determining that the 3D object and the 2D object correspond to the same object in the environment based on the class label associated with the object, the classification score associated with the class label, and the size of the object, and receiving or determine, a tracked heading htrack of the object and a tracked heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    t
                                    r
                                    a
                                    c
                                    k
                                
                                
                                    2
                                
                            
                        
                     associated with the tracked heading htrack of the object.
	Liang teaches extracting information from the 2D image (Liang, Section 3, extract information from the nearest corresponding image features for each point in BEV space). Liang also teaches a class label associated with the object, a classification score associated with the class label, a size of the object (Liang, Section 3.2). Liu teaches an image heading himage of the object or the direction of the object (Liu, input of Algorithm 1 includes the point-clouds of object P and its approximate direction vector Dy that is obtained by the positions of the associated objects, heading means the direction it is moving to). However, combination of Liang in view of Magistri in further view of Liu does not expressly disclose determining that the 3D object and the 2D object correspond to the same object in the environment based on the class label associated with the object, the classification score associated with the class label, and the size of the object, and receiving or determine, a tracked heading htrack of the object and a tracked heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    t
                                    r
                                    a
                                    c
                                    k
                                
                                
                                    2
                                
                            
                        
                     associated with the tracked heading htrack of the object. Therefore claim 6 would be allowable.
	Claim 7-9 would be allowable because they are dependent on claim 6. Claim 10 would be allowable because it is dependent on claim 9.

Regarding claim 14, it would be allowable for disclosing generating four rectangle boxes of a pre-determined size for each respective polygon point of the plurality of selected polygon points.
	The combination of Liang in view of Magistri in further view of Liu does not expressly disclose generating four rectangle boxes of a pre-determined size for each respective polygon point of the plurality of selected polygon points.
	Liang teaches generating two anchor boxes with fixed size (Liang, Section 3.2), but the boxes are not generated for each respective polygon point of the plurality of selected polygon points. Liu teaches selecting polygon points out of the BEV polygon points and the selected points are applied to construct the bounding boxes (Liu, Fig. 1, Fig. 2, and Section 2.1), but Liu does not expressly disclose generating four rectangle boxes of a pre-determined size. Therefore claim 14 would be allowable for disclosing generating four rectangle boxes of a pre-determined size for each respective polygon point of the plurality of selected polygon points.
	Claim 15 would be allowable because it is dependent on claim 14.

Regarding claim 16, it would be allowable for disclosing wherein the information from the 2D image includes: a class label associated with the object, a classification score associated with the class label, a size of the object, an image heading himage of the object, and an image heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    i
                                    m
                                    a
                                    g
                                    e
                                
                                
                                    2
                                
                            
                        
                     associated with the image heading himage of the object, determining that the 3D object and the 2D object correspond to the same object in the environment based on the class label associated with the object, the classification score associated with the class label, and the size of the object, and receiving or determine, a tracked heading htrack of the object and a tracked heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    t
                                    r
                                    a
                                    c
                                    k
                                
                                
                                    2
                                
                            
                        
                     associated with the tracked heading htrack of the object.
	The combination of Liang in view of Magistri in further view of Liu does not expressly disclose determining that the 3D object and the 2D object correspond to the same object in the environment based on the class label associated with the object, the classification score associated with the class label, and the size of the object, and receiving or determine, a tracked heading htrack of the object and a tracked heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    t
                                    r
                                    a
                                    c
                                    k
                                
                                
                                    2
                                
                            
                        
                     associated with the tracked heading htrack of the object.
	Liang teaches extracting information from the 2D image (Liang, Section 3, extract information from the nearest corresponding image features for each point in BEV space). Liang also teaches a class label associated with the object, a classification score associated with the class label, a size of the object (Liang, Section 3.2). Liu teaches an image heading himage of the object or the direction of the object (Liu, input of Algorithm 1 includes the point-clouds of object P and its approximate direction vector Dy that is obtained by the positions of the associated objects, heading means the direction it is moving to). However, the combination of Liang in view of Magistri in further view of Liu does not expressly disclose determining that the 3D object and the 2D object correspond to the same object in the environment based on the class label associated with the object, the classification score associated with the class label, and the size of the object, and receiving or determine, a tracked heading htrack of the object and a tracked heading uncertainty                         
                            
                                
                                    σ
                                
                                
                                    t
                                    r
                                    a
                                    c
                                    k
                                
                                
                                    2
                                
                            
                        
                     associated with the tracked heading htrack of the object. Therefore claim 16 would be allowable.
	Claim 17-19 would be allowable because they are dependent on claim 16. Claim 20 would be allowable because it is dependent on claim 19.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DENISE G ALFONSO whose telephone number is (571)272-1360. The examiner can normally be reached Monday - Friday 7:30 - 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on 571-270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DENISE G ALFONSO/Examiner, Art Unit 2663              

/CLAIRE X WANG/Supervisory Patent Examiner, Art Unit 2663