DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 2/4/2021 has been entered.
 
Response to Arguments
Applicant's arguments filed 2/4/2021 have been fully considered but they are not persuasive.
Applicant argues

    PNG
    media_image1.png
    328
    604
    media_image1.png
    Greyscale

Examiner’s Response
The Examiner is unable to find support as claimed.  The text is copied below.  The term interpolate is not shown in paragraphs 19 & 21.  But it is shown in paragraph 22 and states “interpolates the 2D image generated in step 106 according to the distance between a camera and the point cloud”.  It does not say object in real-space as amended.  
Additionally, there is no evidence that the camera includes a LIDAR.  One can imply that the LIDAR device is at a similar location as the “shooting place, camera or user”. It doesn’t mean that the LIDAR is included with the camera. Furthermore, modifying a dependent claim does not cure deficiencies in the independent claim.  
[0019] According to the vehicle detection process 10, in step 104, the 3D image with point clouds is acquired. The point clouds in the 3D image may represent a plurality of objects. In an embodiment, the 3D image may be obtained by a light detection and ranging (LIDAR) system and normalized in XYZ axes. After the 3D image is acquired, a ground removal for the 3D image is utilized for removing the ground in the 3D image with a random sample consensus (RANSAC) method or device. Before determining the objects from the point clouds, the point clouds of the 3D image may be filtered into a plurality of object-candidate point clouds according to distances between each point. More specifically, the object-candidate point clouds are filtered by a K-D tree search method, which clusters the point clouds based on distances between each point. In an example, when a distance between two points is less than 0.2 meter, the K-D tree search method clusters the two points as a group, and filters the groups based on a dimension, e.g. length, width and height, into the object-candidate point cloud. Therefore, a preliminary filtering of the object-candidate point cloud of the vehicle detection is determined. 

[0021] To improve the accuracy of the classification and determination of vehicle detection, in step 106, the vehicle detection method maps the 3D image onto a two-dimensional image. In an embodiment, since the vehicle is not certainly right in front of a shooting place, a camera or a user, the 3D image is rotated according to an angle and a distance between a camera and a first point cloud of the object-candidate point clouds and is mapped onto the 2D image by flattening. 
[0022] In addition, when the distance between the vehicle and the shooting place, camera or user is too long, the generated point clouds of the 3D image are sparse. Thus, in step 108, the vehicle detection method 10 interpolates the 2D image generated in step 106 according to the distance between a camera and the point cloud. In an embodiment, the sparser of the point cloud of the 2D image, the more points to be interpolated thereon. As shown in FIG. 2, which is a schematic diagram of an interpolation of the 2D image according to an embodiment of the present disclosure, a point clouds PC1 is too sparse to be recognized for the vehicle detection, and a point clouds PC2 is generated based on step 108. Notably, before interpolating the point cloud of the 2D image, a contour of the point clouds is determined, and relative 2D features may be extracted by histogram of oriented gradient (HOG), local binary patterns (LBP) or Haar-like method. 

Applicant argues

    PNG
    media_image2.png
    419
    615
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    187
    598
    media_image3.png
    Greyscale

Examiner’s Response
There is no evidence that the camera includes a LIDAR.  One can imply that the LIDAR device is at a similar location as the “shooting place, camera or user”. It doesn’t mean that the LIDAR is included with the camera.
The Examiner agrees “a group point clouds may correspond to an object in the real space”  A LIDAR device is unable to take “the center, the front end, the rear end” points.  It is physically impossible.  The way LIDAR works is laser is reflected off the closest points of the object.   It has no way of reaching the center or rear end points.  The Examiner agrees that “those skilled in the art may know how to determine a distance between the camera and the group point clouds (i.e. the object in the real space).”.  So is it Applicant’s answer it is any distance, measured by any known method?

Applicant argues

    PNG
    media_image4.png
    498
    676
    media_image4.png
    Greyscale

Examiner’s Response
Applicant’s response does not answer the question.  How does this distance value fit into the interpolation?  The claim(s) must be described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention.  So how does this all work?  Applicant’s cited text is what happens before the interpolation.  But doesn’t explain “interpolation according to a distance …”

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to 

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-18 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Claim 1 recites “interpolating the 2D image according to a distance between a camera and an object in real space represented by a first point cloud of the plurality of point clouds, wherein the 2D image is interpolated according to the distance between the camera and the object in the real space;”.  The Examiner is unable to find support as claimed.  The term interpolate is not shown in paragraphs 19 & 21.  But it is shown in paragraph 22 and states “interpolates the 2D image generated in step 106 according to the distance between a camera and the point cloud”.  It does not say object in real-space as amended.  
Claim 5 recites “image is acquired by a light detection and ranging (LIDAR) of the camera” There is no evidence that the camera includes a LIDAR.  One can imply that the LIDAR device is at a similar location as the “shooting place, camera or user”. It doesn’t mean that the LIDAR is included with the camera. 
Claims 10 are rejected similar to claims 1 above.
Claims 14 are rejected similar to claims 5 above.

Claims 2-9 and 11-18 are rejected as dependent on a rejected claim.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 6 and 15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 6 recites  “wherein the 3D image is rotated according to an angle between the camera and the first point cloud and is mapped onto the 2D image by flattening”.   A camera is located in real space.  A point cloud is a data construct.  What is this angle?  How is it determined?
Claims 15 are rejected similar to claim 6 above.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1,5,10,14 i is/are rejected under 35 U.S.C. 103 as being unpatentable over Kidono (Pedestrian Recognition Using High-definition LIDAR) in view of “An Investigation of Interpolation Techniques to Generate 2D Intensity Image From LIDAR Data”, hereafter referred to as Ashraf
Regarding claim 1, Kidono discloses 1. A vehicle detection method, comprising: 
acquiring a three-dimensional (3D) image with a plurality of point clouds;  (IV.A, “1) Data acquisition: A 3D point cloud is acquired from LIDAR.”)
mapping the 3D image onto a two-dimensional (2D) image; (IV.A, “2) Segmentation: The acquired 3D point cloud is divided into two classes, ground plane and objects, by using an occupancy grid map [18]. All of the 3D points are projected onto the 2D occupancy grid, which is parallel to the ground plane”)
But does not explicitly disclose “interpolating the 2D image according to a distance between a camera and an object in real space represented by a first point cloud of the plurality of point clouds, wherein the 2D image is interpolated according to the distance between the camera and the object in the real space; “
Ashraf discloses “interpolating the 2D image according to a distance between a camera and an object in real space represented by a first point cloud of the plurality of point clouds, wherein the 2D image is interpolated according to the distance between the camera and the object in the real space; “ (Ashraf, pg. 6 left column, “Step 5: 2D image generated from LIDAR point cloud transformation is of very poor quality. Data interpolation is applied with the objective to fill in the gaps and improve the resolution of the image. Selected interpolation techniques are applied to the low resolution image to get the higher resolution image.”, Note; both the intensity and distance images are interpolated, the distance image being the distance of the LIDAR device to the object., see for example abstract and conclusion)

The suggestion/motivation for doing so would have been to improve the resolution of the image.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 

transforming the 3D image into a plurality of voxels, and extracting a plurality of 3D deep features and a plurality of 2D deep features according to the plurality of voxels; and (IV.B, “Table II lists all nine features used in the proposed method. The set of feature values of each candidate Cj forms a vector fj = (f1,..., f9).”, “f1 Number of points included the cluster 1 f2 The minimum distance to the cluster 1 f3 3D covariance matrix of the cluster 6 f4 The normalized moment of inertia tensor 6 f5 2D covariance matrix in 3 zones, which are the upper half, and the left and right lower halves 9 f6 The normalized 2D histogram for the main plane. 14×7 bins 98 f7 The normalized 2D histogram for the secondary plane. 9×5 bins 45 f8 Slice feature for the cluster 20 f9 Distribution of the reflection intensity, which is composed of the mean, the standard deviation and the normalized 1D histogram”, where the number of points is a 3d feature and 2d histogram is a 2d feature.  A 3D image inherently has voxels. )
determining a detection result according to a classification of the plurality of 3D deep features and the plurality of 2D deep features. (IV.D, “4) Classification: A feature vector is computed from the 3D point cloud of each candidate and evaluated to classify the candidate into a pedestrian or not. The proposed method applies SVM with a radial basis function (RBF) kernel to learn the classifier.”)
(inherent to LIDAR, see for example Fig. 3 of Kidono)

Regarding claim 5, Kidono in view of Ashraf discloses 5. The vehicle detection method of claim 1, wherein the 3D image is acquired by a light detection and ranging (LIDAR) of the camera and the 3D data is normalized. (Kidono, IV.A, “1) Data acquisition: A 3D point cloud is acquired from LIDAR.”)

Claim 10 is rejected under similar grounds as claim 1.
Claim 14 is rejected under similar grounds as claim 5.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 2,3,11,12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kidono in view of Ashraf in view of Fan (PGPub 2016/0154999)
Regarding claim 2, Kidono in view of Ashraf discloses 2. The vehicle detection method of claim 1, further comprising: 
But does not expressly disclose “performing a ground removal for the 3D image after acquiring the 3D image; and filtering the plurality of point clouds into a plurality of object-candidate point clouds” 
(Fan, “[0058] In the method of FIG. 2, a three-dimensional (3D) point cloud about at least one object of interest is obtained (200) as an input for the process. Ground and/or building objects are detected (202) from 3D point cloud data using an unsupervised segmentation method, and the detected ground and/or building objects are removed from the 3D point cloud data. Then, from the remaining 3D point cloud data, one or more vertical objects are detected (206) using a supervised segmentation method.”)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to remove the ground as shown with Fan to the 3d image of Kidono.	
The suggestion/motivation for doing so would have been to remove extraneous data.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Kidono with Fan to obtain the invention as specified in claim 2.

Regarding claim 3, Kidono in view of Ashraf in view of Fan discloses 3. The vehicle detection method of claim 2, 
wherein the ground removal is performed by a random sample consensus (RANSAC) method. (Fan, “[0065] The tile is examined in grid cells of a predetermined size, e.g. 25 cm.times.25 cm. The minimal-z-value (MZV) points within a multitude of 25 cm.times.25 cm grid cells are searched at different locations. For each grid cell, neighboring points that are within a z-distance threshold from the MZV point are retained as candidate ground points. Subsequently, an estimation method, for example a RANSAC (RANdom SAmple Consensus) method, is adopted to fit a plane p to candidate ground points that are collected from all cells. Finally, 3D points that are within a predetermined distance (such as d2 in FIG. 3b) from the fitted plane p are considered as ground points of each tile.”)

Claim 11 is rejected under similar grounds as claim 2.
Claim 12 is rejected under similar grounds as claim 3.

Claim 4,13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kidono in view of Ashraf in view of Fan in view of Crouch (PGPub 2019/0370614)
Regarding claim 4, Kidono in view of Ashraf in view of Fan discloses 4. The vehicle detection method of claim 2, 
But does not expressly disclose “wherein the plurality of object-candidate point clouds are filtered by a K-D tree search method” 
wherein the plurality of object-candidate point clouds are filtered by a K-D tree search method. (Crouch, “[0089] …In one embodiment, a dimensionality reduction step is performed in about real-time or near real-time, on the spin image 680 from a high dimension (e.g. number of bins 686) to a lower dimension (e.g. N-1) using projection vectors acquired during LDA. In an embodiment, a k-d tree and NN search is performed in about real-time or near-real time, to assign the object to membership in a first class. In an embodiment, the number (N) of the set of known classes and consequently the number of the set of first classification statistics is limited to a maximum threshold. In one example embodiment, N is less than or equal to 10. In another example embodiment, N is less than or equal to 100. In one embodiment, step 709 is performed in about real-time or near real-time as a result of the dimensional reduction of the spin image 680 vector and resulting k-d tree NN search in the reduced dimensional space.” )

The suggestion/motivation for doing so would have been to properly segment the Image.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Kidono in view of Fan with Crouch to obtain the invention as specified in claim 4.

Claim 13 is rejected under similar grounds as claim 4.

Claim 8,9,17-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kidono in view of Ashraf in view of Owechko (8488877)
Regarding claim 8, Kidono discloses 8. The vehicle detection method of claim 1,
 But does not expressly disclose “wherein the plurality of 3D deep features are extracted by a 3D convolutional neural network and the plurality of 2D deep features are extracted by a 2D neural network. ” 
wherein the plurality of 3D deep features are extracted by a 3D convolutional neural network and the plurality of 2D deep features are extracted by a 2D neural network.  (Owechko, Col. 5, “The present invention includes the combination of: a. Incorporating the relationships between objects (object taxonomy) in a data-driven "just-in-time" processing flow for context-based recognition of hierarchical objects; b. Grammar-based cueing and recognition of geometric objects using implicit geometry representations of the 3D data and geometric token-based finite state machines; c. Area delimitation and 2D saliency recognition using bio-inspired bottom-up visual attention and gist mechanisms; d. Statistical 3D object classifiers based on machine learning of geometric token feature vectors e. 2D and 3D object statistical object classifiers based on convolutional neural networks and prelearning of a set of relevant object features from unlabeled data that are shared by multiple objects; f. Fast local search using cognitive swarm optimization methods; g. Feedback between bottom-up cueing and top-down recognition modules for maximizing recognition rates and minimizing error rates; and h. Executive layer for handling input and output, visualization, construction of the scene map, coordination of recognition processes according to the object taxonomy, and context-based recognition and false alarm rejection. “)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to use the neural network of Owechko to calculate the features of Kidono.	
The suggestion/motivation for doing so would have been to determine features.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Kidono with Owechko to obtain the invention as specified in claim 8.

Regarding claim 9, Kidono discloses 9. The vehicle detection method of claim 1, 
But does not expressly disclose “wherein an input layer of the 3D convolutional neural network is a voxel with a dimension of 30*30*30, and a kernel quantity of a convolutional layer of the 3D convolutional neural network is 30*30 with a kernel size of 5*5*5”.

Therefore, it would have been obvious to combine  to one of ordinary skill in this art to modify Kidono with Owechko  to obtain the invention as specified in claim 9.

Claim 17 is rejected under similar grounds as claim 8.
Claim 18 is rejected under similar grounds as claim 9.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GANDHI THIRUGNANAM whose telephone number is (571)270-3261.  The examiner can normally be reached on M-F 8:30-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached on 571-272-3638.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-






/GANDHI THIRUGNANAM/Primary Examiner, Art Unit 2662