DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The amendment filed September 27th, 2022 has been entered. Claims 1, 6, 10, and 15 remain pending in the application.

Response to Arguments
Applicant's arguments filed 9/27/2022 have been fully considered but they are not persuasive.

In response to applicant’s argument that Wu fails to teach the claimed features of “a depth sensing device including a plurality of lasers” and “the number of layers of the multi-layer grid map is the same number as the number of the lasers” recited in claim 1. Examiner respectfully disagrees because Wu discloses using a LiDAR specifically a VLP-16 LiDAR in Section II.A which has 16 laser bins. Wu also discloses in Section II.A, “we have created 16 grid maps, called bin map, and each bin map has an index corresponding to the order of the laser. Therefore, Wu teaches the limitations “a depth sensing device including a plurality of lasers” and “the number of layers of the multi-layer grid map is the same number as the number of the lasers” recited in claim 1.

In response to applicant’s argument that Wu fails to teach the claimed features of “the object is cut from the multi-layer grid map to determine which one grid belong to the object through continuous grid values, and however, if the continuous grids have different heights from each other” recited in claim 1. Examiner respectfully disagrees because Wu discloses in Section II.A, using the 16 bin maps, level map, and fine map to split the pedestrian with other objects. Wu also discloses in Section II.A that “if the difference of neighbor cell's value in level map and this cell's value in fine map is higher than 2, a laser bin may scan through between this two cells. Then the neighbor occupied cell will be deleted for separation from object to object. After the object's cells splitting, the connected-component labeling is used to group the cells, which are connected together. The labeling map is called blob map. The final step is to collect the object's points, which appears in the corresponding occupied cells in the blob map.” which means that if there are differences between the heights of the neighbor cells, the laser is scanning through or penetrating through it and the neighbor occupied cell will be deleted for separation from object to object therefore object is being cut from the multi-layer grid map to determine which grid belong to the object and finally all the cells that belongs to the group or object to form a blob map. Therefore, Wu teaches the limitations “the object is cut from the multi-layer grid map to determine which one grid belong to the object through continuous grid values, and however, if the continuous grids have different heights from each other” recited in claim 1.

In response to applicant’s argument that Wu and Oh fail to teach the claimed features of “when one grid in the multi-layer grid map at the same location has a breakpoint in different grid maps (no value), this means that this position and height of the breakpoint are penetrated by the laser, and the objects at both ends are not the same objects, and breakpoint is defined as a passage of two objects of different heights at this position with more than two lasers passing apart” recited in claim 1. Examiner respectfully disagrees because Wu teaches of objects having different heights in different grid maps in Section II.A, “after that, we construct a differential map by assigning the cell to 1 if the changing of height in that cell is over 10 cm. The 16 bin maps, level map, and fine map will help to split the pedestrian with other objects, such as a person and bus stop. The way to overcome this problem is to examine the cell in fine map with the cell in level map. If the difference of neighbor cell's value in level map and this cell's value in fine map is higher than 2, a laser bin may scan through between this two cells. Then the neighbor occupied cell will be deleted for separation from object to object.”. As taught by Wu, there are different objects if there’s a higher than 2 cell differences between neighbor cell’s value because a laser bin may scan through or penetrate between the two cells. Oh also teaches a multi-layer grid map having a breakpoint or no value in different grid maps which is shown in Figure 1 and Figure 4 where the height of the objects could be different. Oh teaches that the occupied cells are particles that has reflectance of the beam of the LiDAR and the free cells are penetrated through by the beams which means the free cells between occupied cells are passage between different objects. Also, Oh teaches mapping the (X,Y,Z) of the particles in Section III which means height is also part of the grid map. Also, both Oh and Wu uses LiDAR or depth sensing device. Therefore, Oh and Wu teach the claimed features of “when one grid in the multi-layer grid map at the same location has a breakpoint in different grid maps (no value), this means that this position and height of the breakpoint are penetrated by the laser, and the objects at both ends are not the same objects, and breakpoint is defined as a passage of two objects of different heights at this position with more than two lasers passing apart” recited in claim 1.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim 1, 6, 10, and 15  are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al., "LiDAR/camera sensor fusion technology for pedestrian detection" (2017), hereinafter referred as Wu, in view of Chen et al. "Monocular 3D Object Detection for Autonomous Driving" (2016), hereinafter referred to as Chen, and in further view of Sun et al. "Deep learning based pedestrian detection" (2018), hereinafter referred to as Sun, and in further view of Oh et al. “Fast Occupancy Grid Filtering Using Grid Cell Clusters From LIDAR and Stereo Vision Sensor Data” (2016), hereinafter referred to as Oh.

Regarding claim 1, Wu teaches a method (Fig. 1, proposed lidar/camera sensor fusion design) for identifying a pedestrian (Abstract, research of Lidar/camera sensor fusion technology for pedestrian detection to ensure extremely high detection accuracy), comprising: 
capturing (Fig. 1, camera) an original image (Fig. 1, camera frame catching), detecting  a pedestrian (Fig. 1, Section II, use two sensors to find out the pedestrian region proposals in the first classification, Section II C, PVA-Lite classification, detect only the classes that are needed in the applications, that is two classes, i.e. pedestrian and background by using PVA-lite which is a simpler model of PVANET that is used in many modern CNN detection) in the original image (Fig. 1, camera frame catching), and obtaining a 2D (two-dimensional) pedestrian feature image (Section II, the proposed PVA-lite also gives the candidates in camera-based method,  Section II C, PVA-lite has a base network that is composed of three traditional convolutional layers and follow by ten Inception layers, then using the concept of hyper-net to combine different abstraction levels of features. Finally using the same architecture of Faster R-CNN to do the detection job, and the fully-connected layer is only connected to the application target class. The invention uses CNN for the detection which inherently will output the candidates as a 2D feature image since the input to the CNN is a 2D image) from the original image (Fig. 1, camera frame catching); 
obtaining a 3D (three-dimensional) data (Fig. 1, LiDAR and LiDAR packets catching), and performing a 3D identification process (Fig. 1, 3D processing & 3D classification) for the 3D data (Fig. 1, LiDAR packets catching) to obtain a 3D pedestrian feature map (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method, Section II B, 3D classification, Feature extraction is needed to be done before the 3D SVM. Support vector machines, use a decision plane to divided different classes in feature space, and our features are extracting from 3D point cloud, the combine of feature extraction and the SVM, we call it as 3D SVM instead. Fig. 4 shows an illustration of the feature we extracted from an object's point cloud) with a pedestrian feature (Fig. 4) by a depth sensing device including a plurality of lasers (Fig. 1, LiDAR is a depth sensing device that includes a plurality of lasers, Section II.A, “The VLP-16 LiDAR is adopted in the proposed design, which has 16 laser bins); 
projecting the 3D pedestrian feature map (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method) onto a 2D pedestrian feature plane image (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method. Then the LiDAR's region proposals is projected into image plane and build the ROI for pedestrian detection); and 
matching (Fig. 1, object matching) the 2D pedestrian feature image (Section II, the proposed PVA-lite also gives the candidates in camera-based method) and the 2D pedestrian feature plane image (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method. At the same time, the proposed PVA-lite also gives the candidates in camera-based method. Then the LiDAR's region proposals is projected into image plane and build the ROI for pedestrian detection) to obtain a matching image (Section II D, a list of proposal ROI from LiDAR is given, and there will also be a list of proposal ROI from camera by using PVA-lite. Some of ROI from both lists belong to same pedestrian, so an object-matching algorithm to combine or eliminate the ROI from both lists is needed); 
wherein the original image (Fig. 1, camera frame catching) and the 3D data (Fig. 1, LiDAR packets catching) are simultaneity obtained (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method. At the same time, the proposed PVA-lite also gives the candidates in camera-based method. The 3D data obtained by the LiDAR and the 2D image obtained by camera are processed for candidate detection at the same time which makes it implicit that both the 3D data and 2D image are obtained simultaneously).
wherein when the 2D pedestrian feature image (Section II, the proposed PVA-lite also gives the candidates in camera-based method) is not obtained (Section II, “for the occluded pedestrians, the proposed PVA-lite detector might be missed, but LiDAR doesn't. The candidates of occluded objects will go through the second classification process.” If the PVA-lite detector miss detecting the occluded pedestrians, that means the 2D pedestrian feature image is not obtained.), the obtained original image (Fig. 1, camera frame catching) is processed for a second 3D identification (Fig. 1 and Fig. 7, proposed second classification, Section II E, the candidate ROIs from camera will recollect the 3D point cloud appearing within the ROI of candidates. Then, it performs the feature extraction and 3D classification again, Section II, “for the occluded pedestrians, the proposed PVA-lite detector might be missed, but LiDAR doesn't. The candidates of occluded objects will go through the second classification process”.) to obtain a first interest area (Section III, 3D SVM is used to classify the point cloud that appears within PVA-lite candidates, Section II E, when the whole steps are finished, the final pedestrian detection result based on 2D and 3D classification is given), wherein the first interest area (Section III, 3D SVM is used to classify the point cloud that appears within PVA-lite candidates) is identified to obtain the 2D pedestrian feature image (Section II E, when the whole steps are finished, the final pedestrian detection result based on 2D and 3D classification is given), and
wherein the 3D pedestrian feature map (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method )  is not obtained (Section II, “if a person is lying on the wall, segmentation will treat him as a non-pedestrian object, but this will be detected by camera, this kind of candidates will also go through the second classification”. If the 3D classification treats the pedestrian object as non-pedestrian object then that means the 3D pedestrian feature map is not obtained), the obtained 3D data (Fig. 1, LiDAR packets catching) is processed for a second detection (Fig. 1 and Fig. 7, proposed second classification, Section II E, the candidate ROIs from LiDAR will crop the image within the ROI, and is classified by PVA-lite to see if it is pedestrian, Section II, “if a person is lying on the wall, segmentation will treat him as a non-pedestrian object, but this will be detected by camera, this kind of candidates will also go through the second classification.”) to obtain a second interest area (Section III, use PVA-lite to classify the simple verification candidate ROIs and enhance the occlusion objects detection rate by classifying LiDAR simple verification results by using PVA-lite again, Section II E, when the whole steps are finished, the final pedestrian detection result based on 2D and 3D classification is given.), wherein the second interest area (Section III, use PVA-lite to classify the simple verification candidate ROIs and enhance the occlusion objects detection rate by classifying LiDAR simple verification results by using PVA-lite again) is identified to obtain the 3D pedestrian feature map (Section II E, when the whole steps are finished, the final pedestrian detection result based on 2D and 3D classification is given), and
wherein when the 3D pedestrian feature map (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method) is projected to the 2D pedestrian plane image (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method. Then the LiDAR's region proposals is projected into image plane and build the ROI for pedestrian detection), the 3D pedestrian feature map (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method) is changed from a spherical coordinate to a Cartesian coordinate (It is inherent that if the 3D feature map which is in spherical coordinate is projected onto a 2D plane image which is in Cartesian coordinate, the coordinate of the 3D feature map will be in the coordinate of the 2D image plane which is Cartesian coordinate),
wherein when the 3D pedestrian feature map (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method) is projected to the 2D pedestrian feature plane image (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method. Then the LiDAR's region proposals is projected into image plane and build the ROI for pedestrian detection), detection points detected by the lasers (Fig. 1, LiDAR is a depth sensing device that includes a plurality of lasers, Section II.A, “The VLP-16 LiDAR is adopted in the proposed design, which has 16 laser bins) are projected onto a multi-layer grid map (Section II A, a 2D grid map is a big map with many little cells in it, each cell has a cover of ground of N*N cm2, the more the cell's number is, the bigger the ground covered. The detected-points in the point cloud, will project into this map), the number of layers of the multi-layer grid map is the same as the number of the lasers (Section II.A, the VLP-16 LiDAR has 16 laser bins and 16 grid maps has been created called bin map and each bin map has an index corresponding to the order of laser), each object in the multi-layer grid map (Section II A, Fig. 3) is identified as the highest (Section II A, fine map will record the highest index of bin maps in each cell) and lowest points (Section II A, level map will record the lowest index of bin maps in each cell) of the each object for adjustment (Section II A, 16 grid maps are created, called bin map, and each bin map has an index corresponding to the order of laser. The cell in each map covers the ground of 5*5 cm2. Each bin projects its height value divided by a cell's height of 10 cm to the corresponding cell. When the 16 bin maps are done, a new map called level map will record the lowest index of bin maps in each cell. Another map, called fine map, will record the highest index of each bin map, which is continuous occupied due to the value of level map), the object is cut from the multi-layer grid map (Section II A, Fig. 3) to determine which one grid belong to the object through continuous grid values (Section II A, the 16 bin maps, level map, and fine map will help to split the pedestrian with other objects, such as a person and bus stop. The way to overcome this problem is to examine the cell in fine map with the cell in level map. As seen in Section II A, the map is continuously occupied due to the value of level map.), and however, if the continuous grids have different heights from each other (Fig. 3, continuous grids have different heights from each other as seen on the fine map and level map), the object will be cut and adjusted if the lowest point of the grid around the adjacent continuous grid is higher than the highest point of the continuous grid (Section II A, if the difference of neighbor cell's value in level map and this cell's value in fine map is higher than 2, a laser bin may scan through between this two cells. Then the neighbor occupied cell will be deleted for separation from object to object. The difference between the cell's value in fine map which is the highest point and the neighbor's cell's value in level map which is the lowest point is higher than 2 which means the lowest point of neighbor’s cell is higher point of the other cell, the neighbor occupied cell will be deleted for separation from the object),
wherein in the multi-layer grid map (Section II A, Fig. 3), whether the values in the continuous grid are all 1 (meaning there is a value) to determine whether an object under test (Section II A, if the cell is nonzero value, it’s occupied, after constructing the fine map and level map, the differential map is constructed by assigning the cell to 1 if the changing of height in that cell is over 10 cm), and if the highest point of some of the continuous grids in the continuous grid in the object to be tested is smaller than the lowest point of the surrounding grid (Section II A, if the difference of neighbor cell's value in level map and this cell's value in fine map is higher than 2, a laser bin may scan through between this two cells. Then the neighbor occupied cell will be deleted for separation from object to object. The difference between the cell's value in fine map which is the highest point and the neighbor's cell's value in level map which is the lowest point is higher than 2 which means the lowest point of neighbor’s cell is higher point of the other cell, the neighbor occupied cell will be deleted for separation from the object), the object grid will be adjusted for cutting (Section II A, if the difference of neighbor cell's value in level map and this cell's value in fine map is higher than 2, a laser bin may scan through between this two cells. Then the neighbor occupied cell will be deleted for separation from object to object), and the object to be tested will be post-processed to select the object to be tested that matches the pedestrian range (Section II A, Fig. 2, after the object's cells splitting, the connected-component labeling is used to group the cells, which are connected together. The labeling map is called blob map. The final step is to collect the object's points, which appears in the corresponding occupied cells in the blob map), and
wherein the multi-layer grid map (Section II A, Fig. 3) is further used to distinguish between obstacles of the different heights points (Section II A, “the difference of neighbor cell's value in level map and this cell's value in fine map”, the level map record the lowest index of each bin map or lowest values in the continuous grid and fine map record the highest index of each bin map or the highest values in the continuous grid).

Wu does not explicitly disclose obtain the 2D pedestrian feature image according to a default threshold.
	However, Chen discloses obtain the 2D pedestrian feature image (Chen, see Section 3.2, Chen uses an SVM to created candidate proposals for the object detection, as for Wu, he also uses SVM to do the second classification for the obtained original image as seen in Fig. 1) according to a default threshold (Chen, see Section 4 and Fig. 4, Chen uses threshold to evaluate the candidate proposals. Chen uses an overlap threshold of 0.5 for pedestrians and 0.7 for cars).
	Wu and Chen are both considered to be analogous to the claimed invention because they are in the same field of object detection. It would have been obvious to one of ordinary skill in the art at the time of filing to use the default threshold of Chen on the method of Wu, because it is predictable that doing so would make the method of Wu obtains high-quality object detections due to the different thresholds for different classes (Chen, Section 5), and it will make it more robust to ground plane errors (hen, Section 3.4).

Wu does not explicitly disclose obtain the 3D pedestrian feature map according to another default threshold.
	However, Sun discloses obtain the 3D pedestrian feature map (Sun, see Abstract and Fig. 1, Sun uses a PVANet to generate feature maps which is used to generate pedestrian candidates, as for Wu, he uses PVA-lite, see Fig. 1, to do the second classification of the obtained 3D data, PVA-lite is based on PVANet as seen in Section II C of Wu) according to another default threshold (Sun, see Section 3.3, Sun uses a threshold of 0.7 for the pedestrian detection).
	Wu and Sun are both considered to be analogous to the claimed invention because they are in the same field of object detection. It would have been obvious to one of ordinary skill in the art at the time of filing to use the default threshold of Sun on the method of Wu, because it is predictable that doing so would make sure that the sample is correctly classified (Sun, Section 2.3).

Wu does not explicitly disclose when one grid in the multi-layer grid map at the same location has a breakpoint in different grid maps (no value) , this means that this position and height of the breakpoint are penetrated by the laser, and the objects at both ends are not the same object, and the breakpoint is defined as a passage of two objects of different heights at this position with more than two lasers passing apart.
	However, Oh discloses when one grid in the multi-layer grid map (Section I para. 5, “Grid Cell Cluster (GCC), clusters grid cells by projecting 3D data onto the superpixels of the image data.”, Abstract, “a GCC consisting of several grid cells”, Fig. 4 shows the creation of the grid map, Section I, “we propose an efficient and effective manner of representing and predicting grid states using a multi-layer LIDAR and a stereo vision sensor.”, Section II, “a particle in a LIDAR sensor is the reflectance of a beam”) at the same location has a breakpoint (Fig. 1, the same location has a breakpoint or space, it can be seen that some of the cells in the same row has blank cells or free cells, the red cell defines that it is occupied, Fig. 4, shows the creation of grid map clearly, in Fig. 4.d, there are breakpoint between the data in the grid map) in different grid maps (no value) (Section II.A, “In our work, the (X, Y, Z) axes denote a progression from left to right, from bottom to up, and from near to far with regard to the egovehicle, respectively. Here, θk denotes the sensor observations at time k. We use a 2D grid map as an occupancy grid map on the (X, Z) axis, where 3D measurements from LIDAR and stereo camera are binned. Following this, to filter the occupancy grid cells, we represent the states of each grid cell c (c = 1,··· , N) at time k , where p(Wc k |θk) contains dynamic states p(xc k |θk) and the occupancy state p(oc k |θk). From the represented grid states, the state of each grid cell at time k +1 is predicted as p(Wc k+1|θ1:k).”, Section II A para. 2, “The occupancy state of grid cell c at time k , p(ock|θ1:k) can be occupied or free as ock∈{Ock,Fck}”, so the cell can have value/occupied or no value/free which can be clearly seen in Fig. 1), this means that this position and height of the breakpoint are penetrated by the laser (Section II.A, “The occupancy probability of grid cell c is measured as the ratio of the number of occupied particles (∑Npki=1γc,ik ) to the possible number of particles in the cell c (Γcmax)”and ” The free probability that of grid cell c is free can be defined as follows: p(Fck|θ1:k)=1−p(Ock|θ1:k)”, which means the cell is occupied if the laser touches a particles otherwise it is free because the laser just go through the same position and no particles is sensed by the laser so it is a breakpoint or it is being penetrated by the laser or beam, in Section II, “a particle in a LIDAR sensor is the reflectance of a beam”, so if it is not a particle then the beam is not breaking through it, this can be clearly seen in Fig. 1 and Fig. 4),, and the objects at both ends are not the same object (Fig. 1, the objects are different when there are free cells in between the same location), and the breakpoint is defined as a passage of two objects of different heights at this position with more than two lasers passing apart (Wu teaches that each map is defined by different height and “if the difference of neighbor cell’s value in level map and this cell’s value in fine map is higher than 2, a laser bin may scan through between two cells” (Wu, Section II.A)  and Oh teaches in Fig. 1, that the two different objects has at least two free cells in between each object, the red cell defines that it is occupied, as seen in Fig. 4, the objects are of different heights so Oh teaches that the breakpoint of free cells defines a passage or space between two objects of different heights and as seen in Fig. 1 and Fig. 4 where the grid cell are from the multi-layer LiDAR has two lasers passing apart between the objects,).
	Wu and Oh are both considered to be analogous to the claimed invention because they are in the same field of multi-sensor object detection. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as taught by Wu to incorporate the teachings of Oh when one grid in the multi-layer grid map at the same location has a breakpoint in different grid maps (no value) , this means that this position and height of the breakpoint are penetrated by the laser, and the objects at both ends are not the same object, and the breakpoint is defined as a passage of two objects of different heights at this position with more than two lasers passing apart. The motivation for the proposed modification would have been because using “the execution time for predictions and updates is faster” and “the predictions are more accurate”. (Oh, Section I Paragraph 5).

Regarding claim 6, the combination of Wu in view of Chen and in further view of Sun and in further view of Oh teaches the method (Wu, Fig. 1, proposed lidar/camera sensor fusion design) according to claim 1, wherein the pedestrian (Wu, Fig. 1, Section II, use two sensors to find out the pedestrian region proposals in the first classification, Section II C, PVA-Lite classification, detect only the classes that are needed in the applications, that is two classes, i.e. pedestrian and background by using PVA-lite which is a simpler model of PVANET that is used in many modern CNN detection) in the original image (Wu, Fig. 1, camera frame catching) is detected by a depth learning technology (Wu, Fig. 1, Section II C, deep learning PVA-lite), and the 3D data (Fig. 1, LiDAR packets catching) is processed for a second detection (Wu, Fig. 1, Section II B, 3D classification and 3D classification).


Regarding claim 10, Wu teaches a system (Fig. 1 and Fig. 5, mechanism of the proposed sensor fusion design) for identifying a pedestrian (Abstract, research of Lidar/camera sensor fusion technology for pedestrian detection to ensure extremely high detection accuracy), comprising: 
an image capturing device (Section III, Logitech C920 camera, Fig. 1, camera), capturing (Fig. 1, camera) an original image (Fig. 1, camera frame catching), detecting a pedestrian (Fig. 1, Section II, use two sensors to find out the pedestrian region proposals in the first classification, Section II C, PVA-Lite classification, detect only the classes that are needed in the applications, that is two classes, i.e. pedestrian and background by using PVA-lite which is a simpler model of PVANET that is used in many modern CNN detection) in the original image (Fig. 1, camera frame catching), and obtaining a 2D (two-dimensional) pedestrian feature image (Section II, the proposed PVA-lite also gives the candidates in camera-based method,  Section II C, PVA-lite has a base network that is composed of three traditional convolutional layers and follow by ten Inception layers, then using the concept of hyper-net to combine different abstraction levels of features. Finally using the same architecture of Faster R-CNN to do the detection job, and the fully-connected layer is only connected to the application target class. The invention uses CNN for the detection which inherently will output the candidates as a 2D feature image since the input to the CNN is a 2D image) from the original image (Fig. 1, camera frame catching); 
a depth sensing device (Section III, VLP-16 LiDAR, Fig. 1, LiDAR), including a plurality of lasers (Section III, VLP-16 LiDAR has 16 lasers) obtaining a 3D (three-dimensional) information (Fig. 1, LiDAR and LiDAR packets catching), and performing a 3D identification process (Fig. 1, 3D processing & 3D classification) for the 3D information (Fig. 1, LiDAR packets catching) to obtain a 3D pedestrian feature map (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method, Section II B, 3D classification, Feature extraction is needed to be done before the 3D SVM. Support vector machines, use a decision plane to divided different classes in feature space, and our features are extracting from 3D point cloud, the combine of feature extraction and the SVM, we call it as 3D SVM instead. Fig. 4 shows an illustration of the feature we extracted from an object's point cloud) with the pedestrian feature (Fig. 4); 
and a matching device (Section III, proposed algorithm is running on the Intel Core i7-5930K with 1.50GHz with 12GB RAM. and NVidia Titan X), matching (Fig. 1, object matching) the 2D pedestrian feature image (Section II, the proposed PVA-lite also gives the candidates in camera-based method) and the 2D pedestrian feature plane image (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method. At the same time, the proposed PVA-lite also gives the candidates in camera-based method. Then the LiDAR's region proposals is projected into image plane and build the ROI for pedestrian detection) to obtain a matching image (Section II D, a list of proposal ROI from LiDAR is given, and there will also be a list of proposal ROI from camera by using PVA-lite. Some of ROI from both lists belong to same pedestrian, so an object-matching algorithm to combine or eliminate the ROI from both lists is needed); 
wherein the image capturing device (Section III, Logitech C920 camera, Fig. 1, camera) and the depth sensing device (Section III, VLP-16 LiDAR, Fig. 1, LiDAR) respectively and simultaneity (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method. At the same time, the proposed PVA-lite also gives the candidates in camera-based method. The 3D data obtained by the LiDAR and the 2D image obtained by camera are processed for candidate detection at the same time which makes it implicit that both the 3D data and 2D image are obtained simultaneously) obtain the original image (Fig. 1, camera frame catching) and the 3D data (Fig. 1, LiDAR packets catching).
wherein the image capturing device (Section III, Logitech C920 camera, Fig. 1, camera) does not obtain (Section II, “for the occluded pedestrians, the proposed PVA-lite detector might be missed, but LiDAR doesn't. The candidates of occluded objects will go through the second classification process.” If the PVA-lite detector miss detecting the occluded pedestrians, that means the 2D pedestrian feature image is not obtained.) the 2D pedestrian feature image (Section II, the proposed PVA-lite also gives the candidates in camera-based method), the obtained original image (Fig. 1, camera frame catching) is transmitted (Fig. 1 and Fig. 7) to the depth sensing device (Section III, VLP-16 LiDAR, Fig. 1, LiDAR) to perform the second identification process (Fig. 1 and Fig. 7, proposed second classification, Section II E, the candidate ROIs from camera will recollect the 3D point cloud appearing within the ROI of candidates. Then, it performs the feature extraction and 3D classification again, Section II, “for the occluded pedestrians, the proposed PVA-lite detector might be missed, but LiDAR doesn't. The candidates of occluded objects will go through the second classification process.”) to obtain a first interest area (Section III, 3D SVM is used to classify the point cloud that appears within PVA-lite candidates, Section II E, when the whole steps are finished, the final pedestrian detection result based on 2D and 3D classification is given), wherein the depth sensing device (Section III, VLP-16 LiDAR, Fig. 1, LiDAR) identifies the first interest area (Section III, 3D SVM is used to classify the point cloud that appears within PVA-lite candidates) to obtain the 2D pedestrian feature image (Section II E, when the whole steps are finished, the final pedestrian detection result based on 2D and 3D classification is given), and
wherein when the depth sensing device (Section III, VLP-16 LiDAR, Fig. 1, LiDAR) does not obtain (Section II, “if a person is lying on the wall, segmentation will treat him as a non-pedestrian object, but this will be detected by camera, this kind of candidates will also go through the second classification”. If the 3D classification treats the pedestrian object as non-pedestrian object then that means the 3D pedestrian feature map is not obtained) the 3D pedestrian feature image (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method ), the 3D data (Fig. 1, LiDAR packets catching) is transmitted (Fig. 1 and Fig. 7) to the image capturing device (Section III, Logitech C920 camera, Fig. 1, camera) for a second detection (Fig. 1 and Fig. 7, proposed second classification, Section II E, the candidate ROIs from LiDAR will crop the image within the ROI, and is classified by PVA-lite to see if it is pedestrian, Section II, “if a person is lying on the wall, segmentation will treat him as a non-pedestrian object, but this will be detected by camera, this kind of candidates will also go through the second classification”.) to obtain a second interest area (Section III, use PVA-lite to classify the simple verification candidate ROIs and enhance the occlusion objects detection rate by classifying LiDAR simple verification results by using PVA-lite again, Section II E, when the whole steps are finished, the final pedestrian detection result based on 2D and 3D classification is given), wherein the image capturing device (Section III, Logitech C920 camera, Fig. 1, camera) identifies the second interest area (Section III, use PVA-lite to classify the simple verification candidate ROIs and enhance the occlusion objects detection rate by classifying LiDAR simple verification results by using PVA-lite again) to obtain the 3D pedestrian feature map (Section II E, when the whole steps are finished, the final pedestrian detection result based on 2D and 3D classification is given), and
wherein when the match device (Section III, proposed algorithm is running on the Intel Core i7-5930K with 1.50GHz with 12GB RAM. and NVidia Titan X) projects the 3D pedestrian feature map (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method) to the 2D pedestrian plane image (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method. At the same time, the proposed PVA-lite also gives the candidates in camera-based method. Then the LiDAR's region proposals is projected into image plane and build the ROI for pedestrian detection), the 3D pedestrian feature map (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method ) is changed from a spherical coordinate to a Cartesian coordinate (It is inherent that if the 3D feature map which is in spherical coordinate is projected onto a 2D plane image which is in Cartesian coordinate, the coordinate of the 3D feature map will be in the coordinate of the 2D image plane which is Cartesian coordinate),
wherein when the 3D pedestrian feature map (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method) is projected to the 2D pedestrian feature plane image (Section II, a proposal segmentation and features give the potential candidates in LiDAR-based method. Then the LiDAR's region proposals is projected into image plane and build the ROI for pedestrian detection), detection points detected by the lasers (Fig. 1, LiDAR is a depth sensing device that includes a plurality of lasers, Section II.A, “The VLP-16 LiDAR is adopted in the proposed design, which has 16 laser bins) are projected onto a multi-layer grid map (Section II A, a 2D grid map is a big map with many little cells in it, each cell has a cover of ground of N*N cm2, the more the cell's number is, the bigger the ground covered. The detected-points in the point cloud, will project into this map), the number of layers of the multi-layer grid map is the same as the number of the lasers (Section II.A, the VLP-16 LiDAR has 16 laser bins and 16 grid maps has been created called bin map and each bin map has an index corresponding to the order of laser), are projected onto a multi-layer grid map (Section II A, a 2D grid map is a big map with many little cells in it, each cell has a cover of ground of N*N cm2, the more the cell's number is, the bigger the ground covered. The detected-points in the point cloud, will project into this map), each object in the multi-layer grid map (Section II A, Fig. 3) is identified as the highest (Section II A, fine map will record the highest index of bin maps in each cell) and lowest points (Section II A, level map will record the lowest index of bin maps in each cell) of the each object for adjustment (Section II A, 16 grid maps are created, called bin map, and each bin map has an index corresponding to the order of laser. The cell in each map covers the ground of 5*5 cm2. Each bin projects its height value divided by a cell's height of 10 cm to the corresponding cell. When the 16 bin maps are done, a new map called level map will record the lowest index of bin maps in each cell. Another map, called fine map, will record the highest index of each bin map, which is continuous occupied due to the value of level map), the object is cut from the multi-layer grid map (Section II A, Fig. 3) to determine which one grid belong to the object through continuous grid values (Section II A, the 16 bin maps, level map, and fine map will help to split the pedestrian with other objects, such as a person and bus stop. The way to overcome this problem is to examine the cell in fine map with the cell in level map. As seen in Section II A, the map is continuously occupied due to the value of level map.), and however, if the continuous grids have different heights from each other (Fig. 3, continuous grids have different heights from each other as seen on the fine map and level map), the object will be cut and adjusted if the lowest point of the grid around the adjacent continuous grid is higher than the highest point of the continuous grid (Section II A, if the difference of neighbor cell's value in level map and this cell's value in fine map is higher than 2, a laser bin may scan through between this two cells. Then the neighbor occupied cell will be deleted for separation from object to object. The difference between the cell's value in fine map which is the highest point and the neighbor's cell's value in level map which is the lowest point is higher than 2 which means the lowest point of neighbor’s cell is higher point of the other cell, the neighbor occupied cell will be deleted for separation from the object),
wherein in the multi-layer grid map (Section II A, Fig. 3), whether the values in the continuous grid are all 1 (meaning there is a value) to determine whether an object under test (Section II A, if the cell is nonzero value, it’s occupied, after constructing the fine map and level map, the differential map is constructed by assigning the cell to 1 if the changing of height in that cell is over 10 cm), and if the highest point of some of the continuous grids in the continuous grid in the object to be tested is smaller than the lowest point of the surrounding grid (Section II A, if the difference of neighbor cell's value in level map and this cell's value in fine map is higher than 2, a laser bin may scan through between this two cells. Then the neighbor occupied cell will be deleted for separation from object to object. The difference between the cell's value in fine map which is the highest point and the neighbor's cell's value in level map which is the lowest point is higher than 2 which means the lowest point of neighbor’s cell is higher point of the other cell, the neighbor occupied cell will be deleted for separation from the object), the object grid will be adjusted for cutting (Section II A, if the difference of neighbor cell's value in level map and this cell's value in fine map is higher than 2, a laser bin may scan through between this two cells. Then the neighbor occupied cell will be deleted for separation from object to object), and the object to be tested will be post-processed to select the object to be tested that matches the pedestrian range (Section II A, Fig. 2, after the object's cells splitting, the connected-component labeling is used to group the cells, which are connected together. The labeling map is called blob map. The final step is to collect the object's points, which appears in the corresponding occupied cells in the blob map) , and
wherein the multi-layer grid map (Section II A, Fig. 3) is further used to distinguish between obstacles of the different heights points (Section II A, “the difference of neighbor cell's value in level map and this cell's value in fine map”, the level map record the lowest index of each bin map or lowest values in the continuous grid and fine map record the highest index of each bin map or the highest values in the continuous grid).

Wu does not explicitly disclose obtain the 2D pedestrian feature image according to a default threshold.
	However, Chen discloses obtain the 2D pedestrian feature image (Chen, see Section 3.2, Chen uses an SVM to created candidate proposals for the object detection, as for Wu, he also uses SVM to do the second classification for the obtained original image as seen in Fig. 1) according to a default threshold (Chen, see Section 4 and Fig. 4, Chen uses threshold to evaluate the candidate proposals. Chen uses an overlap threshold of 0.5 for pedestrians and 0.7 for cars).
	Wu and Chen are both considered to be analogous to the claimed invention because they are in the same field of object detection. It would have been obvious to one of ordinary skill in the art at the time of filing to use the default threshold of Chen on the system of Wu, because it is predictable that doing so would make the method of Wu obtains high-quality object detections due to the different thresholds for different classes (Chen, Section 5), and it will make it more robust to ground plane errors (hen, Section 3.4).

Wu does not explicitly disclose obtain the 3D pedestrian feature map according to another default threshold.
	However, Sun discloses obtain the 3D pedestrian feature map (Sun, see Abstract and Fig. 1, Sun uses a PVANet to generate feature maps which is used to generate pedestrian candidates, as for Wu, he uses PVA-lite, see Fig. 1, to do the second classification of the obtained 3D data, PVA-lite is based on PVANet as seen in Section II C of Wu) according to another default threshold (Sun, see Section 3.3, Sun uses a threshold of 0.7 for the pedestrian detection).
	Wu and Sun are both considered to be analogous to the claimed invention because they are in the same field of object detection. It would have been obvious to one of ordinary skill in the art at the time of filing to use the default threshold of Sun on the system of Wu, because it is predictable that doing so would make sure that the sample is correctly classified (Sun, Section 2.3).

Wu does not explicitly disclose when one grid in the multi-layer grid map at the same location has a breakpoint in different grid maps (no value) , this means that this position and height of the breakpoint are penetrated by the laser, and the objects at both ends are not the same object, and the breakpoint is defined as a passage of two objects of different heights at this position with more than two lasers passing apart.
	However, Oh discloses when one grid in the multi-layer grid map (Section I para. 5, “Grid Cell Cluster (GCC), clusters grid cells by projecting 3D data onto the superpixels of the image data.”, Abstract, “a GCC consisting of several grid cells”, Fig. 4 shows the creation of the grid map, Section I, “we propose an efficient and effective manner of representing and predicting grid states using a multi-layer LIDAR and a stereo vision sensor.”, Section II, “a particle in a LIDAR sensor is the reflectance of a beam”) at the same location has a breakpoint (Fig. 1, the same location has a breakpoint or space, it can be seen that some of the cells in the same row has blank cells or free cells, the red cell defines that it is occupied, Fig. 4, shows the creation of grid map clearly, in Fig. 4.d, there are breakpoint between the data in the grid map) in different grid maps (no value) (Section II.A, “In our work, the (X, Y, Z) axes denote a progression from left to right, from bottom to up, and from near to far with regard to the egovehicle, respectively. Here, θk denotes the sensor observations at time k. We use a 2D grid map as an occupancy grid map on the (X, Z) axis, where 3D measurements from LIDAR and stereo camera are binned. Following this, to filter the occupancy grid cells, we represent the states of each grid cell c (c = 1,··· , N) at time k , where p(Wc k |θk) contains dynamic states p(xc k |θk) and the occupancy state p(oc k |θk). From the represented grid states, the state of each grid cell at time k +1 is predicted as p(Wc k+1|θ1:k).”, Section II A para. 2, “The occupancy state of grid cell c at time k , p(ock|θ1:k) can be occupied or free as ock∈{Ock,Fck}”, so the cell can have value/occupied or no value/free which can be clearly seen in Fig. 1), this means that this position and height of the breakpoint are penetrated by the laser (Section II.A, “The occupancy probability of grid cell c is measured as the ratio of the number of occupied particles (∑Npki=1γc,ik ) to the possible number of particles in the cell c (Γcmax)”and ” The free probability that of grid cell c is free can be defined as follows: p(Fck|θ1:k)=1−p(Ock|θ1:k)”, which means the cell is occupied if the laser touches a particles otherwise it is free because the laser just go through the same position and no particles is sensed by the laser so it is a breakpoint or it is being penetrated by the laser or beam, in Section II, “a particle in a LIDAR sensor is the reflectance of a beam”, so if it is not a particle then the beam is not breaking through it, this can be clearly seen in Fig. 1 and Fig. 4),, and the objects at both ends are not the same object (Fig. 1, the objects are different when there are free cells in between the same location), and the breakpoint is defined as a passage of two objects of different heights at this position with more than two lasers passing apart (Wu teaches that each map is defined by different height and “if the difference of neighbor cell’s value in level map and this cell’s value in fine map is higher than 2, a laser bin may scan through between two cells” (Wu, Section II.A)  and Oh teaches in Fig. 1, that the two different objects has at least two free cells in between each object, the red cell defines that it is occupied, as seen in Fig. 4, the objects are of different heights so Oh teaches that the breakpoint of free cells defines a passage or space between two objects of different heights and as seen in Fig. 1 and Fig. 4 where the grid cell are from the multi-layer LiDAR has two lasers passing apart between the objects,).
	Wu and Oh are both considered to be analogous to the claimed invention because they are in the same field of multi-sensor object detection. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as taught by Wu to incorporate the teachings of Oh when one grid in the multi-layer grid map at the same location has a breakpoint in different grid maps (no value) , this means that this position and height of the breakpoint are penetrated by the laser, and the objects at both ends are not the same object, and the breakpoint is defined as a passage of two objects of different heights at this position with more than two lasers passing apart. The motivation for the proposed modification would have been because using “the execution time for predictions and updates is faster” and “the predictions are more accurate”. (Oh, Section I Paragraph 5).

Regarding claim 15, the combination of Wu in view of Chen and in further view of Sun and in further view of Oh teaches the system (Wu, Fig. 1 and Fig. 5, mechanism of the proposed sensor fusion design) according to claim 10, wherein the image capturing device (Wu, Section III, Logitech C920 camera, Fig. 1, camera) detects the pedestrian (Wu, Fig. 1, Section II, use two sensors to find out the pedestrian region proposals in the first classification, Section II C, PVA-Lite classification, detect only the classes that are needed in the applications, that is two classes, i.e. pedestrian and background by using PVA-lite which is a simpler model of PVANET that is used in many modern CNN detection) in the original image (Wu, Fig. 1, camera frame catching) by a depth learn technology (Wu, Fig. 1, Section II C, deep learning PVA-lite).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DENISE G ALFONSO whose telephone number is (571)272-1360. The examiner can normally be reached Monday - Friday 7:30 - 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on 571-270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DENISE G ALFONSO/Examiner, Art Unit 2663       

/CLAIRE X WANG/Supervisory Patent Examiner, Art Unit 2663