DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Original claims 1-35 are pending in the instant application.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on July 14, 2020, is being considered by the examiner.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.



Claim(s) 1, 10-12, 14-16, 18, 27-29 and 31-33 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 4, 5, 9-10, 12, 17 and 20 of copending Application No. 16/876,699 in view of ‘Wirges’ (“Self-Supervised Flow Estimation using Geometric Regularization with Applications to Camera Image and Grid Map Sequences,” 2019).  See the table below, which identifies corresponding claims of the instant application and the copending application.
The claims of the copending application disclose all elements of the claims of the instant application, except for using the flow estimation to perform self-supervised learning.
However, Wirges does teach using flow estimations to perform self-supervised learning (Sec. III teaches loss functions used to perform self-supervised learning; Sec. V teaches application of this self-supervised learning technique to neural-network-based flow estimations on point cloud data).
Wirges teaches that many flow estimation techniques “need labeled training data, either simulated or annotated by humans” (Sec. I, 4th par.), but this has drawbacks including the suboptimal realism of simulated data and the cumbersome and tedious nature of manual annotation (Sec. I, 4th par.).  Wirges teaches that its techniques are able to avoid these disadvantages by learning flow estimators without the need of annotated training data (Sec. I, 5th par.).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the claimed inventions of the copending application with the self-supervised learning of Wirges in order to improve the inventions with the reasonable expectation that this would result in claimed inventions that were able to train their flow estimators without a need for labeled training data, thereby avoiding drawbacks associated with a lack of realism and/or a cumbersome and tedious annotation process.
Therefore, the claims of the instant application are not patentably distinct from the claims of the copending application.
This is a provisional nonstatutory double patenting rejection.

Corresponding Claims
Claim of Instant Application No. 16/876,751
Claim of Co-Pending Application No. 16/876,669
1
1
10
4
11
5
12
9
14
10
15
12
16
17
18
20
27
4
28
5
29
9
31
10
32
12
33
17



Claim Objections
Claim(s) 1, 7, 18-19, 24 and 35 is/are objected to because of the following informalities:  
In claim 2, line 1 on page 58, the term “BeV” is undefined.  The claim should be amended to either recite “bird’s-eye-view” or define the term BeV.
Claim 19 should be corrected in the same manner as claim 2.
In claim 7, line 2, “from to” should be “from” (i.e. delete “to”)
Claim 24 should be corrected in the same manner as claim 7.
In claim 18, insert “and” at the end of the third line
In claim 35, second-to-last line, “performing” should be “perform”
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim(s) 4, 12-16, 21, and 29-33 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 4 recites the limitation "the cost function" in line 2.  There is insufficient antecedent basis for this limitation in the claim.
Claim 21 recites a similar limitation and is indefinite for substantially the same reasons as claim 4.


Claim 12 recites the limitation "the pseudo image" in line 1.  There is insufficient antecedent basis for this limitation in the claim.
Claim 29 recites a similar limitation and is indefinite for substantially the same reason as claim 12.  Claims 13-14 and 30-31 are also indefinite at least because they include the limitations of claim 12 or claim 29.

Claim 15 recites “compute a single mean velocity and co-variance for each obstacle cluster” (last line).  The meaning of this limitation is unclear.  There is no prior recitation of any obstacle clusters in the claim.  In fact, there are no prior recitations of any obstacle or any cluster.  This makes the scope of the claim indefinite at least because it is unclear what the obstacle clusters are, how many there are, and how they relate to the other elements of the claimed invention, such as the object.
Claim 32 is also indefinite for substantially the same reasons as claim 15.

Claim 16 recites “The method of claim 1, wherein the method is performed using three or more sets of 3D point cloud data of the scene, including” (first two lines) followed by a series of steps.  It is unclear how the steps recited in claim 16 relate to the steps recited in claim 1, and this ambiguity renders the claim indefinite.
Some of the steps listed in claim 16 do not correspond to the steps listed in claim 1.  For example, claim 16 recites “aligning all of the point cloud data sets into the same coordinate frame” (lines 2-3), but this step is not recited in claim 1.  This appears to indicate that claim 16 is adding steps to the method of claim 1.
However, some of the steps listed in claim 16 substantially correspond to steps listed in claim 1.  For example, claim 1 recites “encoding data of the point cloud data sets using a pillar feature network to extract two-dimensional (2D) bird’s-eye-view embeddings for each of the point cloud data” and claim 16 similarly recites “encoding data of each of the point cloud data sets using a pillar feature network to extract two-dimensional (2D) bird’s-eye-view embeddings for each of the point cloud data sets”.  In another example, both claims recite “performing a 2D optical flow estimation to estimate the velocity of the object”.  On the one hand, this substantial repetition appears to indicate that claim 16 is referring back to claim 1.  On the other hand, the use of the word “a” (i.e. “a pillar feature network” and “a 2D optical flow estimation”) suggests that new steps are being introduced to the claim.
Claim 16 is indefinite because it is unclear how its scope relates to the scope of claim 1, and what steps are being added by claim 16 with respect to claim 1.
Claim 33 is also indefinite for substantially the same reasons as claim 16.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 35 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  
Claim 35 is directed to a “system” comprising “a pillar feature network” and “a feature pyramid network”, which are both interpreted to be neural networks in light of the specification.  A neural network does not necessarily have any physical or tangible form.  For example, a neural network can exist as a set of rules (i.e. a form of information) or be embodied as a computer program.  Claim 35 does not contain any structural recitations beyond these two networks.
“Non-limiting examples of claims that are not directed to any of the statutory categories include: … Products that do not have a physical or tangible form, such as information (often referred to as "data per se") or a computer program per se (often referred to as "software per se") when claimed as a product without any structural recitations.”  MPEP 2106.03, Subsection I.
Claim 35 is not patent-eligible under 35 U.S.C. 101 because its scope includes embodiments that are products without a physical or tangible form, such as information or a computer program, which are not directed to any of the statutory categories of inventions.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  


The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1-3, 10-14, 18-20, and 27-31 is/are rejected under 35 U.S.C. 103 as being unpatentable over ‘Wirges’ (“Self-Supervised Flow Estimation using Geometric Regularization with Applications to Camera Image and Grid Map Sequences,” 2019) in view of ‘Lang’ (“PointPillars: Fast Encoders for Object Detection from Point Clouds,” 2019).
Regarding claim 1, Wirges teaches a method for determining velocity of an object associated with a three-dimensional (3D) scene (e.g. Fig. 1, flow associated with 3D scene is estimated; also see e.g. Fig. 3, velocity of a moving vehicle object is determined), the method comprising:
receiving two sets of 3D point cloud data of the scene from two consecutive point cloud sweeps (e.g. Sec. III, 1st par., two consecutive frames are received; e.g. Sec. V., 1st par., the input frames may be lidar measurements – i.e. point cloud sweeps); 
encoding data of the point cloud data sets using a pillar feature network to (see Note Regarding Pillar Network below) extract two-dimensional (2D) bird's-eye-view embeddings for each of the point cloud data, wherein first 2D bird's-eye-view embeddings for a first of the two point cloud data sets comprises pillar features for the first point cloud data set and second 2D bird's-eye-view embeddings for a second of the two point cloud data sets comprises pillar features for the second point cloud data set (Sec. V, 1st par., the lidar measurements are converted to first and second grid map embeddings; Sec. V, 1st par., the grid maps are “top-view” – i.e. bird’s eye view; Sec. V, 1st par., and Sec. V.A.b, the grid map stores multiple features of the lidar points lying above each discrete cell along the ground surface – i.e. lying within a “pillar” extending up from the ground surface within a cell’s boundaries; For at least these reasons, the features of the grid map can be considered “pillar features”; see Note Regarding Pillar Network below); 
performing a 2D optical flow estimation using an optical flow network to estimate the velocity of the object (Fig. 3, Sec. V, 2nd par., velocity is estimated for each cell, including for cells corresponding to a vehicle object; Sec. V.A, 1st par., the velocity estimate is from FlowNetC or PWCNet optical flow networks); and 
using the flow estimation to perform self-supervised learning (Sec. III, self-supervised learning is performed; e.g. Sec. III.B, eq. 8, motion consistency loss term is based on the flow estimation                                 
                                    f
                                
                            ; Also see Sec. V.A.c, eq. 12).

Note Regarding Pillar Network.  Wirges forms its grid maps by partitioning the ground plane into a regular grid (Sec. V, 1st par.; Sec. V.A.b) and then performing a hand-crafted feature encoding method on the points in each grid cell (Sec. V, 1st par., hand-crafted features are encoded, such as number of surface reflections, minimum and maximum height above ground, average reflection intensity, etc.).
Wirges does not teach encoding data of the point cloud data sets using a pillar feature network to extract the 2D bird’s-eye-view embeddings for each of the point cloud data.
However, Lang does teach techniques for encoding data of a point cloud data set using a pillar feature network (e.g. Fig. 2, Pillar Feature Net), which extracts a 2D bird’s-eye-view embedding for each point cloud (Sec. 2.1, point cloud is separated into set of pillars, which are then input to a simplified version of PointNet to extract features, which are then scattered back to original pillar locations to generate a pseudo image, which is a 2D bird’s-eye-view embedding for the point cloud), the 2D bird’s-eye-view embedding comprising pillar features for the point cloud data set (Sec. 2.1, last two pars., pseudo image includes                         
                            C
                        
                     features for each pillar).
Lang recognizes that the grid partitioning and hand-crafted feature encoding approach used by Wirges (see above) is a known workaround to the problem of sparsity in bird’s-eye-view lidar point clouds (Pg. 2, first two pars., especially 2nd sentence of 2nd par.).  However, Lang teaches that this known workaround used by Wirges “may be sub-optimal since the hard-coded feature extraction may not generalize to new configurations without significant engineering efforts” (Pg. 2, 2nd par.).  Lang teaches that its PointPillars approach is advantageous because “by learning features instead of relying of fixed encoders, PointPillars can leverage the full information represented by the point cloud” (Pg. 2, 3rd par.).  Indeed, Lang’s results demonstrate that “learning the feature encoding is strictly superior to fixed encoders across all resolution” (Sec. 7.4, 2nd par.; Table 4).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the method of Wirges with the pillar feature network encoding of Lang in order to improve the method with the reasonable expectation that this would result in a method that used learned features instead of fixed encoders, and therefore could generalize to new configurations without significant engineering efforts, leverage the full information represented by a point cloud, and/or achieve superior performance.  This technique for improving the method of Wirges was within the ordinary ability of one of ordinary skill in the art based on the teachings of Lang.
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Wirges and Lang in order to obtain the invention as specified in claim 1.

Regarding claim 2, Wirges in view of Lang teaches the method of claim 1, and Wirges further teaches that using the flow estimation to perform self-supervised learning comprises minimizing a distance between the first and second bird's-eye-view embeddings (Sec. III, the self-supervised learning comprises minimizing loss                                 
                                    L
                                
                            ; e.g. Sec. III.A, eq. 4, the loss includes a data consistency term that measures a distance between pixel values of the first and second pseudo images, which are the first and second bird’s-eye-view embeddings) and learning to predict a BeV flow estimator that is consistent with motion of the pillar features without needing data labels (Sec. III, the self-supervised training learns to estimate flow by minimizing loss function                                 
                                    L
                                
                            , which does not require any data labels; Sec. V.A, 1st par., the BeV flow estimator is FlowNetC or PWCNet).

Regarding claim 3, Wirges in view of Lang teaches the method of claim 1, and Wirges further teaches that the 2D optical flow estimation comprises performing a forward flow estimate for flow from the first 2D bird's-eye-view embeddings to the second 2D bird's-eye-view embeddings and a reverse flow estimate for flow from the second 2D bird's-eye-view embeddings to the first 2D bird's-eye-view embeddings (Sec. III, eq. 3 and surrounding text, both forward  (i.e.                                 
                                    2
                                    ⟵
                                    1
                                
                            ) and backward/reverse (i.e.                                 
                                    1
                                    ⟵
                                    2
                                
                            ) flows are estimated in order to compute loss function).

Regarding claim 10, Wirges in view of Lang teaches the method of claim 1.
Wirges teaches extracting pillar features from each of a first point cloud data set and a second point cloud data set (see mapping of claim 1 and e.g. Sec. V, 1st par.), wherein the first point cloud data set represents the scene at a time t - 1 and the second point cloud data set represents the scene at a time t subsequent to the time t - 1 (e.g. Sec. V, 2nd par., “two subsequent frames” of data are used – i.e. one frame is at time t - 1 and another is at a subsequent time t; Also see Sec. III, 1st par.).
As explained above with respect to claim 1, Wirges has been modified to extract pillar features from the first point cloud data set and separately from the second point cloud data set using the pillar feature network taught by Lang.  
Therefore, in the method of Wirges in view of Lang as applied above, receiving two sets of 3D point cloud data of the scene comprises receiving the first point cloud data set by a first pillar feature network and receiving a second point cloud data set by a second pillar feature network, wherein the first point cloud data set represents the scene at a time t - 1 and the second point cloud data set represents the scene at a time t subsequent to the time t - 1.

Regarding claim 11, Wirges in view of Lang teaches the method of claim 1, and Lang further teaches that encoding data of the point cloud data sets comprises voxelizing the point cloud data sets to render surfaces in the data sets onto a grid of discretized volume elements in a 3D space to create a set of pillars (Sec. 2.1, especially 2nd par.).

Regarding claim 12, Wirges in view of Lang teaches the method of claim 1, and Wirges further teaches warping the pseudo image of the first point cloud data set to align the pseudo image of the first point cloud data set with the pseudo image of the second point cloud data set (Sec. III.B,                                 
                                    
                                        
                                            
                                                
                                                    f
                                                
                                                ^
                                            
                                        
                                        
                                            motion,2⟵1
                                        
                                    
                                
                             warps pseudo image of first point cloud – i.e. frame 1 – to align it to pseudo image of the second point cloud – i.e. frame 2; Also see Sec. V.A.c).

Regarding claim 13, Wirges in view of Lang teaches the method of claim 12, and Wirges further teaches warping the pseudo image of the second point cloud data set to align the pseudo image of the second point cloud data set with the pseudo image of the first point cloud data set (Sec. III.B,                                 
                                    
                                        
                                            
                                                
                                                    f
                                                
                                                ^
                                            
                                        
                                        
                                            motion
                                        
                                    
                                
                             is determined; As explained with respect to claim 12, this includes determining                                 
                                    
                                        
                                            
                                                
                                                    f
                                                
                                                ^
                                            
                                        
                                        
                                            motion,2⟵1
                                        
                                    
                                
                             for warping from frame 1 to frame 2; Sec. III, eq. 3 and surrounding text, analogous calculations are performed to calculate loss in backward direction, i.e. from frame 2 to frame 1; The backward loss would include determining                                 
                                    
                                        
                                            
                                                
                                                    f
                                                
                                                ^
                                            
                                        
                                        
                                            motion,1⟵2
                                        
                                    
                                
                             for warping from frame 2 to frame 1 – i.e. warping the pseudo image of the second point cloud data set to align it with the pseudo image of the first point cloud data set).

Regarding claim 14, Wirges in view of Lang teaches the method of claim 12, and Wirges further teaches that the 2D optical flow estimation further comprises computing a cost function of the warped pseudo image of the first point cloud data set and the pseudo image of the second point cloud data set, by identifying displacement of a feature from the first image to the second image (Sec. III.B, eq. 8, motion consistency portion of cost function identifies feature displacement using                                 
                                    
                                        
                                            
                                                
                                                    f
                                                
                                                ^
                                            
                                        
                                        
                                            motion
                                        
                                    
                                
                            ).

Regarding claim 18, Examiner notes that the claim recites a system, comprising: a non-transitory memory configured to store instructions; at least one processor configured to execute the instructions and to perform the operations: a method that is substantially the same as the method of claim 1.
Wirges in view of Lang teaches the method of claim 1 (see above).
While Wirges certainly implies the use of a computer to perform its method (e.g. Sec. V.A, 1st par.), Wirges does not explicitly teach implementing its method as a system, comprising: a non-transitory memory configured to store instructions; at least one processor configured to execute the instructions and to perform the operations of the method.
Lang also does not explicitly teach these features.
However, Examiner takes Official Notice that it is old and well-known in the art of image analysis to implement a method as a system, comprising: a non-transitory memory configured to store instructions; at least one processor configured to execute the instructions and to perform the operations of the method.  Such computer implementation advantageously allows an image processing method to be performed quickly and efficiently.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to implement the method of Wirges in view of Lang as applied above as a system, comprising: a non-transitory memory configured to store instructions; at least one processor configured to execute the instructions and to perform the operations of the method, with the reasonable expectation that this would result in a method that could be executed quickly and efficiently.  
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Wirges and Lang in order to obtain the invention as specified in claim 18.	

Regarding claim 19, Examiner notes that the claim recites limitations that are substantially the same as limitations recited in claim 2.  Wirges in view of Lang teaches the invention of claim 2.  Accordingly, claim 19 is also rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang for substantially the same reasons as claim 2.

Regarding claim 20, Examiner notes that the claim recites limitations that are substantially the same as limitations recited in claim 3.  Wirges in view of Lang teaches the invention of claim 3.  Accordingly, claim 20 is also rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang for substantially the same reasons as claim 3.

Regarding claim 27, Examiner notes that the claim recites limitations that are substantially the same as limitations recited in claim 10.  Wirges in view of Lang teaches the invention of claim 10.  Accordingly, claim 27 is also rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang for substantially the same reasons as claim 10.

Regarding claim 28, Examiner notes that the claim recites limitations that are substantially the same as limitations recited in claim 11.  Wirges in view of Lang teaches the invention of claim 11.  Accordingly, claim 28 is also rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang for substantially the same reasons as claim 11.

Regarding claim 29, Examiner notes that the claim recites limitations that are substantially the same as limitations recited in claim 12.  Wirges in view of Lang teaches the invention of claim 12.  Accordingly, claim 29 is also rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang for substantially the same reasons as claim 12.
Regarding claim 30, Examiner notes that the claim recites limitations that are substantially the same as limitations recited in claim 13.  Wirges in view of Lang teaches the invention of claim 13.  Accordingly, claim 30 is also rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang for substantially the same reasons as claim 13.

Regarding claim 31, Examiner notes that the claim recites limitations that are substantially the same as limitations recited in claim 14.  Wirges in view of Lang teaches the invention of claim 14.  Accordingly, claim 31 is also rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang for substantially the same reasons as claim 14.


Claim(s) 4 and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang as applied above, and further in view of ‘Sun’ (“PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume,” 2018).
Regarding claim 4, Wirges in view of Lang teaches the method of claim 1.
Wirges further teaches that self-supervised learning comprises minimizing the cost function for a forward and backward flow (Sec. 3, eq. 3).  
Wirges teaches learning a PWCNet (Sec. V.A, 1st par.), but does not teach details of this network.  In particular, Wirges does not teach that, when learning a PWCNet, the cost function is minimized for each of a plurality of hierarchical resolutions for a feature pyramid.
Lang also does not explicitly teach this feature.
However, Sun does teach details of the PWCNet (e.g. Fig. 3), including that it is learned by minimizing a cost function for each of a plurality of hierarchical resolutions for a feature pyramid (e.g. Pg. 8936, last par., images are used to generate an                         
                            L
                        
                    -level pyramid storing a hierarchy of resolutions; e.g. Pg. 8938, eqs. 3 and 4, loss                         
                            L
                        
                     is a sum of losses for each of                         
                            L
                        
                     pyramid levels).
Minimizing a cost function for each level of a feature pyramid, as taught by Sun, advantageously allows performance of a neural network to be improved over the variety of scales represented by the pyramid.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the method of Wirges in view of Lang as applied above with the pyramid cost function minimization of Sun in order to improve the method with the reasonable expectation that this would result in a method that used an approach that was demonstrated by Sun to work with a PWCNet, and that allowed performance of the PWCNet to be improved over the variety of scales represented by the pyramid.  This technique for improving the method of Wirges in view of Lang was within the ordinary ability of one of ordinary skill in the art based on the teachings of Sun.
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Wirges, Lang, and Sun to obtain the invention as specified in claim 4.	

Regarding claim 21, Examiner notes that the claim recites limitations that are substantially the same as limitations recited in claim 4.  Wirges in view of Lang and Sun teaches the invention of claim 4.  Accordingly, claim 21 is also rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang and Sun for substantially the same reasons as claim 4.


Claim(s) 17 and 34 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang as applied above, and further in view of ‘Wirges-18’ (“Evidential Occupancy Grid Map Augmentation using Deep Learning,” 2018).
Regarding claim 17, Wirges in view of Lang teaches the method of claim 1.
Wirges points to Wirges-18 (i.e. “Wirges et al. [24]”) as teaching details about its grid mapping process (Sec. V, 1st par.).  Wirges itself does not explicitly teach filtering the point cloud datasets using a ground height map, wherein the filtering comprises comparing data point heights against ground height and discarding a data point whose point height is not greater than the ground height at the point's location.
Lang also does not explicitly teach this feature.
However, Wirges-18 does teach, as part of a grid mapping process, filtering the point cloud datasets using a ground height map, wherein the filtering comprises comparing data point heights against ground height and discarding a data point whose point height is not greater than the ground height at the point's location (Sec. III.B, last sentence).
Wirges teaches that points with heights below the ground height “are likely a result of multipath propagation” (Sec. III.B, last sentence) – i.e. noise.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the method of Wirges in view of Lang with the ground height filtering of Wirges-18 in order to improve the method with the reasonable expectation that this would result in a method that used the techniques specifically suggested by Wirges, and that removed noisy points caused by multipath propagation.  This technique for improving the method of Wirges in view of Lang was within the ordinary ability of one of ordinary skill in the art based on the teachings of Wirges-18.
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Wirges, Lang and Wirges-18 to obtain the invention as specified in claim 17.	

Regarding claim 34, Examiner notes that the claim recites limitations that are substantially the same as limitations recited in claim 17.  Wirges in view of Lang and Wirges-18 teaches the invention of claim 17.  Accordingly, claim 34 is also rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang and Wirges-18 for substantially the same reasons as claim 17.


Claim(s) 35 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wirges in view of Lang and Sun.
Regarding claim 35 Wirges teaches a system for determining velocity of an object associated with a three-dimensional (3D) scene, the system comprising:
a pillar feature network (see Note Regarding Pillar Feature Network below) to receive two sets of 3D point cloud data of the scene from two consecutive point cloud sweeps (e.g. Sec. III, 1st par., two consecutive frames are received; e.g. Sec. V., 1st par., the input frames may be lidar measurements – i.e. point cloud sweeps), and to encode data of the point cloud data to extract two-dimensional (2D) bird's-eye-view embeddings for each of the point cloud data sets in the form of pseudo images, wherein the 2D bird's-eye-view embeddings for a first of the two point cloud data sets comprises pillar features for the first point cloud data set and the 2D bird's-eye-view embeddings for a second of the two point cloud data sets comprises pillar features for the second point cloud data set (Sec. V, 1st par., the lidar measurements are converted to first and second grid map embeddings; Sec. V, 1st par., the grid maps are “top-view” – i.e. bird’s eye view; Sec. V, 1st par., and Sec. V.A.b, the grid map stores multiple features of the lidar points lying above each discrete cell along the ground surface – i.e. lying within a “pillar” extending up from the ground surface within a cell’s boundaries; For at least these reasons, the features of the grid map can be considered “pillar features”; see Note Regarding Pillar Feature Network below); and 
a feature pyramid network to encode the pillar features and performing a 2D optical flow estimation to estimate the velocity of the object (see Note Regarding Feature Pyramid Network below; Sec. V.A, 1st par., either FlowNetC or PWCNet are applied to grid map pseudo images – i.e. encode the pillar features – and perform a 2D optical flow; Fig. 3, Sec. V, 2nd par., velocity is estimated for each cell, including for cells corresponding to a vehicle object;).

Note Regarding Pillar Feature Network.  Wirges forms its grid maps by partitioning the ground plane into a regular grid (Sec. V, 1st par.; Sec. V.A.b) and then performing a hand-crafted feature encoding method on the points in each grid cell (Sec. V, 1st par., hand-crafted features are encoded, such as number of surface reflections, minimum and maximum height above ground, average reflection intensity, etc.).
Wirges does not teach using a pillar feature network to encode data of the point cloud data sets to extract the 2D bird’s-eye-view embeddings for each of the point cloud data.
However, Lang does teach techniques for encoding data of a point cloud data set using a pillar feature network (e.g. Fig. 2, Pillar Feature Net), which extracts a 2D bird’s-eye-view embedding for each point cloud (Sec. 2.1, point cloud is separated into set of pillars, which are then input to a simplified version of PointNet to extract features, which are then scattered back to original pillar locations to generate a pseudo image, which is a 2D bird’s-eye-view embedding for the point cloud), the 2D bird’s-eye-view embedding comprising pillar features for the point cloud data set (Sec. 2.1, last two pars., pseudo image includes                         
                            C
                        
                     features for each pillar).
Lang recognizes that the grid partitioning and hand-crafted feature encoding approach used by Wirges (see above) is a known workaround to the problem of sparsity in bird’s-eye-view lidar point clouds (Pg. 2, first two pars., especially 2nd sentence of 2nd par.).  However, Lang teaches that this known workaround used by Wirges “may be sub-optimal since the hard-coded feature extraction may not generalize to new configurations without significant engineering efforts” (Pg. 2, 2nd par.).  Lang teaches that its PointPillars approach is advantageous because “by learning features instead of relying of fixed encoders, PointPillars can leverage the full information represented by the point cloud” (Pg. 2, 3rd par.).  Indeed, Lang’s results demonstrate that “learning the feature encoding is strictly superior to fixed encoders across all resolution” (Sec. 7.4, 2nd par.; Table 4).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the system of Wirges with the pillar feature network of Lang in order to improve the system with the reasonable expectation that this would result in a system that used learned features instead of fixed encoders, and therefore could generalize to new configurations without significant engineering efforts, leverage the full information represented by a point cloud, and/or achieve superior performance.  This technique for improving the system of Wirges was within the ordinary ability of one of ordinary skill in the art based on the teachings of Lang.



Note Regarding Feature Pyramid Network.  Wirges teaches using a PWCNet to process pseudo images (i.e. pillar features) and produce a 2D optical flow estimation (Sec. V.A, 1st par.), but does not teach details of this network.  In particular, Wirges does not teach that PWCNet is a feature pyramid network that encodes its input.
Lang also does not teach these features.
However, Sun does teach details of the PWCNet (e.g. Fig. 3), including that it is a feature pyramid network (e.g. Sec. 3, Feature pyramid extractor), and that it encodes its input (e.g. Sec. 3, Feature pyramid extractor, input images are encoded as                         
                            L
                        
                    -level pyramid feature representation).
Wirges specifically teaches that its system can be implemented with the PWCNet taught by Sun (Sec. V.A, 1st par.).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the system of Wirges in view of Lang as applied above with the PWCNet of Sun with the reasonable expectation that this would result in a system that used a feature pyramid network known to be compatible with the techniques of Wirges.  This technique for improving the system of Wirges in view of Lang was within the ordinary ability of one of ordinary skill in the art based on the teachings of Wirges and Sun.
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Wirges, Lang, and Sun to obtain the invention as specified in claim 35.	


Allowable Subject Matter
Claims 5-9 and 22-26 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The following prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
‘Murashkin’ (US 2020/0377105 A1)
Uses voxel encoding and neural networks to estimate object velocity – e.g. Figs. 5, 7 and 8
‘Filatov’ (“Any Motion Detector: Learning Class-agnostic Scene Dynamics from a Sequence of LiDAR Point Clouds,” 2020)
Applies temporal aggregation to voxel features in order to calculate velocities in a bird’s eye view grid – e.g. Fig. 2 and Sec. IV.C
Uses supervised learning – Sec. V.A
‘Wu’ (“MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps,” 2020)
Uses spatio-temporal pyramid network to perform temporal feature aggregation in neural network that calculates velocities for cells of bird’s eye view grid – see e.g. Figs. 1-3, Sec. 3.3
Uses supervised learning – e.g. Sec. 3.5

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEOFFREY E SUMMERS whose telephone number is (571)272-9915. The examiner can normally be reached Monday-Friday, 7:00 AM to 3:30 PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571) 272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GEOFFREY E SUMMERS/Examiner, Art Unit 2669