DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in China on 11/06/2020. It is noted, however, that applicant has not filed a certified copy of the CN202011228919.8 application as required by 37 CFR 1.55.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over He et al., US10,713,794 B1, and further in view of Qi et al., US2019/0147245 A1.
Regarding claim 1, He teaches An apparatus (col. 20 lines 35-38; a computing system may use one or more machine-learning models to generate a number of object proposals corresponding to objects detected within an image.) comprising: a memory configured to store image data of an input image (col. 41 lines 35-36; memory 1704 includes main memory for storing instructions (i.e., image data).); and a processor configured to detect one or more objects in said input image using a quantized multi-stage object detection network (col. 29 lines 12-13; a machine-learning model may be trained to detect object instances depicted in an image.), wherein quantization of said quantized multi-stage object detection network comprises (i) generating quantized image data by performing a first data range analysis on said image data of said input image (col. 20 lines 35-38; generate a number of object proposals corresponding to objects detected within an image.), (ii) generating a feature map and proposal bounding boxes by applying a region proposal network (RPN) to said quantized image data (col. 2 lines 12-18; a region proposal network may generate n number of Rols, each of which defines a portion of the input image's feature map. RolAlign may extract a smaller feature map from each RoI, and that extracted feature map may be used for training the classification model, bounding box model, and segmentation model.), (iii) performing a region of interest pooling operation on said feature map and a plurality of ground truth boxes corresponding to the proposal bounding boxes generated by the RPN (col. 28 lines 44-48; Each refinement module 860 may invert the effects of pooling in first pass 820 in order to double the resolution of the input object-proposal encoding (i.e., output from the immediately preceding layer in second pass 840).), and (iv) generating quantized region of interest pooling results by performing a second data range analysis on results from said region of interest pooling operation (col. 28 lines 52-59; Each refinement module R.sup.i may be trained to merge the object-proposal encoding and the matching features in order to generate a new upsampled object encoding M.sup.i+1. Thus, M.sup.i+1=R.sup.i(M.sup.i,F.sup.i). In particular embodiments, multiple refinement modules 860 are stacked in second pass 840. As an example and not by way of limitation, there may be one refinement module 860 for each layer in the first pass 820 (i.e., every pooling layer).).  
He fails to teach the following recited limitation.  However, Qi teaches (v) applying a region-based convolutional neural network (RCNN) to the quantized region of interest pooling results (par. 0116; Most existing works convert 3D point cloud to images by projection or to volumetric grids by quantization and then apply convolutional networks.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine He’s teachings with Qi’s teachings in order to efficiently and accurately analyze the 3D data created by such sensors to perform object detection, classification, and localization (Qi, par. 0002).

Regarding claims 2 and 13, He and Qi teach all the limitations in claims 1 and 12.  He further teaches wherein said ground truth boxes are projections of said proposal bounding boxes on said input image (col. 31 lines 65-67; Each RoI candidate, which may have ground-truth labels of the correct classification, detection, and/or segmentation information.).

Regarding claim 3, He and Qi teach all the limitations in claim 1.  He further teaches wherein said second data range analysis applies a technique applied by said first data range analysis (col. 33 lines 19-24;  Since rounding is being performed, any value of x in the range of 33 to 39, for example, would result in the edge being snapped to grid 2. A similar operation may be performed alternatively or additionally in they coordinate (e.g., [y/16]). These quantizations introduce misalignments between the RoI and the extracted features.).

Regarding claims 4 and 14, He and Qi teach all the limitations in claims 1 and 12.  He further teaches wherein training of said RPN and said RCNN comprises (i) generating a feature map and proposal bounding boxes by applying said RPN to said image data of said input image (col. 29 lines 63-65; the output of the backbone may also be used by a region proposal network (RPN) to identify any number of Rols (e.g., 930) that map to regions in the feature map 920.) and (ii) performing said region of interest pooling operation on said feature map and said proposal bounding boxes generated by the RPN (col. 30 lines 2-7; The smaller regional feature map may have a fixed, predetermined size (e.g., n×n or n×m). The RoI feature map 950 may then be used by different branches in parallel to (1) detect an object 960 (via the use of a bounding box), (2) classify the object 970 (e.g., as a person, dog, etc.), (3) segment the object 980 (e.g., via a segmentation mask). In particular embodiments, each of these three tasks may be performed using one or more neural network layers, such as convolutional neural networks.).

Regarding claims 5 and 15, He and Qi teach all the limitations in claims 1 and 12.  He further teaches wherein the RPN and the RCNN are stored in said processor as directed acyclic graphs and corresponding weights (col. 30 lines 8-10).

Regarding claims 6 and 16, He and Qi teach all the limitations in claims 1 and 12.  He further teaches wherein the RPN and the RCNN share one or more convolution layers (col. 30 lines 16-19).

Regarding claims 7 and 17, He and Qi teach all the limitations in claims 1 and 12.  He further teaches wherein said processor is further configured to generate a pooling result for a region of interest by cropping and resampling a corresponding portion of a feature map to which an object detection proposal is assigned (col. 23 lines 48-54; the second convolutional neural network 530 may be trained to generate an object proposal 430 for a patch of an image, and the third convolutional neural network 540 may be trained to generate a scalar object score 440 (e.g., representing a likelihood or confidence that the patch contains a full object.).

Regarding claims 8 and 18, He and Qi teach all the limitations in claims 7 and 17.  He further teaches wherein said resampling comprises a warping operation (col. 39 lines 19-22; RolAlign is compared with RoIWarp (with bilinear sampling), which still quantizes the RoI and thereby loses alignment with the input.).

Regarding claims 9 and 19, He and Qi teach all the limitations in claims 1 and 12.  He further teaches wherein said processor configures bilinear interpolation hardware to resize features of the region of interest for use as an input to subsequent sub-networks on hardware (col. 34 lines 30-35; RolAlign may compute the values at each of the sample points 1241-1244 using bilinear interpolation. For example, to compute the value of each sampling point, RolAlign may use bilinear interpretation from the nearby grid points on the feature map 1200.).

Regarding claims 10 and 20, He and Qi teach all the limitations in claims 1 and 12.  He further teaches wherein said feature map comprises a three-dimensional array having dimensions corresponding to a depth, a height, and a width of said feature map (col. 24 lines 19-25; The classification layer may consist of h×w pixel classifiers (h×w denoting the height and width dimensions), each responsible for indicating whether a given pixel belongs to the object in the center of the patch. Each pixel classifier in the output plane may be able to utilize information contained in the entire feature map, and thus have a complete view of the object.).

Regarding claim 11, He and Qi teach all the limitations in claim 1.  He further teaches wherein said memory and said processor are part of at least one of a computer vision system or an autonomous vehicle (col. 29 lines 8-11; machine-learning models and various optimization techniques that enable computing systems to perform tasks related to computer vision.).

Regarding claim 12, He teaches A method of object detection (col. 1 lines 14-16; computer vision and more specifically to object detection and segmentation in images) comprising: storing image data of an input image in a memory (col. 41 lines 35-36; memory 1704 includes main memory for storing instructions (i.e., image data).); and detecting one or more objects in said input image using a quantized multi-stage object detection network (col. 29 lines 12-13; a machine-learning model may be trained to detect object instances depicted in an image.), wherein quantization of said quantized multi-stage object detection network is performed by (i) generating quantized image data by performing a first data range analysis on said image data of said input image (col. 20 lines 35-38; generate a number of object proposals corresponding to objects detected within an image.), (ii) generating a feature map and proposal bounding boxes by applying a region proposal network (RPN) to said quantized image data (col. 2 lines 12-18; a region proposal network may generate n number of Rols, each of which defines a portion of the input image's feature map. RolAlign may extract a smaller feature map from each RoI, and that extracted feature map may be used for training the classification model, bounding box model, and segmentation model.), (iii) performing a region of interest pooling operation on said feature map and a plurality of ground truth boxes corresponding to the proposal bounding boxes generated by the RPN (col. 28 lines 44-48; Each refinement module 860 may invert the effects of pooling in first pass 820 in order to double the resolution of the input object-proposal encoding (i.e., output from the immediately preceding layer in second pass 840).), and (iv) generating quantized region of interest pooling results by performing a second data range analysis on results from said region of interest pooling operation (col. 28 lines 52-59; Each refinement module R.sup.i may be trained to merge the object-proposal encoding and the matching features in order to generate a new upsampled object encoding M.sup.i+1. Thus, M.sup.i+1=R.sup.i(M.sup.i,F.sup.i). In particular embodiments, multiple refinement modules 860 are stacked in second pass 840. As an example and not by way of limitation, there may be one refinement module 860 for each layer in the first pass 820 (i.e., every pooling layer).). 
He fails to teach the following recited limitation.  However, Qi teaches (v) applying a region-based convolutional neural network (RCNN) to the quantized region of interest pooling results (par. 0116; Most existing works convert 3D point cloud to images by projection or to volumetric grids by quantization and then apply convolutional networks.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine He’s teachings with Qi’s teachings in order to efficiently and accurately analyze the 3D data created by such sensors to perform object detection, classification, and localization (Qi, par. 0002).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AYODEJI O AYOTUNDE whose telephone number is (571)270-7983. The examiner can normally be reached Monday - Friday, 7:00am-3:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Yuwen Pan can be reached on 571-272-7855. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AYODEJI O AYOTUNDE/Primary Examiner, Art Unit 2649