Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-2, 4-5 and 7 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by NPL1 (MLOD: A multi-view 3D object detection based on robust feature fusion method, Jian Deng et al., arXiv:1909.04163v1, 9 Sept 2019, Pages 1-6) hereafter NPL1.

1. Regarding claim 1, NPL1 discloses a vehicle information detection method (proposed method of fig 2 on page 2), comprising: 
determining a bird' s-eye view of a target vehicle based on an image of the target vehicle (fig 2 shows the BEV Input (examiner notes that as seen in fig 5 on page 5 the detected cars are in green and fig 6 BEV image has cars therefore the BEV input is based on the Image input of the car (i.e an image of the target vehicle (i.e a car or a target vehicle as seen in image fig 5 and fig 6) meeting the above claim limitations); 
performing feature extraction on the image of the target vehicle and the bird' s- eye view respectively, to obtain first feature information corresponding to the image of the target vehicle (fig 2 shows the feature extractor connected to the image input and outputting the Image features meeting the limitations of obtain the first feature information of the target vehicle image (i.e for car as seen in fig 5 and fig 6)) and second feature information corresponding to the bird's-eye view of the target vehicle (fig 2 shows the feature extractor connected to the BEV input and outputting the BEV features meeting the limitations of obtain the second feature information of the target vehicle (i.e for car as seen in fig 5 and fig 6)); and 
determining three-dimensional information of the target vehicle based on the first feature information and the second feature information (abstract, fig 2 and page 2 section C. discloses MV3D and AVOD are two stage detectors. The multi-view based methods (i.e the current method) merge features from BEV map (i.e the second feature information) and the RGB image feature (i.e the first feature information) to predict 3D bounding boxes (i.e determining three dimensional information of the target vehicle) of the object (car) as seen by the green bounding box (i.e three dimensional information) in fig 5 meeting the above claim limitations).  

2. Regarding claim 2, NPL1 discloses the method of claim 1, wherein the determining the bird's-eye view of the target vehicle based on the image of the target vehicle comprises: performing depth estimation on the image of the target vehicle to obtain a depth map of the target vehicle; and determining the bird' s-eye view of the target vehicle according to the depth map of the target vehicle (page 2 fig 2 “Depth map input into the system”  followed by the “foreground mask layer” which receives input from the BEV features (i.e of the car or target vehicle as seen in figs 5 and 6) and page 1 col 2 discloses “We propose a foreground mask layer, which exploits the projected depth map in front view to select the foreground image features within a 3D bounding box proposal, also page 1 section Introduction discloses “In this work” that is the proposed method, 3D point cloud data (i.e depth map) is represented in the form of a birds eye view (BEV) map meeting the above limitations).

3. Regarding claim 4, NPL1 discloses the method of claim 1, wherein the performing feature extraction on the image of the target vehicle and the bird's-eye view respectively, to obtain the first feature information corresponding to the image of the target vehicle and the second feature information corresponding to the bird' s-eye view of the target vehicle, comprises: performing feature extraction on the image of the target vehicle based on a first feature extraction model, to obtain the first feature information; and performing feature extraction on the bird's-eye view of the target vehicle based on a second feature extraction model, to obtain the second feature information, wherein the first feature extraction model and the second feature extraction model have the same network structure (Page 1 col 1 Introduction section and page 2 fig 2  discloses in this work (i.e the proposed method) discloses applying convolutional neural networks to the BEV map and RGB image and the outputs of the CNN’s (i.e the first feature extraction model and the second feature extraction model where the first model and the second model have the same network structure (i.e both are CNN’s)) resulting in the BEV features (second feature information) and Image features (first feature information) meeting the above claim limitations).

4. Regarding claim 5, NPL1 discloses the method of claim 2, wherein the performing feature extraction on the image of the target vehicle and the bird's-eye view respectively, to obtain the first feature information corresponding to the image of the target vehicle and the second feature information corresponding to the bird' s-eye view of the target vehicle, comprises: performing feature extraction on the image of the target vehicle based on a first feature extraction model, to obtain the first feature information; and performing feature extraction on the bird's-eye view of the target vehicle based on a second feature extraction model, to obtain the second feature information, wherein the first feature extraction model and the second feature extraction model have the same network structure (Page 1 col 1 Introduction section and page 2 fig 2 discloses in this work (i.e the proposed method) discloses applying convolutional neural networks to the BEV map and RGB image and the outputs of the CNN’s (i.e the first feature extraction model and the second feature extraction model where the first model and the second model have the same network structure (i.e both are CNN’s)) resulting in the BEV features (second feature information) and Image features (first feature information) meeting the above claim limitations).
 
5. Regarding claim 7, NPL1 discloses the method of claim 1, wherein the determining the three-dimensional information of the target vehicle based on the first feature information and the second feature information comprises: fusing the first feature information and the second feature information to obtain third feature information (title, abstract, fig 2 discloses feature fusion (as seen in fig 2 the BEV features (second feature) being fused with the Image features (first feature) and the resulting (result of the fusion) will be the fused feature output (i.e the third feature information) and page 1 discloses These methods apply convolutional neural networks (CNN) to the BEV map and RGB image data, and use the resulting fused features to detect objects); and obtaining the three-dimensional information of the target vehicle based on the third feature information (abstract, fig 2 and page 2 section C. discloses MV3D and AVOD are two stage detectors. The multi-view based methods (i.e the current method) merge features (i.e the result of the merging or fusion will be the third feature information) from BEV map (i.e the second feature information) and the RGB image feature (i.e the first feature information) to predict 3D bounding boxes (i.e determining three dimensional information of the target vehicle) of the object (car) as seen by the green bounding box (i.e three dimensional information) in fig 5 meeting the above claim limitations).  
.  
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 8-9, 11-12, 14-16 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over NPL1 in view of NPL1 (Single reference 103 as “An electronic device (i.e Autonomous vehicle disclosed on page 1 and such a system in Introduction section and in fig 2), comprising: at least one processor; and a memory communicatively connected to the at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform operations (i.e from the above disclosure and method/structure of fig 2, above underlined limitations are obvious to one of ordinary skill in the art before the effective filing date of the invention was made“).
6. Regarding claim 8, NPL1 discloses an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform operations of “An electronic device (i.e Autonomous vehicle disclosed on page 1 and such a system in Introduction section and in fig 2), comprising: at least one processor; and a memory communicatively connected to the at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform operations (i.e from the above disclosure and method/structure of fig 2, above underlined limitations are obvious to one of ordinary skill in the art before the effective filing date of the invention was made)
determining a bird' s-eye view of a target vehicle based on an image of the target vehicle (fig 2 shows the BEV Input (examiner notes that as seen in fig 5 on page 5 the detected cars are in green and fig 6 BEV image has cars therefore the BEV input is based on the Image input of the car (i.e an image of the target vehicle (i.e a car or a target vehicle as seen in image fig 5 and fig 6) meeting the above claim limitations);
performing feature extraction on the image of the target vehicle and the bird' s- eye view respectively, to obtain first feature information corresponding to the image of the target vehicle (fig 2 shows the feature extractor connected to the image input and outputting the Image features meeting the limitations of obtain the first feature information of the target vehicle image (i.e for car as seen in fig 5 and fig 6)) and second feature information corresponding to the bird's-eye view of the target vehicle (fig 2 shows the feature extractor connected to the BEV input and outputting the BEV features meeting the limitations of obtain the second feature information of the target vehicle (i.e for car as seen in fig 5 and fig 6));  and 
determining three-dimensional information of the target vehicle based on the first feature information and the second feature information (abstract, fig 2 and page 2 section C. discloses MV3D and AVOD are two stage detectors. The multi-view based methods (i.e the current method) merge features from BEV map (i.e the second feature information) and the RGB image feature (i.e the first feature information) to predict 3D bounding boxes (i.e determining three dimensional information of the target vehicle) of the object (car) as seen by the green bounding box (i.e three dimensional information) in fig 5 meeting the above claim limitations). Before the effective filing date of the invention was made, “a processor, a memory storing instructions and executed by the processor as claimed in claim 8 would be obvious and within one of ordinary skill in the art. The suggestion/motivation would be an accurate (fast and efficient section B page 5) detector providing better classification and localization results (i.e 3D object detection page 6 conclusion). 
  




7. Regarding claim 9, NPL1 discloses the electronic device of claim 8, wherein the determining the bird's-eye view of the target vehicle based on the image of the target vehicle comprises: performing depth estimation on the image of the target vehicle to obtain a depth map of the target vehicle; and determining the bird' s-eye view of the target vehicle according to the depth map of the target vehicle (page 2 fig 2 “Depth map input into the system”  followed by the “foreground mask layer” which receives input from the BEV features (i.e of the car or target vehicle as seen in figs 5 and 6) and page 1 col 2 discloses “We propose a foreground mask layer, which exploits the projected depth map in front view to select the foreground image features within a 3D bounding box proposal, also page 1 section Introduction discloses “In this work” that is the proposed method, 3D point cloud data (i.e depth map) is represented in the form of a birds eye view (BEV) map meeting the above limitations).

8. Regarding claim 11, NPL1 discloses the electronic device of claim 8, wherein the performing feature extraction on the image of the target vehicle and the bird' s-eye view respectively, to obtain the first feature information corresponding to the image of the target vehicle and the second feature information corresponding to the bird' s-eye view of the target vehicle, comprises: performing feature extraction on the image of the target vehicle based on a first feature extraction model, to obtain the first feature information; and performing feature extraction on the bird's-eye view of the target vehicle based on a second feature extraction model, to obtain the second feature information, wherein the first feature extraction model and the second feature extraction model have the same network structure (Page 1 col 1 Introduction section and page 2 fig 2  discloses in this work (i.e the proposed method) discloses applying convolutional neural networks to the BEV map and RGB image and the outputs of the CNN’s (i.e the first feature extraction model and the second feature extraction model where the first model and the second model have the same network structure (i.e both are CNN’s)) resulting in the BEV features (second feature information) and Image features (first feature information) meeting the above claim limitations).
  
9. Regarding claim 12, NPL1 discloses the electronic device of claim 9, wherein the performing feature extraction on the image of the target vehicle and the bird' s-eye view respectively, to obtain the first feature information corresponding to the image of the target vehicle and the second feature information corresponding to the bird' s-eye view of the target vehicle, comprises: performing feature extraction on the image of the target vehicle based on a first feature extraction model, to obtain the first feature information; and performing feature extraction on the bird's-eye view of the target vehicle based on a second feature extraction model, to obtain the second feature information, wherein the first feature extraction model and the second feature extraction model have the same network structure (Page 1 col 1 Introduction section and page 2 fig 2 discloses in this work (i.e the proposed method) discloses applying convolutional neural networks to the BEV map and RGB image and the outputs of the CNN’s (i.e the first feature extraction model and the second feature extraction model where the first model and the second model have the same network structure (i.e both are CNN’s)) resulting in the BEV features (second feature information) and Image features (first feature information) meeting the above claim limitations).
  
10. Regarding claim 14, NPL1 discloses the electronic device of claim 8, wherein the determining the three- dimensional information of the target vehicle based on the first feature information and the second feature information comprises: fusing the first feature information and the second feature information to obtain third feature information (title, abstract, fig 2 discloses feature fusion (as seen in fig 2 the BEV features (second feature) being fused with the Image features (first feature) and the resulting (result of the fusion) will be the fused feature output (i.e the third feature information) and page 1 discloses These methods apply convolutional neural networks (CNN) to the BEV map and RGB image data, and use the resulting fused features to detect objects); and obtaining the three-dimensional information of the target vehicle based on the third feature information (abstract, fig 2 and page 2 section C. discloses MV3D and AVOD are two stage detectors. The multi-view based methods (i.e the current method) merge features (i.e the result of the merging or fusion will be the third feature information) from BEV map (i.e the second feature information) and the RGB image feature (i.e the first feature information) to predict 3D bounding boxes (i.e determining three dimensional information of the target vehicle) of the object (car) as seen by the green bounding box (i.e three dimensional information) in fig 5 meeting the above claim limitations).  
.  
11. Regarding claim 15, NPL1 discloses a non-transitory computer-readable storage medium having computer instructions stored thereon, wherein the computer instructions cause a computer to perform operations would be obvious in view of “An electronic device (i.e Autonomous vehicle disclosed on page 1 and such a system in Introduction section and in fig 2), a non-transitory computer-readable storage medium having computer instructions stored thereon, wherein the computer instructions cause a computer (i.e from the above disclosure and method/structure of fig 2, above underlined limitations are obvious to one of ordinary skill in the art before the effective filing date of the invention was made) to perform operations of determining a bird' s-eye view of a target vehicle based on an image of the target vehicle (fig 2 shows the BEV Input (examiner notes that as seen in fig 5 on page 5 the detected cars are in green and fig 6 BEV image has cars therefore the BEV input is based on the Image input of the car (i.e an image of the target vehicle (i.e a car or a target vehicle as seen in image fig 5 and fig 6) meeting the above claim limitations);
performing feature extraction on the image of the target vehicle and the bird' s- eye view respectively, to obtain first feature information corresponding to the image of the target vehicle (fig 2 shows the feature extractor connected to the image input and outputting the Image features meeting the limitations of obtain the first feature information of the target vehicle image (i.e for car as seen in fig 5 and fig 6)) and second feature information corresponding to the bird's-eye view of the target vehicle (fig 2 shows the feature extractor connected to the BEV input and outputting the BEV features meeting the limitations of obtain the second feature information of the target vehicle (i.e for car as seen in fig 5 and fig 6));  and 
determining three-dimensional information of the target vehicle based on the first feature information and the second feature information (abstract, fig 2 and page 2 section C. discloses MV3D and AVOD are two stage detectors. The multi-view based methods (i.e the current method) merge features from BEV map (i.e the second feature information) and the RGB image feature (i.e the first feature information) to predict 3D bounding boxes (i.e determining three dimensional information of the target vehicle) of the object (car) as seen by the green bounding box (i.e three dimensional information) in fig 5 meeting the above claim limitations). Before the effective filing date of the invention was made, “a processor, a memory storing instructions and executed by the processor as claimed in claim 8 would be obvious and within one of ordinary skill in the art. The suggestion/motivation would be an accurate (fast and efficient section B page 5) detector providing better classification and localization results (i.e 3D object detection page 6 conclusion). 
  
12. Regarding claim 16, NPL1 discloses the non-transitory computer-readable storage medium of claim 15, wherein the determining the bird' s-eye view of the target vehicle based on the image of the target vehicle comprises: performing depth estimation on the image of the target vehicle to obtain a depth map of the target vehicle; and determining the bird' s-eye view of the target vehicle according to the depth map of the target vehicle (page 2 fig 2 “Depth map input into the system”  followed by the “foreground mask layer” which receives input from the BEV features (i.e of the car or target vehicle as seen in figs 5 and 6) and page 1 col 2 discloses “We propose a foreground mask layer, which exploits the projected depth map in front view to select the foreground image features within a 3D bounding box proposal, also page 1 section Introduction discloses “In this work” that is the proposed method, 3D point cloud data (i.e depth map) is represented in the form of a birds eye view (BEV) map meeting the above limitations).

13. Regarding claim 18, NPL1 discloses the non-transitory computer-readable storage medium of claim 15, wherein the performing feature extraction on the image of the target vehicle and the bird' s-eye view respectively, to obtain the first feature information corresponding to the image of the target vehicle and the second feature information corresponding to the bird's-eye view of the target vehicle, comprises: performing feature extraction on the image of the target vehicle based on a first feature extraction model, to obtain the first feature information; and performing feature extraction on the bird's-eye view of the target vehicle based on a second feature extraction model, to obtain the second feature information, wherein the first feature extraction model and the second feature extraction model have the same network structure  (Page 1 col 1 Introduction section and page 2 fig 2  discloses in this work (i.e the proposed method) discloses applying convolutional neural networks to the BEV map and RGB image and the outputs of the CNN’s (i.e the first feature extraction model and the second feature extraction model where the first model and the second model have the same network structure (i.e both are CNN’s)) resulting in the BEV features (second feature information) and Image features (first feature information) meeting the above claim limitations).

14. Regarding claim 19, NPL1 discloses the non-transitory computer-readable storage medium of claim 16, wherein the performing feature extraction on the image of the target vehicle and the bird' s-eye view respectively, to obtain the first feature information corresponding to the image of the target vehicle and the second feature information corresponding to the bird's-eye view of the target vehicle, comprises: performing feature extraction on the image of the target vehicle based on a first feature extraction model, to obtain the first feature information; and performing feature extraction on the bird's-eye view of the target vehicle based on a second feature extraction model, to obtain the second feature information, wherein the first feature extraction model and the second feature extraction model have the same network structure (Page 1 col 1 Introduction section and page 2 fig 2 discloses in this work (i.e the proposed method) discloses applying convolutional neural networks to the BEV map and RGB image and the outputs of the CNN’s (i.e the first feature extraction model and the second feature extraction model where the first model and the second model have the same network structure (i.e both are CNN’s)) resulting in the BEV features (second feature information) and Image features (first feature information) meeting the above claim limitations).



Allowable Subject Matter
Claims 3, 6, 10, 13, 17 and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Examiner's Note: Examiner has cited figures, and paragraphs in the references as applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested for the applicant, in preparing the responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner. Examiner has also cited references in PTO892 but not relied on, which are relevant and pertinent to the applicant’s disclosure, and may also be reading (anticipatory/obvious) on the claims and claimed limitations. Applicant is advised to consider the references in preparing the response/amendments in-order to expedite the prosecution.

As cited in PTO 892 NPL2 (Joint 3D proposal generation and object detection from view aggregation, Jason Ku et al, IEEE Oct 2018, Pages 5750-5757) hereafter NPL2 is also an anticipatory reference regarding independent claims 1, 4, 7, 8, 11, 14, 15 and 18

Regarding claims 1, 8 and 15, NPL2 discloses a vehicle information detection method/device/non-transitory computer readable medium (proposed method of fig 2 and the proposed neural network structure of fig 2 allowing high computational speed and low memory footprint (see page 2 col 1)), comprising: 
determining a bird' s-eye view of a target vehicle based on an image of the target vehicle (fig 2 shows the BEV Input (examiner notes that as seen in fig 2 the BEV input shows the target vehicle or the car determined based on the Image input (i.e an image of the target vehicle (i.e a car or a target vehicle as seen in input image) meeting the above claim limitations, also page 5757 col 1 discloses “car class” or target vehicle)); 
performing feature extraction on the image of the target vehicle and the bird' s- eye view respectively, to obtain first feature information corresponding to the image of the target vehicle (fig 2 shows the feature extractor connected to the image input and outputting the Image feature maps meeting the limitations of obtain the first feature information of the target vehicle image) and second feature information corresponding to the bird's-eye view of the target vehicle (fig 2 shows the feature extractor connected to the BEV input and outputting the BEV feature maps meeting the limitations of obtain the second feature information of the target vehicle); and 
determining three-dimensional information of the target vehicle based on the first feature information and the second feature information (abstract, fig 2 the output “Detected objects” is based on the fusion of the Image feature maps (i.e the first feature information) and the BEV feature maps (i.e the second feature information) and the detected 3D object (i.e the target vehicle) is shown by the green bounding box (i.e three dimensional information) in fig 6 top right meeting the above claim limitations).  

Regarding claims 4, 11 and 18, NPL2 discloses the method/device and non-transitory medium of claims 1, 8 and 15 wherein the performing feature extraction on the image of the target vehicle and the bird's-eye view respectively, to obtain the first feature information corresponding to the image of the target vehicle and the second feature information corresponding to the bird' s-eye view of the target vehicle, comprises: performing feature extraction on the image of the target vehicle based on a first feature extraction model, to obtain the first feature information; and performing feature extraction on the bird's-eye view of the target vehicle based on a second feature extraction model, to obtain the second feature information, wherein the first feature extraction model and the second feature extraction model have the same network structure (fig 2 shows the two feature extraction models and page 5752 section B feature extractor in Fig 3 made up of convolutional neural networks (i.e same structure) meeting the above claim limitations). 

Regarding claims 7, 14, NPL2 discloses the method/device and non-transitory computer readable medium of claims 1, 8  wherein the determining the three-dimensional information of the target vehicle based on the first feature information and the second feature information comprises: fusing the first feature information and the second feature information to obtain third feature information; and obtaining the three-dimensional information of the target vehicle based on the third feature information (Fig 2 shows the first feature map and the second feature and the fusion of both resulting (i.e the third feature map and the resulting detected objects from the fusion meeting the above claim limitations).

Examiner further notes that Claims 2,5,9,12,16 and 19 would be obvious using the combination of NPL2 in view of NPL1.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAYESH PATEL whose telephone number is (571)270-1227. The examiner can normally be reached IFW Mon-FRI.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on 571-272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

JAYESH PATEL
Primary Examiner
Art Unit 2669



/JAYESH A PATEL/Primary Examiner, Art Unit 2669