DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Preliminary Amendment
This Office Action is responsive to a preliminary amendment filed on 06/24/2020. Claims 1-25 have been canceled. Claims 26-50 have been added. An Office Action on the merits follows here below. 
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/24/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 26, 39 and 45 are rejected under 35 U.S.C. 102(a)(1) and/or (a)(2) as being anticipated by Cho et al. (US 20110069155 A1).
Regarding Claim 26 (New): Cho discloses an electronic processing system (Refer to para [032]; “FIG. 3 illustrates a motion detection apparatus, according to one or more embodiments.”) comprising: a processor (Refer to para [074]; “the processing element could include a processor or a computer processor, and processing elements may be distributed and/or included in a single device.”) memory communicatively coupled to the processor (Refer to para [073]; “In addition to the above described embodiments, embodiments can also be implemented through computer readable code/instructions in/on a non-transitory medium, e.g., a computer readable medium, to control at least one processing device, such as a processor or computer, to implement any above described embodiment. The medium can correspond to any defined, measurable, and tangible structure permitting the storing and/or transmission of the computer readable code.”) and logic communicatively coupled to the processor to (Refer to para [074]; “The media may also include, e.g., in combination with the computer readable code, data files, data structures, and the like. Examples of computer-readable media include magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD ROM disks and DVDs; magneto-optical media such as optical disks; and hardware devices that are specially configured to store and perform program instructions, such as read-only memory (ROM), random access memory (RAM), flash memory, and the like.”): pre-process an image to subtract the background from the image (Refer to para [035]; “In an embodiment, the object image may include only an object by removing a background from any image including the background and the object.”) and perform object detection on the pre-processed image with the background subtracted (Refer to para [039]; “Thus, since the motion detection apparatus 300, according to an embodiment, acquires the first and second object images including only the object without a background at certain time intervals, sets the motion detection area around the face of each object image, and detects the motion of the object through the image change amount in the motion detection area, the motion detection apparatus 300 can detect the motion of the object with a limited amount of computation irrespective of a change of the background.”).
Regarding Claim 39 (New): Cho discloses a method (Refer to para [014]; “a method of detecting a motion, including acquiring object images using distance information for an object included in images obtained from at least two cameras, the object image including only the object without a background, setting a motion detection area in the acquired object image, and detecting a motion of the object based on an amount of an image change in the motion detection area between the acquired object images.”) comprising: pre-processing an image to subtract the background from the image (Refer to para [035]; “In an embodiment, the object image may include only an object by removing a background from any image including the background and the object.”) and performing object detection on the pre-processed image with the background subtracted  (Refer to para [039]; “Thus, since the motion detection apparatus 300, according to an embodiment, acquires the first and second object images including only the object without a background at certain time intervals, sets the motion detection area around the face of each object image, and detects the motion of the object through the image change amount in the motion detection area, the motion detection apparatus 300 can detect the motion of the object with a limited amount of computation irrespective of a change of the background.”).

Regarding Claim 45 (New): Cho discloses least one computer readable medium, comprising a set of instructions, which when executed by a computing device (Refer to para [073]; “In addition to the above described embodiments, embodiments can also be implemented through computer readable code/instructions in/on a non-transitory medium, e.g., a computer readable medium, to control at least one processing device, such as a processor or computer, to implement any above described embodiment. The medium can correspond to any defined, measurable, and tangible structure permitting the storing and/or transmission of the computer readable code.”) cause the computing device to: pre-process an image to subtract the background from the image (Refer to para [035]; “In an embodiment, the object image may include only an object by removing a background from any . 

Claims 26, 27, 39, 40 and 45 rejected under 35 U.S.C. 102(a)(1) and/or (a)(2) as being anticipated by Chan et al. (US 20170301109 A1).

Regarding Claim 26: (New) Chan discloses an electronic processing system (Refer to para [042]; “FIG. 1 illustrates a small autonomous system 100 in accordance with various embodiments of the present disclosure. The system 100 can include a chassis 105, an imaging system 120, and a vision-based guidance system 150.”) comprising: a processor (“at least one processor 157”) memory communicatively coupled to the processor (Refer to para [148]; “The vision-based guidance system 150 can include at least one processor 157 and at least one memory 151.”) and logic communicatively coupled to the processor to (Refer to para [050]; “The vision-based navigation system 150 also includes configurable and/or programmable processor 157 and associated core(s) 304, and in some embodiments includes one or more additional configurable and/or programmable processor(s) and associated core(s) (for example, in the case of computer systems having multiple processors/cores), for executing computer-readable and computer-executable instructions or software stored in the memory 151 and other programs for implementing exemplary embodiments of the present disclosure. Each processor 157 may each be a single core processor or a multiple core processor.”):


Regarding Claim 27: (New) Chan discloses an image pre-processor to subtract the background from the image to provide the pre-processed image (Refer to para [057]; “the vision-based guidance system 150 can apply background subtraction performed on registered images or can use specialized object detectors.”) and a convolutional neural network to detect an object in the pre-processed image 

Regarding Claim 39: (New) Chan discloses method (Refer to para [035]; “The systems and methods described include a robust object reacquisition methodology.”) comprising: pre-processing an image to subtract the background from the image (Refer to para [057]; “the vision-based guidance system 150 can apply background subtraction performed on registered images or can use specialized object detectors.”) and performing object detection on the pre-processed image with the background subtracted (Refer to para [057-058]; “the vision-based guidance system 150 can automatically detect the object of interest in the sequence of images. In some embodiments, the vision-based guidance system 150 can apply background subtraction performed on registered images or can use specialized object detectors. In other embodiments, the vision-based guidance system 150 can utilize a convolutional neural network to detect the object of interest in one or more images. As described below in greater detail with reference to FIGS. 7A-9, the vision processing module 154 can create or update an object model 159 using machine learning techniques. In some embodiments, the vision processing module 154 can include an image or video analytics engine suitable to acquire or receive a sequence of images from the imaging system and process the sequence of images to detect, track, and identify objects therein. Information about the detected objects can be used to make determinations of object size and location relative to the system 100 or determinations as to whether the object has been lost or whether the object has been satisfactorily identified. These values can be input into the long-term planning module 152 and the gimbal controller module 155 of the vision-based guidance system 150 so that they can coordinate motion of the chassis 105 or imaging system 120 relative to the object of interest. The gimbal controller module 155 is described in greater detail below.”).

Regarding Claim 40: (New) Chan discloses performing the object detection on the pre-processed image with a convolutional neural network model (Refer to para [057]; “the vision-based guidance system 150 can apply background subtraction performed on registered images or can use specialized object detectors… the vision-based guidance system 150 can utilize a convolutional neural network to detect the object of interest in one or more images.”).

Regarding Claim 45: (New) Chan discloses at least one computer readable medium, comprising a set of instructions, which when executed by a computing device (Refer to para [050]; “The vision-based navigation system 150 also includes configurable and/or programmable processor 157 and associated core(s) 304, and in some embodiments includes one or more additional configurable and/or programmable processor(s) and associated core(s) (for example, in the case of computer systems having multiple processors/cores), for executing computer-readable and computer-executable instructions or software stored in the memory 151 and other programs for implementing exemplary embodiments of the present disclosure. Each processor 157 may each be a single core processor or a multiple core processor.”) cause the computing device to: pre-process an image to subtract the background from the image  (Refer to para [057]; “the vision-based guidance system 150 can apply background subtraction performed on registered images or can use specialized object detectors.”) and perform object detection on the pre-processed image with the background subtracted (Refer to para [057-058]; “the vision-based guidance system 150 can automatically detect the object of interest in the sequence of images. In some embodiments, the vision-based guidance system 150 can apply background subtraction performed on registered images or can use specialized object detectors. In other embodiments, the vision-based guidance system 150 can utilize a convolutional neural network to detect the object of interest in one or more images. As described below in greater detail with reference to FIGS. 7A-9, the vision processing module 154 can create or update an object model 159 using machine learning techniques. In some embodiments, the vision .

Claim 32 is rejected under 35 U.S.C. 102(a)(1) and/or (a)(2) as being anticipated by Nakamura et al. (US 20140071251 A1).

Regarding Claim 32: Nakamura discloses a semiconductor package apparatus (Refer to para [224]; “The image processing unit 15 is a semiconductor integrated circuit embedded within the display device.”) comprising: one or more substrates (Refer to para [224]; “The image processing unit 15 may also be a system Large-Scale Integration (hereinafter, LSI) package mounted on a high-density substrate. The system LSI may be realized as a plurality of individual bare chips mounted on a high-density substrate, or may be a multi-chip module in which a plurality of bare chips are packaged so as to have the appearance of a single LSI.”) and logic coupled to the one or more substrates, wherein the logic is at least partly implemented in one or more of configurable logic and fixed-functionality hardware logic (Refer to para [233]; “The image processing device, integrated circuit, and image processing program pertaining to the present invention are applicable to enabling extraction of three-dimensional positioning information relating to a specific object from a video using few calculations, and further applicable to the development of a remote control system for a camera-.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 27, 46 and rejected under 35 U.S.C. 103 as being unpatentable over Cho in combination with Chan. 

Regarding Claim 27: (New) Cho discloses all the claimed element as rejected above. Cho does not expressly disclose a convolutional neural network to detect an object in a pre-processed image. 

Chan teaches “automated surveillance of objects of interest, which involves multiple phases of a response chain including object detection, tracking, identification, and engagement.”

Chan discloses an image pre-processor (Refer to para [048]; “The processor 157 can execute instructions stored in the memory 151 to perform tracking or navigation tasks in accordance with the embodiments disclosed herein.”) to subtract the background from the image to provide the pre-processed image (Refer to para [057]; “the vision-based guidance system 150 can apply background subtraction performed on registered images or can use specialized object detectors.”) and a convolutional neural network to detect an object in the pre-processed image (Refer to para [057]; “the vision-based guidance system 150 can utilize a convolutional neural network to detect the object of interest in one or more images.”).



The suggestion/motivation for combining the teachings of Cho and Chan would have been in order to “employ a tracking algorithm to identify properties of an object of interest through a sequence of images. The tracking algorithm can operate based on one or more different concepts including, but not limited to, background-subtraction, optical flow/motion-based, complex appearance model-based, part-based, key-point-based, and discriminative learning. In particular, discriminative learning methods (i.e., machine learning) are appealing from the standpoint the models generated thereby are relatively persistent and robust in the sense that they have the ability to maintain lock on the object of interest in the presence of clutter and view obstruction. Object models trained through discriminative training methods can provide a good basis for reacquiring an object of interest upon track loss in the presence of clutter because of the discrimination characteristics of the object model 159.” (at para [162], Chan).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Cho and Chan in order to obtain the specified claimed elements of Claim 27. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 46: (New) Cho discloses all the claimed element as rejected above. Cho does not expressly disclose a convolutional neural network to detect an object in a pre-processed image. 



Chan discloses an image pre-processor (Refer to para [048]; “The processor 157 can execute instructions stored in the memory 151 to perform tracking or navigation tasks in accordance with the embodiments disclosed herein.”) to subtract the background from the image to provide the pre-processed image (Refer to para [057]; “the vision-based guidance system 150 can apply background subtraction performed on registered images or can use specialized object detectors.”) and a convolutional neural network to detect an object in the pre-processed image (Refer to para [057]; “the vision-based guidance system 150 can utilize a convolutional neural network to detect the object of interest in one or more images.”).

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Cho by adding a “vision-based guidance system in order to utilize a convolutional neural network to detect the object of interest in one or more images…” as taught by Chan as rejected above. 

The suggestion/motivation for combining the teachings of Cho and Chan would have been in order to “employ a tracking algorithm to identify properties of an object of interest through a sequence of images. The tracking algorithm can operate based on one or more different concepts including, but not limited to, background-subtraction, optical flow/motion-based, complex appearance model-based, part-based, key-point-based, and discriminative learning. In particular, discriminative learning methods (i.e., machine learning) are appealing from the standpoint the models generated thereby are relatively persistent and robust in the sense that they have the ability to maintain lock on the object of interest in the presence of clutter and view obstruction. Object models trained 

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Cho and Chan in order to obtain the specified claimed elements of Claim 46. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Claim 28 is rejected under 35 U.S.C. 103 as being unpatentable over Cho in combination with Stemle (US 20180043229 A1).

Regarding Claim 28: Cho discloses all the claimed elements as rejected above. Cho does not expressly disclose perform object detection on both a foreground image and a background image.

Stemle teaches a system and method including a computer vision system and one or more throwing targets each having an outlay of one or more distinct colored zones.

Stemle teaches “the image capturing system 117, the computer system 119, including all hardware, software, databases, processors, wires, inputs, outputs, wireless network interfaces, and other electronics and computer programs may be referred to collectively as part of the "computer vision system."” wherein the processor is capable to pre-process the image to split the image into a foreground image with the background subtracted and a background image; perform object detection on both the foreground image and the background image; and combine results of the object detection on the foreground with results of the object detection on the background (Refer to para 
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Cho by adding a subtraction computer vision algorithms as rejected above by Stemle.

The suggestion/motivation for combining the teachings of Cho and Stemle would have been in order to “…(FIGS. 15 and 16) demonstrate exemplar results of background subtraction based on MoG for video with a static background detecting a moving baseball 112 with no false positives.” (at para [117], Stemle).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Cho and Stemle in order to obtain the specified claimed elements of Claim 28. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Claim 29 is rejected under 35 U.S.C. 103 as being unpatentable over Cho in combination with Stemle and further in view of Schulter (US 20190096125 A1).

Regarding Claim 29 (New): Cho in combination with Stemle discloses all the claimed elements as rejected above. Cho in combination with Stemle does not expressly disclose determining object occlusion for the background image based on two or more image frames.

Schulter teaches generating bird eye view representations of environments and more particularly generating occlusion-aware bird eye view representations of complex road scenes. 

More specifically, Schulter teaches logic (Refer to para [023]; Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements. In a preferred embodiment, the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc. ”) determine object occlusion for the background image based on two or more image frames (Refer to para [030]; “An image from the image capture device 130 can be analyzed by a computing system 100 to provide a historical or real-time bird's eye view map of the road scene to a user. To provide such a map, the computer system 100 receives the perspective view image and infers objects occluded by the foreground objects. By inferring the occluded objects, the computing system 100 can localize both foreground and background objects with a high degree of fidelity.”) and fill in the background image based on the determined object occlusion (Refer to para [031-033]; “Accordingly, at least one embodiment of the computing system 100 includes a computer processing device 100 with an object detector 402. The object detector 402 access the image and detects foreground objects such as, e.g., the tree, 140, the car 160 and the street lamp 150. To detect the foreground objects, the object detector 402 includes a neural network, such as, e.g., a convolutional neural network or pyramid scene parsing (PSP) network, that performs semantic segmentation on the image. Concurrently with the object detector 402, a depth predictor 404 included with the computer processing device 110 determines depth measurements for each foreground object. To determine the depth measurements, the depth 

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Cho and Stemle as rejected above by Schulter.

The suggestion/motivation for combining the teachings of Cho, Stemle and Schulter would have been in order to “establish both features and depth values, similar to the depth predictor 404, a mapping 

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Cho, Stemle and Schulter in order to obtain the specified claimed elements of Claim 29. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Claims 41, 42, 47 and 48 are rejected under 35 U.S.C. 103 as being unpatentable over Cho in combination with Schulter (US 20190096125 A1).

Regarding Claim 41: Cho discloses all the claimed elements as rejected above. Cho does not expressly disclose combining results of the object detection on the foreground with results of the object detection on the background.

Schulter teaches generating bird eye view representations of environments and more particularly generating occlusion-aware bird eye view representations of complex road scenes. 




The suggestion/motivation for combining the teachings of Cho and Schulter would have been in order to “establish both features and depth values, similar to the depth predictor 404, a mapping system 300 including, e.g., a background mapping system, can establish coordinates for each background object to localize the background objects in 3D space, such as, e.g., by generating a 3D point cloud. The 3D point cloud can be converted to a bird's eye view by eliminating an elevation component form the 3D point cloud, projecting the points onto a horizontal plane. Thus, a 2D, top-down map of the background objects is created…refining the bird's eye view generated by the mapping system 300 leveraging, e.g., street maps such as, e.g., OpenStreet Map data, or by simulating road shapes, among other refining techniques to ensure that road locations and shapes are correct within the bird's eye view.”(at para [034], Schulter).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Cho and Schulter in order to obtain the specified claimed elements of Claim 41. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 42 (New): Schulter teaches generating bird eye view representations of environments and more particularly generating occlusion-aware bird eye view representations of complex road scenes. 



Regarding Claim 47: Cho discloses all the claimed elements as rejected above. Cho does not expressly disclose combining results of the object detection on the foreground with results of the object detection on the background.

Schulter teaches generating bird eye view representations of environments and more particularly generating occlusion-aware bird eye view representations of complex road scenes. 

More specifically, Schulter teaches pre-processing the image to split the image into a foreground image with the background subtracted and a background image; (Refer to para [004]; “The method includes identifying foreground objects and background objects in an input image by using a semantic segmentation network to extract foreground features corresponding to the foreground objects and background features corresponding to the background objects. The foreground objects are masked from the input image with a mask. Occluded objects are inferred by predicting semantic features in masked areas of the masked image with a semantic in-painting network according to contextual information related to the identified background features visible in the masked image.”) 

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Cho by adding an image processing application such as detecting the foreground objects, the object detector 402 includes a neural network, such as, e.g., a convolutional neural network as rejected above by Schulter.

The suggestion/motivation for combining the teachings of Cho and Schulter would have been in order to “establish both features and depth values, similar to the depth predictor 404, a mapping 

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Cho and Schulter in order to obtain the specified claimed elements of Claim 47. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 48: Schulter teaches generating bird eye view representations of environments and more particularly generating occlusion-aware bird eye view representations of complex road scenes. 

More specifically, Schulter teaches cause the computing device to: determine object occlusion for the background image based on two or more image frames (Refer to para [030]; “An image from the image capture device 130 can be analyzed by a computing system 100 to provide a historical or real-time bird's eye view map of the road scene to a user. To provide such a map, the computer system 100 receives the perspective view image and infers objects occluded by the foreground objects. By inferring the occluded objects, the computing system 100 can localize both foreground and background objects with a high degree of fidelity.”) and fill in the background image based on the .

Claim 33 is rejected under 35 U.S.C. 103 as being unpatentable over Nakamura et al. (US 20140071251 A1) in combination with Chan et al. (US 20170301109 A1).
Regarding Claim 33: (New) Nakamura discloses all the claimed element as rejected above. Nakamura does not expressly disclose a convolutional neural network to detect an object in a pre-processed image. 

Chan teaches “automated surveillance of objects of interest, which involves multiple phases of a response chain including object detection, tracking, identification, and engagement.”

Chan discloses an image pre-processor to subtract the background from the image to provide the pre-processed image (Refer to para [057]; “the vision-based guidance system 150 can apply background subtraction performed on registered images or can use specialized object detectors.”) and a convolutional neural network to detect an object in the pre-processed image (Refer to para [057]; “the vision-based guidance system 150 can utilize a convolutional neural network to detect the object of interest in one or more images.”).

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Nakamura by adding a “vision-based guidance system in order to utilize a convolutional neural network to detect the object of interest in one or more images…” as taught by Chan as rejected above. 
The suggestion/motivation for combining the teachings of Nakamura and Chan would have been in order to “employ a tracking algorithm to identify properties of an object of interest through a 

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Nakamura and Chan in order to obtain the specified claimed elements of Claim 33. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Claims 34 and 35 are rejected under 35 U.S.C. 103 as being unpatentable over Nakamura in combination with Schulter (US 20190096125 A1).

Regarding Claim 34: Nakamura discloses all the claimed elements as rejected above. Nakamura does not expressly disclose combining results of the object detection on the foreground with results of the object detection on the background.

Schulter teaches generating bird eye view representations of environments and more particularly generating occlusion-aware bird eye view representations of complex road scenes. 




The suggestion/motivation for combining the teachings of Nakamura and Schulter would have been in order to “establish both features and depth values, similar to the depth predictor 404, a mapping system 300 including, e.g., a background mapping system, can establish coordinates for each background object to localize the background objects in 3D space, such as, e.g., by generating a 3D point cloud. The 3D point cloud can be converted to a bird's eye view by eliminating an elevation component form the 3D point cloud, projecting the points onto a horizontal plane. Thus, a 2D, top-down map of the background objects is created…refining the bird's eye view generated by the mapping system 300 leveraging, e.g., street maps such as, e.g., OpenStreet Map data, or by simulating road shapes, among other refining techniques to ensure that road locations and shapes are correct within the bird's eye view.”(at para [034], Schulter).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Nakamura and Schulter in order to obtain the specified claimed elements of Claim 34. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 35: Nakamura discloses all the claimed elements as rejected above. Nakamura does not expressly disclose determining  object occlusion for the background image based on two or more image frames.



More specifically, Schulter teaches logic (Refer to para [023]; Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements. In a preferred embodiment, the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc. ”) determine object occlusion for the background image based on two or more image frames (Refer to para [030]; “An image from the image capture device 130 can be analyzed by a computing system 100 to provide a historical or real-time bird's eye view map of the road scene to a user. To provide such a map, the computer system 100 receives the perspective view image and infers objects occluded by the foreground objects. By inferring the occluded objects, the computing system 100 can localize both foreground and background objects with a high degree of fidelity.”) and fill in the background image based on the determined object occlusion (Refer to para [031-033]; “Accordingly, at least one embodiment of the computing system 100 includes a computer processing device 100 with an object detector 402. The object detector 402 access the image and detects foreground objects such as, e.g., the tree, 140, the car 160 and the street lamp 150. To detect the foreground objects, the object detector 402 includes a neural network, such as, e.g., a convolutional neural network or pyramid scene parsing (PSP) network, that performs semantic segmentation on the image. Concurrently with the object detector 402, a depth predictor 404 included with the computer processing device 110 determines depth measurements for each foreground object. To determine the depth measurements, the depth predictor 404 can establish a depth map according to, e.g., a stereoscopic image, a neural network for predicting depths such as, e.g., a fully convolutional residual network, or other depth determination technique. The depth map can be applied to the foreground objects extracted by the object detector 402 to determine 3D dimensional coordinates for each foreground object. The computer processing 

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Nakamura by adding an image processing application such as detecting the foreground objects, the object detector 402 includes a neural network, such as, e.g., a convolutional neural network as rejected above by Schulter.

The suggestion/motivation for combining the teachings of Nakamura and Schulter would have been in order to “establish both features and depth values, similar to the depth predictor 404, a mapping system 300 including, e.g., a background mapping system, can establish coordinates for each background object to localize the background objects in 3D space, such as, e.g., by generating a 3D 

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Nakamura and Schulter in order to obtain the specified claimed elements of Claim 35. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Claim 31, 44 and 50 are rejected under 35 U.S.C. 103 as being unpatentable over Cho in combination with Liu et al. “SSD: Single Shot MultiBox Detector” Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part I, LNCS 9905, pp. 21–37.

Regarding Claim 31: Cho discloses all the claimed elements as rejected above. Cho does not expressly disclose object detection on a pre-processed image with a single shot detector.

Liu teaches a deep network based object detector that does not resample pixels or features for bounding box hypotheses and is as accurate as approaches that do.
More specifically, Liu is capable to perform the object detection on the pre-processed image with a single shot detector model (Refer to page 22, para [003]; “– We introduce SSD, a single-shot detector for multiple categories that is faster than the previous state-of-the-art for single shot detectors (YOLO), and significantly more accurate, in fact as accurate as slower techniques that perform explicit 

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Cho by adding an image processing technique such as a single shot detector as rejected above by Liu.

The suggestion/motivation for combining the teachings of Cho and Liu would have been in order to enhance “a fast single-shot object detector for multiple categories. A key feature of our model is the use of multi-scale convolutional bounding box outputs attached to multiple feature maps at the top of the network. This representation allows us to efficiently model the space of possible box shapes.” (page 35, Liu Section 5; Conclusion).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Cho and Liu in order to obtain the specified claimed elements of Claim 31. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 44: Cho discloses all the claimed elements as rejected above. Cho does not expressly disclose object detection on a pre-processed image with a single shot detector.

Liu teaches a deep network based object detector that does not resample pixels or features for bounding box hypotheses and is as accurate as approaches that do.



Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Cho by adding an image processing technique such as a single shot detector as rejected above by Liu.

The suggestion/motivation for combining the teachings of Cho and Liu would have been in order to enhance “a fast single-shot object detector for multiple categories. A key feature of our model is the use of multi-scale convolutional bounding box outputs attached to multiple feature maps at the top of the network. This representation allows us to efficiently model the space of possible box shapes.” (page 35, Liu Section 5; Conclusion).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Cho and Liu in order to obtain the specified claimed elements of Claim 44. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 50: Cho discloses all the claimed elements as rejected above. Cho does not expressly disclose object detection on a pre-processed image with a single shot detector.


More specifically, Liu is capable to perform the object detection on the pre-processed image with a single shot detector model (Refer to page 22, para [003]; “– We introduce SSD, a single-shot detector for multiple categories that is faster than the previous state-of-the-art for single shot detectors (YOLO), and significantly more accurate, in fact as accurate as slower techniques that perform explicit region proposals and pooling (including Faster R-CNN). – The core of SSD is predicting category scores and box offsets for a fixed set of default bounding boxes using small convolutional filters applied to feature maps.”).

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Cho by adding an image processing technique such as a single shot detector as rejected above by Liu.

The suggestion/motivation for combining the teachings of Cho and Liu would have been in order to enhance “a fast single-shot object detector for multiple categories. A key feature of our model is the use of multi-scale convolutional bounding box outputs attached to multiple feature maps at the top of the network. This representation allows us to efficiently model the space of possible box shapes.” (page 35, Liu Section 5; Conclusion).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Cho and Liu in order to obtain the specified claimed elements of Claim 50. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.
Claim 37 is rejected under 35 U.S.C. 103 as being unpatentable over Nakamura et al. (US 20140071251 A1) in combination with Liu et al. “SSD: Single Shot MultiBox Detector” Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part I, LNCS 9905, pp. 21–37.

Regarding Claim 37: Nakamura discloses all the claimed elements as rejected above. Nakamura does not expressly disclose object detection on a pre-processed image with a single shot detector.

Liu teaches a deep network based object detector that does not resample pixels or features for bounding box hypotheses and is as accurate as approaches that do.

More specifically, Liu is capable to perform the object detection on the pre-processed image with a single shot detector model (Refer to page 22, para [003]; “– We introduce SSD, a single-shot detector for multiple categories that is faster than the previous state-of-the-art for single shot detectors (YOLO), and significantly more accurate, in fact as accurate as slower techniques that perform explicit region proposals and pooling (including Faster R-CNN). – The core of SSD is predicting category scores and box offsets for a fixed set of default bounding boxes using small convolutional filters applied to feature maps.”).

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Nakamura by adding an image processing technique such as a single shot detector as rejected above by Liu.
The suggestion/motivation for combining the teachings of Nakamura and Liu would have been in order to enhance “a fast single-shot object detector for multiple categories. A key feature of our model is the use of multi-scale convolutional bounding box outputs attached to multiple feature maps at the 

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Nakamura and Liu in order to obtain the specified claimed elements of Claim 37. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.
Allowable Subject Matter
Claims 30, 36, 38, 43 and 49 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US 20200045243 A1		US 20210209752 A1		US 9105104 B2
US 20210031507 A1
Kellerman (US 20190325263 A1)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MIA M THOMAS whose telephone number is (571)270-1583. The examiner can normally be reached M-Th 8:30am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Edward (Ed) Urban can be reached on 572-272-7899. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


MIA M. THOMAS
Primary Examiner
Art Unit 2665



/MIA M THOMAS/Primary Examiner
Art Unit 2665