DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 5-7, 11 and 15-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al. (US 20210406674 A1) in view of Amato et al. (US 20210097354 A1).

Regarding claims 1 and 11, Wu et al. disclose a system and method comprising: a first sensor of a first type configured to sense objects around a vehicle (collecting, by a sensor device, sensor data associated with an environment around an automobile, [0056]) and to capture first data about the objects in a frame (“FIG. 5D illustrates a hybrid fusion architecture with a subset of sensors 550 having the same modality (e.g., camera 1 and camera 2) having overlapping fields of view (FOV1 and FOV2)”, [0040]); a second sensor of a second type configured to sense the objects around the vehicle and to capture second data about the objects in the frame (“a sensor having a different modality (e.g., LiDAR) and field of view”, [0040]); and a controller configured to: down-sample the first and second data to generate down-sampled first and second data having a lower resolution than the first and second data (“Embodiments provide convolution and downsampling layer processing (e.g., pooling) in association with the digital signal processors associated with edge sensors of, for example, an automobile, including camera sensors, radar sensors, and lidar sensors”, [0015], convolution and downsampling layers e.g., pooling layer of a convolutional neural network, [0041]); identify a first set of the objects by processing the down-sampled first and second data having the lower resolution (automatically detect objects, classify the objects, and determine distances between the objects and the vehicle, [0023], identify objects around the vehicle, [0024], radar point cloud or detection clusters, camera image pixel maps, and a lidar point clouds are processed by a fusion neural network to produce surrounding-view perception information which can include object boundary detection and classification, [0025], each feature map extracted transported to a network gateway that can intelligently combine different feature maps from different edge nodes e.g. for a fusion network which utilizes lidar and cameras for object detection, [0027]) [can see from Fig. 5A-5D the object detection takes place after the camera and lidar preprocessing/pooling].

Wu et al. do not explicitly disclose identify a second set of the objects by selectively processing the first and second data from the frame.

Amato et al. teach a first sensor of a first type configured to sense objects around a vehicle (initial image that depicts a scene including a vehicle and a human individual, [0038]) and a controller configured to: down-sample the first and second data to generate down-sampled first and second data having a lower resolution than the first and second data (an initial image can be one that is down-sampled from an original image, [0038], the scene 500 can be one depicted by a digital image (e.g., down-sampled version of an original image captured by a digital security camera, [0042]); identify a first set of the objects by processing the down-sampled first and second data having the lower resolution (“In FIG. 3, image 302 can represent an initial image that depicts a scene including a vehicle and a human individual. As used herein, an initial image can be one that is down-sampled from an original image that was captured by a digital image capture device. Within the image 302, the object detection process can detect for (e.g., search for) one or more human individuals, which can serve as one or more anchor objects within the scene”, [0038], the scene 500 can be one depicted by a digital image e.g., down-sampled version of an original image, an object detection system described herein can start by detecting human individuals (object O3) in the given image, [0042]) and identify a second set of the objects by selectively processing the first and second data from the frame (object detection process can search the determined one or more regions, such as the region 310, for an object at a larger relative size or higher resolution or both, the object detection process can detect for one or more target objects (i.e., one or more human hands) in the determined one or more regions at the higher resolution (i.e., the image 304). Accordingly, the object detection process can detect human hands 320 and 322 in the image 304, [0038], If a human individual is detected, then hands (object O2) can be searched in regions determined relative to a detected human individual. As described herein, the regions can be determined by predicting a localization of a second anchor object (object O2) relative to detected human individual (object O3), where the determined region can be explored at a higher resolution than the current resolution of the region. Once one or more hands (object O2) are detected, a suitcase (object O1) can be detected in one or more regions determined relative to the detected hands (object O2). Again, the one or more regions (determined relative to the hands) can be explored at a higher resolution than the current resolution of those one or more regions. In this way, the carried suitcase will not be searched through the whole digital image, [0042]) [first set of objects = anchor objects found in downsampled version, second set of objects = hands/suitcase in just the area of the full resolution image near the anchor objects].

Wu et al. disclose all of the aspects of the claim including downsampling data from two sensors (image sensors and lidar sensors), therefore Amato et al. is incorporated to teach a refined search process on the downsampled camera, radar, and lidar data provided by Wu et al.  

Wu et al. and Amato et al. are in the same art of object detection using sensors (Wu et al., abstract; Amato et al., abstract, [0074]). The combination of Amato et al. with Wu et al. will enable the selectively processing portions of the higher resolution data. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the processing of Amato et al. with the invention of Wu et al. as this was known at the time of filing, the combination would have predictable results, and as Amato et al. indicate “Various embodiments described herein can improve a computing device's ability to perform object detection by reducing processing time used to perform the object detection (e.g., through a coarse-to-fine strategy) and by increasing the object detection rate achieved by the computing device (e.g., through exploiting a higher resolution representation and a larger relative object size). In doing so, various embodiments can improve the robustness and efficiency in visual object recognition tasks” ([0016]) thereby indicating the way the accuracy and efficiency of the hybrid detection scheme described by Wu et al. may be improved by the method of Amato et al.

Regarding claims 5 and 15, Wu et al. and Amato et al. disclose the system and method of claims 1 and 11. Wu et al. further indicate the controller is configured to navigate the vehicle based on the identified first and second sets of the objects (sensor network used in an automobile 100. In an autonomous driving system, [0020], Autonomous driving perception, including bounding box detections, classification, semantic segmentation, and instance segmentation, relies on individual deep neural nets that consume primitive sensory data and then the individual outputs are fused at an object level for decision-making. This is known as late fusion. Alternatively, an early fusion system can improve reliability by employing a single large neural network that consumes primitive sensor data and outputs a joint perception result. A typical sensor primitive can include, for example, detection/peak clusters for radar, image pixel maps for cameras, and 3D point clouds for lidar, [0028]).

Regarding claims 6 and 16, Wu et al. and Amato et al. disclose the system and method of claims 1 and 11. Wu et al. further indicate the first data is three-dimensional and the second data is two- or three-dimensional (including camera sensors, radar sensors, and lidar sensors, [0015], [0020], newer imaging radar sensors can map out surroundings in a three-dimensional point cloud in high resolution, [0021], lidar allows creating 3D images of detected objects and mapping the surroundings, [0022], radar point cloud or detection clusters, camera image pixel maps, and a lidar point clouds are processed, [0025], pixel mapping appropriate to the sensor type (e.g., projected radar/LiDAR point cloud or voxelized radar/LiDAR detections, [0047]).

Regarding claims 7 and 17, Wu et al. and Amato et al. disclose the system and method of claims 1 and 11. Wu et al. further indicate the first sensor is a Lidar sensor and the second sensor is a camera (including camera sensors, radar sensors, and lidar sensors, [0015], [0020], newer imaging radar sensors can map out surroundings in a three-dimensional point cloud in high resolution, [0021], lidar allows creating 3D images of detected objects and mapping the surroundings, [0022], radar point cloud or detection clusters, camera image pixel maps, and a lidar point clouds are processed, [0025], pixel mapping appropriate to the sensor type (e.g., projected radar/LiDAR point cloud or voxelized radar/LiDAR detections, [0047]).

Claim(s) 2, 3, 12, 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al. (US 20210406674 A1) and Amato et al. (US 20210097354 A1) as applied to claims 1 and 11 above, further in view of Clymer et al. (US 20210406602 A1).

Regarding claims 2 and 12, Wu et al. and Amato et al. disclose the system and method of claims 1 and 11. Wu et al. and Amato et al. further indicate the controller is configured to: detect the first set of the objects based on the processing of the downsampled second data (Wu et al., downsampling layer, [0015], detect objects, classify the objects, and determine distances between the objects and the vehicle, [0023], fusion network which utilizes lidar and cameras for object detection, [0027]; Amato et al., down-sampled from an original image, [0038], down-sampled version of an original image captured by a digital security camera, [0042]); generate proposals regarding identities of the objects based on the processing of the down-sampled first data (Wu et al., different feature maps from different edge nodes, [0027]; Amato et al., anchor objects, [0038], [0042]); and confirm identities of the detected first set of the objects based on a first set of the proposals (Amato et al., one or more regions (determined relative to the hands) can be explored at a higher resolution than the current resolution of those one or more regions. In this way, the carried suitcase will not be searched through the whole digital image, [0042]). 

To the extent however that the Amato et al. reference is actually looking for a second set of objects (hands, suitcase) as opposed to confirming the object (confirming for instance it is a human being), another reference is provided herein.

Clymer et al. teach detect the first set of the objects based on the processing of the downsampled second data (a lower resolution to identify objects, abstract, each image was simply down-sampled to a lower resolution, [0086], each image is down-sampled to 512×512 pixels, [0088]); generate proposals regarding identities of the objects based on the processing of the down-sampled first data (identify objects, abstract, bounding box locations for regions of interest within the image, without distinguishing between classes, [0088]); and confirm identities of the detected first set of the objects based on a first set of the proposals (second deep-learning model to analyze the objects at a higher resolution to classify the objects, abstract, each extracted patch was analyzed at the full original resolution for the classification stage, [0086], outputs from the object detection stage were used as the inputs into the classification stage, [0090]).

Though Clymer et al. does not use the term “object proposal”, as Clymer et al. indicate just detection of the object without classifying it in a first stage and establishing a bounding box, this is interpreted as the “object proposal” for which the identity is established in the following confirmation step. 
Wu et al. and Amato et al. and Clymer et al. are in the same art of object detection (Wu et al., abstract; Amato et al., abstract; Clymer et al., abstract). The combination of Clymer et al. with Wu et al. and Amato et al. will enable the confirming of object identities. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the confirmation of Clymer et al. with the invention of Wu et al. and Amato et al. as this was known at the time of filing, the combination would have predictable results, and as Clymer et al. indicate this will minimize false positives from while maintaining a high level of diagnostic accuracy ([0013]), thereby indicating the benefit in medical applications, however “The described invention, as mentioned above, can be generalized to different applications” ([0081]), indicating that it would be obvious to combine Clymer et al. with the traffic and person detection device of Wu et al. and Amato et al. to reduce false positives in this detection as well. 

Regarding claims 3 and 13, Wu et al. and Amato et al. and Clymer et al. disclose the system and method of claims 2 and 12. Wu et al. and Amato et al. and Clymer et al. further indicate process a second set of the proposals using corresponding data from the first and second data from the frame (Amato et al., objects O1-O3, [0042] [first object is person and second/third set of objects is the hand and suitcase]; Clymer et al., identify object, abstract, bounding box location, [0088]) and identify the second set of the objects based on the processing of the second set of the proposals using the corresponding data from the first and second data from the frame (Amato et al., one or more regions can be explored at a higher resolution than the current resolution of those one or more regions, [0042]; Clymer et al., analyze the objects at a higher resolution to classify the objects, abstract, classification stage, [0086], [0090]).

Claim(s) 4 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al. (US 20210406674 A1) and Amato et al. (US 20210097354 A1) as applied to claims 1 and 11 above, further in view of Lucas et al. (US 20150043782 A1).

Regarding claims 4 and 14, Wu et al. and Amato et al. disclose the system and method of claims 1 and 11. Wu et al. further imply the controller is configured to display the identified first and second sets of the objects on a display in the vehicle (“FIG. 1 is a simplified block diagram illustrating an example of a sensor network used in an automobile 100. In an autonomous driving system, for example, data from multiple different types of sensors is used to construct a 360° perception of the environment around the vehicle. Typical sensor types include radar, camera, LiDAR, ultrasound, and a GPS/inertial measurement unit. FIG. 1 illustrates a simplified example of distribution of such sensors throughout the vehicle, [0020]) however as the references do not specify this output is displayed inside the vehicle, another reference is provided herein.

Lucas et al. teach display the identified first and second sets of the objects on a display in the vehicle (a method of detecting obstacles surrounding a vehicle and displaying the obstacles and data related to the obstacles on a digital display inside the vehicle, [0006], detecting obstacles surrounding a vehicle and displaying the obstacles and data related to the obstacles on a digital display inside the vehicle, [0031]).

Wu et al. and Amato et al. and Lucas et al. are in the same art of object detection (Wu et al., abstract; Amato et al., abstract; Lucas et al., abstract). The combination of Lucas et al. with Wu et al. and Amato et al. will enable the display of obstacle data within the car. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the display of Lucas et al. with the invention of Wu et al. and Amato et al. as this was known at the time of filing, the combination would have predictable results, and as Lucas et al. indicate this will effectively increase the ability of the driver of the vehicle to view their surroundings thus improving safety for themselves and those around them ([0004]), which is applicable to the driving application described by Wu et al..

Claim(s) 8 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al. (US 20210406674 A1) and Amato et al. (US 20210097354 A1) and Clymer et al. (US 20210406602 A1) as applied to claims 2 and 12 above, further in view of Hwang et al. (US 10906558 B1).

Regarding claims 8 and 18, Wu et al. and Amato et al. and Clymer et al. disclose the system and method of claims 2 and 12. Wu et al. and Amato et al. and Clymer et al. do not disclose the proposals include N1 proposals regarding first objects within a first range of the vehicle and N2 proposals regarding second objects within a second range of the vehicle that is beyond the first range, where N1 and N2 are integers greater than 1, and N1 >N2.

Hwang et al. teach proposals include N1 proposals regarding first objects within a first range of the vehicle and N2 proposals regarding second objects within a second range of the vehicle that is beyond the first range, where N1 and N2 are integers greater than 1, and N1 >N2 (“FIG. 1A illustrates an exemplary environment in which the mechanisms of managing interactions between an autonomous vehicle and objects can be performed, according to some embodiments. The exemplary environment 100 includes a vehicle 102 and multiple objects that are in the vicinity of the vehicle 102 while the vehicle 102 is moving on a highway 106. The objects include the vehicles 104A-E. At time 103, the vehicle 102 is in motion on a highway towards a destination. The vehicle 102 follows a route to reach the destination. While in some scenarios, the highway 106 may include two sections 106A and 106B, in which vehicles move in two opposing directions (e.g., vehicles (e.g., vehicles 104A-E and vehicle 102) moving in a first direction in section 106A and vehicles (e.g., vehicles 105A-B) moving in a second direction that is opposite to the first direction in section 106B), in other scenarios, the highway 106 includes only a single section (e.g., section 106A) where the vehicles move in the same direction. While in some scenarios the vehicle 102 can be located in a portion 106A of the highway that includes an exit 108, in other scenarios, the vehicle is located in a portion of the highway that does not include an exit. When there is an exit, in some scenarios the vehicle 104A may be present when the vehicle 102 is approaching the exit, while in other scenarios, the vehicle 104A is not present. The embodiments described in further detail below will consider the vehicles 104A-E and the vehicles 105A-B and the presence of the exit 108 as part of the portion of the highway in which the vehicle 102 is located”, col. 3, lines 20-65 [N1 = 5 other vehicles, 104A-E, in zone closest to the car, 106A, N2 = 2 other vehicles, 105 A-B, in zone furthest from the car], The sensors include one or more cameras, lidar devices, and radars that are mounted on the vehicle 102, where each one of these sensors is associated with a field of view, col. 5, lines 15-25, identify one or more objects that are located in the vicinity of vehicle 102, object detection is performed to recognize and localize multiple objects in the scene captured by the sensors (e.g., camera, lidar, radar). In some embodiments, objects can be recognized by estimating a classification probability and are localized within a bounding box, col. 14, lines 30-50, perception systems 410 includes a combination of cameras and lidars, a combination of cameras and radars, or a combination of lidars and radars, col. 20, lines 25-40, 


    PNG
    media_image1.png
    742
    529
    media_image1.png
    Greyscale

).

Wu et al. and Amato et al. and Hwang et al. are in the same art of object detection (Wu et al., abstract; Amato et al., abstract; Hwang et al., abstract). The combination of Hwang et al. with Wu et al. and Amato et al. and Clymer et al. will enable the detection of objects within ranges. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the ranges of Hwang et al. with the invention of Wu et al. and Amato et al. and Clymer et al. as this was known at the time of filing, the combination would have predictable results, and as Hwang et al. indicate, “Autonomous vehicles are a maturing technology with the potential to reshape mobility by enhancing the safety, accessibility, efficiency, and convenience of automotive transportation. Multiple critical tasks need to be executed by a self-driving vehicle to ensure a safe motion of the vehicle in its environment. These critical tasks include motion planning for the vehicle through a dynamic environment shared with other objects (such as vehicles, pedestrians, buildings, etc.)” (col. 1, lines 10-20) thus indicating the increase to safety of the driving applications such as those described by Wu et al., indicating a commercial and ethical benefit to combining these detection techniques.

Allowable Subject Matter

Claims 9, 10, 19 and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The following art is cited as relevant but not sufficient to disclose, teach or fairly suggest the subject matter of these limitations in combination with the independent claims:

US 10509987 B1: a) if at least one training image is acquired, (i) instructing one or more convolutional layers to generate at least one first feature map by applying one or more convolution operations to at least one first manipulated image corresponding to the training image, (ii) instructing an RPN to generate one or more first object proposals corresponding to each of one or more first objects in the first manipulated image by using the first feature map, (iii) instructing a pooling layer to apply one or more pooling operations to each region, corresponding to each of the first object proposals, on the first feature map, to thereby generate at least one first pooled feature map, and (iv) instructing an FC layer to apply at least one fully connected operation to the first pooled feature map, to thereby generate first object detection information corresponding to the first objects; (b) (i) instructing the target object estimating network to search for a (k−1)-th target region, corresponding to an area, where at least one target object is estimated as located, on a (k−1)-th manipulated image, by referring to one or more (k−1)-th object proposals on the (k−1)-th manipulated image, (ii) if a k-th manipulated image is acquired which corresponds to the (k−1)-th target region on the training image or its one or more resized images, instructing the convolutional layers to apply the convolution operations to the k-th manipulated image, to thereby generate a k-th feature map, (iii) instructing the RPN to generate one or more k-th object proposals corresponding to each of k-th objects on the k-th manipulated image by referring to the k-th feature map, (iv) instructing the pooling layer to apply the pooling operations to each region, corresponding to each of the k-th object proposals, on the k-th feature map, to thereby generate at least one k-th pooled feature map, and (v) instructing the FC layer to apply the fully connected operation to the k-th pooled feature map, to thereby generate k-th object detection information corresponding to the k-th objects, by increasing k from 2 to n; and (c) (i) instructing the target object merging network to generate merged object proposals by merging the first object proposals to the n-th object proposals, and generate merged object detection information by merging the first object detection information to the n-th object detection information, and (ii) instructing an FC loss layer to generate one or more FC losses by referring to the merged object detection information and its corresponding GT, to thereby learn at least part of parameters of the FC layer and the convolutional layers by backpropagating the FC losses, and a learning device, a testing method, and a testing device using the same.

US 20190286932 A1: In at least one embodiment, determining the relevancy of the overlap between the object location proposal and each of the one or more center boxes includes: determining an intersection over union between the center box and the object location proposal; determining a score based on an amount of overlap between the center box and the object location proposal; determining a score based on an amount of overlap between the object location proposal and the center box; and determining the relevancy of the overlap between the object location proposal and the center box based on the determined intersection over union, the determined score based on the amount of overlap between the center box and the object location proposal, and the determined score based on the amount of overlap between the object location proposal and the center box.

US 20210134002 A1: To select the best 3D proposals, a multi-objective problem is formulated to increase a total confidence score while reducing the total intersection over unions (IOUs) between 3D bounding box pairs. In addition, IOUs between 3D bounding box pairs may be referred to herein as “a physical collision” or “a physical overlap constraint.” Instead of simply trusting the high-confidence proposals, the 3D object perception module 318 is configured to select less-confidence proposals to realize less physical collision.

US 20170236290 A1: Conditional random fields (CRFs) provide a natural framework to incorporate all mutual spatiotemporal relationships between proposals, as well as the initial proposal confidences. By using a CRF, the system can connect proposals in one frame with the proposals in the other frames, with pairwise edges being provided between the proposals in the whole video sequence. . The pairwise potential for a proposal provides a confidence of a foreground estimate or a background estimate for the proposal. The confidence is based on a linear combination of features of the proposal and another proposal (making up a pair of object proposals) from the refined set of object proposals across the plurality of video frames in a video sequence. For instance, the pairwise potentials can ensure that proposals that are located in a similar position and/or have similar properties have similar foreground or background assignments.

US 10713794 B1: The system may then output, for each patch input 410, an object proposal 430 (e.g., a binary map that identifies the pixels in the image that correspond to the object) and a score 440 (e.g., a scalar quantity predicting whether there is an object in the patch or not). As an example and not by way of limitation, first-pass output 830 may be a semantically-meaningful feature map with multiple channels. As an example and not by way of limitation, first-pass output 830 may include object-proposal encodings (i.e., object proposals 430). The second pass 840 of system 800 is then used in order to extract pixel-level information (i.e., high-resolution information) for image patches 810 in addition to the object-level information obtained in first pass 820. Layer 848 may take as input the output 830 of the first pass. After an input image patch is processed by first pass 820 and second pass 840, system 800 will have generated object-level information (e.g., general identification of a cat in an image) and the pixel-level information (e.g., identification of the edges of the cat). In particular embodiments, system 800 refines it by successively integrating information from earlier layers

US 20210027098 A1: A proposal refinement sub-module of the multi-label classification module is employed to generate locations of objects (e.g., an updated, refined, and/or calibrated object proposal) and assign initial pixel-wise labels to at least a portion of the pixels included in the generated object proposals. As discussed below, the multi-label classification module may generate an object score (e.g., a likelihood and/or confidence score) for each object proposal


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: 

Radecki et al. “All Weather Perception: Joint Data Association, Tracking, and Classification for Autonomous Ground Vehicles”: Experimental evaluation demonstrates robust all-weather data association, tracking, and classification where camera, lidar, and radar sensors complement each other inside the joint probabilistic perception algorithm

    PNG
    media_image2.png
    582
    702
    media_image2.png
    Greyscale



US 20200226377 A1: The one or more processors 102 may include an application processor 214, an image processor 216, a communication processor 218, or any other suitable processing device. Similarly, image acquisition devices 104 may include any number of image acquisition devices and components depending on the requirements of a particular application. Image acquisition devices 104 may include one or more image capture devices (e.g., cameras, charge coupling devices (CCDs), or any other type of image sensor). The safety system 200 may also include a data interface communicatively connecting the one or more processors 102 to the one or more image acquisition devices 104. For example, a first data interface may include any wired and/or wireless first link 220, or first links 220 for transmitting image data acquired by the one or more image acquisition devices 104 to the one or more processors 102, e.g., to the image processor 216. Flow 700 may include one or more processors analyzing and/or processing (block 708) the C1, C2, C3, Ep, and En images via the channel inputs to calculate object detection data. This may include, for instance, applying a suitably trained neural network algorithm to the combined image data. As an example, this may include processing frames of the static camera images including the static-based camera sensor data, and processing the encoded information included in the event camera images, each being received via the respective channel inputs, to determine a location and type of one or more objects included in the scene. Again, the training may be performed in accordance with similar types of road scenes in various conditions, such that the CNN may appropriately convolve, downsample, and combine the C1, C2, C3, Ep, and En images to detect the location of objects in the overall road scene as a bounding box together with the classified probability of the type of object in the road scene.

US 9916681 B2: capturing an image representative of a real-world environment; defining, by a processor, a reference point in the image; determining a first set of data representative of an object in the image at the reference point; determining a second set of data representative of a surface of the object; identifying a first subset of data in the second set of data representative a potential shadow on the surface of the object, wherein the potential shadow indicates a shadow may be located on at least a portion of the surface of the object based on a feature of the second set of data; identifying a second subset of data in the second set of data representative of a light source in the image based on the second set of data; determining an anticipated shadow on the surface of the object from the light source, wherein the anticipated shadow is a shadow that may be located on the surface of the object based on an illumination pattern of the light source; generating notional sensory content to augment the image of the real-world environment at the reference point; and determining that the potential shadow matches the anticipated shadow; and in response to the potential shadow matching the anticipated shadow, applying the potential shadow onto the notional sensory content.

US 11062454 B1: receiving first sensor data associated with a first type of sensor, the first sensor data representing a portion of an environment surrounding an autonomous vehicle; receiving second sensor data associated with a second type of sensor, the second sensor data representing a same portion or different portion of the environment as the portion represented by the first sensor data; receiving an object detection, wherein the object detection identifies an object in one or more images; determining, based at least in part on the object detection, a first subset of the first sensor data and a second subset of the second sensor data; inputting the first subset of the first sensor data into a first subnetwork; inputting the second subset of the second sensor data into a second subnetwork; receiving a first output from the first subnetwork and a second output from the second subnetwork; combining, as a combined output, the first output and the second output; inputting a first portion of the combined output into a third subnetwork and a second portion of the combined output into a fourth subnetwork; and receiving a first map from the third subnetwork and a second map from the fourth subnetwork, wherein: the first map indicates at least a first probability that a first point of the first sensor data is associated with the object, and the second map indicates at least a second probability that a second point of the second sensor data is associated with the object.

US 20200064483 A1: In some embodiments, the one or more different types of sensors in the second set of sensors are selected from the group consisting of stereo cameras, lidar units, and ultrasonic sensors. In some embodiments, the second set of sensors comprise a plurality of stereo cameras and a plurality of lidar units. In some embodiments, the plurality of stereo cameras are configured to capture color image and depth data. In some embodiments, data collected by the plurality of stereo cameras and data collected by the plurality of lidar units from the second set of sensors are fused together to generate a set of RGB-D data that is representative of a 3D color map of a region in proximity to or surrounding the vehicle. In some embodiments, the RGB-D data is usable to detect the presence and type of obstacles in a region in proximity to or surrounding the vehicle. In some embodiments, the RGB-D data is fused with data from other types of sensors from the first and/or second sets of sensors to extract more details about a region in proximity to or surrounding the vehicle. In some embodiments, data collected by the plurality of stereo cameras is used for obstacle detection and for generating a first set of obstacle information, wherein data collected by the plurality of lidar units is used for obstacle detection and for generating a second set of obstacle information, and wherein the first and second sets of obstacle information are fused together to generate an environmental map of a region in proximity to or surrounding the vehicle. In some embodiments, different weight values are assigned to the first and second sets of obstacle information depending on a visibility factor of a region in proximity to or surrounding the vehicle. In some embodiments, the visibility factor is determined based on the data collected by the plurality of stereo cameras. In some embodiments, the first set of obstacle information is assigned a higher weight value than the second set of obstacle information when the visibility factor is above a predetermined threshold. In some embodiments, the first set of obstacle information is assigned a lower weight value than the second set of obstacle information when the visibility factor is below the predetermined threshold.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M ENTEZARI HAUSMANN whose telephone number is (571)270-5084. The examiner can normally be reached 10-7 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VINCENT M RUDOLPH can be reached on (571)272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MICHELLE M ENTEZARI/Primary Examiner, Art Unit 2661