DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-2, 6-10, 14-15, 19-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kwon (US 2020/0218979).
Regarding Claim 1, Kwon et al teach a depth system (system on chip 2104; Fig 21C and ¶ [0201]), comprising: one or more processors (CPU 2106, GPU 2108, processors 2110; Fig 21C and ¶ [0201]); a memory communicably coupled to the one or more processors (memory is part of the SoC 2104 to unify memory of the CPU 2106 and GPU 2108; Fig 21C and ¶ [0197], [0207]) and storing: a network module including instructions that, when executed by the one or more processors, cause the one or more processors (data store 2116 of SoC 2104 stores neural networks including instructions for object detection, executed on processor; Fig 21C and ¶ [0197], [0225]-[0226]) to: generate depth features from sensor data according to whether the sensor data includes sparse depth data (sensor data 102 is generated from a combination of sensors including camera, Lidar sensor and radar sensor and used to predict object distance 106 (predicting depth data from sparse Lidar depth data using sampling 1406); Fig 14 and ¶ [0143]-[0144], [0150]-[0153], [0156]), selectively inject the depth features into a depth model (sampling 1406 (increasing spatial resolution by predicting and incorporating select object depth features into the depth map) is used to improve the machine learning model 104 for the predicted depth map; Figs 14, 17 and ¶ [0156]-[0157]), generate a depth map from at least a monocular image using the depth model that is guided by the depth features when injected (a front-facing monocular camera of vehicle 2100 is used for image generation in sensor data 102 for predicting depth information using the machine learning model 104; Fig 14 and ¶ [0143], [0146]) and provide the depth map as depth estimates of objects represented in the monocular image (predicted depth maps of objects are generated that correspond to the input images at the same spatial resolution; Fig 14 ¶ [0153], [0156]).  
Regarding Claim 2, Kwon et al teach the depth system of claim 1 (as described above), wherein the network module includes instructions (data store 2116 of SoC 2104 stores neural networks including instructions for object detection, executed on processor; Fig 21C and ¶ [0197], [0225]-[0226]) to generate the depth features including instructions to use a sparse auxillary network that is a convolutional encoder to generate the depth features from the sparse depth data (a convolution neural network is the machine learning model 104 used to predict the object 116 (depth feature generation) that undergoes sampling 1406 to predict object features by increasing resolution through increasing free-space distance 1408 points; Figs 14, 17 and [0087], [0143], [0150], [0156]-[0157]), and wherein the sparse depth data is part of the sensor data that is acquired from a range sensor (spare Lidar data is part of the sensor data 102 used for generating the predicted depth maps using the machine learning model 104 and is acquired from a Lidar sensor 2160; Figs 14, 17, 21C and ¶ [0156], [0225]).  
Regarding Claim 6, Kwon et al teach the depth system of claim 1 (as described above), wherein the network module includes instructions (the Reduced Instruction Set Computer (RISC) of accelerators 2114 in SoC 2104 stores instructions for image sensors; Fig 21C and ¶ [0214]-[0215]) to acquire the sensor data including at least the monocular image from at least one sensor of a device (a front-facing monocular camera 2170 of vehicle 2100 is used for image generation in sensor data 102; Figs 21B,21C and ¶ [0146], [0192]-[0193]), wherein the network module includes instructions (data store 2116 of SoC 2104 stores neural networks including instructions for object detection, executed on processor; Fig 21C and ¶ [0225]-[0226]) to generate the depth features including instructions to determine whether the sensor data includes sparse depth data in addition to the monocular image (the machine learning model 104 is used to predict object distance 106 and the output depth map data undergoes sampling 1406 for increasing depth data to improve resolution as it relates to the input image; Figs 14, 17 and ¶ [0146], [0150], [0156]-[0157]), and activating a sparse auxillary network to generate the depth features from the sparse depth data when the sparse depth data is present (after the machine learning model 104 identifies an object 116 (depth feature generation) the predicted depth map then undergoes sampling 1406 to better predict object features by increasing resolution through increasing free-space distance 1408 points; Figs 14, 17 and [0143], [0150], [0156]-[0157]).
Regarding Claim 7, Kwon et al teach the depth system of claim 1 (as described above), wherein providing the depth map includes controlling a device to navigate through a surrounding environment according to the depth map that identifies distances to objects in the surrounding environment (the vehicle 2100 includes controllers 2136 that may include the SoC 2104 to operate the vehicle in response to sensor data that includes identification of distances to objects in the environment, with the SoC depth map data incorporated into controlling the vehicle through the network interface 2124; Fig 21C and ¶ [0184]-[0185], [0201]).  
Regarding Claim 8, Kwon et al teach the depth system of claim 1 (as described above), wherein the depth system is integrated within a device for autonomously controlling a vehicle (the vehicle 2100 includes the SoC 2104 for depth perception and use for autonomous driving; Fig 21C and [0201], [0214], [0222]).  

Regarding Claim 9, Kwon et al teach a non-transitory computer-readable medium including instructions that when executed by one or more processors cause the one or more processors (data store 2116 of SoC 2104 is understood as a non-transitory computer-readable medium that stores neural networks including instructions for object detection, executed on processor; Fig 21C and ¶ [0197], [0225]-[0226]) to: generate depth features from sensor data according to whether the sensor data includes sparse depth data (sensor data 102 is generated from a combination of sensors including camera, Lidar sensor and radar sensor and used to predict object distance 106 (predicting depth data from sparse Lidar depth data using sampling 1406); Fig 14 and ¶ [0143]-[0144], [0150]-[0153], [0156]), selectively inject the depth features into a depth model (sampling 1406 (increasing spatial resolution by predicting and incorporating select object depth features into the depth map) is used to improve the machine learning model 104 for the predicted depth map; Figs 14, 17 and ¶ [0156]-[0157]), generate a depth map from at least a monocular image using the depth model that is guided by the depth features when injected (a front-facing monocular camera of vehicle 2100 is used for image generation in sensor data 102 for predicting depth information using the machine learning model 104; Fig 14 and ¶ [0143], [0146]) and provide the depth map as depth estimates of objects represented in the monocular image (predicted depth maps of objects are generated that correspond to the input images at the same spatial resolution; Fig 14 ¶ [0153], [0156]).  
Regarding Claim 10, Kwon et al teach the non-transitory computer-readable medium of claim 9 (as described above), wherein the instructions to generate the depth features include instructions (data store 2116 of SoC 2104 stores neural networks including instructions for object detection, executed on processor; Fig 21C and ¶ [0197], [0225]-[0226]) to generate the depth features including instructions to use a sparse auxillary network that is a convolutional encoder to generate the depth features from the sparse depth data (a convolution neural network is the machine learning model 104 used to predict the object 116 (depth feature generation) that undergoes sampling 1406 to predict object features by increasing resolution through increasing free-space distance 1408 points; Figs 14, 17 and [0087], [0143], [0150], [0156]-[0157]), and wherein the sparse depth data is part of the sensor data that is acquired from a range sensor (spare Lidar data is part of the sensor data 102 used for generating the predicted depth maps using the machine learning model 104 and is acquired from a Lidar sensor 2160; Figs 14, 17, 21C and ¶ [0156], [0225]).  

Claim 14, Kwon et al teach a method (process 1400 of using system on chip 2104 for depth detection of objects; Fig 14 and ¶ [0143]), comprising: generating depth features from sensor data according to whether the sensor data includes sparse depth data (sensor data 102 is generated from a combination of sensors including camera, Lidar sensor and radar sensor and used to predict object distance 106 (predicting depth data from sparse Lidar depth data using sampling 1406); Fig 14 and ¶ [0143]-[0144], [0150]-[0153], [0156]); selectively injecting the depth features into a depth model (sampling 1406 (increasing spatial resolution by predicting and incorporating select object depth features into the depth map) is used to improve the machine learning model 104 for the predicted depth map; Figs 14, 17 and ¶ [0156]-[0157]); 29generating a depth map from at least a monocular image using the depth model that is guided by the depth features when injected (a front-facing monocular camera of vehicle 2100 is used for image generation in sensor data 102 for predicting depth information using the machine learning model 104; Fig 14 and ¶ [0143], [0146]); and providing the depth map as depth estimates of objects represented in the monocular image (predicted depth maps of objects are generated that correspond to the input images at the same spatial resolution; Fig 14 ¶ [0153], [0156]).  
Regarding Claim 15, Kwon et al teach the method of claim 14 (as described above), wherein generating the depth features includes using a sparse auxillary network that is a convolutional encoder to generate the depth features from the sparse depth (a convolution neural network is the machine learning model 104 used to predict the object 116 (depth feature generation) that undergoes sampling 1406 to predict object features by increasing resolution through increasing free-space distance 1408 points; Figs 14, 17 and [0087], [0143], [0150], [0156]-[0157]), and wherein the sparse depth data is part of the sensor data that is acquired from a range sensor (spare Lidar data is part of the sensor data 102 used for generating the predicted depth maps using the machine learning model 104 and is acquired from a Lidar sensor 2160; Figs 14, 17, 21C and ¶ [0156], [0225]).  
Regarding Claim 19, Kwon et al teach the method of claim 14 (as described above), further comprising: acquiring the sensor data including at least the monocular image from at least one sensor of a device (a front-facing monocular camera 2170 of vehicle 2100 is used for image generation in sensor data 102; Figs 21B,21C and ¶ [0146], [0192]-[0193]), wherein generating the depth features includes determining whether the sensor data includes sparse depth data in addition to the monocular image (the machine learning model 104 is used to predict object distance 106 and the output depth map data undergoes sampling 1406 for increasing depth data to improve resolution as it relates to the input image; Figs 14, 17 and ¶ [0146], [0150], [0156]-[0157]), and activating a sparse auxillary network to generate the depth features from the sparse depth data when the sparse depth data is present (after the machine learning model 104 identifies an object 116 (depth feature generation) the predicted depth map then undergoes sampling 1406 to better predict object features by increasing resolution through increasing free-space distance 1408 points; Figs 14, 17 and [0143], [0150], [0156]-[0157]).
Regarding Claim 20, Kwon et al teach the method of claim 14 (as described above), wherein providing the depth map includes controlling a device to navigate through a surrounding environment according to the depth map that identifies distances to objects in the surrounding environment (the vehicle 2100 includes controllers 2136 that may include the SoC 2104 to operate the vehicle in response to sensor data that includes identification of distances to objects in the environment, with the SoC depth map data incorporated into controlling the vehicle through the network interface 2124; Fig 21C and ¶ [0184]-[0185], [0201]).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3-4, 11-12, 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Kwon et al (US PG PUB 2020/0218979) in view of Popov et al (CN 112825134).
Regarding Claim 3, Kwon et al teach the depth system of claim 1 (as described above), wherein the network module includes instructions (data store 2116 of SoC 2104 stores neural networks including instructions for object detection, executed on processor; Fig 21C and ¶ [0225]-[0226]) to selectively inject the depth features including instructions to, in response to determining that the sensor data includes the sparse depth data, inject the depth features into the depth model (sampling 1406 (increasing spatial resolution by predicting and incorporating select object depth features into the depth map) is used to improve the machine learning model 104 for the predicted depth map; Figs 14, 17 and ¶ [0156]-[0157]).
Kwon et al does not teach to inject the depth features into the depth model by concatenating the depth features with image features from an encoder of the depth model and provide concatenated features into a decoder of the depth model.
Popov et al is analogous art pertinent to the problem solved in this application including to inject the depth features into the depth model by concatenating the depth features with image features from an encoder of the depth model and provide concatenated features into a decoder of the depth model (the machine learning model 108 includes using an encoder and decoder for the depth feature extractor 310 of the Radar data 106 and upsampling the feature map to improve the resolution (injecting depth features); Figs 1, 3 and ¶ [0050]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of this application to combine the teachings of Kwon et al and Popov et al including to inject the depth features into the depth model by concatenating the depth features with image features from an encoder of the depth model and provide concatenated features into a decoder of the depth model. Use of an encoder and decoder allows for the contraction and expansion of data during the convolutional neural network processing with a skip connection allowing for improving the feature map resolution, as recognized by Popov et al (¶ [0050]).
Regarding Claim 4, Kwon et al in view of Popov et al teach the depth system of claim 3 (as described above), wherein Popov et al teaches the network module includes instructions (the SoC 1304 uses the accelerator 1314 for executing the CNN image processing for object detection; Fig 13C and [0158]-[0159]) to inject the depth features including instructions to apply learned weights to the depth features and the image features (the neural network can use weights for the detected objects; Fig 13C and ¶ [0149]-[0150], [0164]) prior to concatenating via skip connections of the depth model (a skip connection 312 is used in the depth feature extractor 310 of the machine learning model 108; Figs 1, 3 and ¶ [0050]).  

Regarding Claim 11, Kwon et al teach the non-transitory computer-readable medium of claim 9 (as described above), wherein the instructions (data store 2116 of SoC 2104 is understood as a non-transitory computer-readable medium that stores neural networks including instructions for object detection, executed on processor; Fig 21C and ¶ [0197], [0225]-[0226]) to selectively inject the depth features include instructions to, in response to determining that the sensor data includes the sparse depth data, inject the depth features into the depth model (sampling 1406 (increasing spatial resolution by predicting and incorporating select object depth features into the depth map) is used to improve the machine learning model 104 for the predicted depth map; Figs 14, 17 and ¶ [0156]-[0157]).
Kwon et al does not teach to inject the depth features into the depth model by concatenating the depth features with image features from an encoder of the depth model and provide concatenated features into a decoder of the depth model.
Popov et al is analogous art pertinent to the problem solved in this application including to inject the depth features into the depth model by concatenating the depth features with image features from an encoder of the depth model and provide concatenated features into a decoder of the depth model (the machine learning model 108 includes using an encoder and decoder for the depth feature extractor 310 of the Radar data 106 and upsampling the feature map to improve the resolution (injecting depth features); Figs 1, 3 and ¶ [0050]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of this application to combine the teachings of Kwon et al and Popov et al including to inject the depth features into the depth model by concatenating the depth features with image features from an encoder of the depth model and provide concatenated features into a decoder of the depth model. Use of an encoder and decoder allows for the contraction and expansion of data during the convolutional neural network processing with a skip connection allowing for improving the feature map resolution, as recognized by Popov et al (¶ [0050]).
Regarding Claim 12, Kwon et al in view of Popov et al teach the non-transitory computer-readable medium of claim 11 (as described above), wherein Popov et al teaches the instructions (the SoC 1304 uses the accelerator 1314 for executing the CNN image processing for object detection; Fig 13C and [0158]-[0159]) to inject the depth features including instructions to apply learned weights to the depth features and the image features (the neural network can use weights for the detected objects; Fig 13C and ¶ [0149]-[0150], [0164]) prior to concatenating via skip connections of the depth model (a skip connection 312 is used in the depth feature extractor 310 of the machine learning model 108; Figs 1, 3 and ¶ [0050]).  

Regarding Claim 16, Kwon et al teach the method of claim 14 (as described above), wherein selectively inject the depth features includes, in response to determining that the sensor data includes the sparse depth data, inject the depth features into the depth model (sampling 1406 (increasing spatial resolution by predicting and incorporating select object depth features into the depth map) is used to improve the machine learning model 104 for the predicted depth map; Figs 14, 17 and ¶ [0156]-[0157]).
Kwon et al does not teach injecting the depth features into the depth model by concatenating the depth features with image features from an encoder of the depth model and provide concatenated features into a decoder of the depth model.
Popov et al is analogous art pertinent to the problem solved in this application including injecting the depth features into the depth model by concatenating the depth features with image features from an encoder of the depth model and provide concatenated features into a decoder of the depth model (the machine learning model 108 includes using an encoder and decoder for the depth feature extractor 310 of the Radar data 106 and upsampling the feature map to improve the resolution (injecting depth features); Figs 1, 3 and ¶ [0050]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of this application to combine the teachings of Kwon et al and Popov et al including injecting the depth features into the depth model by concatenating the depth features with image features from an encoder of the depth model and provide concatenated features into a decoder of the depth model. Use of an encoder and decoder allows for the contraction and expansion of data during the convolutional neural network processing with a skip connection allowing for improving the feature map resolution, as recognized by Popov et al (¶ [0050]).
Regarding Claim 17, Kwon et al in view of Popov et al teach the method of claim 16 (as described above), wherein Popov et al teaches injecting the depth features includes applying learned weights to the depth features and the image features (the neural network can use weights for the detected objects; Fig 13C and ¶ [0149]-[0150], [0164]) prior to concatenating via skip connections of the depth model (a skip connection 312 is used in the depth feature extractor 310 of the machine learning model 108; Figs 1, 3 and ¶ [0050]).  

Claims 5, 13, 18 are rejected under 35 U.S.C. 103 as being unpatentable over Kwon et al (US PG PUB 2020/0218979) in view of Yang et al (US PG PUB 2019/0387209).
Regarding Claim 5, Kwon et al teach the depth system of claim 1 (as described above), wherein the network module includes instructions (data store 2116 of SoC 2104 stores neural networks including instructions for object detection, executed on processor; Fig 21C and ¶ [0225]-[0226]) to generate the depth map including instructions to apply the depth model to the monocular image (a front-facing monocular camera of vehicle 2100 is used for image generation in sensor data 102 for predicting depth information using the machine learning model 104; Fig 14 and ¶ [0143], [0146]).
	Kwon et al does not teach to apply the depth model to the monocular image by using an encoder of the depth model to encode image features and to use a decoder of the depth model to decode the depth features into the depth map, and wherein the network module includes instructions to decode the image features at separate spatial resolutions as provided by skip connections between the encoder and the decoder in combination with an output of a previous layer of the decoder.  
	Yang et al is analogous art pertinent to the problem solved in this application including to apply the depth model to the monocular image by using an encoder of the depth model to encode image features (a monocular camera 105 is used to produce a single image that is input in the encoder of the encoder-decoder 202 of a convolution neural network for additive residual signals for the depth map 217; Figs 1, 2 and ¶ [0024], [0031]-[0033]], and to use a decoder of the depth model to decode the depth features into the depth map (the decoder of the encoder-decoder 202 of the convolution neural network is used to upproject feature maps; ¶ [0033]), and wherein the network module includes instructions to decode the image features at separate spatial resolutions as provided by skip connections between the encoder and the decoder in combination with an output of a previous layer of the decoder (the decoder architecture 202 will use skip-connections to enable high-resolution results of the reconstruction and the resolution for each layer is performed such that the resolutions are separate for each layer; ¶ [0033]-[0035]).  
	It would have been obvious to one of ordinary skill in the art to combine the teachings of Kwon et al with Yang et al including apply the depth model to the monocular image by using an encoder of the depth model to encode image features and to use a decoder of the depth model to decode the depth features into the depth map, and wherein the network module includes instructions to decode the image features at separate spatial resolutions as provided by skip connections between the encoder and the decoder in combination with an output of a previous layer of the decoder.  By using an encoder-decoder with a skip-connections between the encoder and decoder for each layer of the convolution layers the decoder can recover high-resolution results with fine-grained details for each layer thereby improving all resolutions within the depth image, as recognized by Yang et al (¶ [0033]).

Regarding Claim 13, Kwon et al teach the non-transitory computer-readable of claim 9 (as described above), wherein the instructions (data store 2116 of SoC 2104 stores neural networks including instructions for object detection, executed on processor; Fig 21C and ¶ [0225]-[0226]) to generate the depth map including instructions to apply the depth model to the monocular image (a front-facing monocular camera of vehicle 2100 is used for image generation in sensor data 102 for predicting depth information using the machine learning model 104; Fig 14 and ¶ [0143], [0146]).
	Kwon et al does not teach to apply the depth model to the monocular image by using an encoder of the depth model to encode image features and to use a decoder of the depth model to decode the depth features into the depth map, and wherein the instructions to decode the image features at separate spatial resolutions as provided by skip connections between the encoder and the decoder in combination with an output of a previous layer of the decoder.  
	Yang et al is analogous art pertinent to the problem solved in this application including to apply the depth model to the monocular image by using an encoder of the depth model to encode image features (a monocular camera 105 is used to produce a single image that is input in the encoder of the encoder-decoder 202 of a convolution neural network for additive residual signals for the depth map 217; Figs 1, 2 and ¶ [0024], [0031]-[0033]], and to use a decoder of the depth model to decode the depth features into the depth map (the decoder of the encoder-decoder 202 of the convolution neural network is used to upproject feature maps; ¶ [0033]), and wherein the instructions to decode the image features at separate spatial resolutions as provided by skip connections between the encoder and the decoder in combination with an output of a previous layer of the decoder (the decoder architecture 202 will use skip-connections to enable high-resolution results of the reconstruction and the resolution for each layer is performed such that the resolutions are separate for each layer; ¶ [0033]-[0035]).  
	It would have been obvious to one of ordinary skill in the art to combine the teachings of Kwon et al with Yang et al including apply the depth model to the monocular image by using an encoder of the depth model to encode image features and to use a decoder of the depth model to decode the depth features into the depth map, and wherein the instructions to decode the image features at separate spatial resolutions as provided by skip connections between the encoder and the decoder in combination with an output of a previous layer of the decoder.  By using an encoder-decoder with a skip-connections between the encoder and decoder for each layer of the convolution layers the decoder can recover high-resolution results with fine-grained details for each layer thereby improving all resolutions within the depth image, as recognized by Yang et al (¶ [0033]).

Regarding Claim 18, Kwon et al teach the method of claim 14 (as described above), wherein the network module includes instructions (data store 2116 of SoC 2104 stores neural networks including instructions for object detection, executed on processor; Fig 21C and ¶ [0225]-[0226]) to generate the depth map including instructions to apply the depth model to the monocular image (a front-facing monocular camera of vehicle 2100 is used for image generation in sensor data 102 for predicting depth information using the machine learning model 104; Fig 14 and ¶ [0143], [0146]).
	Kwon et al does not teach to apply the depth model to the monocular image by using an encoder of the depth model to encode image features and to use a decoder of the depth model to decode the depth features into the depth map, and wherein the network module includes instructions to decode the image features at separate spatial resolutions as provided by skip connections between the encoder and the decoder in combination with an output of a previous layer of the decoder.  
	Yang et al is analogous art pertinent to the problem solved in this application including to apply the depth model to the monocular image by using an encoder of the depth model to encode image features (a monocular camera 105 is used to produce a single image that is input in the encoder of the encoder-decoder 202 of a convolution neural network for additive residual signals for the depth map 217; Figs 1, 2 and ¶ [0024], [0031]-[0033]], and to use a decoder of the depth model to decode the depth features into the depth map (the decoder of the encoder-decoder 202 of the convolution neural network is used to upproject feature maps; ¶ [0033]), and wherein the network module includes instructions to decode the image features at separate spatial resolutions as provided by skip connections between the encoder and the decoder in combination with an output of a previous layer of the decoder (the decoder architecture 202 will use skip-connections to enable high-resolution results of the reconstruction and the resolution for each layer is performed such that the resolutions are separate for each layer; ¶ [0033]-[0035]).  
	It would have been obvious to one of ordinary skill in the art to combine the teachings of Kwon et al with Yang et al including apply the depth model to the monocular image by using an encoder of the depth model to encode image features and to use a decoder of the depth model to decode the depth features into the depth map, and wherein the network module includes instructions to decode the image features at separate spatial resolutions as provided by skip connections between the encoder and the decoder in combination with an output of a previous layer of the decoder.  By using an encoder-decoder with a skip-connections between the encoder and decoder for each layer of the convolution layers the decoder can recover high-resolution results with fine-grained details for each layer thereby improving all resolutions within the depth image, as recognized by Yang et al (¶ [0033]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Redford et al (WO 2020/188121) teaches a method and system for estimating depth of structures using a combination of images with lidar data.
	Moloney et al (CN 110383340) teaches generation of data from sparse volume data generated from depth sensor to improve the perception of objects observed from an autonomous vehicle.
	Smolyanskiy et al (US PG PUB 2019/0295282) teaches a system for depth estimation using convolution layers and depth data analysis including monocular images for training the depth detection for sparse depth data.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KATHLEEN M BROUGHTON whose telephone number is (571)270-7380. The examiner can normally be reached Monday-Friday 8:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella can be reached on 571-272-7778. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KATHLEEN M BROUGHTON/Examiner, Art Unit 2667

/MATTHEW C BELLA/Supervisory Patent Examiner, Art Unit 2667