DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant previously filed claims 1-34. Claim 10 has been cancelled. Claims 1, 11, 12, 14, 18, 19, 27-30, and 33 have been amended. Accordingly, claims 1-9, and 11-34 are pending in the current application.
Response to Arguments
 Applicant's arguments filed 08/11/2021 have been fully considered but they are not persuasive.
Applicant argues that Paluri in view of Graziosi et al. fails to teach “low resolution depth images correspond to a first low resolution image” or “that the asserted low resolution depth images are obtained using a generation model based on a neural network”. However, Examiner respectfully disagrees. To this end, in Paragraph 33, Paluri teaches “During either the training phase or the real-time phase, the feature extraction module 130 extracts features from color images captured by the color sensors 110 and low resolution depth images captured by the depth sensor 115. Additionally, the feature extraction module 130 may also extract features from a three dimensional map generated by the scene reconstruction module 140. Specifically, the features are variables that are relevant for upsampling the depth images captured by the depth sensor 115. In one embodiment, the features extracted by the feature extraction module 130 can include points, edges, surfaces, or objects identified in each image (e.g., color image or depth image). Here, the feature extraction module 130 may perform an image or object recognition algorithms on the color or depth images in order to identify the edges, surfaces of objects. The extracted features may also include a specific color, intensity of a color, shapes, textures, texts within the images, and the like. In various embodiments,  Paluri clearly and unambiguously teaches utilizing both color and infrared depth images in a neural network model to extract features, and convolve, pool and classify said features to reduce the total number of features, which results in a reduced data set, to serve as an input in the machine learning model. Graziosi is relied upon to teach generating a first low-resolution image having a resolution lower than  a resolution of the input image, which it teaches in Paragraph 41 where it states, “Before the depth encoding 212, the high resolution depth image 202 is down-sampled 211 to reduce the resolution of the depth image. The input depth image can already be a low resolution depth image. Nevertheless, the depth image still needs to be up-sampled for view 
Applicant further argues that Paluri in view of Graziosi et al. fails to teach “the input image comprises a color image or an infrared image”. However, examiner respectfully disagrees. In Paragraph 15, Paluri teaches “Generally, each of the color sensors 110, the depth sensor 115, and the scene mapping device 120 captures information regarding a scene and provides the captured information to the computing device 150.” Paluri further describes color images in Paragraph 16, “Referring to the specific components of the system 100, a color sensor 110 of the system 100 can be configured to capture an intensity of a light of a particular color (e.g., a range of wavelengths corresponding to a color of light). An example of a color sensor 110 is a charge-coupled device (CCD). In various embodiments, the system 100 may include multiple color sensors 110 to capture a wide range of color wavelengths. As depicted in FIG. 1, there may be three color sensors 110 that are configured to capture red, green, and blue (RGB) color light, respectively”. Further, in Paragraph 18, Paluri clarifies that the depth sensor is an infrared sensor, “A depth sensor 115 may be configured to capture an image with depth information corresponding to a scene. As an example, a depth sensor 115 may include both a projector, such as an infrared (IR) projector, and a camera that detects the emission of the projector, such as an IR camera.” Thus the input image(s) rely on both a color image and an infrared image to arrive at the end result.
In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
Applicant further argues regarding Claim 18, that Paluri in view of Graziosi et al. fail to teach “teaches that depth images comprise depth information of different degrees of precision”. However examiner respectfully disagrees. Graziosi is relied upon to teach generating the downsampling of depth images, which it teaches in Paragraph 41 where it states, “Before the depth encoding 212, the high resolution depth image 202 is down-sampled 211 to reduce the resolution of the depth image. The input depth image can already be a low resolution depth image. Nevertheless, the depth image still needs to be up-sampled for view synthesis.” This clearly defines depth information of different degrees of precision, where a down-sampled depth image has a reduced resolution which is considered a lower degree of precision.
Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Applicant is reminded that although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
In light of the above remarks, the claims are rejected using the same art as before.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


Claims 1-34 is/are rejected under 35 U.S.C. 103 as being unpatentable over Paluri (US 20190197667 A1) in view of Graziosi et al. (US 20120269458 A1).
Regarding Claim 1, Paluri teaches a method with depth image generation (Paragraph 4), comprising: 
receiving an input image (Paragraph 4; Paragraphs 15-18); 
generating, from the input image a first reduced set of data (Paragraphs 33, “Here, the feature extraction module 130 may perform an image or object recognition algorithms on the color or depth images in order to identify the edges, surfaces of objects. The extracted features may also include a specific color, intensity of a color, shapes, textures, texts within the images, and the like. In various embodiments, the feature extraction module 130 can generate an ordered list of the features for each image, hereafter referred to as the feature vector for the image. In one embodiment, the feature extraction module 130 applies dimensionality reduction (e.g., via linear discriminant analysis (LDA), principle component analysis (PCA), or the like) to reduce the amount of data in the feature vectors for an image to a smaller, more representative set of data. During the training phase, the feature extraction module 130 can provide a feature vector for each image (e.g., each color image and each depth image) to the model training module 145 to be provided as inputs to train a machine learning model.”; clearly discloses extracting features from a high resolution color image to represent said features in a reduced set of data representing the image features; Paragraph 34);
acquiring a first depth residual image corresponding to the input image by using a first generation model based on a first neural network (Paragraph 5, “. In one embodiment, the color information and low resolution depth information are features extracted from the color images and low resolution depth images, respectively. The additional high resolution depth information of the training data can serve as the ground truth for training the machine learning model. In one embodiment, the high resolution depth information are features extracted from a three dimensional reconstruction of a scene that was generated using data captured by a scene mapping device of the system.”; Paragraphs 33-34); 
generating a first low-resolution depth image corresponding to the first reduced set of data by using a second generation model based on a second neural network (Paragraphs 5-6; Paragraphs 33; Paragraph 34, “In some embodiments, the processes described in reference to the feature extraction module 130 may be performed by the machine learning model itself. As an example, if the machine learning model is a convolutional neural network (CNN), the CNN can extract features from color images captured by the color sensors 110 and low resolution depth images captured by the depth sensor 115. The CNN convolves the pixels of the low resolution depth images or the color images with a patch (e.g., an N.times.N patch) to identify convolved features. In some embodiments, these identified convolved features can be further pooled and classified to reduce the total number of features in a feature vector.”) and 
generating a target depth image corresponding to the input image, based on the first depth residual image and the first low-resolution depth image (Paragraphs 6-7; Paragraphs 33-36).
wherein the input image comprises a color image or an infrared image (Paragraphs 3-6; Paragraph 15, “Generally, each of the color sensors 110, the depth sensor 115, and the scene mapping device 120 captures information regarding a scene and provides the captured information to the computing device 150.”; Paragraph 16, “Referring to the specific components of the system 100, a color sensor 110 of the system 100 can be configured to capture an intensity of a light of a particular color (e.g., a range of wavelengths corresponding to a color of light). An example of a color sensor 110 is a charge-coupled device (CCD). In various embodiments, the system 100 may include multiple color sensors 110 to capture a wide range of color wavelengths. As depicted in FIG. 1, there may be three color sensors 110 that are configured to capture red, green, and blue (RGB) color light, respectively”; Paragraph 18, “A depth sensor 115 may be configured to capture an image with depth information corresponding to a scene. As an example, a depth sensor 115 may include both a projector, such as an infrared (IR) projector, and a camera that detects the emission of the projector, such as an IR camera.”).
However, Paluri does not explicitly teach generating a first low-resolution image having a resolution lower than a resolution of the input image.
Graziosi et al., however, teaches generating, from the input image, a first low-resolution image having a resolution lower than a resolution of the input image (Paragraph 41).
It would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention to have modified the depth image generation as taught in Paluri, to include the lower resolution depth images as shown in Graziosi et al. above in order to reduce the bit rate substantially (See Paluri Paragraph 9).
Regarding Claim 2, Paluri and Graziosi et al. teach the method of claim 1, Paluri further teaches the generating of the target depth image comprises: upsampling the first low-resolution depth image to a resolution of the input image; and generating the target depth image by combining depth information of the upsampled first low-resolution depth image and depth information of the first depth residual image (Paragraphs 6-7).
Regarding Claim 3, Paluri and Graziosi et al. teach the method of claim 1, Paluri further teaches the generating of the first low-resolution depth image comprises: acquiring a second depth residual image corresponding to the first low-resolution image using the second generation model; acquiring a second low-resolution depth image corresponding to the second low- resolution image using a third neural network-based third generation model; and generating the first low-resolution depth image (Paragraphs 5-7; Paragraphs 12-13; Paragraphs 33-34; Paragraphs 40-47).
However, Paluri does not explicitly teach generating a second low-resolution image having a resolution lower than the resolution of the first low-resolution image.
Graziosi et al., however, teaches generating a second low-resolution image having a resolution lower than the resolution of the first low-resolution image (Paragraphs 40-44).
It would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention to have modified the depth image generation as taught in Paluri, to include the lower resolution depth images as shown in Graziosi et al. above in order to reduce the bit rate substantially (See Paluri Paragraph 9).
Regarding Claim 4, Paluri and Graziosi et al. teach the method of claim 3, however, Paluri does not explicitly teach that the generating of the second low-resolution image comprises downsampling the first low-resolution image to generate the second low-resolution image.
Graziosi et al., however, teaches the generating of the second low-resolution image comprises downsampling the first low-resolution image to generate the second low-resolution image (Paragraphs 40-44).
It would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention to have modified the depth image generation as taught in Paluri, to include the lower resolution depth images as shown in Graziosi et al. above in order to reduce the bit rate substantially (See Paluri Paragraph 9).

Regarding Claim 5, Paluri and Graziosi et al. teach the method of claim 3, Paluri further teaches that the generating of the first low-resolution depth image comprises: upsampling the second low-resolution depth image to a resolution of the second depth residual image; and generating the first low-(Paragraphs 40-47).

Regarding Claim 6, Paluri and Graziosi et al. teach the method of claim 3, however Paluri does not explicitly teach that a resolution of the second low-resolution depth image is lower than a resolution of the first low-resolution depth image.
Graziosi et al., however, teaches a resolution of the second low-resolution depth image is lower than a resolution of the first low-resolution depth image (Paragraphs 40-44).
It would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention to have modified the depth image generation as taught in Paluri, to include the lower resolution depth images as shown in Graziosi et al. above in order to reduce the bit rate substantially (See Paluri Paragraph 9).
Regarding Claim 7, Paluri and Graziosi et al. teach the method of claim 3, however Paluri does not explicitly teach that the second depth residual image comprises depth information of a high-frequency component in comparison to the second low-resolution depth image.
Graziosi et al., however, teaches that a second depth residual image comprises depth information of a high-frequency component in comparison to the second low-resolution depth image (Paragraphs 40-44).
It would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention to have modified the depth image generation as taught in Paluri, to include the lower resolution depth images as shown in Graziosi et al. above in order to reduce the bit rate substantially (See Paluri Paragraph 9).
Regarding Claim 8, Paluri and Graziosi et al. teach the method of claim 1, Paluri further teaches that the first low-resolution depth image comprises depth information of a low-frequency component in comparison to the first depth residual image (Paragraphs 40-47).
Regarding Claim 9, Paluri and Graziosi et al. teach the method of claim 1, however, Paluri does not explicitly teach that the generating of the first low-resolution image comprises downsampling the input image to generate the first low-resolution image.
Graziosi, however, teaches that the generating of the first low-resolution image comprises downsampling the input image to generate the first low-resolution image (Paragraphs 40-44).
It would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention to have modified the depth image generation as taught in Paluri, to include the lower resolution depth images as shown in Graziosi et al. above in order to reduce the bit rate substantially (See Paluri Paragraph 9).
Regarding Claim 11, Paluri and Graziosi et al. teach the method of claim 1, Paluri further teaches that the input image comprises the color image and an input depth image, and wherein, in the acquiring of the first depth residual image, the first generation model uses a pixel value of the color image and a pixel value of the input depth image as inputs, and outputs a pixel value of the first depth residual image (Paragraphs 4-5).
Regarding Claim 12, Paluri and Graziosi et al. teach the method of claim 1, wherein the input image comprises the infrared image and an input depth image, and wherein, in the acquiring of the first depth residual image, the first generation model uses a pixel value of the infrared image and a pixel value of the input depth image as inputs, and outputs a pixel value of the first depth residual image (Paragraphs 4-5; Paragraphs 18-23)
Regarding Claim 13, Paluri and Graziosi et al. teach the method of claim 1,  Paluri further teaches a non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 1 (Paragraph 69).
Method claims 14-16 have limitations similar to those rejected in claims 1, 3-6 above, and are rejected for the same reasons of obviousness as used above.
Regarding Claim 17, Paluri and Graziosi et al. teach the method of claim 14,  Paluri further teaches that the generation model comprises a single neural network model (Paragraph 40).
Regarding Claim 18, Paluri teaches a method with depth image generation (Abstract), the method comprising: 
receiving an input image (Paragraph 4); 
acquiring, from the input image, intermediate depth images having a same size using a generation model that is based on a neural network that uses the input image as an input (Paragraphs 40-55); and 
generating a target depth image by combining the acquired intermediate depth images, wherein the intermediate (Paragraphs 40-55),
and wherein the input image comprises a color image or an infrared image (Paragraphs 3-6; Paragraph 15, “Generally, each of the color sensors 110, the depth sensor 115, and the scene mapping device 120 captures information regarding a scene and provides the captured information to the computing device 150.”; Paragraph 16, “Referring to the specific components of the system 100, a color sensor 110 of the system 100 can be configured to capture an intensity of a light of a particular color (e.g., a range of wavelengths corresponding to a color of light). An example of a color sensor 110 is a charge-coupled device (CCD). In various embodiments, the system 100 may include multiple color sensors 110 to capture a wide range of color wavelengths. As depicted in FIG. 1, there may be three color sensors 110 that are configured to capture red, green, and blue (RGB) color light, respectively”; Paragraph 18, “A depth sensor 115 may be configured to capture an image with depth information corresponding to a scene. As an example, a depth sensor 115 may include both a projector, such as an infrared (IR) projector, and a camera that detects the emission of the projector, such as an IR camera.”).
However, Paluri does not explicitly teach that depth images comprise depth information of different degrees of precision.
Graziosi et al., however, teaches that depth images comprise depth information of different degrees of precision (Paragraph 41).
It would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention to have modified the depth image generation as taught in Paluri, to include the lower resolution depth images as shown in Graziosi et al. above in order to reduce the bit rate substantially (See Paluri Paragraph 9).
Apparatus claims 19, 20, 22, 23, 25-29 are drawn to the apparatus corresponding to the method claimed in claims 1-3, 5-7, and 9-12 above and are rejected for the same reasons of obviousness as used above. Paluri further teaches an apparatus with depth image generation, comprising: a processor (Paragraph 69).
Regarding Claim 21, Paluri and Graziosi et al. teach the apparatus of claim 20, Paluri further teaches that the combining of the depth information of the upsampled first low-resolution depth image and the depth information of the first depth residual image comprises calculating a weighted sum or a summation of depth values of pixel positions corresponding to each other in the first depth residual image and the upsampled first low- resolution depth image (Paragraphs 40-55).
Regarding Claim 24, Paluri and Graziosi et al. teach the apparatus of claim 23, Paluri further teaches that the combining of the depth information of the upsampled second low-resolution depth image and the depth information of the second depth residual image comprises calculating a weighted (Paragraphs 40-55).
Apparatus claims 30-32 are drawn to the apparatus corresponding to the method and have otherwise similar limitations as those claimed in claims 6, 14, and 15 above and are rejected for the same reasons of obviousness as used above. Paluri further teaches an apparatus with depth image generation, comprising: a processor (Paragraph 69).
Apparatus claims 33-34 are drawn to the apparatus corresponding to the method and have otherwise similar limitations to other apparatus claims as those claimed in claims 18, and 20-21 above and are rejected for the same reasons of obviousness as used above. Paluri further teaches an apparatus with depth image generation, comprising: a processor (Paragraph 69).
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARHAN MAHMUD whose telephone number is (571)272-7712.  The examiner can normally be reached on 10-7.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Joseph Ustaris can be reached on 5712727383.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/FARHAN MAHMUD/
Primary Examiner, Art Unit 2483