DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
Y The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This office action is made in response to Applicant’s remarks filed on 4/4/2022. Claims 1, 13, and 18 have been amended. Claims 1-20 are pending. 

Response to Arguments
Applicant’s amendments regarding Examiner's rejections under 35 USC 112 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph have been entered. These rejections are accordingly withdrawn.
Applicant’s arguments with respect to Examiner's rejections under 35 USC 102 and 103 have been considered but are not persuasive. Therefore, these rejections are maintained.
Regarding claim 1, Applicant asserts that the cited prior art does not teach the language of the claim because Sanchez does not teach, "wherein the VAE-GAN comprises a shared latent space for generating each of the reconstructed pose vector data, the reconstructed depth map, and reconstructed images," (Remarks at pg. 11). Examiner, however, respectfully disagrees.
Namely, Sanchez teaches wherein the VAE-GAN comprises a shared latent space (e.g. at least latent space 220, see e.g. at least p. 5-11, 32-33, Fig. 2, and related text) for generating each of the reconstructed pose vector data, the reconstructed depth map, and reconstructed images (id., generating characteristic maps, position vectors, and images based on the camera image using the shared latent space).

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-6, 8-11, 13-15, and 17 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ros Sanchez (US 2019/0354804 A1).

Regarding claim 1, Ros Sanchez discloses a method (see e.g. at least Abstract, Fig. 4-5, and related text) comprising:
receiving an image from a camera of a vehicle (e.g. at least camera 626, see e.g. at least p. 3, 77, Fig. 6, and related text);
providing the image to a variational autoencoder generative adversarial network (VAE-GAN) (e.g. at least VAE-GAN, generator 200, see e.g. at least p. 5-11, 32-33, Fig. 2, and related text);
receiving from the VAE-GAN reconstructed pose vector data based on the image and a reconstructed depth map based on the image (see e.g. at least p. 5-11, Fig. 2, and related text, generating encoded vectors based on the image directed to the latent space, fed to a decoder and mapping the characteristics associated with different locations in the latent space); and
calculating simultaneous localization and mapping for the vehicle based on the reconstructed pose vector data and the reconstructed depth map (id.);
wherein the VAE-GAN comprises a shared latent space for generating each of the reconstructed pose vector data, the reconstructed depth map, and reconstructed images (e.g. at least latent space 220, see e.g. at least p. 5-11, 32-33, Fig. 2, and related text) for generating each of the reconstructed pose vector data, the reconstructed depth map, and reconstructed images (id., generating characteristic maps, position vectors, and images based on the camera image using the shared latent space).

Regarding claim 2, Ros Sanchez teaches that training the VAE-GAN (see e.g. at least p. 5-11, 32-33, Fig. 2, and related text) comprises:
providing a training image to an image encoder of the VAE-GAN, wherein the image encoder is configured to map the training image to a compressed latent representation of the training image (see e.g. at least Fig. 4, and related text);
providing training pose vector data based on the training image to a pose encoder of the VAE-GAN, wherein the pose encoder is configured to map the training pose vector data to a compressed latent representation of the training pose vector data (see e.g. at least p. 5-11, Fig. 5, and related text); and
providing a training depth map based on the training image to a depth encoder of the VAE-GAN, wherein the depth encoder is configured to map the training depth map to a compressed latent representation of the training depth map (id., see also e.g. at least p. 7, 22, 24, 33-35).

Regarding claim 3, Ros Sanchez teaches that the VAE-GAN is trained utilizing a plurality of inputs in tandem, such that each of:
the image encoder (e.g. at least VAE) and a corresponding image decoder (e.g. at least GAN, see e.g. at least p. 33);
the pose encoder and a corresponding pose decoder (id.); and
the depth encoder and a corresponding depth decoder are trained in tandem utilizing the latent space of the VAE-GAN (id.).

Regarding claim 4, Ros Sanchez teaches that each of the training image, the training pose vector data, and the training depth map share the latent space of the VAE-GAN (see e.g. at least p. 5-11, Fig. 4, and related text).

Regarding claim 5, Ros Sanchez teaches that the VAE-GAN comprises an encoded latent space vector that is applicable to each of the training image, the training pose vector data, and the training depth map (see e.g. at least p. 5-11, Fig. 4, and related text).

Regarding claim 6, Ros Sanchez teaches that determining a training pose vector data comprises:
receiving a plurality of stereo images forming a stereo image sequence (see e.g. at least p. 72, Fig. 6, and related text); and
calculating pose vector data for successive images of the stereo image sequence using stereo visual odometry (id.);
wherein the training image provided to the VAE-GAN comprises a single image of a stereo image pair of the stereo image sequence (id.).
Additionally, Pirchheim teaches limitations not expressly disclosed by Ros Sanchez including namely: calculating six Degree of Freedom pose vector data for successive images of the image sequence using stereo visual odometry (see e.g. at least p. 3, 5-8, 28).
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teaching of Ros Sanchez by configuring calculating six Degree of Freedom pose vector data for successive images of the stereo image sequence using stereo visual odometry as taught by the combination of Ros Sanchez and Pirchheim in order to provide improved tracking and mapping in a reliable manner (Pirchheim: p. 4).

Regarding claim 8, Ros Sanchez teaches that the VAE-GAN comprises an encoder opposite to a decoder, and wherein the decoder comprises a generative adversarial network (GAN) configured to generate an output, wherein the GAN comprises a GAN generator and a GAN discriminator (see e.g. at least p. 33, Fig. 2, and related text).

Regarding claim 9, Ros Sanchez teaches that the VAE-GAN (see e.g. at least p. 33) comprises:
a trained image encoder configured to receive the image (see e.g. at least p. 5-11, 32-33, 77, Fig. 2, 4, and related text);
a trained pose decoder comprising a GAN configured to generate the reconstructed pose vector data based on the image (see e.g. at least p. 5-11, Fig. 2, and related text); and
a trained depth decoder comprising a GAN configured to generate the reconstructed depth map based on the image (id., see also e.g. at least p. 7, 22, 24, 33-35).

Regarding claim 10, Ros Sanchez teaches that the VAE-GAN comprises:
an image encoder configured to map the image to a compressed latent representation (see e.g. at least p. 7, 22, 24, 33-35);
a pose decoder comprising a GAN generator adversarial to a GAN discriminator (see e.g. at least p. 5-11, Fig. 2, and related text);
a depth decoder comprising a GAN generator adversarial to a GAN discriminator (id., see also e.g. at least p. 7, 22, 24, 33-35); and
a latent space, wherein the latent space is common to each of the image encoder, the pose decoder, and the depth decoder (id., see also e.g. at least Fig. 4, and related text).

Regarding claim 11, Ros Sanchez teaches that the latent space of the VAE-GAN comprises an encoded latent space vector utilized for each of the image encoder, the pose decoder, and the depth decoder (see e.g. at least p. 5-11, Fig. 4, and related text).

Regarding claim 13, Ros Sanchez teaches non-transitory computer-readable storage media storing instructions that, when executed by one or more processors (e.g. at least memory, processors, see e.g. at least p. 9, 29), cause the one or more processors to:
receive an image from a camera of a vehicle (e.g. at least camera 626, see e.g. at least p. 3, 77, Fig. 6, and related text);
provide the image to a variational autoencoder generative adversarial network (VAE- GAN) (e.g. at least VAE-GAN, generator 200, see e.g. at least p. 5-11, 32-33, Fig. 2, and related text);
receive from the VAE-GAN reconstructed pose vector data and a reconstructed depth map based on the image (see e.g. at least p. 5-11, Fig. 2, and related text, generating encoded vectors directed to the latent space, fed to a decoder and mapping the characteristics associated with different locations in the latent space); and
calculate simultaneous localization and mapping for the vehicle based on the reconstructed pose vector data and the reconstructed depth map (id.);
wherein the VAE-GAN comprises a latent space for generating each of the reconstructed pose vector data, the reconstructed depth map, and reconstructed images (e.g. at least latent space 220, see e.g. at least p. 5-11, 32-33, Fig. 2, and related text, generating characteristic maps, position vectors, and images based on the camera image using the shared latent space).

Regarding claim 14, Ros Sanchez teaches that the instructions further cause the one or more processors to train the VAE-GAN, wherein training the VAE-GAN (see e.g. at least p. 5-11, 29, 32-33, Fig. 2, and related text) comprises:
providing a training image to an image encoder of the VAE-GAN, wherein the image encoder is configured to map the training image to a compressed latent representation in the latent space (see e.g. at least Fig. 4, and related text);
providing training pose vector data based on the training image to a pose encoder of the VAE-GAN, wherein the pose encoder is configured to map the training pose vector data to a compressed latent representation in the latent space (see e.g. at least p. 5-11, Fig. 5, and related text); and
providing a training depth map based on the training image to a depth encoder of the VAE-GAN, wherein the depth encoder is configured to map the training depth map to a compressed latent representation in the latent space (id., see also e.g. at least p. 7, 22, 24, 33-35).

Regarding claim 15, Ros Sanchez teaches that the instructions cause the one or more processors to train the VAE-GAN utilizing a plurality of inputs in tandem, such that each of:
the image encoder and a corresponding image decoder (see e.g. at least p. 33);
the pose encoder and a corresponding pose decoder (id.); and
the depth encoder and a corresponding depth decoder are trained in tandem such that each of the training image, the training pose vector data, and the training depth map share the latent space of the VAE-GAN (id.).

Regarding claim 17, Ros Sanchez teaches that the VAE-GAN comprises an encoder opposite to a decoder, and wherein the decoder comprises a generative adversarial network (GAN) configured to generate an output, wherein the GAN comprises a GAN generator and a GAN discriminator (see e.g. at least 5-11, 32-33, 77, Fig. 2, 4, and related text).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 7, 12, 16, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Ros Sanchez (US 2019/0354804 A1) in view of Pirchheim (US 2019/0354804 A1).

Regarding claim 7, Pirchheim teaches limitations not expressly disclosed by Ros Sanchez including namely: that the camera of the vehicle comprises a monocular camera configured to capture a sequence of images of an environment of the vehicle, and wherein the image comprises a red-green-blue (RGB) image (see e.g. at least Abstract).
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teaching of Ros Sanchez by configuring that the camera of the vehicle comprises a monocular camera configured to capture a sequence of images of an environment of the vehicle, and wherein the image comprises a red-green-blue (RGB) image as taught by Pirchheim in order to provide improved tracking and mapping in a reliable manner (Pirchheim: p. 4).

Regarding claim 12, Pirchheim teaches limitations not expressly disclosed by Ros Sanchez including namely: that the reconstructed pose vector data comprises six Degree of Freedom pose data pertaining to the camera of the vehicle (see e.g. at least Abstract, p. 3, 27, 32, Fig. 2, 5, and related text).
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teaching of Ros Sanchez by configuring that the reconstructed pose vector data comprises six Degree of Freedom pose data pertaining to the camera of the vehicle as taught by in order to provide improved tracking and mapping in a reliable manner (Pirchheim: p. 4).

Regarding claim 16, Ros Sanchez teaches that the instructions further cause the one or more processors to calculate the training pose vector data based on the training image, wherein calculating the training pose vector data comprises:
receiving a plurality of stereo images forming a stereo image sequence (see e.g. at least p. 72, Fig. 6, and related text); and
calculating pose vector data for successive images of the stereo image sequence using stereo visual odometry (id.);
wherein the training image provided to the VAE-GAN comprises a single image of a stereo image pair of the stereo image sequence (id.).
Additionally, Pirchheim teaches limitations not expressly disclosed by Ros Sanchez including namely: calculating six Degree of Freedom pose vector data for successive images of the image sequence using stereo visual odometry (see e.g. at least p. 3, 5-8, 28).
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teaching of Ros Sanchez by calculating six Degree of Freedom pose vector data for successive images of the stereo image sequence using stereo visual odometry as taught by the combination of Pirchheim and Ros Sanchez in order to provide improved tracking and mapping in a reliable manner (Pirchheim: p. 4).

Regarding claim 18, Ros Sanchez discloses a system for simultaneous localization and mapping of a vehicle in an environment (see e.g. at least Abstract), the system comprising:
a camera of a vehicle (e.g. at least camera 626, see e.g. at least p. 3, 77, Fig. 6, and related text);
a vehicle controller in communication with the camera, wherein the vehicle controller comprises non-transitory computer readable storage media storing instructions  (e.g. at least memory, processors, see e.g. at least p. 9, 29) that, when executed by one or more processors, cause the one or more processors to:
receive an image from the camera of the vehicle (e.g. at least camera 626, see e.g. at least p. 3, 77, Fig. 6, and related text);
provide the image to a variational autoencoder generative adversarial network (VAE-GAN) (e.g. at least VAE-GAN, generator 200, see e.g. at least p. 5-11, 32-33, Fig. 2, and related text);
receive from the VAE-GAN reconstructed pose vector data based on the image  (see e.g. at least p. 5-11, Fig. 5, and related text);
receive from the VAE-GAN a reconstructed depth map based on the image (see e.g. at least p. 10-11, 24, 35, 42, 50, 70, 82-83, Fig. 2, and related text, generating encoded vectors directed to the latent space, fed to a decoder and mapping the characteristics associated with different locations in the latent space); and
calculate simultaneous localization and mapping for the vehicle based on one or more of the reconstructed pose vector data and the reconstructed depth map (id.);
wherein the VAE-GAN comprises a latent space for generating each of the reconstructed pose vector data, the reconstructed depth map, and reconstructed images (e.g. at least latent space 220, see e.g. at least p. 5-11, 32-33, Fig. 2, and related text, generating characteristic maps, position vectors, and images based on the camera image using the shared latent space).
Additionally, Pirchheim teaches limitations not expressly disclosed by Ros Sanchez including namely: a monocular camera of a vehicle (see e.g. at least Abstract).
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teaching of Ros Sanchez by configuring a monocular camera of a vehicle; a vehicle controller in communication with the monocular camera, wherein the vehicle controller comprises non-transitory computer readable storage media storing instructions that, when executed by one or more processors, cause the one or more processors to: receive an image from the monocular camera of the vehicle as taught by the combination of Ros Sanchez and Pirchheim in order to provide improved tracking and mapping in a reliable manner (Pirchheim: p. 4).

Regarding claim 19, Modified Ros Sanchez teaches that the VAE-GAN is trained and training the VAE-GAN comprises:
providing a training image to an image encoder of the VAE-GAN, wherein the image encoder is configured to map the training image to a compressed latent representation of the training image (Ros Sanchez: see e.g. at least Fig. 4, and related text);
providing training pose vector data based on the training image to a pose encoder of the VAE-GAN, wherein the pose encoder is configured to map the training pose vector data to a compressed latent representation of the training pose vector data (Ros Sanchez: see e.g. at least p. 5-11, Fig. 5, and related text); and
providing a training depth map based on the training image to a depth encoder of the VAE-GAN, wherein the depth encoder is configured to map the training depth map to a compressed latent representation of the training depth map (Ros Sanchez: id., see also e.g. at least p. 7, 22, 24, 33-35).

Regarding claim 20 Modified Ros Sanchez teaches that the VAE-GAN comprises:
an image encoder configured to map the image to a compressed latent representation (Ros Sanchez: see e.g. at least p. 7, 22, 24, 33-35);
a pose decoder comprising a GAN generator adversarial to a GAN discriminator (Ros Sanchez: see e.g. at least p. 5-11, Fig. 2, and related text);
a depth decoder comprising a GAN generator adversarial to a GAN discriminator (Ros Sanchez: id., see also e.g. at least p. 7, 22, 24, 33-35); and
a latent space, wherein the latent space is common to each of the image encoder, the pose decoder, and the depth decoder (Ros Sanchez: id., see also e.g. at least Fig. 4, and related text).

Conclusion
	THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES J HAN whose telephone number is (571) 270-3980.  The examiner can normally be reached on M-Th and every other F (7:30 AM - 5 PM).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Christian Chace can be reached on 571-272-4190.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 900-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHARLES J HAN/Primary Examiner, Art Unit 3662