Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 2 ,9, 12-16 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated  by Suzuki et al., "Joint Multimodal Learning with Deep Generative Models,https://arxiv.org/abs/1611.01891, November 2016, hereinafter “Suzuki.
 	Consider Claims 1 and 19, Suzuki teaches a system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement: an image encoder neural network having a plurality of image encoder parameters (e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)- this is a well understood function of variational autoencoders even based on the Applicant’s  admission in the background of the original disclosure ), wherein the image encoder neural network is configured to: receive an input image; and process the input image in accordance with the image encoder parameters to generate an image encoder output that parameterizes a distribution over possible values for each of a plurality of generative visual factors of variation(e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)- this is a well understood function of variational autoencoders even based on the Applicant’s  admission in the background of the original disclosure ); an image decoder neural network having a plurality of image decoder parameters, wherein the image decoder neural network is configured to: receive an image decoder input comprising a respective value for each of the plurality of visual factors; and process the image decoder input in accordance with the image decoder parameters to generate an output image defined by the values for the visual factors in the image decoder input(e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)- this is a well understood function of variational autoencoders even based on the Applicant’s  admission in the background of the original disclosure ); a symbol encoder neural network having a plurality of symbol encoder parameters, wherein the symbol encoder neural network is configured to: receive a symbol input comprising one or more symbols from a vocabulary of symbols; and process the symbol input in accordance with the symbol encoder parameters to generate a symbol encoder output that parameterizes a distribution over possible values for each of the plurality of generative visual factors of variation(i.e., the text modality which represent a joint function of the neural network)(e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)); a symbol decoder neural network having a plurality of symbol decoder parameters, wherein the symbol decoder neural network is configured to: receive a symbol decoder input comprising a respective value for each of the plurality of generative visual factors; and process the symbol decoder input in accordance with the symbol decoder parameters to generate a symbol output that includes one or more symbols from the vocabulary of symbols(e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)); and a subsystem configured to: receive a new symbol input comprising one or more symbols from the vocabulary; and generate a new output image that depicts concepts referred to by the new symbol input, comprising: processing the new  i.e., a system formed out of two autoencoder networks, one image autoencoder and one symbol autoencoder, which are coupled by two further subsystems that allow decoding of an image from symbol encoder outputs and viceversa.) (e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)).
 	Consider Claims 2, Suzuki teaches wherein the subsystem is further configured to: receive a new input image; and generate a new symbol output that includes one or more symbols that refer to concepts depicted in the new input image, comprising: processing the new input image using the image encoder neural network to generate a new image encoder output for the new input image; sampling, from the distribution parameterized by the new image encoder output, a respective value for each of the plurality of visual factors; and processing a new symbol decoder input comprising the respective values for the visual factors using the symbol decoder neural network to generate the new symbol output (i.e., see function of JMVAE illustrated in at least figure 2).
 	Consider Claims 9, Suzuki teaches wherein the symbol encoder neural network and the symbol decoder neural networks comprise feedforward neural network (e.g., see at least figure 2)


Claim 12, Suzuki teaches a method of training a symbol encoder neural network and  a symbol decoder neural network to determine trained values of symbol encoder parameters of the symbol encoder neural network and symbol decoder parameters of the symbol decoder neural network, the method comprising: receiving a training symbol input and a training image that matches the training symbol input(i.e., the text modality which represent a joint function of the neural network)(e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)); processing the training image using the image encoder neural network in accordance with first values of the image encoder parameters to generate a training image encoder output for the training image (e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)); processing the training symbol input using the symbol encoder neural network in accordance with current values of the symbol encoder parameters to determine a training symbol encoder output for the training symbol input(e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)); sampling, from the distribution parameterized by the training symbol encoder output, a respective value for each of the plurality of visual factors(e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)); processing a training symbol decoder input comprising the respective values for the visual factors using the symbol decoder neural network in accordance with current values of the symbol decoder parameters to generate a training symbol output(e.g., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE)); determining a gradient with respect to the symbol encoder parameters and the symbol decoder parameters of an objective function that includes (1) a variational auto encoder (VAE) objective and (ii) a term that encourages alignment between the training symbol encoder output and the training image i.e., see at least figure 2 which illustrates a Joint Multi modal variational autoencoder (JMVAE) and training of the autoencoder network using JMVAE-kl variant in at least (Fig. 2(b) right, section 3.3).
 	Consider Claim 13, Suzuki teaches further comprising: training the image encoder neural network to generate disentangled representations of the factors to determine the first values of the image encoder parameters (i.e., this limitation is met based on at least an equivalency training of the autoencoder network using JMVAE-kl variant in at least (Fig. 2(b) right, section 3.3). 
 	Consider Claim 14, Suzuki teaches wherein training the image encoder neural network to generate disentangled representations of the factors to determine the first values of the image encoder parameters comprises: training the image encoder neural network and the image decoder neural network jointly using a R-VAE training technique(i.e., this limitation is met based on at least an equivalency training of the autoencoder network using JMVAE-kl variant in at least (Fig. 2(b) right, section 3.3).
 	Consider Claim 15, Suzuki teaches wherein training the image encoder neural network to generate disentangled representations of the factors to determine the first values of the image encoder parameters comprises: training the image encoder neural network and the image decoder neural network jointly using a j-VAE training technique that replaces a pixel level log-likelihood with a loss in a high-level feature space of a denoising autoencoder(i.e., this limitation is met based on at least an equivalency training of the autoencoder network using JMVAE-kl variant in at least (Fig. 2(b) right, section 3.3).

 	Consider Claim 16, Suzuki teaches wherein the term that encourages alignment is a KL divergence between (i) the training image encoder output and (ii) the training symbol encoder output(i.e., this limitation is met based on at least an equivalency training of the autoencoder network using JMVAE-kl variant in at least (Fig. 2(b) right, section 3.3).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the 
Claim 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Suzuki et al., "Joint Multimodal Learning with Deep Generative Models,https://arxiv.org/abs/1611.01891, November 2016, hereinafter “Suzuki in view of Well Known art.
 	Consider Claim 10, Suzuki teaches the claimed invention except wherein the symbol encoder neural network and the symbol decoder neural networks comprise recurrent neural networks.
 	However, the Examiner takes official notice that feedforward and recurrent neural networks are notoriously well known in the art. Consider a feedforward neural network is an artificial neural network wherein connections between the nodes do not form a cycle. As such, it is different from its descendant: recurrent neural networks. The feedforward neural network was the first and simplest type of artificial neural network devised.
 	Therefore, it would have been obvious to a person of ordinary skill before the effective filing dates to try a recurrent neural network as a matter of an alternative design choice.
Claim 11  is/are rejected under 35 U.S.C. 103 as being unpatentable over Suzuki et al., "Joint Multimodal Learning with Deep Generative Models,https://arxiv.org/abs/1611.01891, November 2016, hereinafter “Suzuki in view of Well Known art
 	Consider Claim 11, Suzuki teaches the claimed invention except wherein the image encoder neural network has been trained to generate disentangled representations of the plurality of factors.
e.g., see at least the introduction).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include disentangled representations wherein the image encoder neural network has been trained to generate disentangled representations of the plurality of factors for the purpose of improved learning.
Allowable Subject Matter
Claims 3-8 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES TERRELL SHEDRICK whose telephone number is (571)272-8621.  The examiner can normally be reached on 8A-5P.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lester G Kincaid can be reached on 571 272 7922.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/CHARLES T SHEDRICK/Primary Examiner, Art Unit 2646