DETAILED ACTION
This action is in response to the claims filed October 30th 2018. Claims 1-20 are pending and have been examined. 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 19 and 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because “A computer readable medium” and a “processing unit” are being interpreted in light of the specification as signal-per-se and software-per-se respectively. In the specification a computer readable medium is stated as (¶0141 “including  or other storage media”) to include both transitory media as well as non-transitory media. Further “processing unit” includes but is not limited to a software module (¶0025) computer-readable medium” with “non-transitory computer-readable medium.”

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 7-9 and 15-18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding Claim 7/15
	The functions             
                
                    
                        d
                        e
                        c
                    
                    
                        Ψ
                    
                
                 
                a
                n
                d
                 
                
                    
                        e
                        n
                        c
                    
                    
                        φ
                    
                
            
         are not explicitly defined. For examination purposes, examiner interprets them to correspond to the functions represented by the composite decoder and encoder respectively. 
Regarding Claim 8/9/16/17/18
	The functions denoted with an f are not defined. For examination purposes, examiner interprets them to correspond to generic functions defining an aspect of the claimed neural networks.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7, 9, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. “Unregularized Auto-Encoder with Generative Adversarial Networks for Image Generation”. Further in view of Zhang et al. “Adversarial Feature Matching for Text Generation” hereinafter Zhang. Further still in view of Elaffendi et al. “Text Encoding for Deep Learning Neural Networks: A Reversible Base 64 (Tetrasexagesimal) Integer Transformation (RIT64) Alternative to One Hot Encoding with Applications to Arabic Morphology  hereinafter Elaffendi. Further still in view of Wang et al. “Generative Image Modeling Using Style and Structure Adversarial Networks” hereinafter Wang2. 

Regarding Claims 1 and 19
Wang teaches, A method for training a latent generative adversarial network (GAN), the method comprising: (Abstract “we propose a new Auto-Encoder Generative Adversarial Networks (AEGAN) … we map the random vector into the encoded latent space by adversarial training based on GAN”) receiving, at an encoder neural network, representation of a real [features] and outputting, by the encoder neural network, a latent representation of the real [features], generated from the representation of the real [features]; receiving at a decoder neural network, the latent representation of the real [features], and outputting, by the decoder neural network, a reconstructed representation of the real [features] generated from the latent representation of the real [features] receiving, at the decoder neural network, random noise data or artificial code generated by a generator neural network of the GAN from the random noise data, and outputting, by the decoder neural network, a representation of artificial (Figure 1 AEGAN
    PNG
    media_image1.png
    251
    708
    media_image1.png
    Greyscale
Examiner notes that x and h correspond to the real features and latent representation of real features. The inputs and outputs of the D1 and D2 discriminators correspond to a hybrid discriminator as a whole.) and outputting, by the hybrid discriminator neural network, a probability indicating whether the [reconstructed] representation of artificial [features] and the artificial code received by the hybrid discriminator neural network is similar to the [reconstructed features] and the latent representation of the real [features]. (Figure 1 and Section 3.3 ¶02 “and the discriminator D1 estimates the probability that a latent vector came from Auto-Encoder rather than G1” Examiner notes that as the combination of D1 and D2 corresponds to the hybrid neural network. D1 outputting a probability corresponds to the hybrid neural network outputting a probability.) A (Section 4 and Algorithm 1 Examiner notes that the algorithm presented by Wang is implemented in the experiments section. In order to perform the experiments described, a computer or device consisting of processing units must necessarily be utilized consisting of computer readable storage including instructions, such as algorithm 1.)
Wang does not explicitly teach, a one-hot representation of a real text, a latent representation of the real text, the real text comprising a sequence of words, a reconstructed softmax representation of the real text generated from the latent representation of the real text, the reconstructed softmax representation of the real text comprising a soft-text that is a continuous representation of the real text, softmax representation of artificial text generated from the artificial code, receiving, at a discriminator neural network a combination of the soft-text and the latent representation of the real text and a combination of the softmax representation of artificial text and the artificial code
Zhang, however, when addressing issues related mapping a latent representation to a softmax representation to be discriminated by a discriminator, a latent representation of the real text, a reconstructed softmax representation of the real text generated from the latent representation of the real Section 4¶002-¶003 “feature vectors encoded from real sentences [latent representation of real text]…The feature vector is then fed into a 900-200-2 fully connected network for the discriminator… with sigmoid activation units connecting the intermediate layers and softmax/tanh units for the top layer of discriminator/encoder” Examiner notes that Zhang teaches using softmax units to map latent text to a softmax representation of the real text.) the reconstructed softmax representation of the real text comprising a soft-text that is a continuous representation of the real text, (Examiner notes, by definition the softmax operator outputs a continuous representation of its input.) a softmax representation of artificial text generated from the artificial code (Figure 2 Top Examiner notes, that just as the discriminator generates softmax representation in the feature layer, it also generates f̃, synthetic features)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a discriminator using a softmax representation of latent features to discriminate between real and false text in a generative adversarial network as taught by Zhang to the disclosed invention of Wang.
One of ordinary skill in the arts would have been motivated to make this modification because in order to “delivers superior performance compared to related approaches produce realistic sentences, and that the learned latent  (Zhang Conclusion).
Wang/Zhang does not explicitly teach, 
a one-hot representation of a real text, the real text comprising a sequence of words, receiving, at a discriminator neural network a combination of the soft-text and the latent representation of the real text and a combination of the softmax representation of artificial text and the artificial code 
Elaffendi however, when addressing encoding text usable in neural networks teaches, a one-hot representation of a real text, the real text, the real text comprising a sequence of words; (Introduction “One Hot Encoding approaches represent each word in the vocabulary by a numerical positional vector whose elements are all zeros, except for the position of the word in the vocabulary list” Examiner notes that a text with a string or sequence of words in a corresponding vocabulary can be encoded as described, in the context of neural networks for text processing.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate one-hot encoding for a real text string as taught by Elaffendi to the disclosed invention of Wang/Zhang.
(Elaffendi Abstract)
Wang/Zhang/Elaffendi does not explicitly teach, receiving, at a discriminator neural network a combination of the soft-text and the latent representation of the real text and a combination of the softmax representation of artificial text and the artificial code 
Wang2 however, when addressing issues related to receiving multiple inputs real and generated inputs for a discriminator neural network teaches, receiving, at a discriminator neural network a combination of the soft-[feature] and the latent representation of the real [features] and a combination of the softmax representation of artificial [features] and the artificial code (Figure. 4 
    PNG
    media_image2.png
    302
    818
    media_image2.png
    Greyscale
 Section 5.3 ¶02 “the features extracted from both RGB and depth are concatenated together as inputs for SVM classifier [discriminator]” Examiner notes that the RGB real and generated images correspond to the soft max representations. And the Depth maps correspond to the latent feature representations of the real and generated code. These inputs are combined in the input box.
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a concatenation layer to allow for the combination of input [image/text] features to be received by a neural network as taught by Wang2 to the disclosed invention of Wang/Zhang/Elaffendi in order to allow for a combination of the soft-text features and softmax representation of the artificial text.
One of ordinary skill in the arts would have been motivated to make this modification because in order to provide additional feature details to improve the performance of the discriminator as such “our model is 8.2%  better than DCGAN and   3.7%” (Wang2 Section 5.3 ¶02)

Regarding Claim 2
	Wang/Zhang/Elaffendi/Wang2 teach Claim 1.
Further Wang2 teaches, wherein the combination of the soft-text and the latent representation of the real text comprises a concatenation of the soft-text and the latent representation of the real text, and the combination of the soft-max  (Figure. 4 
    PNG
    media_image2.png
    302
    818
    media_image2.png
    Greyscale
 Section 5.3 ¶02 “the features extracted from both RGB and depth are concatenated together as inputs for SVM classifier [discriminator]” Examiner notes that the RGB real and generated images correspond to the soft max representations. And the Depth maps correspond to the latent feature representations of the real and generated code. These inputs are concatenated in the input box for the discriminator.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a concatenation layer to allow for the combination of inputs to be received by a neural network as taught by Wang2 to the disclosed invention of Wang/Zhang/Elaffendi.
One of ordinary skill in the arts would have been motivated to make this modification because in order to provide additional feature details to improve the (Wang2 Section 5.3 ¶02)


Regarding Claim 3
	Wang/Zhang/Elaffendi/Wang2 teach Claim 1.
Further Wang teaches, calculating a reconstruction loss based on a difference between the one-hot representation of the real text and the soft-text output from the decoder neural network; (Section 3.3 ¶03 “Images from training set are first be encoded into latent space by the encoder: hi = Enc(xi), where i ∈ {0, 1, ...,n}, xi represents the image sampled from training set, hi is the latent vector of xi, and n is the size of mini-batches. Together with the decoder, the whole Auto-Encoder networks are trained by minimizing the squared cost function [reconstruction loss]:

    PNG
    media_image3.png
    69
    316
    media_image3.png
    Greyscale
”Examiner notes, as stated previously that when modified with Zhang the decoder output undergoes the soft-max operation. Such that Dec(h) corresponds to softmax(Dec(h))) and updating parameters of the encoder neural network and parameters of the decoder neural network based on the reconstruction loss. (Section 4 Algorithm 1 The Auto-Encoder Generative Adversarial Networks training procedure “Update Enc and Dec by descending: …Update D2 by descending: …Update D1 by descending: …Update G1 by descending:” Examiner notes that updating includes updating the encoder and decoder based in part on the reconstruction loss)

Regarding Claim 4
	Wang/Zhang/Elaffendi/Wang2 teach Claim 1.
Further Wang as modified by Wang2 teaches, calculating a discriminator loss based on the soft-text, the artificial code, the soft-max representation of artificial text, and the latent representation of the real text; (Section 3.3 ¶04 “Then, both kinds of latent vectors are fed into the Decoder to get the generated images x˜i and the reconstructed images xi′. Discriminator D2 [first discriminator] is trained to distinguish them from each other…
    PNG
    media_image4.png
    34
    404
    media_image4.png
    Greyscale
” Section 3.3 ¶04 “the discriminator D1, which takes a real latent vector hi or a generated one ˜hi as input, is trained to classify inputs into two classes (real or fake)…
    PNG
    media_image5.png
    33
    373
    media_image5.png
    Greyscale
”) Examiner notes that a calculation of hybrid discriminator loss must be a composition of the two component networks merged together. Thus the composite loss is based on the stated elements) and updating parameters of the hybrid discriminator neural network, parameters of the encoder neural network, and parameters the decoder neural network based on the discriminator loss. (Section 4 Algorithm 1 The Auto-Encoder Generative Adversarial Networks training procedure “Update Enc and Dec by descending: … Update D2 by descending: … Update D1 by descending: … Update G1 by descending:” Examiner notes that updating includes updating the parameters of the discriminators and encoder based in part on the composite discriminator loss)

Regarding Claim 5
	Wang/Zhang/Elaffendi/Wang2 teach Claim 1
Further Wang teaches, calculating a generator loss that maximizes the probability output by the hybrid discriminator neural network (Section 3.3 ¶03 “while G1 is trained to "fool" both D1 and D2… The discriminators loss LD1, LD2 and the generator loss LG1 are defined as follows” Examiner notes that a generator trained to fool the composite discriminator, that outputs a probability as stated in the rejection of claim 1, by calculation of generator loss, would be maximizing the probability that the discriminator chooses the artificial data generated by the generator.) and updating parameters of the generator neural (Section 4 Algorithm 1 The Auto-Encoder Generative Adversarial Networks training procedure “Update Enc and Dec by descending: … Update D2 by descending: … Update D1 by descending: … Update G1 by descending:” Examiner notes that updating includes updating the parameters of the decoder and generator based in part on the generator loss)
Further Wang2 teaches, based on a concatenation of the soft-max representation of artificial text and the artificial code; (Examiner notes that the concatenation of the text/artificial text and latent representation of text/code taught by Wang2 as addressed in the rejection of claim 1, when combined with the teaching of the prior art presented results in a composite hybrid discriminator neural network whose parameters are based on a concatenation.)

Regarding Claim 6
	Wang/Zhang/Elaffendi/Wang2 teach Claim 1
	Wang teaches, and repeating the receiving the one-hot representation of the real text, the receiving the artificial code, the receiving the latent representation of the real text, the receiving the combination of the soft-text and the latent representation of the real text and the combination of the soft-max representation of artificial text and the artificial code, and the outputting the probability. (Algorithm 1 The Auto-Encoder Generative Adversarial Networks training procedure. “while (Enc,Dec,G1,D1,D2) not converged do…sample... from the training set…”Examiner notes that the processes of optimizing or training entails repeatedly receiving and outputting the respective vectors into the component neural networks) determining that the combination of the soft-text and the latent representation of the real text and the combination of the soft-max representation of artificial text and the artificial code can be discriminated by the hybrid discriminator neural network; (Further the process of determining whether discrimination can be done by the hybrid discriminatory neural network under broadest reasonable interpretation is performed while training the model, because the probability output by the discriminator is a proxy for the ability of the discriminator to discriminate between the artificial code and real text.)

Regarding Claim 7
	Wang/Zhang/Elaffendi/Wang2 teach Claim 1
Further Wang teaches, 

    PNG
    media_image6.png
    170
    692
    media_image6.png
    Greyscale
 (Section 3.3 ¶03 “Images from training set are first be encoded into latent space by the encoder: hi = Enc(xi), where i ∈ {0, 1, ...,n}, xi represents the image sampled from training set, hi is the latent vector of xi, and n is the size of mini-batches. Together with the decoder, the whole Auto-Encoder networks are trained by minimizing the squared cost function [reconstruction loss]:

    PNG
    media_image3.png
    69
    316
    media_image3.png
    Greyscale
”Examiner notes, as stated previously that when modified with Zhang the decoder output undergoes the soft-max operation. Further “training” entails not only calculating but also updating. Furthermore, the coefficient 1/nHW is only a scalar multiple that would have been obvious to be removed by PHOSITA, as this scalar does not affect the error calculation)

Regarding Claim 9
	Wang/Zhang/Elaffendi/Wang2 teach Claim 1
Further Wang teaches, 
    PNG
    media_image7.png
    159
    649
    media_image7.png
    Greyscale
 (Section 3.3 ¶04 “while G1 is trained to "fool" both D1 and D2… 
    PNG
    media_image8.png
    29
    323
    media_image8.png
    Greyscale
”  Section 4 ¶02 “We experimented various values of the hyperparameter λ, and found that λ = 1 works well in all reported experiments” Examiner notes that the expected value of the discriminators output necessarily is dependent on a function of both the synthetic features (x and h ) but also the real features (x and h), even though they are not expressly depicted in the equation. Thus, E [ logD1 ] and E [ logD2 ] is equivalent to the presented in the claim. Furthermore, λ is simply a hyper parameter for learning rate, whose sign, positive or negative, defines the loss as a maximization problem or minimization problem.)


Claims 10-15, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. “Unregularized Auto-Encoder with Generative Adversarial Networks for Image Generation”. Further in view of Zhang et al. “Adversarial Feature Matching for Text Generation” hereinafter Zhang. Further still in view of Elaffendi et al. “Text Encoding for Deep Learning Neural Networks: A Reversible Base 64 (Tetrasexagesimal) Integer Transformation (RIT64) Alternative to One Hot Encoding with Applications to Arabic Morphology  hereinafter Elaffendi.

Regarding Claim 10 and 20
Wang teaches, A method for training a generative adversarial network (GAN) executing on one or more processing units for real [feature] generation, the method comprising (Abstract “we propose a new Auto-Encoder Generative Adversarial Networks (AEGAN) … we map the random vector into the encoded latent space by adversarial training based on GAN” Examiner notes that training implicitly involves execution by at least one processing unit.) receiving, at an encoder neural network, a real [features] and outputting, by the encoder neural network, a latent representation of the real [features] generated from the real [features] receiving, at a decoder neural network, the latent representation of the real [features], and outputting, by the decoder neural network, a reconstructed representation of the real [features] generated from the latent representation of the real [features]; receiving, at the decoder neural network, random noise data or artificial code generated by a generator neural network of the (Figure 1 AEGAN
    PNG
    media_image1.png
    251
    708
    media_image1.png
    Greyscale
Examiner notes that x and h correspond to the real features and latent representation of real features.) and outputting, by the second discriminator neural network, a second probability indicating whether the artificial code or the random noise data received by the second discriminator neural network is similar to the latent representation of the real [features]. (Figure 1 and Section 3.3 ¶02 “and the discriminator D1 estimates the probability that a latent vector came from Auto-Encoder rather than G1” Examiner notes that the first discriminator and second discriminator corresponds to                         
                            
                                
                                    D
                                
                                
                                    2
                                
                            
                             
                            a
                            n
                            d
                             
                            
                                
                                    D
                                
                                
                                    1
                                
                            
                        
                     respectively.) (Section 3.3 ¶02 “After that, the discriminator D2 is used to distinguish them from real images” Examiner notes that D2 receives soft max representations of the real and artificial [features] output by the decoder. Given that D1 is capable of distinguishing via a probability estimate it would obvious for D2 to distinguish via a probability output.) A device comprising: one or more processing units; a computer readable storage medium storing programming for execution by the one or more processing units, the programming including instructions for: (Section 4 and Algorithm 1 Examiner notes that the algorithm presented by Wang is implemented in the experiments section. In order to perform the experiments described, a computer or device consisting of processing units must necessarily be utilized consisting of computer readable storage including instructions, such as algorithm 1.)
Wang does not explicitly teach, a one-hot representation of a real text, a latent representation of the real text, the real text comprising a sequence of words, a reconstructed softmax representation of the real text generated from the latent representation of the real text, the reconstructed softmax representation of the real text comprising a soft-text that is a continuous representation of the real text, 
Zhang, however, when addressing issues related mapping a latent representation to a softmax representation to be discriminated by a discriminator, a latent representation of the real text, a reconstructed softmax representation of the real text generated from the latent representation of the real text, (Section 4¶002-¶003 “feature vectors encoded from real sentences [latent representation of real text]…The feature vector is then fed into a 900-200-2 fully connected network for the discriminator… with sigmoid activation units connecting the intermediate layers and softmax/tanh units for the top layer of discriminator/encoder” Examiner notes that zhang teaches using softmax units to map latent text to a softmax representation of the real text.) the reconstructed softmax representation of the real text comprising a soft-text that is a continuous representation of the real text, (Examiner notes, by definition the softmax operator outputs a continuous representation of its input.) a softmax representation of artificial text generated from the artificial code (Figure 2 Top Examiner notes, that just as the discriminator generates softmax representation in the feature layer, it also generates f̃, synthetic features)
Zhang to the disclosed invention of Wang.
One of ordinary skill in the arts would have been motivated to make this modification because in order to “delivers superior performance compared to related approaches produce realistic sentences, and that the learned latent representation space can “smoothly” encode plausible sentences” (Zhang Conclusion)
Wang/Zhang does not explicitly teach, a one-hot representation of a real text, the real text comprising a sequence of words; 
Elaffendi however, when addressing encoding text usable in neural networks teaches, one-hot representation of the real text, the real text comprising a sequence of words (Introduction “One Hot Encoding approaches represent each word in the vocabulary by a numerical positional vector whose elements are all zeros, except for the position of the word in the vocabulary list” Examiner notes that a text with a string or sequence of words in a corresponding vocabulary can be encoded as described, in the context of neural networks for text processing.)
Elaffendi to the disclosed invention of Wang/Zhang.
One of ordinary skill in the arts would have been motivated to make this modification because “One Hot Encoding (OHE) is currently the norm in text encoding for deep learning neural models” (Elaffendi Abstract)

Regarding Claim 11
	Wang/Zhang/Elaffendi teach Claim 10
Further Wang teaches, calculating a reconstruction loss based on a difference between the one-hot representation of the real text and the soft-text output from the decoder neural network; (Section 3.3 ¶03 “Images from training set are first be encoded into latent space by the encoder: hi = Enc(xi), where i ∈ {0, 1, ...,n}, xi represents the image sampled from training set, hi is the latent vector of xi, and n is the size of mini-batches. Together with the decoder, the whole Auto-Encoder networks are trained by minimizing the squared cost function [reconstruction loss]:

    PNG
    media_image3.png
    69
    316
    media_image3.png
    Greyscale
” Examiner notes, as stated previously that when modified with Zhang the decoder output undergoes the soft-max operation. Such that Dec(h) corresponds to softmax(Dec(h))) and updating parameters of the encoder neural network and parameters of the decoder neural network based on the reconstruction loss. (Section 4 Algorithm 1 The Auto-Encoder Generative Adversarial Networks training procedure “Update Enc and Dec by descending: … Update D2 by descending: … Update D1 by descending: … Update G1 by descending:” Examiner notes that updating includes updating the encoder and decoder based in part on the reconstruction loss)

Regarding Claim 12
	Wang/Zhang/Elaffendi teach Claim 10
Further Wang as modified by Zhang teaches, calculating a first discriminator loss for the first discriminator neural network based on the soft-text and the soft-max representation of artificial text; (Section 3.3 ¶04 “Then, both kinds of latent vectors are fed into the Decoder to get the generated images x˜i and the reconstructed images xi′. Discriminator D2 [first discriminator] is trained to distinguish them from each other…
    PNG
    media_image4.png
    34
    404
    media_image4.png
    Greyscale
” Examiner notes, xi undergoes the soft-max operation before being input into the discriminator.) and updating parameters of the first discriminator neural network and parameters of the decoder neural network based on the first discriminator loss.(Section 4 Algorithm 1 The Auto-Encoder Generative Adversarial Networks training procedure “Update Enc and Dec by descending:… Update D2 by descending: …Update D1 by descending: …Update G1 by descending:” Examiner notes that updating includes updating the parameters of the 1st  discriminator and encoder based in part on the 1st discriminator loss)

Regarding Claim 13
	Wang/Zhang/Elaffendi teach Claim 10
Further Wang as modified by Zhang teaches, calculating a second discriminator loss for the second discriminator neural network based on the artificial code or the random noise data and the latent representation of the real text; (Section 3.3 ¶04 “the discriminator D1, which takes a real latent vector hi or a generated one ˜hi as input, is trained to classify inputs into two classes (real or fake)…
    PNG
    media_image5.png
    33
    373
    media_image5.png
    Greyscale
”) and updating parameters of the second discriminator neural network and parameters the encoder neural network based on the second discriminator loss when input to the (Section 4 Algorithm 1 The Auto-Encoder Generative Adversarial Networks training procedure “Update Enc and Dec by descending:… Update D2 by descending: …Update D1 by descending:…Update G1 by descending:” Examiner notes that updating includes updating the parameters of the 2nd discriminator and encoder based in part on the 2nd discriminator loss)

Regarding Claim 14
	Wang/Zhang/Elaffendi teach Claim 10
Further Wang teaches, calculating a generator loss that maximizes the first probability and the second probability; (Section 3.3 ¶04 “while G1 is trained to "fool" both D1 and D2… 
    PNG
    media_image8.png
    29
    323
    media_image8.png
    Greyscale
” Examiner notes that in the context of discriminators that output probabilities of being fooled. A Generator network that is trained to fool the discriminators would be one that maximizes those probabilities.) and updating parameters of the generator neural network and parameters the decoder neural network based on the generator loss. (Section 4 Algorithm 1 The Auto-Encoder Generative Adversarial Networks training procedure “Update Enc and Dec by descending: …Update D2 by descending: … Update D1 by descending: …Update G1 by descending:” Examiner notes that updating includes updating the parameters of the decoder and generator based in part on the generator loss)

Regarding Claim 15
	Wang/Zhang/Elaffendi teach Claim 10
Further Wang teaches, 

    PNG
    media_image9.png
    207
    693
    media_image9.png
    Greyscale
 (Section 3.3 ¶03 “Images from training set are first be encoded into latent space by the encoder: hi = Enc(xi), where i ∈ {0, 1, ...,n}, xi represents the image sampled from training set, hi is the latent vector of xi, and n is the size of mini-batches. Together with the decoder, the whole Auto-Encoder networks are trained by minimizing the squared cost function [reconstruction loss]:

    PNG
    media_image3.png
    69
    316
    media_image3.png
    Greyscale
”Examiner notes, as stated previously that when modified with Zhang the decoder output undergoes the soft-max operation. Further “training” entails not only calculating but also updating. Furthermore, the coefficient 1/nHW is only a scalar multiple that would have been obvious to be removed by PHOSITA, as this scalar does not affect the error calculation)

Regarding Claim 18
	Wang/Zhang/Elaffendi teach Claim 10
Further Wang teaches, 
    PNG
    media_image10.png
    359
    751
    media_image10.png
    Greyscale
 (Section 3.3 ¶04 “while G1 is trained to "fool" both D1 and D2… 
    PNG
    media_image8.png
    29
    323
    media_image8.png
    Greyscale
” Section 4 ¶02 “We experimented various values of the hyperparameter λ, and found that λ = 1 works well in all reported experiments” Examiner notes that the expected value of the discriminators output necessarily is dependent on a function of both the synthetic features (x̃ and h̃ ) corresponding to (x̂ and ĉ) but also the real features (x and h) corresponding to (x̃ and c). Thus, E [ logD1 ] and E [ logD2 ] is equivalent to the presented in the claim. Furthermore, λ is simply a hyper parameter for learning rate, whose sign, positive or negative, defines the loss as a maximization problem or minimization problem. However, Examiner notes that the second alternative equation is not taught by any of the art references in combination because examiner interprets,                        
                            
                                
                                     
                                    f
                                
                                
                                    w
                                    2
                                
                            
                        
                    , to be the mapping of discriminator inputs to outputs. When combined the art does not teach a discriminator taking random noise, z, as input.)

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. “Unregularized Auto-Encoder with Generative Adversarial Networks for Image Generation”. Further in view of Zhang et al. “Adversarial Feature Matching for Text Generation” hereinafter Zhang. Further still in view of Elaffendi et al. “Text Encoding for Deep Learning Neural Networks: A Reversible Base 64 (Tetrasexagesimal) Integer Transformation (RIT64) Alternative to One Hot Encoding with Applications to Arabic Morphology” hereinafter Elaffendi. Further still in view of Wang2 et al. “Deep Fusion Generative Adversarial Networks for Text-to-Image Synthesis” hereinafter Wang2. Further still in view of Gulrajani et al. “Improved Training of Wasserstein GANs” hereinafter Gulrajani.

Regarding Claim 8
Wang/Zhang/Elaffendi/Wang2 teach Claim 1
Wang/Zhang/Elaffendi/Wang2 does not explicitly teach, 

    PNG
    media_image11.png
    323
    657
    media_image11.png
    Greyscale
 
Gulrajani however, when addressing issues related to improving the training stability for generative models in GANs teaches, 
    PNG
    media_image12.png
    323
    657
    media_image12.png
    Greyscale
 ((Section 4 ¶01 “To circumvent tractability issues, we enforce a soft version of the constraint with a penalty on the gradient norm for random samples xˆ ∼ Pxˆ. Our new objective is: …
    PNG
    media_image13.png
    71
    562
    media_image13.png
    Greyscale
 ” Examiner notes that this gradient penalty corresponds to the gradient policy presented by the claim. Furthermore, the critic loss is described here… Section 2.2 “where D is the set of 1-Lipschitz functions and Pg is once again the model distribution implicitly defined by x̃ = G(z)” Thus the functions in the red boxes correspond to each other because they both define the expectation of the discriminator with artificial inputs, while the functions in the green boxes define the expectation of the discriminator with the real inputs.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a modified adversarial Gulrajani to the disclosed invention of Wang/Zhang/Elaffendi/Wang2.
One of ordinary skill in the arts would have been motivated to make this modification because in order to demonstrate a “strong modeling performance and stability across a variety of architectures…. our work opens the path for stronger modeling performance on large-scale image datasets and language… adapting our penalty term to the standard GAN objective function, where it might stabilize training by encouraging the discriminator to learn smoother decision boundaries” (Gulrajani Conclusion)	

Claims 16 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. “Unregularized Auto-Encoder with Generative Adversarial Networks for Image Generation”. Further in view of Zhang et al. “Adversarial Feature Matching for Text Generation” hereinafter Zhang. Further still in view of Elaffendi et al. “Text Encoding for Deep Learning Neural Networks: A Reversible Base 64 (Tetrasexagesimal) Integer Transformation (RIT64) Alternative to One Hot Encoding with Applications to Arabic Morphology” hereinafter Elaffendi. Further still in view of Gulrajani et al. “Improved Training of Wasserstein GANs” hereinafter Gulrajani.

Regarding Claim 16
	Wang/Zhang/Elaffendi teach Claim 10
Wang/Zhang/Elaffendi does not explicitly teach, 

    PNG
    media_image14.png
    272
    709
    media_image14.png
    Greyscale
 
Gulrajani however, when addressing issues related to improving the training stability for generative models in GANs teaches, 
    PNG
    media_image15.png
    211
    654
    media_image15.png
    Greyscale
 ((Section 4 ¶01 “To circumvent tractability issues, we enforce a soft version of the constraint with a penalty on the gradient norm for random samples xˆ ∼ Pxˆ. Our new objective is: …
    PNG
    media_image13.png
    71
    562
    media_image13.png
    Greyscale
 ” Examiner notes that this gradient penalty corresponds to the gradient policy presented by the claim. Furthermore, the critic loss is described here… Section 2.2 “where D is the set of 1-Lipschitz functions and Pg is once again the model distribution implicitly defined by x̃ = G(z)” Thus the functions in the red boxes correspond to each other because they both define the expectation of the discriminator with artificial inputs, while the functions in the green boxes define the expectation of the discriminator with the real inputs.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a modified adversarial critic loss function that includes as gradient penalty as taught by Gulrajani to the disclosed invention of Wang/Zhang/Elaffendi
One of ordinary skill in the arts would have been motivated to make this modification because in order to demonstrate a “strong modeling performance and stability across a variety of architectures…. our work opens the path for stronger modeling performance on large-scale image datasets and language… adapting our penalty term to the standard GAN objective function, where it might stabilize training by encouraging the discriminator to learn smoother decision boundaries” (Gulrajani Conclusion)	

Regarding Claim 17
	Wang/Zhang/Elaffendi teach Claim 10
Wang/Zhang/Elaffendi does not explicitly teach, 

    PNG
    media_image16.png
    514
    747
    media_image16.png
    Greyscale

Gulrajani however, when addressing issues related to improving the training stability for generative models in GANs teaches, 
    PNG
    media_image17.png
    430
    644
    media_image17.png
    Greyscale
 ((Section 4 ¶01 “To circumvent tractability issues, we enforce a soft version of the constraint with a penalty on the gradient norm for random samples xˆ ∼ Pxˆ. Our new objective is: …
    PNG
    media_image13.png
    71
    562
    media_image13.png
    Greyscale
 ” Examiner notes that this gradient penalty corresponds to the gradient policy presented by the claim. The gradient policy for both presented alternative equations are understood to represent the same function,                 
                    
                        
                            f
                        
                        
                            w
                            2
                        
                    
                
            , where both c̅ and c̅1 corresponds to x̂. Furthermore, the critic loss is described here… Section 2.2 “where D is the set of 1-Lipschitz functions and Pg is once again the model distribution implicitly defined by x̃ = G(z)” Thus the functions in the green boxes correspond to each other because they each define the expectation of the discriminator with artificial inputs, whether latent or reconstructed, while the functions in the red boxes define the expectation of the discriminator with the real inputs, whether latent or reconstructed. In this case the set of 1-Lipschitz functions includes the mapping of c and ĉ through the discriminator, such that ĉ corresponds to x̃ and c corresponds to x. However, Examiner notes that the second alternative equation is not taught by any of the art references in combination because examiner interprets,                
                    
                        
                             
                            f
                        
                        
                            w
                            2
                        
                    
                
            , to be the mapping of discriminator inputs to outputs. When combined the art does not teach a discriminator taking random noise, z, as input.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a modified adversarial critic loss function that includes as gradient penalty as taught by Gulrajani to the disclosed invention of Wang/Zhang/Elaffendi.
One of ordinary skill in the arts would have been motivated to make this modification because in order to demonstrate a “strong modeling performance and stability across a variety of architectures…. our work opens the path for stronger modeling performance on large-scale image datasets and language… adapting our penalty term to the standard GAN objective function, where it might stabilize training by encouraging the discriminator to learn smoother decision boundaries” (Gulrajani Conclusion)	

Conclusion
Prior art
Kusner et al “GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution” further discuss modifying GANs that deal with discrete sequences by applying Gumbel-softmax distribution to produce a continuous approximation of discrete sequences.
Makhzani et al “Adversarial Autoencoder” further discusses architectures that integrate auto encoders with GANs, including outputting a probabilistic discrimination between latent representations and Generative outputs.
Spinks et al “Generating Continuous Representations of Medical Texts” further discusses adversarially regularized auto encoder inspired GAN, wherein a generator mimics the latent representation of discrete text strings.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached on Monday-Friday 7:30 am – 4:00 pm (EST).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki, can be reached at telephone number 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://portal.uspto.gov/external/portal. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

/J.R.G./
Examiner, Art Unit 2122                
/BRIAN M SMITH/Primary Examiner, Art Unit 2122