DETAILED ACTION
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
2.	This communication is in response to the Applicant’s submission filed 27 December 2018, where:
Claims 1-20 are pending.
Claims 1-20 are rejected.
Information Disclosure Statement
3.	Information disclosure statements were submitted on 27 December 2018, 18 June 2020, 18 October 2021, and 11 February 2022. The submission complies with the provisions of 37 CFR 1.97. Accordingly, the Examiner considered the information disclosure statements.
Drawings
4.	The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description: 
Reference “802” of Fig. 8,
Reference “814” of Fig. 8, 
Reference “818” of Fig. 8,
Reference “820” of Fig. 8, and
Reference “822” of Fig. 8.
Corrected drawing sheets in compliance with 37 CFR 1.121(d), or amendment to the specification to add the reference character(s) in the description in compliance with 37 CFR 1.121(b) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Claim Rejections - 35 U.S.C. § 103 
5.	The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
6.	The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. § 103 are summarized as follows:
1. 	Determining the scope and contents of the prior art.
2. 	Ascertaining the differences between the prior art and the claims at issue.
3. 	Resolving the level of ordinary skill in the pertinent art.
4. 	Considering objective evidence present in the application indicating obviousness or nonobviousness.
7.	Claims 1, 2, 4, 6, 8, 9, 11, 13, 15, 16, 18, and 20 are rejected under 35 U.S.C. § 103 as being unpatentable over US Published Application 20180197071 to Dong et al. [hereinafter Dong] in view of Zheng et al., “Generative adversarial network based telecom fraud detection at the receiving bank,” Neural Networks (March 2018) [hereinafter Zheng].
Regarding claims 1, 8, and 15, Dong teaches [a] system for training a neural network classifier (see at least Dong ¶ 0002) of claim 1, [a] method (see at least Dong ¶ 0004) of claim 8, and [a] non-transitory machine-readable medium (see at least Dong ¶ 0097) having stored thereon machine-readable instructions executable to cause a machine to perform operations (see at least Dong ¶ 0097) of claim 15, comprising:
a non-transitory memory; and one or more hardware processors coupled to the non-transitory memory (Dong ¶ 0032 teaches at least one processor coupled to a memory) and configured to read instructions from the non-transitory memory to cause the system to perform operations (Dong ¶ 0032 teaches stored instructions in memory 210 or in storage 220 may include those enabling the processor 205 to execute one or more aspects of the systems and methods described herein) comprising:
performing a first training process, using a first training dataset (Dong ¶ 0003 teaches the neural network learn[s] how to provide an output for new input data by generalizing the information the neural network learns in the training stage from the training data (that is, performing a first training process, using a first training dataset) ). . . , on a neural network system including an autoencoder (that is, a neural network system including an autoencoder)) including an encoder and a decoder to generate a trained autoencoder (Dong ¶ 0062 teaches an auto-encoder neural network 465, which is an unsupervised learning algorithm that applies backpropagation, for setting the target values to be equal to the inputs. The auto-encoder1 465 may be a feedforward, non-recurrent neural network having an input layer (that is, an encoder), an output layer (that is, a decoder) and one or more hidden layers connecting them (that is, a neural network system including an autoencoder). . . . [T]he auto-encoder 465 may be a denoising encoder, a sparse encoder, a variational encoder, a contractive encoder, or any other type of auto-encoder),
wherein a trained encoder of the trained autoencoder is configured to receive a first plurality of input data in an N-dimensional data space and generate a first plurality of latent variables in an M-dimensional latent space, wherein M is an integer less than N (Dong Fig. 6 teaches (Examiner annotations in dashed text-boxes) generating a prediction using a multi-dimension CNN:

    PNG
    media_image1.png
    783
    889
    media_image1.png
    Greyscale

Dong ¶ 0057 teaches that the factor classifier 455 accesses raw data from the data repositories 410-430 and categorizes the parameters to be used for training a neural network, such as 3D CNN (that is, a second training dataset). . . . [T]he factor classifier 455 classifies the parameters into 3 categories: main factors 455A, low dimensional auxiliary factors 455B, and high dimensional auxiliary factors 455C (that is, “factors” are the trained autoencoder is configured to receive a first plurality of input data in an N-dimensional data space)) and generate a first plurality of latent variables in an M-dimensional latent space, wherein M is an integer less than N (Dong ¶ 0062 recites the auto-encoder uses regression error from back-propagation (645) of the 3D CNN for the compression of the high-dimensional parameters (that is, a number of “high-dimensional parameters” are an N-dimensional data space), by identifying, and abandoning irrelevant components according to the regression errors (that is, the “abandoning irrelevant components” is to reduce the N-dimensional data space to generate a first plurality of latent variables in an M-dimensional latent space, wherein M is an integer less than N)
[Examiner notes that “abandoning irrelevant components” is akin to the “ignoring noise” of the Applicant’s Specification, which recites “The autoencoder 218 may learn to compress an input data xi 222 (e.g., an input transaction) into a latent variable (also referred to as a latent code or a latent representation), denoted as En(xi) in a latent space 212. In an example, the input transaction 222 may have N attributes (e.g., transaction time, transaction type, payor, payee, transaction history, etc.), and as such, is in an N dimensional space. The latent space 212 may have M dimensions, where M is less than N. The decoder 204 may uncompress that latent representation En(xi) into a reconstructed data 224 (denoted as De(En(xi))) that closely matches the input data xi 222. As such, the autoencoder 218 engages in dimensionality reduction, for example by learning how to ignore noise. A reconstruction loss function may be used by the autoencoder 218 to generate a reconstruction error” (Specification ¶ 0020)]);
performing a sampling process to the first plurality of latent variables to generate a first plurality of latent variable samples (Dong ¶ 0058 & Fig. 6 teaches at 615 the number of factors identified (that is, “number of factors identified” is performing a sampling process to the first plurality of latent variables) as low dimensional factors are reduced. In one or more examples, an administrator or another user may identify the factors to be used, thereby reducing the number of low dimensional factors used during training; Dong ¶ 0059 & Table 1 teaches illustrates a feature vector that depicts a list of parameters that is to be reduced (that is, to generate a first plurality of latent variable samples), such as a list of the low dimensional factors; Dong ¶ 0062 & Fig. 6 teaches the factor classifier 455 uses an auto-encoder neural network 465, which is an unsupervised learning algorithm that applies backpropagation, for setting the target values to be equal to the inputs (that is the “latent variables” are the factors 455A-C, which is performing a sampling process to the first plurality of latent variables))
[Examiner notes the Specification recites “The size of the first plurality of latent space samples 506 may be determined based on a performance requirement (e.g., accuracy requirement, training time requirement) for fraud detection” (Specification ¶ 0028). Thus, the plain and ordinary meaning of the term “latent space samples” is an amount or quantity of sample labels, not a value or content of the “latent space sample”]);
generating, using a trained decoder of the trained autoencoder, a second training dataset in the N-dimensional data space (Dong, Fig. 6 teaches TAZ Cubes 610 (that is, second training dataset); Dong ¶ 0057 teaches that the factor classifier 455 accesses raw data from the data repositories 410-430 and categorizes the parameters to be used for training a neural network, such as 3D CNN (that is, a second training dataset)) using the first plurality of latent variable samples (Dong ¶ 0063 teaches using the parameters identified by the auto-encoder (S3), and the deep machine learning (S2), generates the TAZ timeslot-cubes 610, at 635. The TAZ timeslot-cubes 610 are used to train the 3D CNN (that is, the “TAZ timeslot-cubes” are a second training dataset in the N-dimensional data space) . . . . [T]he backpropagation error during the training is used for the auto-encoder, at S3 (that is, using the first plurality of latent variable samples)); and
performing a second training process, using the second training dataset, on a first classifier including a first classifier neural network model (Dong ¶ 0063 teaches [t]he TAZ timeslot-cubes 610 (that is, performing a second training process, using the second training dataset) are used to train the 3D CNN (that is, the “3D CNN” is performing a second training process . . . on a first classifier including a first classifier neural network model to generate a trained classifier)) to generate a trained classifier (Dong ¶ 0054 teaches the neural networks 485 includes a 3D convolutional neural network (CNN). The forecasting server 140 uses the 3D CNN for determining spatial neighborhood correlations (that is, “for determining spatial neighborhood correlations” is to generate a trained classifier). The forecasting server trains the 3D CNN using TAZ cubes as input data) . . . .
Though Dong teaches the feature of an autoencoder that generates training data for a classifier neural network model, Dong, however, does not explicitly teach transactional data where a first training dataset “including a plurality of transactions,” and performing a second training process “for providing transaction classification of a first transaction.”
But Zheng teaches transactional data where a first training dataset “including a plurality of transactions,” (Zheng, Table 1) and performing a second training process “for providing transaction classification of a first transaction” (Zheng, Fig. 1 teaches an architecture of the GAN for fraud detection:

    PNG
    media_image2.png
    462
    525
    media_image2.png
    Greyscale

Zheng, left column of p. 80, “3. An adversarial deep denoising autoencoder for fraud detection,” first full paragraph, teaches construct[ing] a GAN based on a deep denoising autoencoder architecture, which consists of a two-hidden-layer encoder, a corresponding decoder, and two Gaussian mixture models denoted by GMM1 and GMM2, as shown in Fig. 2. (that is, performing a second training pro teaches “for providing transaction classification of a first transaction)).
Dong and Zheng are from the same or similar field of endeavor. Dong teaches the feature of an autoencoder that generates training data for a classifier neural network model. Zheng teaches a GAN including an autoencoder for transaction fraud detection. Thus, it would have been obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to modify the teachings of the combination of Dong and Zheng relating to an AE and classifier prediction in transactional settings with the GAN of Zheng. 
The motivation for doing so is to outperform a set of well-known fraud classification methods, and in its application reduce financial losses caused by fraudulent transactions. (Zheng, Abstract).
Regarding claims 2, 9, and 16, the combination of Dong and Zheng teaches all of the limitations of claims 1, 8, and 15, respectively, as described in detail above. 
Dong teaches -
wherein the neural network system includes a second classifier (Dong ¶ 0059 teaches the factor classifier 455 uses the data structures to perform deep learning to determine the combination of parameters to use for predicting the travel demand (that is, “the factor classifier” is a second classifier)) including a second classifier neural network model for transaction classification (Dong, Fig. 8, teaches the factor classifier is a “neural network (deep learning) for selecting factors:

    PNG
    media_image3.png
    757
    882
    media_image3.png
    Greyscale

Dong ¶ 0061 teaches the deep learning for reducing the number of factors uses tensors during the deep learning; Dong ¶ 0059 teaches the factor classifier 455 uses the data structures to perform deep learning to determine the combination of parameters to use for predicting the travel demand (that is, “the factor classifier” is a second classifier including a second classifier neural network model for transaction classification)),
wherein the performing the first training process includes training the second classifier, using the first training dataset, to generate a trained second classifier (Dong, Fig. 8, teaches initializing data for training multiple neural networks of which one is the factor classifier (that is, training the second classifier, using the first training dataset, to generate a trained second classifier)); and
wherein the performing the sampling process to the first plurality of latent variables to generate a plurality of latent variable samples includes:
performing a first sub-sampling process, using a first sampler, to generate a second plurality of latent variable samples from the first plurality of latent variables (see at least Dong ¶ 0057 & Fig. 6 (“Factors 455”) above);
* * *
Though Dong teaches an autoencoder may be a variational encoder or any other type of auto-encoder, Dong does not explicitly teach “a plurality of class probabilities” -
* * *
providing, using the trained second classifier, a plurality of class probabilities corresponding to the second plurality of latent variable samples respectively; and
performing a second sub-sampling process, using a second sampler, to generate the first plurality of latent variable samples from the second plurality of latent variable samples based on the plurality of class probabilities.
Zheng, which teaches a variational encoder, further teaches -
* * *
providing, using the trained second classifier, a plurality of class probabilities corresponding to the second plurality of latent variable samples respectively (Zheng, Fig. 2, teaches (Examiner annotations in text-box):

    PNG
    media_image4.png
    453
    453
    media_image4.png
    Greyscale

Zheng, right column of p. 79, “3. An Adversarial Deep Denoising Autoencoder for Fraud Detection,” first paragraph teaches a deep neural network to extract latent representations that can support much more effective classification than raw input features; Zheng, right column of p. 80, “3. An adversarial deep denoising autoencoder for fraud detection,” first paragraph, teaches that GMM1 acts as the discriminator D. That is, the encoder accepts an input vector x representing a transfer (the input features of which are summarized in Table 1) and transforms it into a latent vector z, and GMM1 calculates from z a possibility Φ1(z) of the transfer being a real normal transfer from the data distribution, i.e., both fakes transfer from the generator and fraudulent transfers are regarded as negative
samples (that is, providing, using the trained second classifier, a plurality of class probabilities corresponding to the second plurality of latent variable samples respectively)); and
performing a second sub-sampling process, using a second sampler, to generate the first plurality of latent variable samples from the second plurality of latent variable samples based on the plurality of class probabilities (Zheng, Fig. 2 & left column of p. 81, “3. An adversarial deep denoising autoencoder for fraud detection,” first paragraph, teaches [o]nce both [Discriminator] and [Generator] are trained, GMM2 is then trained . . . where x+ and X- are the empirical distributions of positive samples . . . and negative samples (that is, performing a second sub-sampling process, using a second sampler, to generate the first plurality of latent variable samples from the second plurality of latent variable samples based on the plurality of class probabilities)).
Regarding claims 4, 11, and 18, the combination of Dong and Zheng teaches all of the limitations of claims 2, 9, and 16, respectively, as described in detail above. 
Zheng teaches -
wherein the neural network system includes a fraudulent transaction generative adversarial network (GAN)2 (Zheng, Abstract, teaches new generative adversarial network (GAN) based model to calculate for each large transfer a probability that it is fraudulent) including a fraudulent transaction generator and a fraud discriminator (Zheng, Abstract, teaches inference model uses a deep denoising autoencoder to effectively learn the complex probabilistic relationship among the input features, and employs adversarial training that establishes a minimax game between a discriminator (that is, a fraud discriminator) and a generator (that is, a fraudulent transaction generator) to accurately discriminate between positive samples and negative samples in the data distribution),
wherein the fraud transaction generator including the decoder (Zheng, right column of p. 80, “3. An adversarial deep denoising autoencoder for fraud detection,” second paragraph, teaches [t]he decoder [of the autoencoder] acts as the generator G (that is, the autoencoder of the generator includes a decoder, such that the fraud transaction generator including the decoder)), and
wherein the performing the first training process includes training the fraudulent transaction GAN using a fraud-sensitive weighted adversarial loss function based on class probabilities of transactions provided by the second classifier (Zheng, right column of p. 79, “3. An adversarial deep denoising autoencoder for fraud detection,” second paragraph, teaches θ and θ′ are vectors of weight and bias parameters of the encoder and the decoder, respectively. Autoencoder training consists in minimizing the reconstruction error (that is, training the fraudulent transaction GAN using a fraud -sensitive weighted adversarial loss function based on class probabilities of transactions provided by the second classifier); Zheng, left column of p. 81, “3. An adversarial deep denoising autoencoder for fraud detection,” first full paragraph, teaches construct[ing] a GAN based on a deep denoising autoencoder architecture, which consists of a two-hidden-layer encoder, a corresponding decoder, and two Gaussian mixture models denoted by GMM1 and GMM2).
Regarding claims 6, 13, and 20, the combination of Dong and Zheng teaches all of the limitations of claims 2, 9, and 16, respectively, as described above in detail.
Dong teaches -
wherein the second classifier (Dong ¶ 0059 & Fig. 6 teaches the factor classifier 455 (that is, the second classifier is different)) is different from the first classifier (Dong ¶ 0063 & Fig. 6 teaches a “3D CNN,” which is the first classifier).
8.	Claims 3, 10, and 17 are rejected under 35 U.S.C. § 103 as being unpatentable over US Published Application 20180197071 to Dong et al. [hereinafter Dong] in view of Zheng et al., “Generative adversarial network based telecom fraud detection at the receiving bank,” Neural Networks (March 2018) [hereinafter Zheng] and Zhai et al., “Semisupervised Autoencoder for Sentiment Analysis,” AAAI (2016) [hereinafter Zhai].
Regarding claims 3, 10, and 17, the combination of Dong and Zheng teaches all of the limitations of claims 2, 9, and 16, respectively, as described above in detail. 
Though Dong and Zheng teach the feature of an autoencoder that generates training data for a classifier neural network model in a GAN for transaction fraud detection, the combination of Dong and Zheng does not explicitly teach -
wherein the second sub-sampling process includes an adaptive bootstrap sampling process.
But Zhai teaches -wherein the second sub-sampling process includes an adaptive bootstrap sampling process (Zhai, right column of p. 1396, “Results,” second paragraph, teaches As the training of the autoencoder part of [the Semisupervised Bregman Divergence Autoencoder (SBDAE)] does not require the availability of labels, we also try incorporating unlabeled data after learning the linear classifier in SBDAE. As shown in Table 2:

    PNG
    media_image5.png
    249
    1079
    media_image5.png
    Greyscale

doing so further improves the performance over using labeled data only. This justifies that it is possible to bootstrap from a relatively small amount of labeled data and learn better representations with more unlabeled data with SBDAE).)
[Examiner note: like the “reduction” of Dong, the Specification recites “an adaptive bootstrap sampling method. By using the output of the trained classifier 208 in the sampling process, the second plurality of latent space samples 510 may be used to generate a training dataset with less noise than the first training dataset. In some examples, by using the output of the trained classifier 208 in the sampling process, more latent space samples corresponding to fraudulent transactions near a decision boundary of the classifier 208 may be generated in the second set of training data [(that is, “bootstrap”)]. A decision boundary is the region of a problem space in which the output label of a classifier is ambiguous (Specification ¶ 0030)]).
Dong, Zheng, and Zhai are from the same and similar field of endeavor. Dong teaches the feature of an autoencoder that generates training data for a classifier neural network model. Zheng teaches a GAN including an autoencoder for transaction fraud detection. Zhai teaches improving classification effectiveness for a desired classification task. Thus, it would have been obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to modify the teachings of the combination of Dong and Zheng relating to an AE and classifier prediction in transactional settings of a GAN with the model training improvement of Zhai. 
The motivation for doing so is to take advantage of unlabeled dataset and get improved performance out of the model. (Zhai, Abstract).
9.	Claims 5, 12, and 19 are rejected under 35 U.S.C. § 103 as being unpatentable over US Published Application 20180197071 to Dong et al. [hereinafter Dong] in view of Zheng et al., “Generative adversarial network based telecom fraud detection at the receiving bank,” Neural Networks (March 2018) [hereinafter Zheng] and Chen et al., “Credit Card Fraud Detection Using Sparse Autoencoder and Generative Adversarial Network,” IEEE (November 2018) [hereinafter Chen].
Regarding claims 5, 12, and 19, the combination of Dong and Zheng teaches all of the limitations of claims 2, 9, and 16, respectively, as described in detail above. 
Though Dong and Zheng teach the feature of an autoencoder that generates training data for a classifier neural network model based on a transactional setting, the combination of Dong and Zheng, however, does not explicitly teach -
wherein the second classifier is trained using a cross-entropy loss function.
But Chen teaches -
wherein the second classifier is trained using a cross-entropy loss function (Chen, right column of p. 1055, “III. Autoencoder and GAN - A. Autoencoder,” third paragraph, teaches [w]here Φ is the encoder and ψ is the decoder, and L means the error between input and output which usually use mean-square error method or cross-entropy method (that is, trained using a cross-entropy function)).
Dong, Zheng and Chen are from the same or similar field of endeavor. Dong teaches the feature of an autoencoder that generates training data for a classifier neural network model. Zheng teaches a GAN including an autoencoder for transaction fraud detection. Chen teaches we combine the SAE and the discriminator of GAN and apply them to detect whether a transaction is genuine or fraud. Thus, it would have been obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to modify the teachings of the combination of Dong and Zheng relating to an AE and classifier prediction in transactional settings with the cross-entropy learning of Chen. 
The motivation for doing so is to address skewed datasets with little transactional fraud data. (Chen, Abstract).
10.	Claims 7 and 14 are rejected under 35 U.S.C. § 103 as being unpatentable over US Published Application 20180197071 to Dong et al. [hereinafter Dong] in view of Zheng et al., “Generative adversarial network based telecom fraud detection at the receiving bank,” Neural Networks (March 2018) [hereinafter Zheng] and Saifuddin Hitawala, “Comparative Study on Generative Adversarial Networks,” University of Waterloo (2018) [hereinafter Hitawala].
Regarding claims 7 and 14, the combination of Dong and Zheng teaches all of the limitations of claims 1 and 8, respectively, as described in detail above.
Though Dong and Zheng teach the feature of an autoencoder that generates training data for a classifier neural network model based on a transactional setting, the combination of Dong and Zheng, however, does not explicitly teach -
wherein the neural network system includes a prior distribution generative adversarial network (GAN) including a generator and a prior distribution discriminator,
wherein the generator including the encoder, and
wherein the performing the first training process includes training the prior distribution GAN using a predetermined prior distribution.
But Hitawala teaches -
wherein the neural network system includes a prior distribution generative adversarial network (GAN) including a generator and a prior distribution discriminator (Hitawala, Fig. 2, teaches an generative adversarial network (GAN) (Examiner annotations in text-box):

    PNG
    media_image6.png
    342
    478
    media_image6.png
    Greyscale

Hitawala, right column of p. 2, “2. Background: Generative Adversarial Networks,” first paragraph, teaches Generative Adversarial Networks (Goodfellow et al., 2014) consist of a pair of models called the generator and discriminator; Hitawala, right column of p. 4, “3.4 Adversarial Autoencoder,” second paragraph, teaches to [l]et p(z) be the prior distribution we want to impose, q(z|x) be the encoding distribution and p(x|z) be
the decoding distribution (that “decoding distribution” is a post distribution discriminator)),
wherein the generator including the encoder (Hitawala, right column of p. 4, “3.4 Adversarial Autoencoders,” last paragraph, teaches generator of the adversarial network is also the encoder of the autoencoder q(z|x) (that is, wherein the generator including the encoder)), and
wherein the performing the first training process includes training the prior distribution GAN using a predetermined prior distribution (Hitawala, Fig. 5 teaches an adversarial autoencoder including latent data:

    PNG
    media_image7.png
    222
    441
    media_image7.png
    Greyscale

Hitawala, right column of p. 4, “3.4 Adversarial Autoencoders,” first paragraph, teaches that [a]n adversarial autoencoder . . . is trained with dual objectives - a traditional reconstruction error criteria, and an adversarial training criterion that matches the aggregated posterior distribution of the latent representation to an arbitrary prior distribution. After training, the encoder learns to convert the data distribution to the prior distribution, while the decoder learns a deep generative model that maps the imposed prior to the data distribution (that is, training the prior distribution GAN using a predetermined prior distribution)).
Dong, Zheng and Hitawala are from the same or similar field of endeavor. Dong teaches the feature of an autoencoder that generates training data for a classifier neural network model. Zheng teaches a GAN including an autoencoder for transaction fraud detection. Hitawala teaches a GAN using predetermined prior distributions. Thus, it would have been obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to modify the teachings of the combination of Dong and Zheng relating to an AE and classifier prediction in transactional settings with the prior distribution GAN of Hitawala. 
The motivation for doing so is because later versions of adversarial networks are more robust and have many more applications compared to the original version of the Generative Adversarial Networks, which can prove to be useful in image classification, recognition, capturing and generation in a variety of ways. (Hitawala, left column of p. 8, “5. Conclusion and Future Work,” first paragraph).
Conclusion
11.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
(Roy et al., "A Robust System for Noisy Image Classification Combining Denoising Autoencoder and Convolutional Neural Network," IJACSA (January 2018)) teaches utilizing denoising autoencoder (DAE) to restore original images from noisy images and then Convolutional Neural Network (CNN) is used for classification.
(US Patent 9704054 to Tappen et al.) teaches that in unsupervised learning of an identity function, such as that which is typically performed by a sparse autoencoder, target output of the training set is the input, and the neural network is trained to recognize the 65 input as such. Sparse autoencoders employ backpropagation in order to train the autoencoders to recognize an approximation of an identity function for an input, or to otherwise approximate the input.
12.	Any inquiry concerning this communication or earlier communications from the Examiner should be directed to KEVIN L. SMITH whose telephone number is (571) 272-5964. Normally, the Examiner is available on Monday-Thursday 0730-1730. 
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner’s supervisor, KAKALI CHAKI can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/K.L.S./
Examiner, Art Unit 2122
/BRIAN M SMITH/Primary Examiner, Art Unit 2122                                                                                                                                                                                                        



    
        
            
        
            
        
            
    

    
        1 “The basic idea behind autoencoders is to encode information (as in compress, not encrypt) automatically, hence the name. The entire network always resembles an hourglass like shape, with smaller hidden layers than the input and output layers. AEs are also always symmetrical around the middle layer(s) (one or two depending on an even or odd amount of layers). The smallest layer(s) is|are almost always in the middle, the place where the information is most compressed (the chokepoint of the network). Everything up to the middle is called the encoding part, everything after the middle the decoding and the middle (surprise) the code. One can train them using backpropagation by feeding input and setting the error to be the difference between the input and what came out. AEs can be built symmetrically when it comes to weights as well, so the encoding weights are the same as the decoding weights.” <https://www.asimovinstitute.org/neural-network-zoo/>
        2 Generative adversarial networks (GAN) are from a different breed of networks, they are twins: two networks working together. GANs consist of any two networks (although often a combination of FFs and CNNs), with one tasked to generate content and the other has to judge content. The discriminating network receives either training data or generated content from the generative network. How well the discriminating network was able to correctly predict the data source is then used as part of the error for the generating network. This creates a form of competition where the discriminator is getting better at distinguishing real data from generated data and the generator is learning to become less predictable to the discriminator. This works well in part because even quite complex noise-like patterns are eventually predictable but generated content similar in features to the input data is harder to learn to distinguish.