DETAILED ACTION
Claims 1-21 are pending and have been examined.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Specification
The disclosure is objected to because of the following informalities:
[0015] and [0025] recites “at least one of the generator and the discriminator is a neutral network.” “a neutral network” should be “a neural network.” 
[0069] recites “where x is the input to a neuron of the rectifier neutral network.” “the rectifier neutral network” should be “the rectifier neural network.”
  
Appropriate correction is required.

Claim Objections
Claims 1, 2, 9, 12, 19 and 21 are objected to because of the following informalities: 
In claim 1, line 6, “wherein the one or more program are stored in the memory” should be “wherein the one or more programs are stored in memory.” “program” should be in the plural form, and “the memory” is first time seen in line 6 or it can be changed to “the non-transitory computer-readable medium.”
In claims 2 and 12, line 2, “a neutral network” should be “a neural network.” 
In claims 9 and 19, line 1, “the immediate pre-nonlinearity activities” should be “immediate pre-nonlinearity activities.” “the immediate pre-nonlinearity activities” is first time seen in line 1.
In claim 21, line 4, “wherein the one or more program are stored in the memory” should be “wherein the one or more programs are stored in memory.” “program” should be in the plural form.

 Appropriate correction is required.



Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 5-6 and 15-16 are rejected under 35 U.S.C. 112(b)  or pre-AIA  35 U.S.C. 112, second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
In claims 5 and 15, the definition of variable d in the equations is missing. The examiner suggests to amend the claim to include the definition specified in the spec. [0080]: d is the number of d hidden units in the given layer.
Claims 6 and 16 are also rejected due to their dependency on a rejected claim.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-2, 7-8, 10, 11-12, 17-18 and 20 are rejected under 35 U.S.C. 102 (a)(1) as being anticipated by Salimans ("Improved Techniques for Training GANs").

In regard to claims 1 and 11, Salimans teaches: An electronic device for improved neural network training comprising: a processor; a non-transitory computer-readable medium storing data representative of a generative adversarial network (GAN) to learn from unlabeled data by engaging a generator and a discriminator; and (Salimans, 6 Experiments, 6.4 ImageNet "We extensively modified a publicly available implementation of DCGANs2 using TensorFlow [28] to achieve high performance, using a multi-GPU implementation."; Salimans indicates that they implement their method using TensorFlow on a computer, where a processor / GPU, a non-transitory computer-readable medium / memory, programs / instructions are inherent; Section 2 "One of the primary goals of this work is to improve the effectiveness of generative adversarial networks for semi-supervised learning (improving the performance of a supervised task, in this case, classification, by learning on additional unlabeled examples) [learn from unlabeled data].")
one or more programs, wherein the one or more program are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for: (see above)
receiving a plurality of training cases; (Salimans, 3.2 Minibatch discrimination, "... The task of the discriminator is thus effectively still to classify single examples as real data or generated data …6 Experiments, "We performed semi-supervised experiments on MNIST, CIFAR-10 and SVHN, and sample generation experiments on MNIST, CIFAR-10, SVHN and ImageNet."; examples from the generator (generated data) and real data, or images from MNIST, CIFAR-10, SVHN are also examples of training cases.)
training the generative adversarial network, based on the plurality of training cases, to classify the training cases as real or fake; and (Salimans, section 1 "The training signal for G is provided by a discriminator network D(x) that is trained to distinguish samples from the generator distribution pmodel(x) from real data"; 3.2 Minibatch discrimination, "The task of the discriminator is thus effectively still to classify single examples as real data [real data] or generated data [fake data]…")
executing a regularizer to configure the discriminator to allocate a model capacity evenly. (Salimans, 3.2 Minibatch discrimination, "One of the main failure modes for GAN is for the generator to collapse to a parameter setting where it always emits the same point. When collapse to a single mode is imminent, the gradient of the discriminator may point in similar directions for many similar points... An obvious strategy to avoid this type of failure is to allow the discriminator to look at multiple data examples in combination, and perform what we call minibatch discrimination [regularization]...  The concept of minibatch discrimination is quite general: any discriminator model that looks at multiple examples in combination, rather than in isolation... Let f (xi) RA denote a vector of features for input xi, produced by some intermediate layer in the discriminator... The output o(xi) for this minibatch layer for a sample xi is then defined as the sum of the cb(xi; xj)’s to all other samples... "; batch discrimination is a regularizer to allocate the discriminator capacity evenly. A model capacity is based on the parameter setting, and batch discrimination can avoid the problem of same-point parameter setting, i.e. it can help allocate a model capacity evenly, see related reference Berthelot ("a heuristic regularizer") and Arora ("discriminator capacity").)

Claim 11 recites substantially the same limitation as claim 1, therefore the rejection applied to claim 1 also apply to claims 11.

In regard to claims 2 and 12, Salimans teaches: wherein at least one of the generator and the discriminator is a neutral network. (Salimans, section 1 "The goal of GANs is to train a generator network G(z; θ(G)) that produces samples from the data distribution, pdata(x), by transforming vectors of noise z as x = G(z; θ(G)). The training signal for G is provided by a discriminator network D(x)…"; in practice generator and discriminator are often neural network models, also see related reference Arora (“where Gu is a function — which is often a neural network in practice)... Suppose the generator and discriminator are both k-layer neural networks”))

In regard to claims 7 and 17, Salimans teaches: wherein the plurality of training cases transmitted to the discriminator comprise real data and fake data. (Salimans, section 1 "The training signal for G is provided by a discriminator network D(x) that is trained to distinguishsamples from the generator distribution pmodel(x) from real data"; 3.2 Minibatch discrimination, "The task of the discriminator is thus effectively still to classify single examples as real data [real data] or generated data [fake data]…")

In regard to claims 8 and 18, Salimans teaches: wherein the plurality of training cases transmitted to the discriminator comprise interloped real and fake data. (Salimans, section 5 Semi-supervised learning "Assuming half of our data set consists of real data and half of it is generated (this is arbitrary), our loss function for training the classifier then becomes…")

In regard to claims 10 and 20, Salimans teaches: wherein the regularizer is applied on generated data and random interpolation inbetween real and generated fake data. (Salimas, section 3.2 "The concept of minibatch discrimination… we have restricted our experiments to models that explicitly aim to identify generator samples that are particularly close together… The output o(xi) for this minibatch layer for a sample xi is then defined as the sum of the cb(xi; xj)’s to all other samples... We compute these minibatch features separately for samples from the generator and from the training data."; section 5 Semi-supervised learning "Assuming half of our data set consists of real data and half of it is generated (this is arbitrary)..."; Because in the GAN training, mix of real data and fake data [arbitrary / random] are provided to the discriminator, the minibatch discrimination [regularizer] is applied on the fake data and random interpolation inbetween real and fake data.)

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 3-4, 9, 13-14 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Salimans in view of Ioffe ("Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift").

In regard to claims 3 and 13, reference is made to the rejection of claims 2 and 12 respectively, and further, Salimans does not teach, but Ioffe teaches: wherein the discriminator is a rectifier network having an activation function defined as: f(x) = x+ = max(0,x), , where x is input to a neuron of the rectifier network. (Ioffe, Section 1 "In practice, the saturation problem and the resulting vanishing gradients are usually addressed by using Rectified Linear Units... ReLU(x) = max(x, 0)..."; section 3.2 "Batch Normalization can be applied to any set of activations in the network. Here, we focus on transforms that consist of an affine transformation followed by an elementwise nonlinearity: z = g(Wu + b) where W and b are learned parameters of the model, and g(.) is the nonlinearity such as sigmoid or ReLU.")

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the discriminator of Salimans to include ReLU of Ioffe in the model. Doing so would solve the saturation problem and the resulting vanishing gradients. (Ioffe, Section 1 "In practice, the saturation problem and the resulting vanishing gradients are usually addressed by using Rectified Linear Units... ReLU(x) = max(x, 0)")

In regard to claims 4 and 14, reference is made to the rejection of claims 3 and 13 respectively, and further, Salimans does not teach, but Ioffe teaches: wherein the discriminator is configured to compute a piecewise linear function. (Ioffe, Section 1 "In practice, the saturation problem and the resulting vanishing gradients are usually addressed by using Rectified Linear Units... ReLU(x) = max(x, 0)..."; ReLu or Maxout are examples of piecewise linear functions, e.g. ReLu are pieces of linear functions y=0 and y=x.)

The rationale for combining the teachings of Salimans and Ioffe is the same as set forth in the rejection of claims 3 and 13 respectively.

In regard to claims 9 and 19, reference is made to the rejection of claims 8 and 18 respectively, and further, Salimans does not teach, but Ioffe teaches: wherein the regularizer is applied to the immediate pre-nonlinearity activities on one or more layers of the discriminator model. (Ioffe, section 3.2 "Batch Normalization can be applied to any set of activations in the network. Here, we focus on transforms that consist of an affine transformation followed by an elementwise nonlinearity: z = g(Wu + b) where W and b are learned parameters of the model, and g(.) is the nonlinearity such as sigmoid or ReLU. This formulation covers both fully-connected and convolutional layers. We add the BN transform immediately before the nonlinearity, by normalizing x = Wu + b."; batch normalization / regularizer is right before ReLU.)

The rationale for combining the teachings of Salimans and Ioffe is the same as set forth in the rejection of claims 3 and 13 respectively.

Claim 21 are rejected under 35 U.S.C. 103 as being unpatentable over Salimans in view of Zhang ("StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks").

In regard to claim 21, Salimans teaches: An electronic device comprising: one or more processors; memory; and one or more programs, wherein the one or more program are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for: (Salimans, 6 Experiments, 6.4 ImageNet "We extensively modified a publicly available implementation of DCGANs2 using TensorFlow [28] to achieve high performance, using a multi-GPU implementation."; Salimans indicates that they implement their method using TensorFlow on a computer, where processors, memory and programs are inherent.)
… trained using a regularizer to configure a discriminator to evenly use its model capacity; and (Salimans, 3.2 Minibatch discrimination, "One of the main failure modes for GAN is for the generator to collapse to a parameter setting where it always emits the same point. When collapse to a single mode is imminent, the gradient of the discriminator may point in similar directions for many similar points... An obvious strategy to avoid this type of failure is to allow the discriminator to look at multiple data examples in combination, and perform what we call minibatch discrimination [regularization]...  The concept of minibatch discrimination is quite general: any discriminator model that looks at multiple examples in combination, rather than in isolation... Let f (xi) RA denote a vector of features for input xi, produced by some intermediate layer in the discriminator... The output o(xi) for this minibatch layer for a sample xi is then defined as the sum of the cb(xi; xj)’s to all other samples... "; batch discrimination is a regularizer to allocate the discriminator capacity evenly. A model capacity is based on the parameter setting, and batch discrimination can avoid the problem of same-point parameter setting, i.e. it can help allocate a model capacity evenly, see related reference Berthelot ("a heuristic regularizer") and Arora ("discriminator capacity").)

Salimans does not teach, but Zhang teaches: receiving a text string; (Zhang, abstract "In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) to generate 256 x 256 photo-realistic images conditioned on text descriptions."; section 3 see Figure 2 "Figure 2. The architecture of the proposed StackGAN. The Stage-I generator draws a low-resolution image by sketching rough shape and basic colors of the object from the given text [receiving a text string] and painting the background from a random noise vector. Conditioned on Stage-I results, the Stage-II generator corrects defects and adds compelling details into Stage-I results, yielding a more realistic high-resolution image.")
processing the text string using a generative adversarial network... (Zhang, section 3 see Figure 2 "Figure 2. The architecture of the proposed StackGAN [using a GAN]. The Stage-I generator draws a low-resolution image by sketching rough shape and basic colors of the object from the given text [text string] and painting the background from a random noise vector.")
generating an image based on the processed text string. (Zhang, abstract "In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) to generate 256 x 256 photo-realistic images conditioned on text descriptions."; section 3 see Figure 2 "Figure 2... the Stage-II generator corrects defects and adds compelling details into Stage-I results, yielding a more realistic high-resolution image.")
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have applied the GNA model of Salimans on the applications of Zhang. Doing so would allow the model to generating photo-realistic images from text descriptions. (Zhang, abstract "Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications... Extensive experiments and comparisons with state-of-the-arts on benchmark datasets demonstrate that the proposed method achieves significant improvements on generating photo-realistic images conditioned on text descriptions.")
Allowable Subject Matter
Claims 5, 6, 15 and 16 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
The closest prior arts for claim 5 and 15 are Courbariaux ("Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1") and Zhao ("Energy-based Generative Adversarial Network"). Courbariaux teaches batch normalization and activations constrained to ±1, but does not teach the concept of average of hidden units of the square of activation functions across the mini-batch. Zhao teaches a repelling regularizer that may be related to the second term, but does not teach the number of hidden units and |s_i T s_j|.
The closest prior arts for claim 6 and 16 are Raghu ("On the Expressive Power of Deep Neural Networks"). Raghu teaches different local linear regions is closely related to the different activation patterns.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Berthelot ("BEGAN: Boundary Equilibrium Generative Adversarial Networks") teaches a heuristic regularizer can be batch discrimination and repelling regularizer.
Arora ("Generalization and Equilibrium in Generative Adversarial Nets (GANs)") teaches the concept of model capacity.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SU-TING CHUANG whose telephone number is (408)918-7519.  The examiner can normally be reached on Monday - Thursday 8-5 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571)272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/S.C./Examiner, Art Unit 2122                 

/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122