DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Amendments
	Per Applicant’s request, claim 1, 3-4, 12, 17, and 19 are amended. Claims 1-20 are pending and have been considered.

Drawings
The drawings were received on 10/28/2021.  These drawings are acceptable and have been entered.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
CLAIM 1
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
generating, from a first set of visible values, a set of hidden values 
generating a second set of visible values based on the generated set of hidden values; 

computing a set of adversarial gradients using an adversarial model based on at least one of the set of hidden values and the set of visible values;
computing a set of compound gradients based on the set of likelihood gradients and the set of adversarial gradients; and 
updating based on the set of compound gradients.
All of the limitations are mathematical computations. Accordingly, the claim recites an abstract idea.

Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
a RBM (a neural network)
a hidden layer of the RBM
a visible layer of the RBM
an architecture that is the same as that of the RBM
An RBM neural network, its layers, and an architecture that is the same as that of the RBM are not meaningful limitations because the claim does not recite positive functionalities of the RBM, nor does the claim recite improvements in neural network technology. See MPEP 2106.05(e). These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h).

Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h).The claim is not patent eligible.

CLAIM 2 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim narrows the judicial exceptions. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
RBM (neural network)
a composite layer composed of a plurality of sub-layers for different data types
An RBM neural network and its layers are not meaningful limitations because the claim does not recite positive functionalities of the RBM, nor does the claim recite improvements in neural network technology. See MPEP 2106.05(e). These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h).

Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h).
The claim is not patent eligible.

CLAIM 3 incorporates the rejection of claim 2.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 2 are incorporated. The claim narrows the judicial exceptions. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
the plurality of sub-layers comprises at least one of a Bernoulli layer, an Ising layer, a one-hot layer, a von Mises-Fisher layer, a Gaussian layer, a ReLU layer, a clipped ReLU layer, a student-t layer, an ordinal layer, an exponential layer, and a composite layer.
An RBM’s layers are not meaningful limitations because the claim does not recite positive functionalities of the RBM, nor does the claim recite improvements in neural network technology. See MPEP 2106.05(e). These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h).

Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h). 
The claim is not patent eligible.

CLAIM 4 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim narrows the judicial exceptions. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
a deep Boltzmann machine (neural network)
a plurality of hidden layers
A DBM neural network and its layers are not meaningful limitations because the claim does not recite positive functionalities of the RBM, nor does the claim recite improvements in neural network technology. See MPEP 2106.05(e). These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h).

Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h).
The claim is not patent eligible.

CLAIM 5 incorporates the rejection of claim 4. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 4 are incorporated. The claim recites the following limitations:
sampling
	 
stacking into a vector
generating weights [of the DBM] (generating the DBM is interpreted as generating the weights for the DBM, given the BRI)
All of the computing limitations are mathematical computations. Accordingly, the claim recites an abstract idea.

Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following elements:
first RBM (neural network) and its visible and hidden layers
second RBM (neural network) and its visible layer
training a second RBM, wherein the vector is a visible layer of the second RBM; and 
DBM (neural network)
A first RBM neural network, a second RBM neural network, a DBM neural network, and their layers are not meaningful limitations because the claim does not recite positive functionalities of the RBMs and DBM, nor does the claim recite improvements in neural network technology. See MPEP 2106.05(e). These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h). Training a neural network amounts to an insignificant extra-solution activity. See MPEP 2106.05(g). 
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. A first RBM neural network, a second RBM neural network, a DBM neural network, and their layers amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h). Additionally, training an RBM is well-understood, routine, conventional activity. Berkheimer analysis: The instant specification states in ¶ [0097]: “The traditional algorithms for training RBMs maximize the log-likelihood, which is equivalent to minimizing the forward Kullback-Liebler (KL) divergence”. 
The claim is not patent eligible.

CLAIM 6 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
generate a time progression of a disease; and
treating the patient based on the generated time progression.
	Generating a time progression is a mathematical computation. Treating the patient is following rules or instructions which falls under the sub-grouping of “managing personal behavior or relationships or interactions between people”. Moreover, using the RBM and treating the patient are NOT a particular treatment and prophylaxis in Step 2A Prong 2.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
receiving a phenotype vector for a patient; 
using the RBM (neural network)
Receiving a phenotype vector for a patient is mere data-gathering, which is an insignificant extra-solution activity. See MPEP 2106.05(g). Using the RBM merely indicates a field of use. MPEP 2106.05(h) states: 
Examples of limitations that the courts have described as merely indicating a field of use or technological environment in which to apply a judicial exception include:
vi. Limiting the abstract idea of collecting information, analyzing it, and displaying certain results of the collection and analysis to data related to the electric power grid, because limiting application of the abstract idea to power-grid monitoring is simply an attempt to limit the use of the abstract idea to a particular technological environment, Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016);
Limiting application of the abstract idea of generating time progression to an RBM is simply an attempt to limit the use of the abstract idea to a particular technological environment.

Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. Additionally receiving a phenotype vector for a patient is well-understood, routine, conventional activity of receiving or transmitting data over a network. See MPEP 2106.05(d)(II)(i). 
The claim is not patent eligible.

CLAIM 7 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim narrows the judicial exceptions. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
the visible layer and the hidden layer are for a first time instance
a second hidden layer that incorporates data from a different second time instance.
The layers are not meaningful limitations because the claim does not recite positive functionalities of the RBM, nor does the claim recite improvements in neural network technology. See MPEP 2106.05(e). These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h).

Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h). 
The claim is not patent eligible.

CLAIM 8 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim narrows the judicial exceptions. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
the visible layer is a composite layer comprising data for a plurality of different time instances.
The layers are not meaningful limitations because the claim does not recite positive functionalities of the RBM, nor does the claim recite improvements in neural network technology. See MPEP 2106.05(e). This additional element amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. These additional elements amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use of machine learning, as discussed in MPEP 2106.05(h). 
The claim is not patent eligible.

CLAIM 9 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim narrows the judicial exceptions. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 10 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim narrows the judicial exceptions. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 11 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. Further, the claim recites:
drawing data samples based on authentic data; 
drawing fantasy samples based from the RBM; and 
All of the limitations are mathematical computations. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
training the adversarial model based on the adversarial model's ability to distinguish between the data samples and the fantasy samples.
Training the adversarial model is insignificant extra-solution activity. See MPEP 2106.05(g). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. Additionally, training an adversarial model is well-understood, routine, conventional activity. Berkheimer analysis: Goodfellow et 

CLAIM 12 incorporates the rejection of claim 11.
Step 1: The claim recites a method, one of the four categories of eligible subject matter. Further, the claim recites:
measuring a probability that a particular sample is drawn from either the authentic data or the RBM.
The limitation is a mathematical computation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 13 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim narrows the judicial exceptions. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 14 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitation:
generate a set of samples of a target population
The limitation is a mathematical computation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
using the RBM (neural network)
Using the RBM merely indicates a field of use. Limiting application of the abstract idea of generating a set of samples to an RBM is simply an attempt to limit the use of the abstract idea to a particular technological environment. See MPEP 2106.05(h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 15 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitation further limiting the computing a set of likelihood gradients:
computing a convex combination of a Monte Carlo estimate and a mean field estimate.
This limitation is a mathematical computation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 16 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitation further limiting the computing a set of likelihood gradients:
initializing a plurality of samples; 
initializing an inverse temperature for each sample of the plurality of samples; 

updating the sample using Gibbs sampling.
All of the limitations are mathematical computations. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 17 recites a product, and it is rejected under 35 U.S.C. § 101 for the same reasons as claim 1. Claim 17 recites the additional elements of “a non-transitory machine readable medium”, “processor instructions”, and “a processor” which are generally linking the abstract idea to the particular technological environment of machine learning, and they are not an improvement to machine learning technology. Therefore, they are not meaningful limitations. See MPEP 2106.05(e) and (h). The claim is not patent eligible.

CLAIM 18 incorporates the rejections of claim 17. Claim 18 is rejected under 35 U.S.C. § 101 for the same reasons as claim 2. The claim is not patent eligible.

CLAIM 19 incorporates the rejections of claim 17. Claim 19 is rejected under 35 U.S.C. § 101 for the same reasons as claim 4. The claim is not patent eligible.

CLAIM 20 incorporates the rejection of claim 19. Claim 20 is rejected under 35 U.S.C. § 101 for the same reasons as claim 5. The claim is not patent eligible.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


Claims 1, 9-10, 13, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Hinton ("A Practical Guide to Training Restricted Boltzmann Machines") in view of Goodfellow et al. (“Generative Adversarial Nets”).
	
	Regarding CLAIM 1, Hinton teaches: A method for training a restricted Boltzmann machine (RBM), wherein the method comprises: 
generating, from a first set of visible values, a set of hidden values in a hidden layer of a RBM; (Hinton p. 5, § 3.1 teaches: “The hidden units should have stochastic binary states”, where a set of hidden values in a hidden layer corresponds to Hinton’s binary states of the hidden units. Equation 10 teaches that the hidden values of units                         
                            
                                
                                    h
                                
                                
                                    j
                                
                            
                        
                     are generated from a first set of visible values of units                         
                            
                                
                                    v
                                
                                
                                    i
                                
                            
                        
                     in equation 10.)
generating a second set of visible values in a visible layer of the RBM based on the generated set of hidden values; (Hinton p. 6, § 3.2 teaches, where a second set of visible values is the updated                         
                            
                                
                                    v
                                
                                
                                    i
                                
                            
                        
                    : 
    PNG
    media_image1.png
    116
    647
    media_image1.png
    Greyscale
 
computing a set of likelihood gradients based on at least one of the first set of visible values and the generated set of visible values; (Hinton p. 4, equation 5 is a gradient of a log probability of visible units:

    PNG
    media_image2.png
    54
    450
    media_image2.png
    Greyscale

Regarding the adversarial model recited in Claim 1, Hinton on p. 16, § 16 discloses “The first [way of using RBMs for discrimination] is to use the hidden features learned by the RBM as the inputs for some standard discriminative method.”
Hinton is in the same field of endeavor as the claimed invention, namely Restricted Boltzmann Machines.
However, Hinton does not explicitly teach: computing a set of adversarial gradients using an adversarial model, wherein the adversarial model includes an architecture that is the same as that of the RBM, based on at least one of the set of hidden values and the set of visible values; computing a set of compound gradients based on the set of likelihood gradients and the set of adversarial gradients; and updating the RBM based on the set of compound gradients.
But Goodfellow teaches: computing a set of adversarial gradients using an adversarial model… based on at least one of the set of hidden values and the set of visible values (Adversarial gradients and adversarial model are interpreted as discriminative gradients and discriminative model in light of specification ¶ [0111], line 3: “the discriminator (or adversary)”. In Algorithm 1, Goodfellow computes                         
                            
                                
                                    
                                        
                                            ∇
                                        
                                        
                                            θ
                                        
                                    
                                
                                
                                    d
                                
                            
                        
                     based on the set of visible values from generator (p. 1: “discriminative model… learns to determine whether a sample is from the model distribution or the data distribution”).)
wherein the adversarial model includes an architecture that is the same as that of the RBM (The architecture of the RBM comprises a visible layer and a hidden layer. The broadest reasonable interpretation of this limitation is that the adversarial model (i.e., the discriminative model) includes a visible layer and a hidden layer. Goodfellow, p. 1, Abstract, lines 1-4 teaches: “We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G” (emphasis added). Goodfellow, p. 2, § 3, lines 4-6 teaches: “We also device a second multilayer perceptron                         
                            D
                            
                                
                                    x
                                    ;
                                    
                                        
                                            θ
                                        
                                        
                                            d
                                        
                                    
                                
                            
                        
                     that outputs a single scalar.                         
                            D
                            
                                
                                    x
                                
                            
                        
                     represents the probability that                         
                            x
                        
                     came from the data rather than                         
                            
                                
                                    p
                                
                                
                                    g
                                
                            
                        
                    ” where                         
                            
                                
                                    p
                                
                                
                                    g
                                
                            
                        
                     is the generator’s distribution. Goodfellow’s multilayer perceptron                         
                            D
                            
                                
                                    x
                                    ;
                                    
                                        
                                            θ
                                        
                                        
                                            d
                                        
                                    
                                
                            
                        
                     includes at least a visible input layer, a hidden layer, and a visible output layer. The claim limitations have been met.)
computing a set of compound gradients based on the set of likelihood gradients and the set of adversarial gradients; and (A set of compound gradients is interpreted as a set where some of the compound gradients are based on the set of likelihood gradients and other compound gradients are based on the set of adversarial gradients. For any step k in Algorithm 1, part of the set of compound gradients is identical to the discriminator gradient and another part of the set is identical to the generator gradient.)
updating… based on the set of compound gradients. (Goodfellow’s Algorithm 1 contains a step “Update the generator by descending its stochastic gradient.” This amounts to updating a generator model based on the set of compound gradients, which includes a set of likelihood gradients.)
(Goodfellow, page 7, § 6)	

Regarding CLAIM 9, the Hinton/Goodfellow combination teaches: The method of claim 1, 
Hinton teaches: wherein computing the set of likelihood gradients comprises performing Gibbs sampling. (Hinton p. 4, second-to-last paragraph teaches getting an unbiased sample of                         
                            
                                
                                    
                                        
                                            
                                                
                                                    v
                                                
                                                
                                                    i
                                                
                                            
                                            
                                                
                                                    h
                                                
                                                
                                                    i
                                                
                                            
                                        
                                    
                                
                                
                                    m
                                    o
                                    d
                                    e
                                    l
                                
                            
                        
                     by starting at any random state of the visible units and performing alternating Gibbs sampling.)

	Regarding CLAIM 10, the Hinton/Goodfellow combination teaches: The method of claim 1, 
However, Hinton does not explicitly teach: wherein the set of compound gradients are weighted averages of the set of likelihood gradients and the set of adversarial gradients.
But Goodfellow teaches: wherein the set of compound gradients are weighted averages of the set of likelihood gradients and the set of adversarial gradients. (Interpreted as separate weighted averages of the set of likelihood gradients and weighted averages of the set of adversarial gradients. In Algorithm 1, the discriminator and generator gradients are each weighted averages with weights of 1/m)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Goodfellow’s system into 

Regarding CLAIM 13, the Hinton/Goodfellow combination teaches: The method of claim 1, 
However, Hinton does not explicitly teach:  wherein the adversarial model is one of a fully-connected classifier, a logistic regression model, a nearest neighbor classifier, and a random forest.
But Goodfellow teaches: wherein the adversarial model is one of a fully-connected classifier, (multilayer perceptron – p. 2 § 3). 
a logistic regression model, a nearest neighbor classifier, and a random forest. (Under the broadest reasonable interpretation, the claim requires at least one model type from the list. Examiner is not required to cite prior art to these limitations because they are recited as alternatives to “a fully-connected classifier.”)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Goodfellow’s system into Hinton/Goodfellow’s system by modeling the adversary/discriminator with a multilayer perceptron, where “the advantages are that Markov chains are never needed, only backprop is used to obtain gradients, no inference is needed during learning, and a wide variety of functions can be incorporated into the model.” (Goodfellow, page 7, § 6)	

Regarding CLAIM 17, Hinton teaches: generating, from a first set of visible values, a set of hidden values in a hidden layer of a RBM; (Hinton p. 5, § 3.1 teaches: “The hidden units should have stochastic binary states”, where a set of hidden values in a hidden layer corresponds to Hinton’s binary                         
                            
                                
                                    h
                                
                                
                                    j
                                
                            
                        
                     are generated from a first set of visible values of units                         
                            
                                
                                    v
                                
                                
                                    i
                                
                            
                        
                     in equation 10.)
generating a second set of visible values in a visible layer of the RBM based on the generated set of hidden values; (Hinton p. 6, § 3.2 teaches, where a second set of visible values is the updated visible states of                         
                            
                                
                                    v
                                
                                
                                    i
                                
                            
                        
                    : 
    PNG
    media_image1.png
    116
    647
    media_image1.png
    Greyscale
 
computing a set of likelihood gradients based on at least one of the first set of visible values and the generated set of visible values; (Hinton p. 4, equation 5 is a gradient of a log probability of visible units:

    PNG
    media_image2.png
    54
    450
    media_image2.png
    Greyscale

Regarding the adversarial model recited in Claim 1, Hinton on p. 16, § 16 discloses “The first [way of using RBMs for discrimination] is to use the hidden features learned by the RBM as the inputs for some standard discriminative method.”
However, Hinton does not explicitly teach: A non-transitory machine readable medium containing processor instructions for training a restricted Boltzmann machine (RBM), wherein execution of the instructions by a processor causes the processor to perform a process that comprises: computing a set of adversarial gradients using an adversarial model, wherein the adversarial model includes an architecture that is the same as that of the RBM, based on at least one of the set of hidden values and the set of visible values; computing a set of compound gradients based on the set of likelihood gradients and the set of adversarial gradients; and updating the RHM based on the set of compound gradients.
But Goodfellow teaches: A non-transitory machine readable medium containing processor instructions for training a restricted Boltzmann machine (RBM), wherein execution of the instructions by a processor causes the processor to perform a process that comprises: (Experiments section is evidence of non-transitory machine readable medium, processor instructions, and processor)
computing a set of adversarial gradients using an adversarial model… based on at least one of the set of hidden values and the set of visible values (Adversarial gradients and adversarial model are interpreted as discriminative gradients and discriminative model in light of specification ¶ [0111], line 3: “the discriminator (or adversary)”.  In Algorithm 1, Goodfellow computes the discriminator gradient                         
                            
                                
                                    
                                        
                                            ∇
                                        
                                        
                                            θ
                                        
                                    
                                
                                
                                    d
                                
                            
                        
                     based on the set of visible values from generator (p. 1: “discriminative model… learns to determine whether a sample is from the model distribution or the data distribution”).)
wherein the adversarial model includes an architecture that is the same as that of the RBM (The architecture of the RBM comprises a visible layer and a hidden layer. The broadest reasonable interpretation of this limitation is that the adversarial model (i.e., the discriminative model) includes a visible layer and a hidden layer. Goodfellow, p. 1, Abstract, lines 1-4 teaches: “We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G” (emphasis added). Goodfellow, p. 2, § 3, lines 4-6 teaches: “We also device a second multilayer perceptron                         
                            D
                            
                                
                                    x
                                    ;
                                    
                                        
                                            θ
                                        
                                        
                                            d
                                        
                                    
                                
                            
                        
                     that outputs a single scalar.                         
                            D
                            
                                
                                    x
                                
                            
                        
                     represents the probability that                         
                            x
                        
                     came from the data rather than                         
                            
                                
                                    p
                                
                                
                                    g
                                
                            
                        
                    ” where                         
                            
                                
                                    p
                                
                                
                                    g
                                
                            
                        
                     is the generator’s distribution. Goodfellow’s multilayer perceptron                         
                            D
                            
                                
                                    x
                                    ;
                                    
                                        
                                            θ
                                        
                                        
                                            d
                                        
                                    
                                
                            
                        
                     includes at least a visible input layer, a hidden layer, and a visible output layer. The claim limitations have been met.)
computing a set of compound gradients based on the set of likelihood gradients and the set of adversarial gradients; and (A set of compound gradients is interpreted as a set where some compound gradients are based on the set of likelihood gradients and other compound gradients are based on the set of adversarial gradients. For any step k in Algorithm 1, the set of compound gradients is identical to the discriminator gradient and generator gradient.)
updating… based on the set of compound gradients. (Goodfellow’s Algorithm 1 contains a step “Update the generator by descending its stochastic gradient.” This amounts to updating a generator model based on the set of compound gradients, which includes a set of likelihood gradients.)
	Goodfellow is in the same field of endeavor as the claimed invention, namely, generative adversarial nets. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Goodfellow’s system into Hinton’s system by computing adversarial gradients and updating the Hinton’s generator RBM by descending its stochastic gradient, with a motivation to perform generative adversarial training, where “the advantages are that Markov chains are never needed, only backprop is used to obtain gradients, no inference is needed during learning, and a wide variety of functions can be incorporated into the model.” (Goodfellow, page 7, § 6)	

	Claim 2-3 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Hinton ("A Practical Guide to Training Restricted Boltzmann Machines") in view of Goodfellow et al. (“Generative Adversarial Nets”) and Tran et al. (“Mixed-Variate Restricted Boltzmann Machines”).

	Regarding CLAIM 2 the combination of Hinton and Goodfellow teaches: The method of claim 1, 
	However, neither Hinton nor Goodfellow explicitly teaches: wherein the visible layer of the RBM comprises a composite layer composed of a plurality of sub-layers for different data types.
wherein the visible layer of the RBM comprises a composite layer composed of a plurality of sub-layers for different data types. (Page 214 § 2 teaches: “In this section we present Mixed-Variate Restricted Boltzmann Machines (MV.RBM) for jointly modelling variables of multiple modalities and types.” Further, §2.1 teaches:

    PNG
    media_image3.png
    63
    1029
    media_image3.png
    Greyscale

category-ranked.
A plurality of sub-layers for different data types are the visible units containing visible variables of one of the data types listed above.)
	Tran is in the same field of endeavor as the claimed invention, namely Mixed-Variate Restricted Boltzmann Machines. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Tran’s system into the combination of Hinton and Goodfellow’s system by letting the visible units be different modalities, with a motivation to simultaneously model variables of multiple types and modalities. (Tran Abstract: “Modern datasets are becoming heterogeneous. To this end, we present in this paper Mixed-Variate Restricted Boltzmann Machines for simultaneously modelling variables of multiple types and modalities”)

Regarding CLAIM 3 the combination of Hinton, Goodfellow, and Tran teaches: The method of claim 2, 
However, neither Hinton nor Goodfellow explicitly teaches: wherein the plurality of sub-layers comprises at least one of a Bernoulli layer, an Ising layer, a one-hot layer, a von Mises-Fisher layer, a Gaussian layer, a ReLU layer, a clipped ReLU layer, a student-t layer, an ordinal layer, an exponential layer, and a composite layer.
 wherein the plurality of sub-layers comprises at least one of a Bernoulli layer, (Interpreted as a binary variable data model, p. 216 table)
a Gaussian layer, (Gaussian variable data model, p. 216 table)
an ordinal layer, (Ordinal variables, §2.2.1)
a composite layer (Interpreted as a categorical data model, p. 216 table, or muliticategorical variable, §2.2.1)
an Ising layer, a one-hot layer, a von Mises-Fisher layer, … a ReLU layer, a clipped ReLU layer, a student-t layer, … , an exponential layer. (Under the broadest reasonable interpretation, the claim requires at least one type of layer from the list. Examiner is not required to cite prior art teaching these limitations because they are listed as alternatives.)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Tran’s system into the combination of Hinton, Goodfellow, and Tran’s system by letting the visible units be different modalities, with a motivation to simultaneously model variables of multiple types and modalities. (Tran Abstract: “Modern datasets are becoming heterogeneous. To this end, we present in this paper Mixed-Variate Restricted Boltzmann Machines for simultaneously modelling variables of multiple types and modalities”)

Regarding CLAIM 18 the combination of Hinton and Goodfellow teaches: The non-transitory machine readable medium of claim 17, 
However, neither Hinton nor Goodfellow explicitly teaches: wherein the visible layer of the RBM comprises a composite layer composed of a plurality of sub-layers for different data types.
But Tran teaches: wherein the visible layer of the RBM comprises a composite layer composed of a plurality of sub-layers for different data types. (Page 214 § 2 teaches: “In this section we present 

    PNG
    media_image3.png
    63
    1029
    media_image3.png
    Greyscale

category-ranked.
A plurality of sub-layers for different data types are the visible units containing visible variables of one of the data types listed above.)
	Tran is in the same field of endeavor as the claimed invention, namely Mixed-Variate Restricted Boltzmann Machines. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Tran’s system into the combination of Hinton and Goodfellow’s system by letting the visible units be different modalities, with a motivation to simultaneously model variables of multiple types and modalities. (Tran Abstract: “Modern datasets are becoming heterogeneous. To this end, we present in this paper Mixed-Variate Restricted Boltzmann Machines for simultaneously modelling variables of multiple types and modalities”)

Claims 4-5 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hinton ("A Practical Guide to Training Restricted Boltzmann Machines") in view of Goodfellow et al. (“Generative Adversarial Nets”) and Salakhutdinov et al. (“Deep Boltzmann Machines”). Fig. 2 from p. 451 is shown below with annotations.


    PNG
    media_image4.png
    337
    624
    media_image4.png
    Greyscale

Salakhutdinov Fig. 2 (annotated)
Regarding CLAIM 4 the Hinton/Goodfellow combination teaches: The method of claim 1, 
However, neither Hinton nor Goodfellow explicitly teaches: wherein the RBM is in a deep Boltzmann machine (DBM), wherein the hidden layer is one of a plurality of hidden layers.
But Salakhutdinov teaches: wherein the RBM is in a deep Boltzmann machine (DBM), wherein the hidden layer is one of a plurality of hidden layers. (Salakhutdinov Fig. 2 shows the RBM with layers (v – h1) is contained within the DBM on the right. This is labeled above as First RBM. The visible layer in the original RBM was duplicated for pretraining according to p. 451. The hidden layer h1 is one of a plurality of hidden layers h1 and h2.) 
Salakhutdinov is in the same field of endeavor as the claimed invention, namely Deep Boltzmann Machines. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Salakhutdinov’s system into Hinton/Goodfellow’s system by including the RBM in DBM, with motivations of learning internal representations that become increasingly complex; high-level representations can be built from a large (Salakhutdinov page 450 § 3)

Regarding CLAIM 5 the combination of Hinton, Goodfellow, and Salakhutdinov teaches: The method of claim 4, 
However, neither Hinton nor Goodfellow explicitly teaches: wherein the RBM is a first RBM and the hidden layer is a first hidden layer of the plurality of hidden layers, wherein the method further comprises: sampling the hidden layer from the first RBM;44.U5-05365 stacking the visible layer and the hidden layer from the first RBM into a vector; training a second RBM, wherein the vector is a visible layer of the second RBM; and generating the DBM by copying weights from the first and second RBMs to the DBM.
But Salakhutdinov teaches: wherein the RBM is a first RBM and the hidden layer is a first hidden layer of the plurality of hidden layers, (Salakhutdinov Fig. 2 shows the RBM with layers (v – h1) is contained within the DBM on the right. This is labeled above as First RBM. The visible layer in the original RBM was duplicated for pretraining according to p. 451. The hidden layer h1 is one of a plurality of hidden layers h1 and h2.) 
wherein the method further comprises: sampling the hidden layer from the first RBM; (inferring probabilities of h1 on p. 451, col. 2)
stacking the visible layer and the hidden layer from the first RBM into a vector; (The First RBM in Fig. 2 shows hidden layer h1 stacked on top of visible layer v. A vector is interpreted as the vector of data in the hidden layer h1, and the visible layer data flows into that vector.) 
training a second RBM, wherein the vector is a visible layer of the second RBM; and (In the second RBM from Fig. 2, h1 becomes a visible layer.)
generating the DBM by copying weights from the first and second RBMs to the DBM. (In Fig. 2, the first and second RBMs are composed into the DBM).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Salakhutdinov’s system into Hinton/Goodfellow/Salakhutdinov’s system by building the DBM from multiple RBMs, with a motivation of learning internal representations that become increasingly complex; high-level representations can be built from a large supply of unlabeled sensory inputs; and DBMs can better propagate uncertainty about, and hence deal more robustly with, ambiguous inputs. (Salakhutdinov page 450 § 3)

Regarding CLAIM 19, the Hinton/Goodfellow combination teaches: The non-transitory machine readable medium of claim 17, 
However, neither Hinton nor Goodfellow explicitly teaches: wherein the RBM is in a deep Boltzmann machine (DBM), wherein the hidden layer is one of a plurality of hidden layers.
But Salakhutdinov teaches: wherein the RBM is in a deep Boltzmann machine (DBM), wherein the hidden layer is one of a plurality of hidden layers. (Salakhutdinov Fig. 2 shows the RBM with layers (v – h1) is contained within the DBM on the right. This is labeled above as First RBM. The visible layer in the original RBM was duplicated for pretraining according to p. 451. The hidden layer h1 is one of a plurality of hidden layers h1 and h2.)
Salakhutdinov is in the same field of endeavor as the claimed invention, namely Deep Boltzmann Machines. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Salakhutdinov’s system into Hinton/Goodfellow’s system by including the RBM in DBM, with motivations of learning internal representations that become increasingly complex; high-level representations can be built from a large (Salakhutdinov page 450 § 3)

Regarding CLAIM 20 the combination of Hinton, Goodfellow, and Salakhutdinov teaches: The non-transitory machine readable medium of claim 19, 
However, neither Hinton nor Goodfellow explicitly teaches: wherein the RBM is a first RBM and the hidden layer is a first hidden layer of the plurality of hidden layers, wherein the method further comprises: sampling the hidden layer from the first RBM;  44.U5-05365 stacking the visible layer and the hidden layer from the first RBM into a vector; training a second RBM, wherein the vector is a visible layer of the second RBM; and generating the DBM by copying weights from the first and second RBMs to the DBM.
But Salakhutdinov teaches: wherein the RHM is a first RBM and the hidden layer is a first hidden layer of the plurality of hidden layers, (Salakhutdinov Fig. 2 shows the RBM with layers (v – h1) is contained within the DBM on the right. This is labeled above as First RBM. The visible layer in the original RBM was duplicated for pretraining according to p. 451. The hidden layer h1 is one of a plurality of hidden layers h1 and h2.)
wherein the method further comprises: sampling the hidden layer from the first RBM; (inferring probabilities of h1 on p. 451, col. 2)
stacking the visible layer and the hidden layer from the first RBM into a vector; (The First RBM in Fig. 2 shows hidden layer h1 stacked on top of visible layer v. A vector is interpreted as the vector of data in the hidden layer h1, and the visible layer data flows into that vector.) 
training a second RBM, wherein the vector is a visible layer of the second RBM; and (In the second RBM from Fig. 2, h1 becomes a visible layer.)
generating the DBM by copying weights from the first and second RBMs to the DBM. (In Fig. 2, the first and second RBMs are composed into the DBM).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Salakhutdinov’s system into Hinton/Goodfellow/Salakhutdinov’s system by building the DBM from multiple RBMs, with a motivation of learning internal representations that become increasingly complex; high-level representations can be built from a large supply of unlabeled sensory inputs; and DBMs can better propagate uncertainty about, and hence deal more robustly with, ambiguous inputs. (Salakhutdinov page 450 § 3)

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Hinton ("A Practical Guide to Training Restricted Boltzmann Machines") in view of Goodfellow et al. (“Generative Adversarial Nets”) and Nguyen et al. (“Latent Patient Profile Modelling and Applications with Mixed-Variate Restricted Boltzmann Machine”).

Regarding CLAIM 6 the Hinton/Goodfellow combination teaches: The method of claim 1 
However, neither Hinton nor Goodfellow explicitly teaches: receiving a phenotype vector for a patient; using the RBM to generate a time progression of a disease; and treating the patient based on the generated time progression.
But Nguyen teaches: further comprising: receiving a phenotype vector for a patient; (Nguyen p. 127, Fig. 1 shows the visible layer contains the gender, age, and region of birth for a diabetic patient.)
using the RBM to generate a time progression of a disease; and (Nguyen p. 127, middle paragraph states: “the model also enables a certain degree of disease prediction, i.e., we want to guess which diagnoses will be positive for the patient in the future… More specifically, subset of diagnoses at 
treating the patient based on the generated time progression. (Nguyen p. 123, § 1, second paragraph teaches that diabetic patients should be identified by the model and given care plans: “To provide high quality healthcare, care plans are issued to patients to manage them within the community, taking steps in advance so that these people are not hospitalised. Thus, it is imperative to identify groups of patients with similar characteristics so that they can be covered by a coherent care plan.” Examiner interprets this as treating the patient based on the generated time progression.)
Nguyen is in the same field of endeavor as the claimed invention, namely, Mixed-Variate
Restricted Boltzmann Machines for modeling diseases. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Nguyen’s system into Hinton/Goodfellow’s system by using the RBM to predict progression of diabetes, with a motivation to diagnose and treat patients before they must be hospitalized. (Nguyen p. 123: “To provide high quality healthcare, care plans are issued to patients to manage them within the community, taking steps in advance so that these people are not hospitalised. Thus, it is imperative to identify groups of patients with similar characteristics so that they can be covered by a coherent care plan.”)

Claims 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Hinton ("A Practical Guide to Training Restricted Boltzmann Machines") in view of Goodfellow et al. (“Generative Adversarial Nets”) and Zhang et al. (“Predictive Deep Boltzmann Machine for Multiperiod Wind Speed Forecasting”).

Regarding CLAIM 7 the Hinton/Goodfellow combination teach: The method of claim 1, 
wherein the visible layer and the hidden layer are for a first time instance, wherein the hidden layer is further connected to a second hidden layer that incorporates data from a different second time instance.
But Zhang, on p. 1417, teaches: wherein the visible layer and the hidden layer are for a first time instance, 

    PNG
    media_image5.png
    108
    514
    media_image5.png
    Greyscale
 
The first time instance x1 propagates through the visible layer and the hidden layer in PDBM Fig. 4 on p. 1419.)
wherein the hidden layer is further connected to a second hidden layer that incorporates data from a different second time instance. (The second time instance x2 propagates to the second hidden layer in PDBM Fig. 4 on p. 1419.)
	Zhang is in the same field of endeavor as the claimed invention, namely Deep Boltzmann Machines for time series. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Zhang’s system into Hinton/Goodfellow’s system by propagating first and second time instances though the first and second hidden layers of the DBM, as described in the claim mapping, with a motivation to forecast the wind speed for managing operations in wind power plants (Zhang Abstract, lines 1-2).

Regarding CLAIM 8 the Hinton/Goodfellow combination teaches: The method of claim 1, 
However, neither Hinton nor Goodfellow explicitly teaches: wherein the visible layer is a composite layer comprising data for a plurality of different time instances.
wherein the visible layer is a composite layer comprising data for a plurality of different time instances. (Zhang p. 1417, §II-B, lines 3-6 teaches: “For multiperiod time series prediction, it [the wind speed data sequence] is to predict the value of                         
                            
                                
                                    x
                                
                                
                                    t
                                    +
                                    τ
                                
                            
                        
                     by using the previous M data, where t is the index of the time series and τ denotes the forecast horizon.” The wind speed data sequence X is the visible layer as shown in Fig. 4 on p. 1419.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Zhang’s system into the combination of Hinton and Goodfellow’s system as described in the claim mapping with a motivation to forecast the wind speed for managing operations in wind power plants (Zhang Abstract, lines 1-2).

Claims 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Hinton ("A Practical Guide to Training Restricted Boltzmann Machines") in view of Goodfellow et al. (“Generative Adversarial Nets”) and Dutt et al. (“Generative Adversarial Networks (GAN) Review”). 

Regarding CLAIM 11 the Hinton/Goodfellow combination teaches: The method of claim 1
However, neither Hinton nor Goodfellow explicitly teaches: drawing data samples based on authentic data; drawing fantasy samples based from the RBM; and training the adversarial model based on the adversarial model's ability to distinguish between the data samples and the fantasy samples.
	But Dutt teaches: further comprising training the adversarial model by: drawing data samples based on authentic data; (Dutt’s Fig. 2 at the top of p. 2 teaches real data are input to the discriminator.)
drawing fantasy samples based from the RBM; and (Fig. 2 teaches a mimicked signal from a generator is input to the discriminator. Dutt further teaches that the generator can be an RBM at p. 3, end of col. 1: “both the generator as well as discriminator networks can be any networks like RBMs”)
training the adversarial model based on the adversarial model's ability to distinguish between the data samples and the fantasy samples. (Dutt teaches on p. 2, middle of col. 1: 

    PNG
    media_image6.png
    191
    582
    media_image6.png
    Greyscale

Training is illustrated in Fig. 2 by the Matching Score and feedback into the generator.)
	Dutt is in the same field of endeavor as the claimed invention, namely GANs. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Dutt’s system into Hinton/Goodfellow’s system by performing GAN training with Hinton/Goodfellow’s generator and adversarial/discriminator models where “the advantages are that Markov chains are never needed, only backprop is used to obtain gradients, no inference is needed during learning, and a wide variety of functions can be incorporated into the model.” (Goodfellow, page 7, § 6)

Regarding CLAIM 12, the combination of Hinton, Goodfellow, and Dutt teaches: The method of claim 11, 
However, neither Hinton nor Dutt explicitly teaches: wherein training the adversarial model comprises measuring a probability that a particular sample is drawn from either the authentic data or the RBM.
wherein training the adversarial model comprises measuring a probability that a particular sample is drawn from either the authentic data or the RBM. (Page 2, § 3: “D(x) represents the probability that x came from the data rather than                         
                            
                                
                                    p
                                
                                
                                    g
                                
                            
                        
                    .”)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Goodfellow’s system into Hinton/Goodfellow/Dutt’s system by determining with the adversary/discriminatory if the sample is from authentic or fantasy data, with a motivation to train the GAN.

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Hinton ("A Practical Guide to Training Restricted Boltzmann Machines") in view of Goodfellow et al. (“Generative Adversarial Nets”) and Nguyen, Phung et al. (“Supervised Restricted Boltzmann Machines”), hereinafter Phung.

Regarding CLAIM 14 the Hinton/Goodfellow combination teaches: The method of claim 1 
However, neither Hinton nor Goodfellow explicitly teaches: further comprising using the RBM to generate a set of samples of a target population.
But Phung taches: further comprising using the RBM to generate a set of samples of a target population. (Phung, p. 3, § 3, lines 5-7 state: “The model now consists of two components: an RBM with undirected connections of joint distribution p (v, h) that has generative capability of representing data”. In the experiments on p. 8, § 4.3, lines 1-4, the supervised RBM is used to predict relative location for CT slices, where the dataset consists of CT images from 74 patients. Under the broadest reasonable interpretation, the generative part of the model generates a set of samples of a target population (the patients).)
Phung is in the same field of endeavor as the claimed invention, namely using RBMs for healthcare applicaitons. Therefore, it would have been obvious to one of ordinary skill in the art before (Phung Abstract: “Extensive experiments on real-world datasets show that our sRBM achieves better predictive performance than baseline methods.”)

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Hinton ("A Practical Guide to Training Restricted Boltzmann Machines") in view of Goodfellow et al. (“Generative Adversarial Nets”) and Cho et al. (“Gaussian-Bernoulli deep Boltzmann machine”).

Regarding CLAIM 15, the Hinton/Goodfellow combination teaches: The method of claim 1, wherein computing a set of likelihood gradients comprises
	However, neither Hinton nor Goodfellow explicitly teaches: computing a convex combination of a Monte Carlo estimate and a mean field estimate.
	But Cho teaches: computing a convex combination of a Monte Carlo estimate and (At the end of p. 3, the formula 
    PNG
    media_image7.png
    24
    177
    media_image7.png
    Greyscale
is a convex combination of a Monte Carlo estimate. The first sentence of p. 3, § 3.2 states that the second term of the gradient of equation (2) can be computed using Markov-chain Monte Carlo sampling. Equation (2) is at the top of p. 3.)
[computing] a mean field estimate. (According to p. 3, § 3.1, lines 3-4, the first term of equation (2) is computed with mean field approximation.)
	Cho is in the same field of endeavor as the claimed invention, namely DBMs. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Cho’s system into Hinton/Goodfellow’s system by computing the terms above, with a motivation to compute the first term of the gradient of Cho equation (Cho, p. 3, § 3.1, lines 1-4: “Computing the first term of (2) is straightforward for restricted Boltzmann machines because in that model the hidden neurons are independent of each other given the visible neurons. However, this does not apply to GDBM and therefore one needs to use some sort of approximation”) 

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Hinton ("A Practical Guide to Training Restricted Boltzmann Machines") in view of Goodfellow et al. (“Generative Adversarial Nets”), Li et al. (“Temperature based Restricted Boltzmann Machines”), and Chatterjee et al. (“Explaining Complex Distributions with Simple Models”).

Regarding CLAIM 16 the Hinton/Goodfellow combination teaches: The method of claim 1, wherein computing a set of likelihood gradients comprises: 
However, neither Hinton nor Goodfellow explicitly teaches: initializing a plurality of samples; initializing an inverse temperature for each sample of the plurality of samples; for each sample of the plurality of samples: updating the inverse temperature by sampling from an autocorrelated Gamma distribution; and updating the sample using Gibbs sampling.
But Li teaches: initializing a plurality of samples; (On page 2, in equation (1), each visible unit vector v corresponds to a sample. The term                         
                            
                                
                                    n
                                
                                
                                    v
                                
                            
                        
                     from equation (1) is the number of units in the visible vector. The combinations of values of the units form a plurality of samples.)
initializing an inverse temperature for each sample of the plurality of samples; (On page 2, equation (2) is the joint probability distribution of visible and hidden vectors, and it has an inverse temperature factor (1/T) in the exponential.)
for each sample of the plurality of samples: 
updating the inverse temperature (On p. 6, last paragraph, lines 6-7 states the experiments used different temperatures T/T0 including 0.1, 0.2, 0.5, 0.8, 1.0, 1.2, 1.5, and 2.0, where T/T0 = 1 (line 4) corresponds to the standard RBM.)
updating the sample using Gibbs sampling. (Page 3, § “Contrastive divergence for pre-training a TRBM” states, “In the pre-training stage, the contrastive divergence algorithm performs MCMC/Gibbs sampling and is used inside a gradient descent procedure to compute weight update” (emphasis added). On page 4, following equation 10, Li teacehs: “where                         
                            
                                
                                    v
                                
                                
                                    j
                                
                                
                                    k
                                
                            
                        
                     is the k-th Gibbs sampling of                         
                            
                                
                                    v
                                
                                
                                    j
                                
                            
                        
                    .”)
	Li is in the same field of endeavor as the claimed invention, namely, temperature-based Restricted Boltzmann Machines. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Li’s system into Hinton/Goodfellow’s system by introducing a temperature parameter to the joint probability distribution of visible and hidden vectors, with a motivation to control the selectivity of the firing neurons in the hidden layers. (Li, Abstract: “In this work, we propose temperature based restricted Boltzmann machines (TRBMs) which reveals that temperature is an essential parameter controlling the selectivity of the firing neurons in the hidden layers.”)
	However, neither Hinton, Goodfellow, nor Li explicitly teaches: [updating the inverse temperature] by sampling from an autocorrelated Gamma distribution; and
But Chatterjee teaches: [updating the inverse temperature] by sampling from an autocorrelated Gamma distribution; and (p. 4 § 1.1.2 teaches a Gamma distribution by equation (1.21) which is dependent on temperature T. Any sequence of stationary random variables has an autocorrelation, so the sequence is an autocorrelated Gamma distribution.)
(Equation 1.21).

Response to Arguments
	Examiner will respond to Applicant’s remarks, claim amendments, replacement drawings, and specification amendments all filed 10/28/2021.
	On page 11, Applicant provides a summary of amendments and a summary of the interview held on October 4, 2021. On page 11-13 Applicant provides a summary of the office action dated July 7, 2021.

Objections to the Drawing (Remarks p. 13): The objection to Fig. 15 is withdrawn due to the amendments to specification paragraph [0110]. The objections to Figs. 6, 11, 13, 14, and 16 are withdrawn due to the replacement drawings. The objection to reference sign 345 in specification paragraph [0049] is withdrawn due to the amendment to this paragraph. The objections regarding reference signs 800 from Fig. 8 and 1600 from Fig. 16 are withdrawn due to the amendments to specification paragraphs [0083] and [0128], respectively.

Claim Rejections under 35 U.S.C. § 112 (Remarks p. 14): The rejections of claims 3-5, 12, and 19-20 are withdrawn due to the claim amendments.

Claim Rejections under 35 U.S.C. § 101 (Remarks pp. 14-17): Applicant's arguments have been fully considered but they are not persuasive. 
Applicant’s Argument # 1 (Remarks p. 15)

    PNG
    media_image8.png
    415
    969
    media_image8.png
    Greyscale

Examiner’s Response #1
Claim 1 recites mathematical computations, which is a type of judicial exception. Improvements to judicial exceptions are still judicial exceptions. 
Applicant cites portions of the specification on pp. 15-16.

Applicant’s Argument # 2 (Remarks p. 16)

    PNG
    media_image9.png
    116
    647
    media_image9.png
    Greyscale

Examiner’s Response # 2
	Claims are given their broadest reasonable interpretation, and Examiner should not read the specification into the claims. Claim 1 does not explicitly recite a process to train models to generate more accurate samples than those generated by models trained using prior methods. 
The claim rejections are maintained.

Claim Rejections under 35 U.S.C. § 103 (Remarks pp. 17-16): Applicant's arguments have been fully considered but they are not persuasive. As explained by the 35 U.S.C.  § 103 rejection of claim 1 in this office action, the architecture of a restricted Boltzmann machine is a model with a visible layer and a hidden layer. The broadest reasonable interpretation of the amendment to claim 1, “wherein the adversarial model includes an architecture that is the same as that of the RBM,” includes the adversarial model having a visible layer and a hidden layer, such as a multilayer perceptron. Goodfellow teaches a discriminative model comprising a multilayer perceptron in the Abstract, lines 1-4 and on p. 2, § 3, lines 4-6. A discriminative model is an adversarial model in light of specification ¶ [0111], line 3. 
The claim rejections are maintained.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher H. Jablon whose telephone number is (571)270-7648. The examiner can normally be reached Monday - Friday, 9:00 am - 6:00 pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar can be reached on (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ASHER H. JABLON/Examiner, Art Unit 2127                                                                                                                                                                                                        
/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127