DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-30 are presented for examination.

Response to Amendment
Applicant’s amendment has obviated most, but not all, of the objections to the specification, drawings, and claims given in the last Office Action.  To the extent that an objection or rejection appears in the previous Office Action(s) but not this Office Action, that objection or rejection is withdrawn.  To the extent that is appears both in a previous Office Action(s) and this Office Action, the objection or rejection is maintained.

Specification
The disclosure is objected to because of the following informalities:
Examiner notes that “data” is the plural of “datum” and that the specification contains multiple instances of the term “data” being used as singular.  Examiner requests that all such instances be corrected.  Examiner has identified instances of this error in paragraphs 40-41, 43-44, 48 (two instances), 78 (three instances), and 126.
In paragraph 65, “execute to cluster data” should be “execute clustering of data”.
In paragraph 95, the symbol “//” should be changed to a comma for consistency with the remainder of the specification.
In paragraph 137, “between the 10 different classes” should be “among the 10 different classes”.
In paragraph 138, “preprintarXiv:” should be “preprint arXiv:”.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1, 13-17, and 29-30 are rejected under 35 U.S.C. 103 as being unpatentable over Prince, Tutorial #5: Variational Autoencoders, Borealis AI, https://www.borealisai.com/en/blog/tutorial-5-variational-auto-encoders/ (Jan. 28, 2020) (“Prince”) in view of Bingham et al. (US 20210316455) (“Bingham”) and further in view of Zhang et al., “Semi-Supervised Learning of Bearing Anomaly Detection via Deep Variational Autoencoders,” in arXiv preprint arXiv:1912.01096 (2019) (“Zhang”).
Regarding claim 1, Prince discloses “[a] non-transitory computer-readable medium having stored thereon computer-readable instructions (Prince, section entitled “The Reparameterization Trick” discloses that PyTorch/Tensorflow may be used to perform automatic differentiation via backpropagation [note that PyTorch/Tensorflow must be stored in memory, or a non-transitory computer-readable medium]) that when executed by a computing device cause the computing device to: …
(B) select a first batch of classified observation vectors from a plurality of classified observation vectors, … wherein the first batch of classified observation vectors includes the predefined number of observation vectors (during learning I samples [I = predefined number] of training data are given and the log likelihood of the model with respect to the parameters is maximized – Prince, section entitled “Evidence Lower Bound (ELBO),” first paragraph); 
(C) compute a prior regularization error value using a β-divergence distance computation (in one formulation of the ELBO, the ELBO is written as the difference of two terms, one of which measures the Kullback-Liebler divergence between an auxiliary distribution and the prior [prior regularization error value; note that KL divergence is a special case of β-divergence] – Prince, section entitled “ELBO as Reconstruction Loss Minus KL to Prior”); 
in one formulation of the ELBO, the ELBO is written as the difference of two terms, one of which measures the reconstruction loss – Prince, section entitled “ELBO as Reconstruction Loss Minus KL to Prior”); 
(E) generate a first batch of noise observation vectors using a predefined noise function, wherein the first batch of noise observation vectors includes a predefined number of observation vectors (a VAE trained on the CELEBA faces dataset generates sample images by predicting the mean of the pixel data based on a random value of a latent variable, and per-pixel spherical Gaussian noise is added to each image [predefined noise function = spherical Gaussian; predefined number = number of images in dataset] – Prince, Fig. 11 and accompanying text); 
(F) compute an evidence lower bound (ELBO) value from the computed prior regularization error value and the computed decoder reconstruction error value (in one formulation of the ELBO, the ELBO is written as the difference of two terms, one of which measures the Kullback-Liebler divergence between an auxiliary distribution and the prior [prior regularization error value] and the other of which measures the reconstruction loss [decoder reconstruction error value] – Prince, section entitled “ELBO as Reconstruction Loss Minus KL to Prior”); 
(G) compute a gradient of an encoder neural network model (a variational autoencoder computes the ELBO for a point by estimating the mean and variance of the posterior distribution of the data point using an encoder network; a sample is drawn from this distribution and the ELBO is computed; to maximize the sum of the ELBO over all data examples, stochastic gradient descent is performed by running mini-batches of points through the network [since the variational autoencoder includes the encoder network, the gradient is deemed for purposes of examination to be the gradient of, inter alia, the encoder] – Prince, section entitled “The Variational Autoencoder”); 
(H) update the ELBO value (automatic differentiation may be performed via backpropagation to optimize [update] the cost [ELBO] function; because there is no way to differentiate through the sampling step, a reparameterization trick is performed – Prince, section entitled “The Reparameterization Trick”); 
Prince Fig. 9 shows that, for each training data example x [classified observation vector, including noise, see Fig. 11], an encoder computes a latent vector h and a sample h* from a variational distribution is fed into the decoder to make a prediction; section entitled “The Reparameterization Trick” shows that the parameters θ may be updated in this process through a reparameterization trick [note that since the decoder is based on the sample h* which is based on the updated parameters θ, the decoder is updated]); 
(J) update the encoder neural network model with the plurality of observation vectors (Prince Fig. 9 shows that, for each training data example x [classified observation vector, including noise, see Fig. 11], an encoder computes a latent vector h and a sample h* from a variational distribution is fed into the decoder to make a prediction; section entitled “The Reparameterization Trick” shows that the parameters θ may be updated in this process through a reparameterization trick [since the encoder is based on the parameters θ, updating of θ implies updating the encoder]); [and]
(K) train the decoder neural network model to classify [a] plurality of unclassified observation vectors and the first batch of noise observation vectors by repeating (A) to (J) (Prince Fig. 9 discloses that an encoder encodes each training example x to create a variational distribution; a sample h* from the variational distribution is input to a decoder, and the decoder makes a prediction of x based on h* [note that the goal of training is to classify unclassified observation vectors; note also that the above process is repeated for each training data sample])….”
Bingham discloses “(A) select[ing] a first batch of unclassified observation vectors from a plurality of unclassified observation vectors, wherein the first batch of unclassified observation vectors includes a predefined number of observation vectors (system for determining a robot collision processes a state sequence of the robot performing a task including at least a current state of the robot, a next state of the robot, and an action to transition the robot from the current state to the next state [states/actions = unclassified observation vector] – Bingham, paragraph 60; after processing, the system determines whether to process any additional states and corresponding actions of the robot, and the system may determine not to process any additional states and actions based on the robot completing the task [so that the states/actions are part of a batch whose number is predetermined by whether the robot has completed the task or not] – id. at paragraph 66); 
(B) select[ing] a first batch of classified observation vectors …, wherein a target variable value is defined to represent a class for each respective observation vector of the plurality of classified observation vectors (system generates training data by generating a sequence of actions for a robot to perform a task and captures a state sequence corresponding to the sequence of actions; if the system determines that there is no collision, the sequence of actions and corresponding state sequence are stored as a training instance [vector = state/action sequence; target variable value = collision/no collision] – Bingham, paragraphs 32-34; see also Fig. 3);…
(I) updat[ing] a decoder neural network model with a plurality of observation vectors, wherein the plurality of observation vectors includes the first selected batch of unclassified observation vectors (after selecting a training instance, the initial state is selected as the current state, and a latent space of the current state is generated; a predicted latent space of the next state is generated, followed by the latent space of the next state; the predictor network portion [decoder] of a cascading variational autoencoder is then updated based on a generated loss – Bingham, Fig. 5 and paragraph 54);…
(K) train[ing] the decoder neural network model … until the computed ELBO gradient value indicates a decoder loss value46Atty. Dkt. No.: 04500-0103-09 (10025513) and an encoder loss value have converged or a predetermined maximum number of iterations of (A) is performed (if there are no additional states, the system determines whether to perform any additional training on the predictor network portion [decoder] of the cascading VAE; the system can determine to perform more training if a criterion such as a threshold number of epochs [predetermined maximum number of iterations] has not yet been satisfied – Bingham, paragraph 57; see also Fig. 5, esp. ref. chars. 516-520); 
determin[ing] the target variable value for each observation vector of the plurality of unclassified observation vectors based on an output of the trained decoder neural network model (from a state sequence of the robot including a current state of the robot, a next state, and an action to transition from the current state to the next state [these three items forming an unclassified observation vector], a latent space of a next state and a predicted latent space of a next state are generated, and the VAE whether a collision has occurred [collision/no collision = target variable value]; the predicted latent space is output from a prediction network [decoder] portion of the cascading VAE; process continues until there are no additional states – Bingham, Fig. 6 and paragraphs 60-64 and 66); and 
output[ting] the target variable value for each observation vector of the plurality of unclassified observation vectors, wherein the target variable value selected for each observation vector of the plurality of unclassified observation vectors identifies the class of a respective observation vector (based on the latent space of the next state and the predicted latent space of the next state, the system determines whether there is a collision [target variable value whose class is either collision or no collision] – Bingham, paragraph 64; process continues as long as there are additional states and corresponding actions of the robot performing the task – id. at paragraph 66; see also Fig. 6).”
Bingham and the instant application relate to variational autoencoders and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Prince to train and use the variational autoencoder using both classified and unclassified vectors and use the network to predict a target variable value, as disclosed by Bingham, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would allow the system to be optimally trained to classify unknown observations.  See Bingham, paragraphs 62-64.
Neither Prince nor Bingham appears to disclose explicitly the further limitations of the claim.  However, Zhang discloses “(C) comput[ing] a prior regularization error value … from the selected first batch of unclassified observation vectors and the selected first batch of classified observation vectors (a straightforward implementation of VAE in semi-supervised learning is to train the model using both labeled and unlabeled training data; the encoder part is utilized to provide a clustering of input data in the latent space; the dimension of latent space variables z is much smaller than that of the input data x – Zhang, sec. 3.1; see also Eq. (6) (showing that the KL regularization value is one component of the ELBO and is calculated based on both z and x [so that the regularization error is ultimately computed based on both the labeled data and the unlabeled data])), wherein a beta value for the β-divergence distance computation is greater than zero and less than one (when training a VAE-based M1 model, it is difficult to train a straight implementation of VAE that equally weights the likelihood and the KL divergence; to overcome this, the implementation uses “β-VAE” in which the introduced weight factor of the KL divergence term β will gradually increase from 0 to 1 over the course of the training epoch – Zhang, sec. 3.3.1, second paragraph);
(D) comput[ing] a … reconstruction error value from the selected first batch of unclassified observation vectors and the selected first batch of classified observation vectors (a straightforward implementation of VAE in semi-supervised learning is to train the model using both labeled and unlabeled training data; the encoder part is utilized to provide a clustering of input data in the latent space; the dimension of latent space variables z is much smaller than that of the input data x – Zhang, sec. 3.1; see also Eq. (6) (showing that the reconstruction error value is one component of the ELBO and is calculated based on both z and x [so that the reconstruction error is ultimately computed based on both the labeled data and the unlabeled data])); …
(F) comput[ing] an evidence lower bound (ELBO) gradient value (the maximization of ELBO requires its gradients with respect to model parameters and variational parameters, which are generally intractable; the dominant approach for circumventing this is by Monte Carlo (MC) integration to approximate the expectation of the gradients – Zhang, last paragraph before sec. 3) …; … [and]
(H) updat[ing] an ELBO value using the computed ELBO gradient value (the maximization of ELBO requires its gradients with respect to model parameters and variational parameters, which are generally intractable; the dominant approach for circumventing this is by Monte Carlo (MC) integration to approximate the expectation of the gradients – Zhang, last paragraph before sec. 3; see also last paragraph before sec. 3.3 (disclosing that the gradient descents are used to update the network parameters; note that these parameters are used to calculate the ELBO value per Eq. (6), so updating the parameters updates the ELBO value))….”
Zhang and the instant application both relate to semi-supervised learning using variational autoencoders and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Prince and Bingham to use semi-supervised data to compute the ELBO gradient value, as disclosed by Zhang, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would allow for effective utilization of the dataset when only a small subset of data have labels.  See Zhang, abstract.

Claim 16 is a computing device claim corresponding to non-transitory computer-readable medium claim 1 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 17 is a method claim corresponding to non-transitory computer-readable medium claim 1 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 13, Prince, as modified by Bingham and Zhang, discloses that “the updated decoder neural network model is further output (automatic differentiation via the backpropagation algorithm may be performed to optimize the cost function [update the neural network]; because there is no way to differentiate through the sampling step, a reparameterization trick is employed [and the network including the reparameterization trick is output] – Prince, section entitled “The Reparameterization Trick” and Fig. 10).”

Claim 29 is a method claim corresponding to non-transitory computer-readable medium claim 13 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 14, Prince, as modified by Bingham and Zhang, discloses that “after (K), the computer-readable instructions further cause the computing device to: 
state sequence of a robot performing a task including a current state of the robot, a next state of the robot, and an action to transition the robot from the current state to a next state [these three items collectively form a new observation vector, which comes from a dataset] is processed by the trained VAE – Bingham, Fig. 6, ref. char. 602); 
input the read new observation vector to the updated decoder neural network model to predict a class for the read new observation vector (predicted latent space of the next state is generated by processing the current state’s latent space and an action to transition to the next state using a prediction network portion [decoder] of a next VAE; based on the predicted latent space and the latent space, it is determined whether there is a collision [collision/no collision = class] – Bingham, Fig. 6, esp. ref. chars. 602-610); and 
output the predicted class (based on the predicted latent space and the latent space, it is determined whether there is a collision [and outputs the result] – Bingham, Fig. 6, esp. ref. char. 610; see also paragraph 64).”   It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Prince and Zhang to use the trained network to classify new observations, as disclosed by Bingham, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the user automatically to predict an unknown class with the trained network.  See Bingham, paragraphs 62-64.

Regarding claim 15, Prince, as modified by Bingham and Zhang, discloses that “the decoder reconstruction error value is computed using Monte Carlo sampling (in non-linear latent variable models, the autoencoder architecture can approximate a lower bound on the maximum likelihood [ELBO] using a Monte Carlo sampling method [note that, because the ELBO has the decoder reconstruction error as a term, the deconstruction error value must also be determined by Monte Carlo sampling] – Prince, last paragraph before “Latent Variable Models”).”

.

Allowable Subject Matter
Claims 2-12 and 18-28 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Response to Arguments
Applicant's arguments filed January 6, 2022 (“Remarks”) have been fully considered but they are, except insofar as rendered moot by a new ground of rejection, not persuasive.
Regarding Applicant’s arguments with regard to the objections to the specification, Remarks at 26-27, Examiner replies as follows:  (a) Regarding Applicant’s contention that the term “data” in the cited paragraphs is used to refer to a single box, amendment of the relevant sections of the specification to read “set of input classified data” and “set of input classified data” would also be acceptable, as it would remove the impression of the reader that there is only one datum. (b) Regarding the objection to “to execute to cluster data,” it is not proper English to follow the infinitive “to execute” immediately with another infinitive.  The proper construction in this case would be to place the second verb in gerund form. (c) Regarding the objection to “between the 10 different classes,” there is in fact a grammatical error if there are more than two classes, as “between” is used for comparison between two entities and “among” is used for comparison among three or more entities.  Here, because there are ten different classes, “among” and not “between” is appropriate.  
Applicant alleges that Bingham does not teach any computation based on semi-supervised data and is therefore irrelevant to the claim language.  Remarks at 30.  However, Examiner notes that the claim does not specify that the data are “semi-supervised” as such.  Rather, the contested portions of the claim merely require that one batch of observation vectors be “unclassified” and that another be “classified,” which is 
Applicant’s further arguments that Prince and Bingham do not disclose beta values in the range (0, 1) or computation based on semi-supervised data are moot by virtue of the addition of Zhang to the rejection.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849. The examiner can normally be reached M-R 7a-5:30p ET.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/R.C.V./             Examiner, Art Unit 2125

/KAMRAN AFSHAR/             Supervisory Patent Examiner, Art Unit 2125