DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-30 are presented for examination.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on September 8, 2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings are objected to because (a) in Fig. 3, the terms “depth” and “width” are written in a shade that is difficult to read; (b) in Fig. 6, reference characters 600-602, 604-605, 607, and 609 are written on a shaded background, see 37 CFR § 1.84(p)(3); and (c) in Fig. 8, the numerals representing the cluster labels are written on a shaded background, see id.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the 

Specification
The disclosure is objected to because of the following informalities:
In paragraph 9, “data … contains” should be “data … contain”.   Examiner notes that “data” is the plural of “datum” and that the specification contains multiple instances of the term “data” being used as singular.  Examiner requests that all such instances be corrected.  Examiner has identified further instances of this error in paragraphs 40-41, 43-44, 48 (two instances), 78 (three instances), and 126.
In paragraph 28, “description128” should be “description 128”.
In paragraph 34, “programming language, scripting language, assembly language” should be “programming languages, scripting languages, assembly languages”.
In paragraphs 37 and 108, “comprised of” should be “comprising”.
In paragraph 38, “hypertext transport protocol” should be “hypertext transfer protocol”.
In paragraph 55, last equation on page 17, the close parenthesis is larger than the open parenthesis.
In paragraphs 56, 59, 94 (two instances), and 95 (two instances), the equations contain the symbol “//”, which presumably means division but is ordinarily represented by a single slash “/”.
In paragraph 65, “execute to cluster data” should be “execute clustering of data”.
In paragraph 68, “observation vectors … includes” should be “observation vectors … include”.
In paragraphs 87-89, “for example” is repeated twice in the same sentence; there should be a comma between the default value and the word “though” in the last sentence of each paragraph.
In paragraph 91, “for example” is repeated twice in the same sentence.
In paragraph 98, “with the loss function is” should be “with the loss function”.
In paragraph 129, “eight column” should be “eighth column”.
In paragraph 132, “using deep neural network” should be “using a deep neural network”.
In paragraph 137, “between the 10 different classes” should be “among the 10 different classes”.
In paragraph 138, “preprintarXiv:” (two instances) should be “preprint arXiv:”.
Appropriate correction is required.
The use of the terms BLUETOOTH (paragraph 32), JAVA (paragraph 38), and GOOGLE (paragraph 144), which are trade names or marks used in commerce, has been noted in this application. The terms should be accompanied by the generic terminology; furthermore, the terms should be capitalized wherever they appear or, where appropriate, include a proper symbol indicating use in commerce such as ™, SM , or ® following the term.
Although the use of trade names and marks used in commerce (i.e., trademarks, service marks, certification marks, and collective marks) is permissible in patent applications, the proprietary nature of the marks should be respected and every effort made to prevent their use in any manner which might adversely affect their validity as commercial marks.

Claim Objections
Claim 2 is objected to because of the following informalities:  “plurality of classes that each represent” should be “plurality of classes that each represents”.  Appropriate correction is required.
Claims 12 and 28 are objected to because of the following informalities:  the equations contain the symbol “//”, which presumably means division but is ordinarily represented by a single slash “/”.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 13-17, and 29-30 are rejected under 35 U.S.C. 103 as being unpatentable over Prince, Tutorial #5: Variational Autoencoders, Borealis AI, https://www.borealisai.com/en/blog/tutorial-5-variational-auto-encoders/ (Jan. 28, 2020) (“Prince”) in view of Bingham et al. (US 20210316455) (“Bingham”).
Regarding claim 1, Prince discloses “[a] non-transitory computer-readable medium having stored thereon computer-readable instructions (Prince, section entitled “The Reparameterization Trick” discloses that PyTorch/Tensorflow may be used to perform automatic differentiation via backpropagation [note that PyTorch/Tensorflow must be stored in memory, or a non-transitory computer-readable medium]) that when executed by a computing device cause the computing device to: …
(B) select a first batch of classified observation vectors from a plurality of classified observation vectors, … wherein the first batch of classified observation vectors includes the predefined number of during learning I samples [I = predefined number] of training data are given and the log likelihood of the model with respect to the parameters is maximized – Prince, section entitled “Evidence Lower Bound (ELBO),” first paragraph); 
(C) compute a prior regularization error value using a β-divergence distance computation (in one formulation of the ELBO, the ELBO is written as the difference of two terms, one of which measures the Kullback-Liebler divergence between an auxiliary distribution and the prior [prior regularization error value; note that KL divergence is a special case of β-divergence] – Prince, section entitled “ELBO as Reconstruction Loss Minus KL to Prior”); 
(D) compute a decoder reconstruction error value (in one formulation of the ELBO, the ELBO is written as the difference of two terms, one of which measures the reconstruction loss – Prince, section entitled “ELBO as Reconstruction Loss Minus KL to Prior”); 
(E) generate a first batch of noise observation vectors using a predefined noise function, wherein the first batch of noise observation vectors includes a predefined number of observation vectors (a VAE trained on the CELEBA faces dataset generates sample images by predicting the mean of the pixel data based on a random value of a latent variable, and per-pixel spherical Gaussian noise is added to each image [predefined noise function = spherical Gaussian; predefined number = number of images in dataset] – Prince, Fig. 11 and accompanying text); 
(F) compute an evidence lower bound (ELBO) value from the computed prior regularization error value and the computed decoder reconstruction error value (in one formulation of the ELBO, the ELBO is written as the difference of two terms, one of which measures the Kullback-Liebler divergence between an auxiliary distribution and the prior [prior regularization error value] and the other of which measures the reconstruction loss [decoder reconstruction error value] – Prince, section entitled “ELBO as Reconstruction Loss Minus KL to Prior”); 
(G) compute a gradient of an encoder neural network model (a variational autoencoder computes the ELBO for a point by estimating the mean and variance of the posterior distribution of the data point using an encoder network; a sample is drawn from this distribution and the ELBO is computed; to maximize the sum of the ELBO over all data examples, stochastic gradient descent is performed by running mini-batches of points through the network [since the variational autoencoder includes the encoder network, the gradient is deemed for purposes of examination to be the gradient of, inter alia, the encoder] – Prince, section entitled “The Variational Autoencoder”); 
(H) update the ELBO value (automatic differentiation may be performed via backpropagation to optimize [update] the cost [ELBO] function; because there is no way to differentiate through the sampling step, a reparameterization trick is performed – Prince, section entitled “The Reparameterization Trick”); 
(I) update a decoder neural network model with a plurality of observation vectors, wherein the plurality of observation vectors includes … the first batch of classified observation vectors[] and the first batch of noise observation vectors (Prince Fig. 9 shows that, for each training data example x [classified observation vector, including noise, see Fig. 11], an encoder computes a latent vector h and a sample h* from a variational distribution is fed into the decoder to make a prediction; section entitled “The Reparameterization Trick” shows that the parameters θ may be updated in this process through a reparameterization trick [note that since the decoder is based on the sample h* which is based on the updated parameters θ, the decoder is updated]); 
(J) update the encoder neural network model with the plurality of observation vectors (Prince Fig. 9 shows that, for each training data example x [classified observation vector, including noise, see Fig. 11], an encoder computes a latent vector h and a sample h* from a variational distribution is fed into the decoder to make a prediction; section entitled “The Reparameterization Trick” shows that the parameters θ may be updated in this process through a reparameterization trick [since the encoder is based on the parameters θ, updating of θ implies updating the encoder]); [and]
(K) train the decoder neural network model to classify [a] plurality of unclassified observation vectors and the first batch of noise observation vectors by repeating (A) to (J) (Prince Fig. 9 discloses that an encoder encodes each training example x to create a variational distribution; a sample h* from the variational distribution is input to a decoder, and the decoder makes a prediction of x based on h* [note that the goal of training is to classify unclassified observation vectors; note also that the above process is repeated for each training data sample])….”
Bingham discloses “(A) select[ing] a first batch of unclassified observation vectors from a plurality of unclassified observation vectors, wherein the first batch of unclassified observation vectors includes a predefined number of observation vectors (system for determining a robot collision processes a state sequence of the robot performing a task including at least a current state of the robot, a next state of the robot, and an action to transition the robot from the current state to the next state [states/actions = unclassified observation vector] – Bingham, paragraph 60; after processing, the system determines whether to process any additional states and corresponding actions of the robot, and the system may determine not to process any additional states and actions based on the robot completing the task [so that the states/actions are part of a batch whose number is predetermined by whether the robot has completed the task or not] – id. at paragraph 66); 
(B) select[ing] a first batch of classified observation vectors …, wherein a target variable value is defined to represent a class for each respective observation vector of the plurality of classified observation vectors (system generates training data by generating a sequence of actions for a robot to perform a task and captures a state sequence corresponding to the sequence of actions; if the system determines that there is no collision, the sequence of actions and corresponding state sequence are stored as a training instance [vector = state/action sequence; target variable value = collision/no collision] – Bingham, paragraphs 32-34; see also Fig. 3);…
(I) updat[ing] a decoder neural network model with a plurality of observation vectors, wherein the plurality of observation vectors includes the first batch of unclassified observation vectors (after selecting a training instance, the initial state is selected as the current state, and a latent space of the current state is generated; a predicted latent space of the next state is generated, followed by the latent space of the next state; the predictor network portion [decoder] of a cascading variational autoencoder is then updated based on a generated loss – Bingham, Fig. 5 and paragraph 54);…
if there are no additional states, the system determines whether to perform any additional training on the predictor network portion [decoder] of the cascading VAE; the system can determine to perform more training if a criterion such as a threshold number of epochs [predetermined maximum number of iterations] has not yet been satisfied – Bingham, paragraph 57; see also Fig. 5, esp. ref. chars. 516-520); 
determin[ing] the target variable value for each observation vector of the plurality of unclassified observation vectors based on an output of the trained decoder neural network model (from a state sequence of the robot including a current state of the robot, a next state, and an action to transition from the current state to the next state [these three items forming an unclassified observation vector], a latent space of a next state and a predicted latent space of a next state are generated, and the VAE whether a collision has occurred [collision/no collision = target variable value]; the predicted latent space is output from a prediction network [decoder] portion of the cascading VAE; process continues until there are no additional states – Bingham, Fig. 6 and paragraphs 60-64 and 66); and 
output[ting] the target variable value for each observation vector of the plurality of unclassified observation vectors, wherein the target variable value selected for each observation vector of the plurality of unclassified observation vectors identifies the class of a respective observation vector (based on the latent space of the next state and the predicted latent space of the next state, the system determines whether there is a collision [target variable value whose class is either collision or no collision] – Bingham, paragraph 64; process continues as long as there are additional states and corresponding actions of the robot performing the task – id. at paragraph 66; see also Fig. 6).”
Bingham and the instant application relate to variational autoencoders and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Prince to train and use the variational autoencoder using both classified and unclassified vectors and use the network to predict a target variable value, as disclosed by Bingham, and an ordinary See Bingham, paragraphs 62-64.

Claim 16 is a computing device claim corresponding to non-transitory computer-readable medium claim 1 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 17 is a method claim corresponding to non-transitory computer-readable medium claim 1 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 13, Prince, as modified by Bingham, discloses that “the updated decoder neural network model is further output (automatic differentiation via the backpropagation algorithm may be performed to optimize the cost function [update the neural network]; because there is no way to differentiate through the sampling step, a reparameterization trick is employed [and the network including the reparameterization trick is output] – Prince, section entitled “The Reparameterization Trick” and Fig. 10).”

Claim 29 is a method claim corresponding to non-transitory computer-readable medium claim 13 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 14, Prince, as modified by Bingham, discloses that “after (K), the computer-readable instructions further cause the computing device to: 
read a new observation vector from a dataset (state sequence of a robot performing a task including a current state of the robot, a next state of the robot, and an action to transition the robot from the current state to a next state [these three items collectively form a new observation vector, which comes from a dataset] is processed by the trained VAE – Bingham, Fig. 6, ref. char. 602); 
input the read new observation vector to the updated decoder neural network model to predict a class for the read new observation vector (predicted latent space of the next state is generated by processing the current state’s latent space and an action to transition to the next state using a prediction network portion [decoder] of a next VAE; based on the predicted latent space and the latent space, it is determined whether there is a collision [collision/no collision = class] – Bingham, Fig. 6, esp. ref. chars. 602-610); and 
output the predicted class (based on the predicted latent space and the latent space, it is determined whether there is a collision [and outputs the result] – Bingham, Fig. 6, esp. ref. char. 610; see also paragraph 64).”   It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Prince to use the trained network to classify new observations, as disclosed by Bingham, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the user automatically to predict an unknown class with the trained network.  See Bingham, paragraphs 62-64.

Regarding claim 15, Prince, as modified by Bingham, discloses that “the decoder reconstruction error value is computed using Monte Carlo sampling (in non-linear latent variable models, the autoencoder architecture can approximate a lower bound on the maximum likelihood [ELBO] using a Monte Carlo sampling method [note that, because the ELBO has the decoder reconstruction error as a term, the deconstruction error value must also be determined by Monte Carlo sampling] – Prince, last paragraph before “Latent Variable Models”).”

Claim 30 is a method claim corresponding to non-transitory computer-readable medium claim 15 and is rejected for the same reasons as given in the rejection of that claim.

Allowable Subject Matter
Claims 2-12 and 18-28 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849. The examiner can normally be reached M-R 7a-5:30p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/R.C.V./             Examiner, Art Unit 2125           

/KAMRAN AFSHAR/             Supervisory Patent Examiner, Art Unit 2125