DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Acknowledgement is made of Applicant’s claim amendments filed on 4/12/2022. The claim amendments are entered. Presently, claims 1-4, 6-13, 15-18, and 20 remain pending. Claims 1, 5, 12, 16, 22, and 27 have been amended and claims 28-32 are newly amended.
Response to Arguments
Applicant’s arguments with respect to claim(s) 4/12/2022 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 7, 12, 22, and 27-29 are rejected under 35 U.S.C. 103 as being unpatentable over Ha et al. ("Generating large images from latent vectors.") in view of Arthur et al. (US-20140244631-A1) and Ganjam et al. (US-20170371958-A1).
Regarding Claim 1,
Ha teaches a method for training a model that implements a machine- learning algorithm, comprising: 
receiving a set of training data (pg. 7; In the current literature, the training dataset is typically composed of small images (such as 32x32px or 64x64, though I have seen 128x128 and 256x256 sets being used).), wherein the set of training data includes a plurality of input vectors (pg. 8; The vectors x, y, and r are all the possible coordinates we want to compute the pixel intensity for in our image, so for an image size of (26x26), we will need to calculate a total of 676 pixel intensities, to form (32+676+676+676) inputs into the network.) and a plurality of desired output vectors (pg. 20; y_real = Discriminator(Sample Image, w_d) The sample image is the desired output vector which is denoted by “xData” in the VAE network in figure labeled “Our final CPPN model combined with GAN + VAE” on page 19.), wherein each input vector in the plurality of input vectors is associated with a latent descriptor vector in a plurality of latent descriptor vectors…(pg. 8; The input vector Z is a vector of 32 real numbers that are drawn from unit gaussian random number generator, and all of them will be independent. Input vectors x, y, and r are associated with latent vector Z.); 
providing a plurality of augmented input vectors as inputs to a model, wherein each augmented input vector in the plurality of augmented input vectors comprises an input vector in the plurality of input vectors (pg. 8; The vectors x, y, and r are all the possible coordinates we want to compute the pixel intensity for in our image, so for an image size of (26x26), we will need to calculate a total of 676 pixel intensities, to form (32+676+676+676) inputs into the network. X, Y, and Z are the input vectors.) and a latent descriptor vector… (pg. 8; The input vector Z is a vector of 32 real numbers that are drawn from unit gaussian random number generator, and all of them will be independent. Z is the latent descriptor vector.); 
applying, by the model, a set of model parameters to the plurality of augmented input vectors to generate a set of output vectors (pg. 9; As we have seen earlier, by keeping the weights of the network constant, and by controlling the input Z vector, we can get a rich set of output images using this method. The goal here is to train our network in a such a way that for any random Z vector we put in, the output would look like an image from our MNIST training set, and we would not really be able to tell them apart. The output vector is shown as Xgen (generated image) from the Generator network See figure “Our final CPPN model combined with GAN + VAE:” on page 19. And pg. 10; We can define these performance measures, or cost functions, to evaluate the performance of the generator network: G = −log(y∣Image Generated by G)); and 
adjusting the set of model parameters (Pg. 19; Training this combined model will require some tweaks to our existing algorithm, because we will also need to train to optimise the VAE’s error. Note that we will adjust both Wq and Wg when optimising for both G_loss and VAE_loss. The weights (i.e model parameters) are adjusted based on the loss.) and at least one latent descriptor vector (pg. 20; Z = Encoder(Sample Image, Z_vae_noise) Please note the Encoder network is updated based on the adjusted weights (i.e. model parameters) and Z (i.e latent descriptor vector) is set to the output of the Encode netowork (See figure “Our final CPPN model combined with GAN + VAE:” on page 19.). Pg. 17; VAE’s help use do two things. Firstly, they allow us to encode existing image into a much smaller latent Z vector, kind of like compression. It does this by passing an image through the encoder network, which we will call the Q network, with weights Wq And from this encoded latent vector Z, the generator network will produce an image that will be as close as possible to the original image passed in, hence it is an autoencoder system. ) … based on differences between each output vector in the set of output vectors and a corresponding desired output vector in the plurality of desired output vectors (pg. 10; Likewise, we can evaluate the performance of the discriminator in a similar way: D = −log(y∣Real Training Sample) D = −log(1 − y∣Image Generated by G) D = 0.5D + 0.5D If the discriminator is doing a great job, both D and D will be a number close to zero… The generator network can also be trained via backprop as well by adjusting the weights w in the direction of the gradient that will make G smaller (thereby also making D larger). A difference is computed between the sample image (i.e. desired output vectors) and the image generated (i.e. output vector). Pg. 20; Calculate G_loss Calculate dG_loss/dW_g and dG_loss/dW_q gradients via backprop #i.e., the direction that would make D crappier Adjust W_g's W_q's using SGD-type optimising step; Here we see the weight W_g for the generator network and W_q being adjusted based on the difference computed by the loss.), 
wherein each latent descriptor vector comprises a plurality of scalar values that are initialized prior to training the model (pg. 21; z = sampler.generate_z() # vector of 32 random gaussian samples ~ N And in addition, scalar values are shown in the image on page 4.).
Ha does not explicitly disclose 
receiving …, a plurality of training identifiers, …wherein each input vector in the plurality of input vectors is associated with a training identifier in the plurality of training identifiers;
initializing, prior to training the model, a plurality of latent descriptor vectors stored in a database, wherein each latent descriptor vector stored in the database is mapped to one training identifier in the plurality of training identifiers; 
a latent descriptor vector selected from the database using the training identifier that is associated with the input vector;
…latent descriptor vectors stored in a database…
	However, Arthur (US 20140244631 A1) teaches
…latent descriptor vectors stored in a database… (para [0033] …storing, in a latent feature vector database, latent feature vectors associated with each of the plurality of multimedia digital assets…).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the machine-learning model of Ha which utilizes latent vectors with the latent feature vector database of Arthur.
Doing so would allow for the storage and retrieval of latent vectors. Storing latent vectors in a database would be an improvement to the teachings of Ha because it would allow the user to search and query specific data stored in the database. This would help narrow number of results resulting in faster data retrieval. (para [0033] …a search query from a user or group of users…comparing the normalized target latent feature vector with latent feature vectors retrieved from the latent feature vector database for each of the plurality of multimedia digital assets to generate a latent comparison value between each of the plurality of multimedia digital assets and the discovered target multimedia digital asset…).
Ganjam teaches
receiving …a plurality of training identifiers, …wherein each input vector in the plurality of input vectors is associated with a training identifier in the plurality of training identifiers (para [0065] FIG. 5 includes an example element of corporal data 500, an example probabilistic database (“PD”) 502 for a token-of-interest (“TOI”) 504, “Sofia” to find class identifiers associated with “Sofia”, an example PD 506 for identifying a latent attribute related to the class identifiers “F Name” and “L Name”, and an example PD 508 for identifying a latent attribute value for the TOIs “Sofia” “Barga” and the latent attribute, “Gender”. Class identifier (i.e. training identifier).); 
initializing, prior to training the model (fig. 4, para [0057] Database populated with structured data from parsed unstructured data before training a model. para [0021] In some examples, the techniques discussed herein can parse data and/or predict related data with limited or no training, which greatly reduces the complexity of development and deployment of the techniques.), a plurality of latent descriptor vectors stored in a database, wherein each latent descriptor vector stored in the database is mapped to one training identifier in the plurality of training identifiers para [0065] FIG. 5 includes an example element of corporal data 500, an example probabilistic database (“PD”) 502 for a token-of-interest (“TOI”) 504, “Sofia” to find class identifiers associated with “Sofia”, an example PD 506 for identifying a latent attribute related to the class identifiers “F Name” and “L Name”, and an example PD 508 for identifying a latent attribute value for the TOIs “Sofia” “Barga” and the latent attribute, “Gender”. Class identifier (i.e. training identifier) identifies (i.e. maps to) a latent attribute and value (i.e. latent vector).).; 
a latent descriptor vector selected from the database using the training identifier that is associated with the input vector (para [0070] FIG. 5 also includes a PD 506 generated to identify latent attribute identifiers that may be associated with the class identifiers “F Name” and “L Name”.); 
Ha and Ganjam are both directed towards the same field of endeavor for discovering missing values of latent variables.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the method of estimating latent variables of Ha with the database for storing latent variables and identifiers of Ganjam.
Doing so would allow for parsing and/or predicting can be conducted iteratively at varying levels of granularity to improve accuracy and expose more hidden data (Abs.)
Regarding Claim 7,
Ha, Arthur, and Ganjam teach the method of claim 1. Ha further teaches wherein each latent descriptor vector in the plurality of latent descriptor vectors is initialized with random values prior to training the model (pg. 21; z = sampler.generate_z() # vector of 32 random gaussian samples ~ N).
Regarding Claim 12,
Ha teaches a system, comprising: 
a model that implements a machine-learning algorithm (pg. 3; This function can also be defined as a neural network, with arbitrary architectures.); and 
wherein the model is trained by: 
receiving a set of training data (pg. 7; In the current literature, the training dataset is typically composed of small images (such as 32x32px or 64x64, though I have seen 128x128 and 256x256 sets being used).), wherein the set of training data includes a plurality of input vectors (pg. 8; The vectors x, y, and r are all the possible coordinates we want to compute the pixel intensity for in our image, so for an image size of (26x26), we will need to calculate a total of 676 pixel intensities, to form (32+676+676+676) inputs into the network.) and a plurality of desired output vectors (pg. 20; y_real = Discriminator(Sample Image, w_d) The sample image is the desired output vector which is denoted by “xData” in the VAE network in figure “Our final CPPN model combined with GAN + VAE:” on page 19.), wherein each input vector in the plurality of input vectors is associated with a latent descriptor vector in the plurality of latent descriptor vectors (pg. 8; The input vector Z is a vector of 32 real numbers that are drawn from unit gaussian random number generator, and all of them will be independent. Input vectors x, y, and r are associated with latent vector Z.), 
providing a plurality of augmented input vectors as inputs to the model, wherein each augmented input vector in the plurality of augmented input vectors comprises an input vector in the plurality of input vectors (pg. 8; The vectors x, y, and r are all the possible coordinates we want to compute the pixel intensity for in our image, so for an image size of (26x26), we will need to calculate a total of 676 pixel intensities, to form (32+676+676+676) inputs into the network. X, Y, and Z are the input vectors.) and a latent descriptor vector… (pg. 8; The input vector Z is a vector of 32 real numbers that are drawn from unit gaussian random number generator, and all of them will be independent. Z is the latent descriptor vector.);
applying, by the model, a set of model parameters to the plurality of augmented input vectors to generate a set of output vectors (pg. 9; As we have seen earlier, by keeping the weights of the network constant, and by controlling the input Z vector, we can get a rich set of output images using this method. The goal here is to train our network in a such a way that for any random Z vector we put in, the output would look like an image from our MNIST training set, and we would not really be able to tell them apart. The output vector is shown as Xgen (generated image) from the Generator network See figure “Our final CPPN model combined with GAN + VAE:” on page 19. And pg. 10; We can define these performance measures, or cost functions, to evaluate the performance of the generator network: G = −log(y∣Image Generated by G)), and 
adjusting the set of model parameters (Pg. 19; Training this combined model will require some tweaks to our existing algorithm, because we will also need to train to optimise the VAE’s error. Note that we will adjust both Wq and Wg when optimising for both G_loss and VAE_loss. The weights (i.e model parameters) are adjusted based on the loss.) and at least one latent descriptor vector (pg. 20; Z = Encoder(Sample Image, Z_vae_noise) Please note the Encoder network is updated based on the adjusted weights (i.e. model parameters) and Z (i.e latent descriptor vector) is set to the output of the Encode netowork (See figure “Our final CPPN model combined with GAN + VAE:” on page 19.). Pg. 17; VAE’s help use do two things. Firstly, they allow us to encode existing image into a much smaller latent Z vector, kind of like compression. It does this by passing an image through the encoder network, which we will call the Q network, with weights Wq And from this encoded latent vector Z, the generator network will produce an image that will be as close as possible to the original image passed in, hence it is an autoencoder system. ) …based on differences between each output vector in the set of output vectors and a corresponding desired output vector in the plurality of desired output vectors (pg. 10; Likewise, we can evaluate the performance of the discriminator in a similar way: D = −log(y∣Real Training Sample) D = −log(1 − y∣Image Generated by G) D = 0.5D + 0.5D If the discriminator is doing a great job, both D and D will be a number close to zero… The generator network can also be trained via backprop as well by adjusting the weights w in the direction of the gradient that will make G smaller (thereby also making D larger). A difference is computed between the sample image (i.e. desired output vectors) and the image generated (i.e. output vector). Pg. 20; Calculate G_loss Calculate dG_loss/dW_g and dG_loss/dW_q gradients via backprop #i.e., the direction that would make D crappier Adjust W_g's W_q's using SGD-type optimising step; Here we see the weight W_g for the generator network and W_q being adjusted based on the difference computed by the loss.), 
Ha does not explicitly disclose
a database storing a plurality of latent descriptor vectors, 
, a plurality of training identifiers, …wherein each input vector in the plurality of input vectors is associated with a training identifier in the plurality of training identifiers,
initializing, prior to training the model, the plurality of latent descriptor vectors, wherein each latent descriptor vector stored in the database is mapped to one training identifier in the plurality of training identifiers, 
a latent descriptor vector selected from the database using the training identifier that is associated with the input vector,
However, Arthur (US 20140244631 A1) teaches
a database storing a plurality of latent descriptor vectors (para [0033] …storing, in a latent feature vector database, latent feature vectors associated with each of the plurality of multimedia digital assets…),
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the machine-learning model of Ha which utilizes latent vectors with the latent feature vector database of Arthur.
Doing so would allow for the storage and retrieval of latent vectors. Storing latent vectors in a database would be an improvement to the teachings of Ha because it would allow the user to search and query specific data stored in the database. This would help narrow number of results resulting in faster data retrieval. (para [0033] …a search query from a user or group of users…comparing the normalized target latent feature vector with latent feature vectors retrieved from the latent feature vector database for each of the plurality of multimedia digital assets to generate a latent comparison value between each of the plurality of multimedia digital assets and the discovered target multimedia digital asset…).-4-
Ganjam teaches
, a plurality of training identifiers, …wherein each input vector in the plurality of input vectors is associated with a training identifier in the plurality of training identifiers (para [0065] FIG. 5 includes an example element of corporal data 500, an example probabilistic database (“PD”) 502 for a token-of-interest (“TOI”) 504, “Sofia” to find class identifiers associated with “Sofia”, an example PD 506 for identifying a latent attribute related to the class identifiers “F Name” and “L Name”, and an example PD 508 for identifying a latent attribute value for the TOIs “Sofia” “Barga” and the latent attribute, “Gender”. Class identifier (i.e. training identifier).),
initializing, prior to training the model (fig. 4, para [0057] Database populated with structured data from parsed unstructured data before training a model. para [0021] In some examples, the techniques discussed herein can parse data and/or predict related data with limited or no training, which greatly reduces the complexity of development and deployment of the techniques.), the plurality of latent descriptor vectors, wherein each latent descriptor vector stored in the database is mapped to one training identifier in the plurality of training identifiers (para [0065] FIG. 5 includes an example element of corporal data 500, an example probabilistic database (“PD”) 502 for a token-of-interest (“TOI”) 504, “Sofia” to find class identifiers associated with “Sofia”, an example PD 506 for identifying a latent attribute related to the class identifiers “F Name” and “L Name”, and an example PD 508 for identifying a latent attribute value for the TOIs “Sofia” “Barga” and the latent attribute, “Gender”. Class identifier (i.e. training identifier) identifies (i.e. maps to) a latent attribute and value (i.e. latent vector).), 
a latent descriptor vector selected from the database using the training identifier that is associated with the input vector (para [0070] FIG. 5 also includes a PD 506 generated to identify latent attribute identifiers that may be associated with the class identifiers “F Name” and “L Name”.),
Ha and Ganjam are both directed towards the same field of endeavor for discovering missing values of latent variables.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the method of estimating latent variables of Ha with the database for storing latent variables and identifiers of Ganjam.
Doing so would allow for parsing and/or predicting can be conducted iteratively at varying levels of granularity to improve accuracy and expose more hidden data (Abs.)
Regarding Claim 22,
Ha, Arthur, and Ganjam teach the method of claim 1. Ganjam further teaches wherein at least one training identifier of the plurality of training identifiers is assigned to more than one of the input vectors in the plurality of input vectors (para [0065] FIG. 5 includes an example element of corporal data 500, an example probabilistic database (“PD”) 502 for a token-of-interest (“TOI”) 504, “Sofia” to find class identifiers associated with “Sofia”, an example PD 506 for identifying a latent attribute related to the class identifiers “F Name” and “L Name”, and an example PD 508 for identifying a latent attribute value for the TOIs “Sofia” “Barga” and the latent attribute, “Gender”. The identifier ‘Sofia’ is associated with a plurality of input vectors (i.e. rows).).
Regarding Claim 27,
Claims 27 is the system corresponding to the method of claim 1. Claim 27 is substantially similar to claim 22 and is rejected on the same grounds.
Regarding Claim 28,
Ha, Arthur, and Ganjam teach the method of claim 1. Ha further teaches further comprising, after training the model: 
receiving a new input vector that is not associated with a training identifier (pg. 8; The input vector Z is a vector of 32 real numbers that are drawn from unit gaussian random number generator, and all of them will be independent.); 
augmenting the new input vector with a first latent descriptor vector to produce a new augmented input vector (pg. 8; The input vector Z is a vector of 32 real numbers that are drawn from unit gaussian random number generator, and all of them will be independent. The vectors x, y, and r are all the possible coordinates we want to compute the pixel intensity for in our image, so for an image size of (26x26), we will need to calculate a total of 676 pixel intensities, to form (32+676+676+676) inputs into the network. Z is used to augment x, y, and r. See the drawing of the Generator Network.); and -7- 
applying the set of model parameters to the new augmented input vector by the model to generate an output vector (pg. 9; As we have seen earlier, by keeping the weights of the network constant, and by controlling the input Z vector, we can get a rich set of output images using this method. The goal here is to train our network in a such a way that for any random Z vector we put in, the output would look like an image from our MNIST training set, and we would not really be able to tell them apart. The output vector is shown as Xgen (generated image) from the Generator network See figure “Our final CPPN model combined with GAN + VAE:” on page 19. And pg. 10; We can define these performance measures, or cost functions, to evaluate the performance of the generator network: G = −log(y∣Image Generated by G)).
Regarding Claim 29,
Ha, Arthur, and Ganjam teach the method of claim 28. Arthur further teaches wherein the first latent descriptor vector is calculated based on at least a portion of the plurality of latent descriptor vectors stored in the database (para [0033]  in a latent feature vector database, latent feature vectors associated with each of the plurality of multimedia digital assets, comparing the normalized semantic feature vector with semantic feature vectors retrieved from the semantic feature vector database for each of the plurality of multimedia digital assets to generate a semantic comparison value between each of the plurality of multimedia digital assets and the discovered target multimedia digital asset, comparing the normalized target latent feature vector with latent feature vectors retrieved from the latent feature vector database for each of the plurality of multimedia digital assets to generate a latent comparison value between each of the plurality of multimedia digital assets and the discovered target multimedia digital asset).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the machine-learning model of Ha which utilizes latent vectors with the latent feature vector database of Arthur.
Doing so would allow for the storage and retrieval of latent vectors. Storing latent vectors in a database would be an improvement to the teachings of Ha because it would allow the user to search and query specific data stored in the database. This would help narrow number of results resulting in faster data retrieval. (para [0033] …a search query from a user or group of users…comparing the normalized target latent feature vector with latent feature vectors retrieved from the latent feature vector database for each of the plurality of multimedia digital assets to generate a latent comparison value between each of the plurality of multimedia digital assets and the discovered target multimedia digital asset…).

Claims 3 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Ha et al., Arthur et al., and Ganjam et al., as applied above, and further in view of Chung et al. (“You said that?”).
Regarding Claim 3,
Ha, Arthur, and Ganjam teach the method of claim 1. 
	Ha, Arthur, and Ganjam do not explicitly disclose wherein the set of model parameters comprises filter coefficients for one or more convolution kernels of a convolution layer of a deep convolutional neural network
However, Chung further teaches wherein the set of model parameters comprises filter coefficients for one or more convolution kernels of a convolution layer of a deep convolutional neural network ([chapter 3.2] The layer configurations is based on AlexNet [16] and VGGM [2], but filter sizes adapted for the unusual input dimensions And [see Fig.7] /2 refers to the stride of each kernel in a specific layer which is normally of equal stride in both spatial dimensions).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the variational autoencoder of Ha, Arthur, and Ganjam with the autoencoder of Chung.
Doing so would allow for both audio and image data as inputs. This would be an improvement because it would allow for the network to be used for a wider range of applications. For instance, Chung discloses the application of face tracking and video descriptions used by the networks (pg. 2; Moving forward, current research shows adversarial training proposed by [8] works well for generating natural-looking images; conditional generative models [20] are able to generate images based on auxilary information such as a class label. Our Speech2Vid model is closest in spirit to the image-to-image model by Isola et al. [10] in that we generate an output that closely resembles the input, but in our case we have both audio and image data as inputs.)
Regarding Claim 14,
Claims 14 is the system corresponding to the method of claim 1. Claim 14 is substantially similar to claim 3 and is rejected on the same grounds

Claims 5, 16, 21, and 26 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Ha et al., Arthur et al., and Ganjam et al., as applied above, and further in view of Okanohara et al. (US-20180365089-A1).
Regarding Claim 5,
Ha, Arthur, and Ganjam teach the method of claim 1.  
	Ha, Arthur, and Ganjam do not explicitly disclose
further comprising, after the adjusting: 
receiving a new input vector that is included in the plurality of input vectors; 
augmenting the new input vector with a first latent descriptor vector of the at least one latent descriptor vector stored in the database to produce a new augmented input vector; and 
applying the set of model parameters to the new augmented input vector by the model to generate an output vector.
However, Okanohara further teaches
receiving a new input vector that is included in the plurality of input vectors (para [0123] Therefore, the restored data x.about. obtained by one inference and generation process may be further input to the encoder.); 
augmenting the new input vector with a first latent descriptor vector of the at least one latent descriptor vector stored in the database to produce a new augmented input vector (para [0123] For example, the restored data close to the value of normal data can be obtained by repeating process including generating restored data x.sub.0.about. from input data x, generating restored data x.sub.1.about. from restored data x.sub.0.about.,); and 
applying the set of model parameters to the new augmented input vector by the model to generate an output vector (para [0123] generating restored data x.sub.2.about. from restored data x.sub.1.about., inputting the obtained output to the encoder again.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the variational autoencoder of Ha, Arthur, and Ganjam with the autoencoder of Okanohara.
Doing so would allow for regularizing error for the discriminator network and the encoder network. Regularization error improves not only the accuracy of the encoder but the accuracy of the discriminator as well (para [0106] In this way, a regularization error is obtained in the process of distinguishing between normal data and false data in the discriminator, and not only the discriminator but also the parameters of the encoder are updated and learned using the regularization error, so that this improves the accuracy of inference in the encoder and improves the discrimination accuracy of the discriminator.)
Regarding Claim 16,
Claims 16 is the system corresponding to the method of claim 1. Claim 16 is substantially similar to claim 5 and is rejected on the same grounds
Regarding Claim 21,
Ha, Arthur, and Ganjam teach the method of claim 1. 
	Ha, Arthur, and Ganjam does not explicitly disclose wherein, after the at least one latent descriptor vector is adjusted, the plurality of latent descriptor vectors encode residual data that is included in the desired output vectors and that is missing from the input vectors.
However, Okanohara further teaches wherein, after the at least one latent descriptor vector is adjusted, the plurality of latent descriptor vectors encode residual data that is included in the desired output vectors and that is missing from the input vectors (para [0123] In a second improvement in learning and in abnormality detection, in the first to fourth embodiments, in any of the learning process and abnormality detection process, the input data x is input to the encoder, the expression z ( latent variable z) of the input data is inferred, and the expression z of the input data is input to a decoder to generate restored data and abnormality detection is performed by comparing the input data x with the restored data x.about..).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the variational autoencoder of Ha, Arthur, and Ganjam with the autoencoder of Okanohara.
Doing so would allow for regularizing error for the discriminator network and the encoder network. Regularization error improves not only the accuracy of the encoder but the accuracy of the discriminator as well (para [0106] In this way, a regularization error is obtained in the process of distinguishing between normal data and false data in the discriminator, and not only the discriminator but also the parameters of the encoder are updated and learned using the regularization error, so that this improves the accuracy of inference in the encoder and improves the discrimination accuracy of the discriminator.)
Regarding Claim 26,
Claims 26 is the system corresponding to the method of claim 1. Claim 26 is substantially similar to claim 21 and is rejected on the same grounds.

Claims 9, 10 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Ha et al., Arthur et al., and Ganjam et al., as applied above, and further in view of Cao et al. ("Expressive speech-driven facial animation.").
Regarding Claim 9,
Ha, Arthur, and Ganjam the method of claim 1. 
	Ha, Arthur, and Ganjam do not explicitly disclose
wherein each input vector in the plurality of input vectors comprise speech features, each latent descriptor vector in the plurality of latent descriptor vectors is an E-dimensional emotional state vector, and each augmented input vector is an abstract feature vector describing a facial pose.
	However, Cao (Expressive Speech-Driven Facial Animation) teaches 
wherein each input vector (pg. 1288, section 4.1; To organize these sequences of phoneme/anime pairs into a useful structure, we need to associate each pair with an appropriate feature vector that we can later use during matching and searching.) in the plurality of input vectors comprise speech features (pg. 1285, section 2.1; This model can be applied to statistically interpolate novel video frames corresponding to input speech segments.), each latent descriptor vector in the plurality of latent descriptor vectors is an E-dimensional emotional state vector (pg. 1288, section 4.1; This vector consists of the trajectories of the first nine parameters returned by RASTA-PLP. In addition, each example utterance has an emotion label E that indicates one of five emotions: Happy, Angry, Neutral, Sad, Frustrated. All phonemes in a sequence share the same label E.), and each augmented input vector is an abstract feature vector describing a facial pose (pg. 1288, section 4.1; Each recorded utterance is converted to a sequence of nodes with one node per phoneme which we call animes. An anime captures a phoneme instance and contains a phoneme label, the associated motion segment, and other audio information. Like a viseme, an anime is a visual representation of a phoneme. Unlike a viseme that captures a single frame of a facial pose, an anime holds a number of motion frames.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the method of training a machine learning model of Ha, Arthur, and Ganjam with the method of training of Cao.
	Doing so would allow scaling feature values to fit within a predetermined range. Scaling feature values results in improved accuracy which is important for the reliability of the classification result (pg. 1298; Experiments. The training set is scaled so that feature values range from −1 to 1. Scaling the data has been proved to increase the accuracy of the classification result.).
Regarding Claim 10,
Ha, Arthur, Ganjam, and Cao teach the method of claim 9. Cao further teaches wherein the model generates three-dimensional positions of a plurality of vertices in a mesh by processing each augmented input vector (pg. 1287, section 3.1; Geometric Face Model. Pighin et al. [1998] developed an imaged-based rendering technique that constructs high-quality 3D face models from photographs. We applied this technique to photographs of the performer and constructed a separate face model for each of the emotions we consider. In addition, Pighin et al’s method [1998] produces a high-quality texture map of the face which we apply to our 3D geometric model to produce a realistic result).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the method of training a machine learning model of Ha, Arthur, and Ganjam with the method of training of Cao.
	Doing so would allow scaling feature values to fit within a predetermined range. Scaling feature values results in improved accuracy which is important for the reliability of the classification result (pg. 1298; Experiments. The training set is scaled so that feature values range from −1 to 1. Scaling the data has been proved to increase the accuracy of the classification result.).
Regarding Claim 19,
Claims 19 is the system corresponding to the method of claim 1. Claim 19 is substantially similar to claim 9 and is rejected on the same grounds.

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Ha et al., Arthur et al., and Ganjam et al., as applied above, and further in view of Burket et al. (“Deep Convolutional Neural Network for Expression Recognition”).
Regarding Claim 20,
Ha, Arthur, and Ganjam teach the system of claim 12.  
	Ha, Arthur, and Ganjam do not explicitly disclose
wherein the model is implemented as a set of instructions executed on a parallel processing unit.
However, Burket teaches 
wherein the model is implemented as a set of instructions executed on a parallel processing unit ([chapter 5] The key structure in our architecture is the Parallel Feature Extraction Block ... In the parallel path a Max Pooling layer is used to reduce information before applying a CNN of size 1 x 1).
Ha, Arthur, Ganjam, and Burkert are analogous in the arts because they describe convolutional neural networks using image data.
Therefore, it would be obvious to one of ordinary skill in the art at the filing date of the instant application, having the teachings of Ha, Arthur, and Ganjam before him and her, to create the Convolutional Neural Network (CNN) model taught by Ha, Arthur, and Ganjam, and modify the model using the parallel CNN of Burkert.
Doing so would allow for improving the accuracy of the convolutional neural network. Improved accuracy leads to better performance and increases the efficacy and reliability of the CNN to be used for real world applications (pg. 1; For the MMI dataset, currently the best accuracy for emotion recognition is 93.33%. The proposed architecture achieves 99.6% for CKP and 98.63% for MMI, therefore performing better than the state of the art using CNNs. Automatic facial expression recognition has a broad spectrum of applications such as human-computer interaction and safety systems. This is due to the fact that non-verbal cues are important forms of communication and play a pivotal role in interpersonal communication. The performance of the proposed architecture endorses the efficacy and reliable usage of the proposed work for real world applications.)

Claim 30 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Ha et al., Arthur et al., and Ganjam et al., as applied above, and further in view of Heisterkamp ("Building a latent semantic index of an image database from patterns of relevance feedback.").
Regarding Claim 30,
Ha, Arthur, and Ganjam teach the method of claim 28. 
	Ha, Arthur, and Ganjam do not explicitly disclose
randomly selecting an index within a range of indices that represent the latent descriptor vectors stored in the database; and 
selecting the first latent descriptor vector from the database using the index.
However, Heisterkamp teaches
randomly selecting an index within a range of indices that represent the latent descriptor vectors stored in the database (pg. 134; The intra-query information from relevance feedback is a document whose words are the images of the database and whose meaning expresses the semantic intent of the user. The inter-query information takes the form of a collection of documents which can be subjected to latent semantic analysis.); and 
selecting the first latent descriptor vector from the database using the index (pg. 1366; Segmentation - the segmentation data set, taken from the UCI repository [SI, consists of images that were drawn randomly from a database of 7 outdoor images.).
Ha, Arthur, Ganjam, and Heisterkamp are analogous arts because they are both directed towards learning latent variables.
It would have obvious to one of ordinary skill in the art before the effective filing date to modify the latent variables of Ha, Arthur, and Ganjam with the method of indexing latent variables of Heisterkamp.
Doing so would allow for incorporating relevance feedback generated by the data retrieval of the user. Using intra-query information increases performance of the data retrieval system (Abs.)

Claim 31 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Ha et al., Arthur et al., and Ganjam et al., as applied above, and further in view of Wierstra et al. (US-20170230675-A1).
Regarding Claim 31,
Ha, Arthur, and Ganjam teach the method of claim 28. 
	Ha, Arthur, and Ganjam do not explicitly disclose
wherein the first latent descriptor vector is randomly selected from the database.
However, Wierstra (US 20170230675 A1) teaches
wherein the first latent descriptor vector is randomly selected from the database (para [0059] The system generates the reconstruction of the compressed image by conditioning the generative neural network on the reconstructed compression latent variables and the randomly selected remaining latent variables (step 308).).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the encoder neural network of Ha, Arthur, and Ganjam with the encoder neural network of Wierstra.
Doing so would allow for improved compression. Improved compression results in better quality images which is important for the accuracy and reliability of the system. In addition, compression allows for smaller files which saves space in memory (para [0009] By compressing images using latent variables defined by outputs of encoder neural networks, an improved lossy image compression scheme, i.e., a scheme that improves compression quality, can be achieved. That is, compressed representations of images generated as described in this specification can be smaller in size than compressed representations generated by other lossy compression techniques yet can be reconstructed to yield reconstructed images that have a quality that is equal to or better than those generated by the other lossy compression techniques.)

Claim 32 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Ha et al., Arthur et al., and Ganjam et al., as applied above, and further in view of Steinemann et al. (US 20170139994 A1).
Regarding Claim 32,
Ha, Arthur, and Ganjam teach the method of claim 28. Arthur further teaches further comprising: 
selecting, based on a comparison between the new input vector and the plurality of input vectors (para [0033] comparing the normalized target latent feature vector with latent feature vectors retrieved from the latent feature vector database for each of the plurality of multimedia digital assets to generate a latent comparison value between each of the plurality of multimedia digital assets and the discovered target multimedia digital asset), 
Ha, Arthur, and Ganjam do not explicitly disclose
an index within a range of indices that represent the latent descriptor vectors stored in the database; and 
selecting the first latent descriptor vector from the database using the index.
However, Steinemann (US 20170139994 A1) teaches
an index within a range of indices that represent the latent descriptor vectors stored in the database (para [0026] As noted above, the ValueID lookup table can include group ranges of ValueIDs such that the ValueID lookup table lists, for each group range, a first occurrence in the index vector of any ValueID in the group range.); and 
selecting the first latent descriptor vector from the database using the index (para [0026] At 220, a row of an index vector for the database column at which to begin a scan for the ValueID is identified. The identifying includes reading a ValueID lookup table that maps each unique ValueID to a starting position in the index vector for the database column such that the ValueID does not occur in the index vector prior to the starting position.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the latent vector database of Ha, Arthur, and Ganjam with the vector index of Steinemann.
Doing so would allow for quickly scanning through a database with limited memory and computing resources (para [0014]).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Steck (US 20170024391 A1) – discloses a machine learning model using latent matrices.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HENRY K NGUYEN whose telephone number is (571)272-0217. The examiner can normally be reached Mon - Fri 7:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 5712723768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/H.N./Examiner, Art Unit 2121                                                                                                                                                                                                        


/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121