Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to amendments and remarks filed on 08/15/2022. In the current amendments, claims 1, 10-11 and 19 are amended. Claims 1-20 are pending and have been examined.
In response to amendments to the Drawing and Claim filed on 8/15/2022, the objections to Drawing and Claim put forth in the previous Office Action have been withdrawn.

Claim Interpretation
“computer readable storage medium” in claims 10-18 are interpreted as “non-transitory computer readable storage medium” in view of [0083], which recites “A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.”

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5 and 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Burgess et al. (“Understanding disentangling in β-VAE”) in view of Ma et al. (“Probabilistic Representation and Inverse Design  of Metamaterials Based on a Deep Generative Model  with Semi-Supervised Learning Strategy”)
Regarding Claim 1,
Burgess al. teaches a computer-implemented method of improving an operation of a variational autoencoder, comprising (Burgess et al., Section 6 Pg. 8, “We have developed new insights into why β-VAE learns an axis-aligned disentangled representation of the generative factors of visual data compared to the standard VAE objective. In particular, we identified pressures which encourage β-VAE to find a set of representational axes which best preserve the locality of the data points, and which are aligned with factors of variation that make distinct contributions to improving the data log likelihood. We have demonstrated that these insight produce an actionable modification to the β-VAE training regime” teaches a method for improving the training of Variational Autoencoders).
Burgess et al. does not appear to explicitly teach training a generator network of a variational autoencoder to approximate a simulator and generate a first result, wherein the simulator, given input data, outputs output data that simulates output of an entity the simulator is simulating, wherein a training data set for the generator network includes the simulator's input data and output data; based on the simulator's output data and the first result of the generator network, training an inference network of the variational autoencoder to generate a second result, the second result of the trained inference network inverting the first result of the generator and approximating the simulator's input data, the trained inference network functioning as an inverted simulator
training a generator network of a variational autoencoder to approximate a simulator and generate a first result (Ma et al., Figure 1 and Pg. 3 Para. 2, “the entire deep generative model is trained in an end-to-end manner with both labeled and unlabeled data employing a semi-supervised learning strategy… This means the proposed model can efficiently learn from similar metamaterial patterns without corresponding optical response obtained by numerical simulations” teaches training a deep generative model (corresponds to a generator network of a variational autoencoder) to approximate an optical response of the simulator and generates a first result).
wherein the simulator, given input data, outputs output data that simulates output of an entity the simulator is simulating (Ma et al., Figure 1 and Pg. 3 Para. 3, “The  ground-truth geometry and reflection spectra obtained by numerical simulations… The forward prediction performance is first evaluated by feeding the ground-truth geometry to the prediction model, and the output spectra… The  excellent agreement  between the predicted spectra by our model and the  numerically simulated spectra clearly confirms that our model can function as an effective simulator for fast metamaterial characterization” teaches the simulator obtaining the ground-truth  geometry  and  reflection spectra (corresponds to input data) and outputting effective simulator for fast metamaterial characterization (corresponds to outputs output data that simulates output of an entity the simulator is simulating)).
wherein a training data set for the generator network includes the simulator's input data and output data (Ma et al., Figure 1 and Pg. 3 Para. 2, “the entire deep generative model is trained in an end-to-end manner with both labeled and unlabeled data employing a semi-supervised learning strategy. We show that, with the aid of unlabeled data, the model performance is obviously improved (Table S2, Supporting Information). This means the proposed model can efficiently learn from similar metamaterial patterns without corresponding optical response obtained by numerical simulations, which alleviate the burden in data acquisition compared with other supervised learning counterpart” teaches the generative model being trained with the geometric pattern of metamaterial structure (corresponds to the simulator's input data and output data) from the numerical simulator).
based on the simulator's output data and the first result of the generator network, training an inference network of the variational autoencoder to generate a second result, the second result of the trained inference network inverting the first result of the generator and approximating the simulator's input data, the trained inference network functioning as an inverted simulator (Ma et al., Figure 1 and Pg. 3 Para. 1, “generation model for the inverse design process of metamaterial given required spectra” teaches the generation model for the inverse design (corresponds to inference network of the variational autoencoder to generate a second result) that samples for inverse generation (corresponds to the second result of the trained inference network inverting the first result of the generator) to approximate the metamaterial design and optical response (corresponds to approximating the simulator's input data)).
It would have been obvious to one of ordinary skill in the art before the effective filing data of the claimed invention approximate a simulator with a variational autoencoder, as taught by Ma et al., to improve the operation of a variational autoencoder of Burgess et al. The motivation to for an effective simulator for fast metamaterial characterization (Ma et al., Pg. 4 Para. 4, “The excellent agreement between the predicted spectra by our  model and the numerically simulated spectra clearly confirms that our model can function as an effective simulator for fast metamaterial characterization”).
Regarding Claim 2,
The Burgess et al. in view of Ma et al. combination of claim 1 teaches the method of claim 1, 
The combination, as described in the rejection of claim 1, further teaches wherein the simulator's input data, based on which the generator network was trained, includes a representation of latent space of the variational autoencoder (Ma et al., Figure 1 and Pg. 2 Para. 6, “our deep generative model can be decomposed into three submodels, namely, recognition model for the encoding process of metamaterial structures into latent space” teaches latent space of the deep generative model (corresponds to the variational autoencoder)).
Regarding Claim 3,
The Burgess et al. in view of Ma et al. combination of claim 2 teaches the method of claim 2,
The combination, as described in the rejection of claim 2, further teaches wherein the latent space is disentangled and interpretable (Burgess et al., Section 1 Pg. 2, “β-VAE adds an extra hyperparameter β to the VAE objective, which constricts the effective encoding capacity of the latent bottleneck and encourages the latent representation to be more factorised. The disentangled representations learnt by β-VAE have been shown to be important for learning a hierarchy of abstract visual concepts conducive of imagination [17] and for improving transfer performance of reinforcement learning policies, including simulation to reality transfer in robotics” teaches the latent representation being disentangled and interpretable.
Regarding Claim 4,
The Burgess et al. in view of Ma et al. combination of claim 1 teaches the method of claim 1, 
The combination, as described in the rejection of claim 1, further teaches wherein the training the generator network includes supervised training (Ma et al., Pg. 2 Para. 3, “the proposed deep generative model offers interpretability and can utilize unlabeled data in a semi-supervised learning strategy to improve the model performance” teaches the deep generative model includes semi-supervised learning (corresponds supervised training)).
Regarding Claim 5,
The Burgess et al. in view of Ma et al. combination of claim 1 teaches the method of claim 1,
The combination, as described in the rejection of claim 1, further teaches wherein the training the inference network includes unsupervised training (Burgess et al., Section 1 Pg. 1, “β-VAE is a state of the art model for unsupervised visual disentangled representation learning. It is a modification of the Variational Autoencoder (VAE)” teaches unsupervised training on a β-VAE (corresponds to a variational autoencoder that consist of an inference network).
Regarding Claim 8,
The Burgess et al. in view of Ma et al. combination of claim 1 teaches the method of claim 1,
The combination, as described in the rejection of claim 1, further teaches wherein the generator network is an artificial neural network (Burgess et al., Section A.1 Pg. 11, “The neural network models used for experiments in this paper all utilised the same basic architecture. The encoder for the VAEs consisted of 4 convolutional layers, each with 32 channels, 4x4 kernels, and a stride of 2. This was followed by 2 fully connected layers, each of 256 units. The latent distribution consisted of one fully connected layer of 20 units parametrising the mean and log standard deviation of 10 Gaussian random variables (or 32 for the CelebA experiment). The decoder architecture was simply the transpose of the encoder, but with the output parametrising Bernoulli distributions over the pixels. ReLU activations were used throughout” teaches the neural network models (corresponds to the artificial neural network) architecture used for an encoder (corresponds to the generator network)).
Regarding Claim 9,
The Burgess et al. in view of Ma et al. combination of claim 1 teaches the method of claim 1,
The combination, as described in the rejection of claim 1, further teaches wherein the inference network is an artificial neural network (Burgess et al., Section A.1 Pg. 11, “The neural network models used for experiments in this paper all utilised the same basic architecture. The encoder for the VAEs consisted of 4 convolutional layers, each with 32 channels, 4x4 kernels, and a stride of 2. This was followed by 2 fully connected layers, each of 256 units. The latent distribution consisted of one fully connected layer of 20 units parametrising the mean and log standard deviation of 10 Gaussian random variables (or 32 for the CelebA experiment). The decoder architecture was simply the transpose of the encoder, but with the output parametrising Bernoulli distributions over the pixels. ReLU activations were used throughout” teaches the neural network models (corresponds to the artificial neural network) architecture used for a decoder (corresponds to the inference network)).
Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Burgess et al. in view of Ma et al. in further view of Krajewski et al. (“Data-Driven Maneuver Modeling using Generative Adversarial Networks and Variational Autoencoders for Safety Validation of Highly Automated Vehicles”)
Regarding Claim 6,
The Burgess et al. in view of Ma et al. combination of claim 1 teaches the method of claim 1,
Burgess et al. in view of Ma et al. does not appear to explicitly teach wherein the training the generator network includes minimizing a measure of discrepancy D on observations from the simulator and the generator network on the same input data
However, Krajewski et al., teaches wherein the training the generator network includes minimizing a measure of discrepancy D on observations from the simulator and the generator network on the same input data (Krajewski et al., Section III.C Pg. 2387, “To measure the trajectory quality for our TraVAE, we do not need to repurpose any part of the network, as the network already minimizes the mean squared error loss between the original trajectories and the reconstructed trajectories during training. Although, images reconstructed by autoencoders are typically blurry, this does not pose a problem for trajectories, as smooth trajectories are desired” teaches training of the TraVAE (corresponds to the variational autoencoder that includes a generator network) by minimizing the mean squared error loss (corresponds to the measure of discrepancy D) of the original trajectory (corresponds to observation from the simulator that is the input data to the generator)).
It would have been obvious to one of ordinary skill in the art before the effective filing data to minimize a measure of discrepancy D on observations from the simulator and the generator network on the same input data, as taught by Krajewski et al., to improve the operation of a variational autoencoder of Burgess et al. The motivation to improve other neural networks or to guide the test case generation (Krajewski et al., Conclusion, “Both networks, called TraGAN and TraVAE, are able to generate synthetic lane change maneuver trajectories that are realistic”).
Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Burgess et al. in view of Ma et al. in view of Krajewski et al. in further view of Tran et al. (“Hierarchical Implicit Models and Likelihood-Free Variational Inference”)
Regarding Claim 7,
The Burgess et al. in view of Ma et al. combination of claim 1 teaches the method of claim 1,
Burgess et al. in view of Ma et al. does not appear to explicitly teach wherein the training the generator network includes minimizing a measure of discrepancy D on observations from the simulator and the generator network on the same input data
However, Krajewski et al., teaches wherein the training the inference network includes minimizing a measure of discrepancy D on observations from the simulator on the input data and from the generator network on posterior parameters that the inference network outputs (Krajewski et al., Section III.C Pg. 2387, “To measure the trajectory quality for our TraVAE, we do not need to repurpose any part of the network, as the network already minimizes the mean squared error loss between the original trajectories and the reconstructed trajectories during training. Although, images reconstructed by autoencoders are typically blurry, this does not pose a problem for trajectories, as smooth trajectories are desired” teaches training of the TraVAE (corresponds to the variational autoencoder that includes a discriminator network. The discriminator network corresponds to the inference network) by minimizing the mean squared error loss (corresponds to the measure of discrepancy D) of the original trajectory (corresponds to observation from the simulator that is the input data to the generator). Section II.C.1 Pg. 2385, “An encoder network is trained to encode inputs to a low dimensional representation. To learn a meaningful latent representation, a decoder network has to reconstruct the original input using this representation. By training the network to minimize the difference between the network input and output, the latent space representation has to retain as much information as possible. After training, the decoder network is used to generate new data by sampling from the latent space” teaches training the decoder network to minimize the difference (corresponds to the discrepancy D) of the input (corresponds to input data from the generator network) and output (corresponds to the posterior parameters that the inference network outputs)).
Burgess et al. in view of Ma et al. in view of Krajewski et al. does not appear to explicitly teach which is a sample from a standard Gaussian, and latent variables of the simulator's observation are hidden
However, Tran et al., teaches which is a sample from a standard Gaussian, and latent variables of the simulator's observation are hidden (Tran et al., Fig 3a and Section 2 Pg. 2 “where xn is an observation, zn are latent variables associated to that observation (local variables), and β are latent variables shared across observations (global variables)” teaches the hidden state is now an implicit variable of the observations). Section 4 Pg. 8, “If the injected noise t,z combines linearly with the output of gz, the induced distribution p(zt | xt−1, zt−1) is Gaussian parameterized by that output” teaches the Gaussian is parameterized).
It would have been obvious to one of ordinary skill in the art before the effective filing data to include a sample from a standard Gaussian, and latent variables of the simulator's observation are hidden, as taught by Tran et al., to improve the operation of a variational autoencoder of Burgess et al. The motivation to match the model’s flexibility and allows for accurate approximation of the posterior (Tran et al., Abstract, “This matches the model’s flexibility and allows for accurate approximation of the posterior. We demonstrate diverse applications: a large-scale physical simulator for predator-prey populations in ecology; a Bayesian generative adversarial network for discrete data; and a deep implicit model for text generation”).
Claims 10-14 and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Burgess et al. in view of Aliper et al. (US 20200090049 A1) in view of Ma et al. 
Regarding Claim 10,
Burgess al. teaches for improving an operation of a variational autoencoder (Burgess et al., Section 6 Pg. 8, “We have developed new insights into why β-VAE learns an axis-aligned disentangled representation of the generative factors of visual data compared to the standard VAE objective. In particular, we identified pressures which encourage β-VAE to find a set of representational axes which best preserve the locality of the data points, and which are aligned with factors of variation that make distinct contributions to improving the data log likelihood. We have demonstrated that these insight produce an actionable modification to the β-VAE training regime” teaches a method for improving the training of Variational Autoencoders). 
Burgess et al. does not appear to explicitly teach a computer program product… the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a device to cause the device to
However, Aliper et al., teaches a computer program product… the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a device to cause the device to (Aliper et al., Para. [0038], “The standard VAE with a normal prior p(z)=N(z|0, I) can be improved by replacing p(z) with a more complex distribution pψ(z), which is referred to as a learnable prior” teaches improving a standard VAE. Para. [0007], “” teaches a computer program product. Para. [0162], “In one embodiment, any of the operations, processes, or methods, described herein can be performed or cause to be performed in response to execution of computer-readable instructions stored on a computer-readable medium and executable by one or more processors. The computer-readable instructions can be executed by a processor of a wide range of computing systems from desktop computing systems, portable computing systems, tablet computing systems, hand-held computing systems, as well as network elements, and/or any other computing device” teaches a computer readable storage medium with program instructions that are executable by a device).
It would have been obvious to one of ordinary skill in the art before the effective filing data to include a computer program product comprising a computer readable storage medium, as taught by Aliper et al., to improve the operation of a variational autoencoder of Burgess et al. The motivation wherein the provided data may omit one or more properties of the object, and still result in an object with a desired property (Aliper et al., Abstract, “The proposed model is a Variational Autoencoder having a learnable prior that is parametrized with a Tensor Train (VAE-TTLP). The VAE-TTLP can be used to generate new objects, such as molecules, that have specific properties and that can have specific biological activity (when a molecule). The VAE-TTLP can be trained in a way with the Tensor Train so that the provided data may omit one or more properties of the object, and still result in an object with a desired property”).
Burgess et al. in view of Aliper et al. does not appear to explicitly teach a train a generator network of a variational autoencoder to approximate a simulator and generate a first result, wherein the simulator, given input data, outputs output data that simulates output of an entity the simulator is simulating, wherein a training data set for the generator network includes the simulator's input data and output data; based on the simulator's output data and the first result of the generator network, train an inference network of the variational autoencoder to generate a second result, the second result of the trained inference network inverting the first result of the generator and approximating the simulator's input data, the trained inference network functioning as an inverted simulator
However, Ma et al., teaches train a generator network of a variational autoencoder to approximate a simulator and generate a first result (Ma et al., Figure 1 and Pg. 3 Para. 2, “the entire deep generative model is trained in an end-to-end manner with both labeled and unlabeled data employing a semi-supervised learning strategy… This means the proposed model can efficiently learn from similar metamaterial patterns without corresponding optical response obtained by numerical simulations” teaches training a deep generative model (corresponds to a generator network of a variational autoencoder) to approximate an optical response of the simulator and generates a first result).
wherein the simulator, given input data, outputs output data that simulates output of an entity the simulator is simulating (Ma et al., Figure 1 and Pg. 3 Para. 3, “The  ground-truth geometry and reflection spectra obtained by numerical simulations… The forward prediction performance is first evaluated by feeding the ground-truth geometry to the prediction model, and the output spectra… The  excellent agreement  between the predicted spectra by our model and the  numerically simulated spectra clearly confirms that our model can function as an effective simulator for fast metamaterial characterization” teaches the simulator obtaining the ground-truth  geometry  and  reflection spectra (corresponds to input data) and outputting effective simulator for fast metamaterial characterization (corresponds to outputs output data that simulates output of an entity the simulator is simulating)).
wherein a training data set for the generator network includes the simulator's input data and output data (Ma et al., Figure 1 and Pg. 3 Para. 2, “the entire deep generative model is trained in an end-to-end manner with both labeled and unlabeled data employing a semi-supervised learning strategy. We show that, with the aid of unlabeled data, the model performance is obviously improved (Table S2, Supporting Information). This means the proposed model can efficiently learn from similar metamaterial patterns without corresponding optical response obtained by numerical simulations, which alleviate the burden in data acquisition compared with other supervised learning counterpart” teaches the generative model being trained with the geometric pattern of metamaterial structure (corresponds to the simulator's input data and output data) from the numerical simulator). 
based on the simulator's output data and the first result of the generator network, train an inference network of the variational autoencoder to generate a second result, the second result of the trained inference network inverting the first result of the generator and approximating the simulator's input data, the trained inference network functioning as an inverted simulator (Ma et al., Figure 1 and Pg. 3 Para. 1, “generation model for the inverse design process of metamaterial given required spectra” teaches the generation model for the inverse design (corresponds to inference network of the variational autoencoder to generate a second result) that samples for inverse generation (corresponds to the second result of the trained inference network inverting the first result of the generator) to approximate the metamaterial design and optical response (corresponds to approximating the simulator's input data)).
Regarding Claim 11,
The Burgess et al. in view of Aliper et al. in view of Ma et al. combination of claim 10 teaches the computer program product of claim 10,
The combination, as described in the rejection of claim 10, further teaches wherein the simulator's input data, based on which the generator network was trained, includes a representation of latent space of the variational autoencoder (Ma et al., Figure 1 and Pg. 2 Para. 6, “our deep generative model can be decomposed into three submodels, namely, recognition model for the encoding process of metamaterial structures into latent space” teaches latent space of the deep generative model (corresponds to the variational autoencoder)).
Regarding Claim 12,
The Burgess et al. in view of Aliper et al. in view of Ma et al. combination of claim 11 teaches the computer program product of claim 11,
The combination, as described in the rejection of claim 11, further teaches wherein the latent space is disentangled and interpretable (Burgess et al., Section 1 Pg. 2, “β-VAE adds an extra hyperparameter β to the VAE objective, which constricts the effective encoding capacity of the latent bottleneck and encourages the latent representation to be more factorised. The disentangled representations learnt by β-VAE have been shown to be important for learning a hierarchy of abstract visual concepts conducive of imagination [17] and for improving transfer performance of reinforcement learning policies, including simulation to reality transfer in robotics” teaches the latent representation being disentangled and interpretable.
Regarding Claim 13,
The Burgess et al. in view of Aliper et al. in view of Ma et al. combination of claim 10 teaches the computer program product of claim 10,
The combination, as described in the rejection of claim 10, further teaches wherein the device is caused to train the generator network by supervised training (Ma et al., Pg. 2 Para. 3, “the proposed deep generative model offers interpretability and can utilize unlabeled data in a semi-supervised learning strategy to improve the model performance” teaches the deep generative model includes semi-supervised learning (corresponds supervised training)).
Regarding Claim 14,
The Burgess et al. in view of Aliper et al. in view of Ma et al. combination of claim 10 teaches the computer program product of claim 10,
The combination, as described in the rejection of claim 10, further teaches wherein the device is caused to train the inference network by unsupervised training (Burgess et al., Section 1 Pg. 1, “β-VAE is a state of the art model for unsupervised visual disentangled representation learning. It is a modification of the Variational Autoencoder (VAE)” teaches unsupervised training on a β-VAE (corresponds to a variational autoencoder that consist of an inference network).
Regarding Claim 17,
The Burgess et al. in view of Aliper et al. in view of Ma et al. combination of claim 10 teaches the computer program product of claim 10,
The combination, as described in the rejection of claim 10, further teaches wherein the generator network is an artificial neural network (Burgess et al., Section A.1 Pg. 11, “The neural network models used for experiments in this paper all utilised the same basic architecture. The encoder for the VAEs consisted of 4 convolutional layers, each with 32 channels, 4x4 kernels, and a stride of 2. This was followed by 2 fully connected layers, each of 256 units. The latent distribution consisted of one fully connected layer of 20 units parametrising the mean and log standard deviation of 10 Gaussian random variables (or 32 for the CelebA experiment). The decoder architecture was simply the transpose of the encoder, but with the output parametrising Bernoulli distributions over the pixels. ReLU activations were used throughout” teaches the neural network models (corresponds to the artificial neural network) architecture used for an encoder (corresponds to the generator network)).
Regarding Claim 18,
The Burgess et al. in view of Aliper et al. in view of Ma et al. combination of claim 10 teaches the computer program product of claim 10,
The combination, as described in the rejection of claim 10, further teaches wherein the inference network is an artificial neural network (Burgess et al., Section A.1 Pg. 11, “The neural network models used for experiments in this paper all utilised the same basic architecture. The encoder for the VAEs consisted of 4 convolutional layers, each with 32 channels, 4x4 kernels, and a stride of 2. This was followed by 2 fully connected layers, each of 256 units. The latent distribution consisted of one fully connected layer of 20 units parametrising the mean and log standard deviation of 10 Gaussian random variables (or 32 for the CelebA experiment). The decoder architecture was simply the transpose of the encoder, but with the output parametrising Bernoulli distributions over the pixels. ReLU activations were used throughout” teaches the neural network models (corresponds to the artificial neural network) architecture used for a decoder (corresponds to the inference network)).
Regarding Claim 19,
Burgess al. teaches a system for improving an operation of a variational autoencoder, comprising (Burgess et al., Para. [0017], “A system, method and technique may be provided, which can generate and train a generative network, for example, a generative adversarial network (GAN). In an embodiment, the generative network is built and trained to learn the future market uncertainty in its multidimensional form for portfolio diversification. The generative network can allow for diversified portfolio combination with a risk adjusted return. For instance, a generative network model can be trained to directly model the market uncertainty, a factor driving future price trend in multidimensional form, such that the non-linear interactions between different assets can be embedded in a generative network” teaches a system for improving the training of Variational Autoencoders).
Burgess et al. does not appear to explicitly teach a hardware processor; a memory device coupled with the hardware processor; the hardware processor configured to at least
However, Aliper et al., teaches a hardware processor (Aliper et al., Para. [0167], “In a very basic configuration 602, computing device 600 generally includes one or more processors 604” teaches the hardware processor).
a memory device coupled with the hardware processor; the hardware processor configured to at least (Aliper et al., Para. [0178], “n some embodiments, a computer program product can include a non-transient, tangible memory device having computer-executable instructions that when executed by a processor, cause performance of a method” teaches a memory device couple with the hardware processor).
Burgess et al. in view of Aliper et al. does not appear to explicitly teach train a generator network of a variational autoencoder to approximate a simulator and generate a first result, wherein the simulator, given input data, outputs output data that simulates output of an entity the simulator is simulating, wherein a training data set for the generator network includes the simulator's input data and output data; based on the simulator's output data and the first result of the generator network, train an inference network of the variational autoencoder to generate a second result, the second result of the trained inference network inverting the first result of the generator and approximating the simulator's input data, the trained inference network functioning as an inverted simulator.
However, Ma et al., teaches train a generator network of a variational autoencoder to approximate a simulator and generate a first result (Ma et al., Figure 1 and Pg. 3 Para. 2, “the entire deep generative model is trained in an end-to-end manner with both labeled and unlabeled data employing a semi-supervised learning strategy… This means the proposed model can efficiently learn from similar metamaterial patterns without corresponding optical response obtained by numerical simulations” teaches training a deep generative model (corresponds to a generator network of a variational autoencoder) to approximate an optical response of the simulator and generates a first result).
wherein the simulator, given input data, outputs output data that simulates output of an entity the simulator is simulating (Ma et al., Figure 1 and Pg. 3 Para. 3, “The  ground-truth geometry and reflection spectra obtained by numerical simulations… The forward prediction performance is first evaluated by feeding the ground-truth geometry to the prediction model, and the output spectra… The  excellent agreement  between the predicted spectra by our model and the  numerically simulated spectra clearly confirms that our model can function as an effective simulator for fast metamaterial characterization” teaches the simulator obtaining the ground-truth  geometry  and  reflection spectra (corresponds to input data) and outputting effective simulator for fast metamaterial characterization (corresponds to outputs output data that simulates output of an entity the simulator is simulating)).
wherein a training data set for the generator network includes the simulator's input data and output data (Ma et al., Figure 1 and Pg. 3 Para. 2, “the entire deep generative model is trained in an end-to-end manner with both labeled and unlabeled data employing a semi-supervised learning strategy. We show that, with the aid of unlabeled data, the model performance is obviously improved (Table S2, Supporting Information). This means the proposed model can efficiently learn from similar metamaterial patterns without corresponding optical response obtained by numerical simulations, which alleviate the burden in data acquisition compared with other supervised learning counterpart” teaches the generative model being trained with the geometric pattern of metamaterial structure (corresponds to the simulator's input data and output data) from the numerical simulator). 
based on the simulator's output data and the first result of the generator network, train an inference network of the variational autoencoder to generate a second result, the second result of the trained inference network inverting the first result of the generator and approximating the simulator's input data, the trained inference network functioning as an inverted simulator (Ma et al., Figure 1 and Pg. 3 Para. 1, “generation model for the inverse design process of metamaterial given required spectra” teaches the generation model for the inverse design (corresponds to inference network of the variational autoencoder to generate a second result) that samples for inverse generation (corresponds to the second result of the trained inference network inverting the first result of the generator) to approximate the metamaterial design and optical response (corresponds to approximating the simulator's input data)).
Regarding Claim 20,
The Burgess et al. in view of Aliper et al. in view of Ma et al. combination of claim 19 teaches the system of claim 19, 
The combination, as described in the rejection of claim 19, further teaches wherein the simulator's input data, based on which the generator network was trained, includes a representation of latent space of the variational autoencoder (Ma et al., Figure 1 and Pg. 2 Para. 6, “our deep generative model can be decomposed into three submodels, namely, recognition model for the encoding process of metamaterial structures into latent space” teaches latent space of the deep generative model (corresponds to the variational autoencoder)).
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Burgess et al. in view of Aliper et al. in view of Ma et al.  in further view of Krajewski et al. 
Regarding Claim 15,
The Burgess et al. in view of Aliper et al. in view of Ma et al. combination of claim 10 teaches the computer program product of claim 10,
Burgess et al. in view of Aliper et al. in view of Ma et al. does not appear to explicitly teach wherein the device is caused to train the generator network by minimizing a measure of discrepancy D on observations from the simulator and the generator network on the same input data
However, Krajewski et al., teaches wherein the device is caused to train the generator network by minimizing a measure of discrepancy D on observations from the simulator and the generator network on the same input data (Krajewski et al., Section III.C Pg. 2387, “To measure the trajectory quality for our TraVAE, we do not need to repurpose any part of the network, as the network already minimizes the mean squared error loss between the original trajectories and the reconstructed trajectories during training. Although, images reconstructed by autoencoders are typically blurry, this does not pose a problem for trajectories, as smooth trajectories are desired” teaches training of the TraVAE (corresponds to the variational autoencoder that includes a generator network) by minimizing the mean squared error loss (corresponds to the measure of discrepancy D) of the original trajectory (corresponds to observation from the simulator that is the input data to the generator)).
Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Burgess et al. in view of Aliper et al. in view of Ma et al.  in view of Krajewski et al. in further view of Tran et al. 
Regarding Claim 16,
The Burgess et al. in view of Aliper et al. in view of Ma et al. combination of claim 10 teaches the computer program product of claim 10,
Burgess et al. in view of Aliper et al. in view of Ma et al. does not appear to explicitly teach wherein the device is caused to train the inference network by minimizing a measure of discrepancy D on observations from the simulator on the input data and from the generator network on posterior parameters that the inference network outputs
However, Krajewski et al., teaches wherein the device is caused to train the inference network by minimizing a measure of discrepancy D on observations from the simulator on the input data and from the generator network on posterior parameters that the inference network outputs (Krajewski et al., Section III.C Pg. 2387, “To measure the trajectory quality for our TraVAE, we do not need to repurpose any part of the network, as the network already minimizes the mean squared error loss between the original trajectories and the reconstructed trajectories during training. Although, images reconstructed by autoencoders are typically blurry, this does not pose a problem for trajectories, as smooth trajectories are desired” teaches training of the TraVAE (corresponds to the variational autoencoder that includes a discriminator network. The discriminator network corresponds to the inference network) by minimizing the mean squared error loss (corresponds to the measure of discrepancy D) of the original trajectory (corresponds to observation from the simulator that is the input data to the generator). Section II.C.1 Pg. 2385, “An encoder network is trained to encode inputs to a low dimensional representation. To learn a meaningful latent representation, a decoder network has to reconstruct the original input using this representation. By training the network to minimize the difference between the network input and output, the latent space representation has to retain as much information as possible. After training, the decoder network is used to generate new data by sampling from the latent space” teaches training the decoder network to minimize the difference (corresponds to the discrepancy D) of the input (corresponds to input data from the generator network) and output (corresponds to the posterior parameters that the inference network outputs)).
Burgess et al. in view of Aliper et al. in view of Ma et al. in view of Krajewski et al. does not appear to explicitly teach wherein the device is caused to train the inference network by minimizing a measure of discrepancy D on observations from the simulator on the input data and from the generator network on posterior parameters that the inference network outputs
However, Tran et al., teaches which is a sample from a standard Gaussian, and latent variables of the simulator's observation are hidden (Tran et al., Fig 3a and Section 2 Pg. 2 “where xn is an observation, zn are latent variables associated to that observation (local variables), and β are latent variables shared across observations (global variables)” teaches the hidden state is now an implicit variable of the observations). Section 4 Pg. 8, “If the injected noise t,z combines linearly with the output of gz, the induced distribution p(zt | xt−1, zt−1) is Gaussian parameterized by that output” teaches the Gaussian is parameterized).

Response to Arguments
Applicant's arguments filed 08/15/2022 with respect to the 35 U.S.C. 103 rejection to claims 1-20 have been fully considered but they are not persuasive. Applicant asserts that “Burgess and Korthals do not appear to disclose or suggest, "training a generator network of a variational autoencoder to approximate a simulator and generate a first result, wherein the simulator, given input data, outputs output data that simulates output of an entity the simulator is simulating, wherein a training data set for the generator network includes the simulator's input data and output data; based on the simulator's output data and the first result of the generator network, training an inference network of the variational autoencoder to generate a second result, the second result of the trained inference network inverting the first result of the generator and approximating the simulator's input data, the trained inference network functioning as an inverted simulator," claimed in amended claim 1. The same reasons apply to claims 10 and 19, and the dependent claims at least by virtue of their dependencies.” (Remarks, pg. 12).
Examiner’s Response:
The Examiner agrees that Burgess et al. in combination of Korthals et al. does not teach what the above mentioned amendments. However, Burgess et al. in combination of Ma et al. teaches “training a generator network of a variational autoencoder to approximate a simulator and generate a first result, wherein the simulator, given input data, outputs output data that simulates output of an entity the simulator is simulating, wherein a training data set for the generator network includes the simulator's input data and output data” (Ma et al., Figure 1 and Pg. 3 Para. 2, “the entire deep generative model is trained in an end-to-end manner with both labeled and unlabeled data employing a semi-supervised learning strategy. We show that, with the aid of unlabeled data, the model performance is obviously improved (Table S2, Supporting Information). This means the proposed model can efficiently learn from similar metamaterial patterns without corresponding optical response obtained by numerical simulations, which alleviate the burden in data acquisition compared with other supervised learning counterpart” teaches training a deep generative model (corresponds to a generator network of a variational autoencoder) to approximate an optical response of the simulator and generates a first result. The generative model being trained with the geometric pattern of metamaterial structure (corresponds to the simulator's input data and output data) from the numerical simulator. Figure 1 and Pg. 3 Para. 3, “The  ground-truth geometry and reflection spectra obtained by numerical simulations… The forward prediction performance is first evaluated by feeding the ground-truth geometry to the prediction model, and the output spectra… The  excellent agreement  between the predicted spectra by our model and the  numerically simulated spectra clearly confirms that our model can function as an effective simulator for fast metamaterial characterization” teaches the simulator obtaining the ground-truth  geometry  and  reflection spectra (corresponds to input data) and outputting effective simulator for fast metamaterial characterization (corresponds to outputs output data that simulates output of an entity the simulator is simulating)). Ma et al. further teaches “based on the simulator's output data and the first result of the generator network, training an inference network of the variational autoencoder to generate a second result, the second result of the trained inference network inverting the first result of the generator and approximating the simulator's input data, the trained inference network functioning as an inverted simulator” (Ma et al., Figure 1 and Pg. 3 Para. 1, “generation model for the inverse design process of metamaterial given required spectra” teaches the generation model for the inverse design (corresponds to inference network of the variational autoencoder to generate a second result) that samples for inverse generation (corresponds to the second result of the trained inference network inverting the first result of the generator) to approximate the metamaterial design and optical response (corresponds to approximating the simulator's input data)).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Henry T Nguyen whose telephone number is (571)272-8860. The examiner can normally be reached Monday-Friday 8:00am-4:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/HENRY TRONG NGUYEN/Examiner, Art Unit 2125         

/BRIAN M SMITH/Primary Examiner, Art Unit 2122