DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Examiner Remarks
The Examiner notes that the instant application was previously examined by a different examiner. As such, the Examiner proceeds with prosecution giving full faith and credit to the search and action of the previous examiner per MPEP § 704.01:

When an examiner is assigned to act on an application which has received one or more actions by some other examiner, full faith and credit should be given to the search and action of the previous examiner unless there is a clear error in the previous action or knowledge of other prior art. In general the second examiner should not take an entirely new approach to the application or attempt to reorient the point of view of the previous examiner, or make a new search in the mere hope of finding something. See MPEP § 719.05.
 In this case, the examiner has conducted an updated search in light of amended claim limitations.
Response to Arguments
Applicant’s amendments and remarks filed 06/17/2022 have been considered by the examiner. 

Regarding applicant’s remarks directed to the claim rejections under USC § 101 made in the pervious rejection, have been fully considered. The examiner notes that the rejection made of record constitute a prima facie case per the guidance provided in the Under the 2019 Revised Patent Subject Matter Eligibility Guidance issued by the U.S.P.T.O. on January 4, 2019. The examiner has followed the guidance in the current action and the claims were found to recite an abstract idea where the relevant additional elements have been examined per MPEP 2106. See the current action for rejection concerning the current amended claims.

 Regarding applicant’s remarks directed to the claim rejections under USC § 102 and USC § 103 made in the pervious rejection, have been fully considered. 
With respect to the argument directed to the rejection of claims under USC §102, specifically the amended elements in claim 1 and 25 limitations, the Applicant has presented arguments regarding claim language that have not been previously examined. Therefore, applicants arguments are rendered moot. See current office action regarding the rejection of the amended claim limitations.
 



Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-7, 10-13, 20-23, and 25-29 are rejected under 35 U.S.C. 101.

According to the first part of the Alice analysis, in the instant case, the claims, 1-7, 10-13, 20-23, and 25-29, were determined to be directed to one of the four statutory categories.

Regarding claim 1, the claimed invention is directed to an abstract idea without significantly more. The claim recites encoding the input … to generate a latent variable vector in a latent variable region space partitioned into region, and decoding the latent variable vector … to generate an output response corresponding to a region, from among the regions, of the latent variable vector, which is a mental process. This judicial exception is not integrated into a practical application because the additional limitations do not integrate the claim into practical application. The additional limitations are directed to the following:
obtaining an input and providing a result of inference based on the output response inferred from the input.
Claimed elements are recited at a high level of generality and are considered insignificant extra-solution activity, in  MPEP § 2106.05(g);  where the courts have identified these class of limitations do not integrate a judicial exception into a practical application, see MPEP 2106.04(d)(I). MPEP 2106.05(d)(II) further notes that the courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. 
i. Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); TLI Communications LLC v. AV Auto. LLC, 823 F.3d 607, 610, 118 USPQ2d 1744, 1745 (Fed. Cir. 2016) (using a telephone for image transmission); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network); but see DDR Holdings, LLC v. Hotels.com, L.P., 773 F.3d 1245, 1258, 113 USPQ2d 1097, 1106 (Fed. Cir. 2014) ("Unlike the claims in Ultramercial, the claims at issue here specify how interactions with the Internet are manipulated to yield a desired result‐‐a result that overrides the routine and conventional sequence of events ordinarily triggered by the click of a hyperlink." (emphasis added))
… using a neural network-based encoder … and … using a neural network-based decoder … 
Claim elements are recited as an application of the neural network as a processing tool and merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f); where the courts have identified these class of limitations do not integrate a judicial exception into a practical application, see MPEP 2106.04(d)(I).
Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
	
Regarding claim 2, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 1, where the vector has multiple dimensions comprising variables to generate the response. This judicial exception is not integrated into a practical application because the claim does not disclose any additional limitation other than what has been analyzed above in the claim 1. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claim does not disclose any additional limitation other than what has been analyzed above in the claim 1. Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 3, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 1, where the regions correspond to responses. This judicial exception is not integrated into a practical application because the claim limitation generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h). The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the region being able to correspond to multiple responses does not modify its function; where the recitation generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 4, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 3, where control inputs that help to generate the variable partition the regions. This judicial exception is not integrated into a practical application because the claim limitation generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h). The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the region being able to correspond to multiple responses does not modify its function; where the recitation generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 5, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 1, where generating the vector includes generating a variable which corresponds to the vector. This judicial exception is not integrated into a practical application because the claim merely adds a step to generating the vector, which still only results in generating a vector, and is considered part of the mental process. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited use of the claimed information generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 6, the claimed invention is directed to an abstract idea without significantly more. The claim recites the abstract idea of claim 4, and further recites generating latent variables by sampling vectors in a distribution and generating the variables based on sampled vectors as claimed are considered additional limitations directed to a mental process; where the additional elements are not sufficient to integrate the abstract idea into a practical application. This judicial exception is not integrated into a practical application because the additional elements reciting distribution a probability distribution representing claimed region space. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited use of the claimed information generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 7, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 4, further recite selecting inputs and generating variable vectors, that is considered directed to a mental process. The claim recites generating latent variables by sampling vectors in a distribution and generating the variables based on sampled vectors as claimed are considered limitations directed to a mental process; where the additional elements are not sufficient to integrate the abstract idea into a practical application. This judicial exception is not integrated into a practical application because the additional limitations are recited such that the elements generally ling the abstract idea to a field of use as claimed variables belonging to the claimed regions corresponding to a probability distribution The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited use of the control input comprising claimed information generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 10, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 3, where the input is a user utterance and the responses are different responses to the utterance. This judicial exception is not integrated into a practical application because the additional limitations are recited such that the elements generally ling the abstract idea to a field of use as claimed input type. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited use of the claimed elements generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 11, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 1 have steps for corresponding data to a mean and a variance of a distribution modeling variables as claimed further directed to a mental process; where the additional limitations directed to the claim application of the neural network is consider insufficient to integrate the abstract idea into a practical application.  This judicial exception is not integrated into a practical application because the additional limitations are recited such that the elements generally ling the abstract idea to a field of use as claimed neural network type. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited use of the elements generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 12, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 1; where the additional limitations directed to the claim application of the neural network is consider insufficient to integrate the abstract idea into a practical application.  This judicial exception is not integrated into a practical application because the additional limitations are recited such that the elements generally ling the abstract idea to a field of use as claimed neural network type and claimed particular features. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited use of the claimed elements generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 13, the claimed invention is directed to an abstract idea without significantly more. The claim recites instructions that, when executed, generate a vector by encoding an input, and generate a response by decoding the vector, which is a mental process. This judicial exception is not integrated into a practical application because the storage medium holding the instructions is a generic computer component that does not add a meaningful limitation to the mental process because it amounts to implementing the idea on a computer. This judicial exception is not integrated into a practical application because the additional limitations are recited such that the elements generally ling the abstract idea to a technology environment as claimed. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited use of claimed elements generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 20, the claim recited similar limitations to claim 1 limitation and are thus rejected under the same rationale. Furthermore, the additional limitations are directed to the following:
a processor  … 
This judicial exception is not integrated into a practical application because the additional limitations are recited such that the elements generally ling the abstract idea to a technology environment as claimed. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited use of the processor as claimed information generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 21, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 20, where the limitations are similar with claim 2 limitations and thus rejected under the same rationale. 

Regarding claim 22, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 20, where the claim limitations are similar to claim 4 limitations and thus rejected under the same rationale.

Regarding claim 23, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 20, where the claim limitations are similar to claim 5 limitations and are thus rejected under the same rationale.

Regarding claim 25, the claimed invention is directed to an abstract idea without significantly more. The claim recited similar limitations to claim 1 limitation and are thus rejected under the same rationale. Furthermore, the additional limitations are directed to the following:
… receive an input from a user and output the response through a user interface.
Claimed elements are recited at a high level of generality and are considered insignificant extra-solution activity, in  MPEP § 2106.05(g);  where the courts have identified these class of limitations do not integrate a judicial exception into a practical application, see MPEP 2106.04(d)(I). MPEP 2106.05(d)(II) further notes that the courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. 
i. Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); TLI Communications LLC v. AV Auto. LLC, 823 F.3d 607, 610, 118 USPQ2d 1744, 1745 (Fed. Cir. 2016) (using a telephone for image transmission); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network); but see DDR Holdings, LLC v. Hotels.com, L.P., 773 F.3d 1245, 1258, 113 USPQ2d 1097, 1106 (Fed. Cir. 2014) ("Unlike the claims in Ultramercial, the claims at issue here specify how interactions with the Internet are manipulated to yield a desired result‐‐a result that overrides the routine and conventional sequence of events ordinarily triggered by the click of a hyperlink." (emphasis added))
a sensor …; a processor …;  a memory configured to store
This judicial exception is not integrated into a practical application because the additional limitations are recited such that the elements generally ling the abstract idea to a technology environment as claimed. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited use of the processor as claimed information generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
… store a latent variable region space partitioned into regions corresponding to responses
Claimed elements are recited at a high level of generality and are considered insignificant extra-solution activity, in  MPEP § 2106.05(g);  where the courts have identified these class of limitations do not integrate a judicial exception into a practical application, see MPEP 2106.04(d)(I). MPEP 2106.05(d)(II) further notes that the courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. 
iv. Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93;
Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 26, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 25, where the processor generates the vector based on a control input, which is based on a latent variable produced by encoding the input, and thus is further consider a mental process. This judicial exception is not integrated into a practical application because the additional limitations are recited such that the elements generally ling the abstract idea to a technology environment having claimed processor. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited use of the claimed elements generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 27, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 26, where the control input randomly corresponds to any region. This judicial exception is not integrated into a practical application because the claim limitation generally links the use of a judicial exception to a particular field of use, as discussed in MPEP § 2106.05(h). The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the region being able to correspond randomly to  any region generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 28, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 26, where the control input corresponds to a combination of keywords and user sentiment, attitude, directive, and guidance. This judicial exception is not integrated into a practical application because the claim limitation generally links the use of a judicial exception to a particular field of use, as discussed in MPEP § 2106.05(h). The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because input corresponding to claimed elements generally links the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h); and the courts have also identified that these class/type of limitations did not integrate a judicial exception into a practical application 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding claim 29, the claimed invention is directed to an abstract idea without significantly more. The claim recites the mental process of claim 26, where the processor comprises an encoder and decoder implemented on two neural networks. This judicial exception is not integrated into a practical application because the use of two generic neural networks does not add a meaningful limitation as they amount to simply implementing the mental process on a computer. Claim elements are recited as an application of the neural network as a processing tool and merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f); where the courts have identified these class of limitations do not integrate a judicial exception into a practical application, see MPEP 2106.04(d)(I). Thus, the claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Thus, per the analysis above claims 1-7, 10-13, 20-23, and 25-29 when examined, individually and as an ordered combination (e.g. as a whole) do not recite what have the courts have identified as "significantly more”.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-7, 11-13, and 20-23 are rejected under 35 U.S.C. 103 as being unpatentable over Stojevic (U.S. Pub. No. 20210081804-A1, hereinafter ‘Stojevic) in view of DeFelice (US Pub. No. 2019/0236139, hereinafter ‘Def’).

Regarding claim 1, Stojevic teaches a processor-implemented response inference method, comprising:
obtaining an input ([0008]: "...a tensor network representation of molecular quantum states of a dataset of small, drug-like molecules is provided as an input to a machine learning system…"; Stojevic teaches a machine learning system that takes an input).
encoding the input using a neural network-based encoder to a latent variable vector in a latent variable region space partitioned into regions; ([0026]: "...trained to encode the input to a small dimensional vector in the latent space…"; [0139]: "The tensor network used to represent interesting regions of the exponentially large space needs to be determined using an intelligent prior based on available data,"; Stojevic teaches encoding an input to a vector (i.e. generating a latent variable vector…by encoding the input) in a region of the latent space, and that this space has different interesting regions (i.e. generating…in a latent variable region space partitioned into regions); As depicted in Fig. 5 and Fig. 21, claimed encoding the input using a neural network-based encoder…

    PNG
    media_image1.png
    542
    874
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    900
    1455
    media_image2.png
    Greyscale


In [0147]-[0148]: FIG. 5 is a schematic diagram showing how data is processed in the method 100 to train the autoencoder. Input data 114 is encoded into a chosen latent space 116 using a neural network or tensor network… In one embodiment, input data in the form of a tensor is received. A tensor network provided as an autoencoder is used to encode the input data into a complete-graph tensor network latent space…; And as an Variational autoencoder in [0171]: In FIG. 21, we show the operation of a Variational Auto-Encoder (VAE): The standard VAE first encodes an input x into a set of latent variables μ(x) (i.e. encoding the input using a neural network-based encoder to a latent variable vector in a latent variable region space partitioned into regions), a(x)…).
decoding the latent variable vector using a neural network-based decoder to generate an output response corresponding to a region, from among the regions, of the latent variable vector. ([0026]: "The term ‘autoencoder’ preferably connotes an artificial neural network having an output in the same form as the input, trained to encode the input to a small dimensional vector in the latent space, and to decode this vector to reproduce the input as accurately as possible,"; [0139]: "The tensor network used to represent interesting regions of the exponentially large space needs to be determined using an intelligent prior based on available data,"; Stojevic teaches decoding the vector from a region of the latent space to produce (i.e. generate) an output that reproduces the input. The output is based on the decoded vector (i.e. corresponding to a region), and the region of the vector is one of several regions in the latent space (i.e. a region, from among the regions); And claimed decoded depicted in Fig. 5 and Fig. 21  for processing claimed response of latent variables for decoding and generating the depicted output, in  [0147]-[0148]: FIG. 5 is a schematic diagram showing how data is processed in the method 100 to train the autoencoder. Input data 114 is encoded into a chosen latent space 116 using a neural network or tensor network… In one embodiment, input data in the form of a tensor is received. A tensor network provided as an autoencoder is used to encode the input data into a complete-graph tensor network latent space…; And as an Variational autoencoder in [0171]: In FIG. 21, we show the operation of a Variational Auto-Encoder (VAE): The standard VAE first encodes an input x into a set of latent variables μ(x) (i.e. encoding the input using a neural network-based encoder to a latent variable vector in a latent variable region space partitioned into regions), … The decoder network (i.e. decoding the latent variable vector using a neural network-based decoder to generate an output response corresponding to a region, from among the regions, of the latent variable vector) samples the latent space from a prior distribution p(z), usually a Gaussian, and decodes to an output x'. The network is optimized to reproduce the inputs x maximally well, given a small latent space z... ).
providing a result … based on the output response inferred from the input. ( providing claimed predictive output, in 0047: FIG. 16 15 shows a table of results. FIG. schematically represents  processing tensor network input states to produce a predictive output (i.e. providing … the output response inferred from the input t) and a generative output.; And in 0147: FIG. 5 is a schematic diagram showing how data is processed in the method 100 to train the autoencoder. Input data 114 is encoded into a chosen latent space 116 using a neural network or tensor network. The compressed representation of the data in the latent space is then decoded using the neural network or tensor network, thereby producing outputs 118. The weights in the neural network, or constituent tensors in a tensor network, are optimised to minimize the difference (i.e. providing a result … based on the output response inferred from the input) between outputs 118 and inputs 114…
While Stojevic, teaches using encoder and decoding networks to process information to produce a neural network output to infer a result of inference as difference between the decoded output and input as disclosed above. Stojevic doesn’t expressly teach the use of decoder output to infer a result as claimed:
providing a result of inference based on the output response inferred from the input. (Def teaches in 0046-0047: Focusing on the text generation component 220, it includes both an encoder 222 and a decoder 224. The encoder 22 network encodes the words within the source text 201 as a list of vectors,… In various embodiments, the text generation component also includes a discriminator 226 and an evaluator 228. The discriminator 226 (i.e. providing a result of inference based on the output response inferred from the input) is used to judge whether a particular candidate output is "human-like," without primary regard to the content of the candidate output. The output of the discriminator (i.e. providing a result of inference based on the output response inferred from the input) is provided back to the encoder 222 at 229(a). The evaluator 228 is used to judge whether the generated text conforms to the target classifiers identified relative to FIG. 1 . The output of the evaluator can be used both to disqualify a particular candidate text (for failing one or more binary classifiers or for falling too far outside an acceptable range on a Gaussian classifier) but it can also be used as part of a feedback loop for the encoder 222, shown…: And depicted in Fig. 2:

    PNG
    media_image3.png
    674
    791
    media_image3.png
    Greyscale

)
The Stojevic and Def references would have been recognized by those of ordinary skill in the art as useful for applicant’s purpose in developing information processing and retrieval techniques using neural network models. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the prior art for generating inferred information from a decoder output as disclosed by Def with the method of information processing and retrieval techniques using encoder and decoder neural network models as disclosed by Stojevic.
One of ordinary skill in the arts would have been motivated to combine the disclosed methods of Stojevic and Def in order provide a and use an result of inference that allows the use of a network model to consider an outside objective based on a decoded output  (Def, 0009); Doing so helps to improve learning by allow the use of infer results that capture external objectives by considering the use of contextual information to apply to the decoder output (Def, 0009).

	
Regarding claim 2, Stojevic in combination with Def teaches the method according to claim 1. Stojevic further teaches the latent variable vector is a multidimensional vector comprising latent information variables to generate a response to the input ([0014]: "The term ‘tensor’ preferably connotes a multidimensional or multi-rank array (a matrix and vector being examples of rank-2 or rank-1 tensors), where the components of the array are preferably functions of the coordinates of a space,"; [0149]: "...the latent space might be a tensorial object, or a simple vector (which is the usual setup in an autoencoder), or some other mathematical construct such as a graph. The output determined by a given element of the latent space (and in particular the optimal element of the latent space) will in general not be a part of the original dataset,"; Stojevic teaches a tensorial object which can have multiple dimensions (i.e. a multidimensional vector) that produces an output (i.e. response to an input) comprised of several elements (i.e. variables)).

Regarding claim 3, Stojevic in combination with Def teaches the method according to claim 1. Stojevic further teaches the regions correspond to a plurality of responses ([0139]: "The tensor network used to represent interesting regions of the exponentially large space needs to be determined using an intelligent prior based on available data,"; Stojevic teaches a tensor network and data (i.e. a plurality of responses), where the tensor network is used to represent a region (i.e. the network that the region correspond to); And as depicted in Fig. 21, in [0171]: In FIG. 21, we show the operation of a Variational Auto-Encoder (VAE): The standard VAE first encodes an input x into a set of latent variables μ(x) (i.e. regions of the latent variable region space), … The decoder network  samples the latent space from a prior distribution p(z), usually a Gaussian, and decodes to an output x'. The network is optimized to reproduce the inputs x maximally well, given a small latent space z... ).).

Regarding claim 4, Stojevic in combination with Def teaches the method according to claim 3. Stojevic further teaches:
the latent variable region space is partitioned by control inputs corresponding to the plurality of responses ([0149]: "...the generative tensorial approach described here will explore regions of the huge space of possible compounds not accessible to other methods. The output data may alternatively or additionally be a filtered version of the input data, corresponding to a smaller number of data points,"; Stojevic teaches that the space has regions (i.e. the space is partitioned) and that the data that comes from it can be filtered based on data points (i.e. control inputs that correspond to responses); And as depicted in Fig. 21, in [0171]: In FIG. 21, we show the operation of a Variational Auto-Encoder (VAE): The standard VAE first encodes an input x into a set of latent variables μ(x) (i.e. regions of the latent variable region space), … The decoder network  samples the latent space from a prior distribution p(z), usually a Gaussian, and decodes to an output x'. The network is optimized to reproduce the inputs x maximally well, given a small latent space z... ).
a control input of the control inputs comprises information to generate the latent variable vector in the region of the latent variable region space ([0139]: "The tensor network used to represent interesting regions of the exponentially large space needs to be determined using an intelligent prior based on available data,"; Stojevic teaches an intelligent prior based on available data (i.e. a control input) that is used to determine a network that a region of the latent variable space (i.e. information to generate the latent variable vector in the region of the latent variable region space); And as depicted in Fig. 21, in [0171]: In FIG. 21, we show the operation of a Variational Auto-Encoder (VAE): The standard VAE first encodes an input x into a set of latent variables μ(x) (i.e. regions of the latent variable region space), … The decoder network  samples the latent space from a prior distribution p(z), usually a Gaussian, and decodes to an output x'. The network is optimized to reproduce the inputs x maximally well, given a small latent space z...).

Regarding claim 5, Stojevic in combination with Def teaches the method according to claim 1. Stojevic further teaches:
generating a latent variable by encoding the input ([0026]: "...trained to encode the input to a small dimensional vector in the latent space…"; Stojevic teaches encoding the input to a vector in the latent space (i.e. generating a latent variable))
generating the latent variable vector belonging to one of the regions of the latent variable region space corresponding to the latent variable ([0139]: "The tensor network used to represent interesting regions of the exponentially large space needs to be determined using an intelligent prior based on available data,"; Stojevic teaches determining a tensor network representing an interesting region (i.e. generating the latent variable vector belonging to one of the regions) using (i.e. corresponding to) an intelligent prior (i.e. latent variable); And as depicted in Fig. 21, in [0171]: In FIG. 21, we show the operation of a Variational Auto-Encoder (VAE): The standard VAE first encodes an input x into a set of latent variables μ(x) (i.e. regions of the latent variable region space), … The decoder network  samples the latent space from a prior distribution p(z), usually a Gaussian, and decodes to an output x'. The network is optimized to reproduce the inputs x maximally well, given a small latent space z... ).

Regarding claim 6, Stojevic in combination with Def teaches the method according to claim 4. Stojevic further teaches:
sampling a plurality of vectors based on a probability distribution representing the latent variable region space ([0163]: "…samples of real molecules are fed to Discriminator D; molecules are represented as tensor networks T..."; [0217]: "...a generative model G that captures the training dataset distribution and (b) a discriminative model D that estimates the probability that a sample came from the training dataset rather than G,"; Stojevic teaches sampling a plurality of molecules represented as tensor networks (i.e. vectors), using a distribution of the training dataset (i.e. based on a probability distribution representing the latent variable region space))
generating the latent variable vector based on the sampled vectors ([0222]: "...the machine learning system outputs tensor network representations of the molecular quantum states of small drug-like molecules to a predictive model,"; Stojevic teaches outputting a tensor network (i.e. generating the latent variable vector) that represents molecules (i.e. the vector is based on the sampled vectors)).

Regarding claim 7, Stojevic in combination with Def teaches the method according to claim 4. Stojevic further teaches:
selecting one of control inputs corresponding to the regions of the latent variable region space ([0134]: “Tensor networks enable intelligent priors to be picked that, in turn, restrict the search to the space of physically relevant elements…"; Stojevic teaches selecting a prior (i.e. control input) that restricts a search to an area of a space (i.e. corresponding to the region[s] of the latent variable region space); And as depicted in Fig. 21, in [0171]: In FIG. 21, we show the operation of a Variational Auto-Encoder (VAE): The standard VAE first encodes an input x into a set of latent variables μ(x) (i.e. regions of the latent variable region space), … The decoder network  samples the latent space from a prior distribution p(z), usually a Gaussian, and decodes to an output x'. The network is optimized to reproduce the inputs x maximally well, given a small latent space z... ).
generating the latent variable vector belonging to the region corresponding to the selected control input based on a probability distribution ([0134]: “Tensor networks enable intelligent priors to be picked that, in turn, restrict the search to the space of physically relevant elements…"; [0171]: “The standard VAE first encodes an input x into a set of latent variables p(x), a(x). The decoder network samples the latent space from a prior distribution p(z), usually a15 Gaussian, and decodes to an output x'. The network is optimised to reproduce the inputs,”; Stojevic teaches selecting a prior that restricts a search to an area of the space (i.e. region corresponding to the selected control input). Stojevic also teaches an autoencoder that encodes a set of latent variables (i.e. the latent variable vector), then samples the latent space from the prior Gaussian distribution (i.e. the region corresponding to the selected control input based on a probability distribution) and decodes an output (i.e. generating the latent variable vector belonging to the region)).

Regarding claim 11, Stojevic in combination with Def teaches the method according to claim 1. Stojevic further teaches:
wherein a neural network of the neural network-based encoder comprises an input layer corresponding to the input and an output layer corresponding to a mean and a variance of a probability distribution modeling a latent variable. ([0026]: "The term 'autoencoder' preferably connotes an artificial neural network having an output in the same form as the input, trained to encode the input to a small dimensional vector in the latent space, and to decode this vector to reproduce the input as accurately as possible,”; Fig. 21, [0171]: “The standard VAE first encodes an input x into a set of latent variables p(x), a(x). The decoder network samples the latent space from a prior distribution p(z), usually a15 Gaussian, and decodes to an output x'. The network is optimised to reproduce the inputs,”; Stojevic teaches a VAE neural network that accepts an input, and produces an output that is a reproduction of the input (i.e. modeling a latent variable). The output is produced with the use of a Gaussian (i.e. corresponding to the mean and variance of a probability); And as depicted in Fig. 21).

Regarding claim 12, Stojevic in combination with Def teaches the method according to claim 1. Stojevic further teaches:
wherein a neural network of the neural network-based decoder comprises an input layer corresponding to the latent variable vector and an output layer corresponding to the output response. ([0026]: "The term 'autoencoder' preferably connotes an artificial neural network having an output in the same form as the input, trained to encode the input to a small dimensional vector in the latent space, and to decode this vector to reproduce the input as accurately as possible,”; [0171]: “The standard VAE first encodes an input x into a set of latent variables p(x), a(x). The decoder network (i.e. the neural network-based decoder comprises an input layer corresponding to the latent variable vector and an output layer corresponding to the output response) samples the latent space from a prior distribution p(z), usually a15 Gaussian, and decodes to an output x'. The network is optimised to reproduce the inputs,”; Stojevic teaches a VAE neural network decoder that takes an input from the latent space (i.e. latent variable vector) and produces an output corresponding to the input to the encoder (i.e. output response); And as depicted in Fig. 21).

Regarding claim 13, Stojevic in combination with Def teaches the method according to claim 1. Stojevic further teaches a non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the response inference method of claim 1 ([0274]: "...a computer readable medium having stored thereon a program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein,"; "As used herein, means plus function features may be expressed alternatively in terms of their corresponding structure, such as a suitably programmed processor and associated memory,"; Stojevic teaches a computer readable storage holding instructions for carrying out methods, and a processor for carrying out those instructions).

Regarding claim 20, Stojevic teaches:
a processor configured to: ([0128]: "...a processor for processing the chemical compound dataset to determine a tensorial space for said chemical compound dataset…"; Stojevic teaches a processor)
Examiner notes that the remaining claim 20 limitations are similar to claim 1 limitations and thus rejected under the same rationale as claim 1 limitations.

Regarding claim 21, Stojevic in combination with Def teaches the apparatus according to claim 20. Claim limitations similar to claim 2 limitations and thus rejected under the same rationale as claim 2 limitations. 
Regarding claim 22, Stojevic teaches the apparatus according to claim 20. Claim limitations similar to claim 4 limitations and thus rejected under the same rationale as claim 4 limitations. 

Regarding claim 23, Stojevic teaches the apparatus according to claim 20. Claim limitations similar to claim5  limitations and thus rejected under the same rationale as claim 5 limitations.

Claims 8,  9 , and 24, are rejected under 35 U.S.C. 103 as being unpatentable over Stojevic in view of Def and in further view of Graves et al. (Graves, Alex, Jacob Menick, and Aaron van den Oord. "Associative compression networks for representation learning." arXiv preprint arXiv:1804.02476 (2018), hereinafter ‘Graves’).

Regarding claim 8, Stojevic teaches the method according to claim 4. Stojevic further teaches sampling vectors based on a probability distribution representing the latent variable region space ([0163]: "…samples of real molecules are fed to Discriminator D; molecules are represented as tensor networks T..." ; [0217]: "...a generative model G that captures the training dataset distribution and (b) a discriminative model D that estimates the probability that a sample came from the training dataset rather than G,"; Stojevic teaches sampling a plurality of molecules represented as tensor networks (i.e. vectors), using a distribution of the training dataset (i.e. based on a probability distribution representing the latent variable region space)).
Stojevic and Def does not expressly teach:
generating an embedded control input by randomizing a control input comprising information to generate the latent variable vector in the region of the latent variable region space
applying the embedded control input to each of the sampling vectors
generating the latent variable vector using a weighted sum of the sampled vectors to which the embedded control input is applied.
Graves does expressly teach:
generating an embedded control input by randomizing a control input comprising information to generate the latent variable vector in the region of the latent variable region space ((Section 3): "Associative compression networks (ACNs) are similar to VAEs, except the prior for each x is now conditioned on the distribution q(zj^x) used to encode some neighboring datum ^x. We used a unit variance, diagonal Gaussian for all encoding distributions, meaning that q(zjx) is entirely described by its mean vector Ezq(zjx) [z], which we refer to as the code c for x. Given c, we randomly pick ^c, the code for ^x, from KNN(x), the set of K nearest Euclidean neighbors to c among all the codes for the training data. We then pass ^c to the prior network to obtain the conditional prior distribution p(zj^c) and hence determine the KL cost,"; Graves teaches randomly picking an input to a prior network (i.e. generating an embedded control input by randomizing a control input) to obtain a distribution (i.e. comprising information to generate the latent variable vector in the region of the latent variable region space))
applying the embedded control input to each of the sampling vectors ((Section 2): "The encoder receives observable data x as input and emits as output a data-conditional distribution q(zjx) over latent vectors z. A sample z q is drawn from this distribution..."; (Section 3): "We then pass ^c to the prior network to obtain the conditional prior distribution p(zj^c) and hence determine the KL cost,"; Graves teaches applying a value (i.e. embedded control input) to a distribution, from which sample vectors are pulled (i.e. applying…to each of the sampling vectors))
generating the latent variable vector using a weighted sum of the sampled vectors to which the embedded control input is applied ((Section 4): "The encoding distribution q(zjx) was always a unit variance Gaussian with mean specified by the output of the encoder network. The dimensionality of z was 16 for binarized MNIST..."; (Section 4.1): "For the binarized MNIST experiments the ACN encoder had five convolutional layers..."; (Section 2): "The encoder receives observable data x as input and emits as output a data-conditional distribution q(zjx) over latent vectors z. A sample z q is drawn from this distribution..."; (Section 3): "We then pass ^c to the prior network to obtain the conditional prior distribution p(zj^c) and hence determine the KL cost,"; Graves teaches a Gaussian (i.e. weighted sum of the sampled vectors) used in the encoding process (i.e. generating the latent variable vector). Graves also teaches applying a value to the distribution (i.e. to which the embedded control input is applied)).
Stojevic, Def, and Graves are analogous art because they are from the same field of endeavor in neural networks. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Stojevic, Def,  and Graves before him or her to modify the sampling of Stojevic to include the use of an embedded control input as in Graves, obtaining the advantage of reducing sampling noise (Graves; (Section 3.2): “Note that in order to reduce sampling noise we use the mean codes c as latent for the reconstructions, rather than samples from N(c; 1)…”).

Regarding claim 9, Stojevic, Def, and Graves teach the method of claim 8. Stojevic does not teach the control input comprises a vector having a dimension that is same as a dimension of the latent variable vector.
Def teaches the control input comprises a vector having a dimension that is same as a dimension of the latent variable vector (in 0087: Turning to the text generation component 220, one embodiment uses a VAE encoder/decoder model. VAEs are  generative models based upon a regularized autoencoder. Instead of just encoding the mapping from inputs to outputs, the VAE internally breaks the representation (i.e. the control input comprises a vector having a dimension that is same as a dimension of the latent variable vector) into a prior distribution and a learned posterior model… Encoder 1010 is a variational inference network, mapping observed inputs (i.e. the control input comprises a vector having a dimension that is same as a dimension of the latent variable vector)  to posterior distributions over latent space…; And the use of the claimed multi-dimensional vector data, in 0004: One aspect of a model is that as a mapping of many-valued inputs to many-valued outputs (i.e. the control input comprises a vector having a dimension that is same as a dimension of the latent variable vector), it is not limited to discrimination between existing inputs, but can be used to predict the mapping of a new, never-before seen input to the set of outputs given the model… In this case, the model maps multi-dimensional inputs (i.e. the control input comprises a vector having a dimension that is same as a dimension of the latent variable vector) onto a distribution of possible outputs…)
The Stojevic and Def references would have been recognized by those of ordinary skill in the art as useful for applicant’s purpose in developing information processing and retrieval techniques using neural network models. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the prior art for processing multi-dimensional data using a model maps of a autoencoder neural network as disclosed by Def with the method of information processing and retrieval techniques using encoder and decoder neural network models as disclosed by Stojevic.
One of ordinary skill in the arts would have been motivated to combine the disclosed methods of Stojevic and Def in order provide a and use an result of inference that allows the use of a network model in discriminating between existing models and predict mapping of new inputs  (Def, 0004); Doing so helps to improve learning by allow the use of infer results that capture external objectives by considering the use of contextual information to apply to the decoder output (Def, 0009); and allows for the minimization of error between the model distributions of the expected and observed outputs associated with the model’s mapping (Def, 0004).

Additionally, Graves teaches the control input comprises a vector having a dimension that is same as a dimension of the latent variable vector ((Section 3): "We used a unit variance, diagonal Gaussian for all encoding distributions, meaning that q(zjx) is entirely described by its mean vector Ezq(zjx) [z], which we refer to as the code c for x. Given c, we randomly pick ^c, the code for ^x, from KNN(x)..."; Graves teaches a distribution (i.e. control input) that is described by the mean vector of a latent vector, which requires the distribution to share a dimension with the latent vector (i.e. a vector having a dimension that is same as a dimension of the latent variable vector)).
Stojevic, Def, and Graves are analogous art because they are from the same field of endeavor in neural networks. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Stojevic, Def, and Graves before him or her to modify the control input of Stojevic and Def to include the shared dimension as in Graves, obtaining the advantage of describing the distribution with a vector (Graves; (Section 3): “…meaning that q(zjx) is entirely described by its mean vector Ezq(zjx) [z]…”).

Regarding claim 24, Stojevic teaches the apparatus according to claim 23. Claim limitations similar to claim 8 limitations and thus rejected under the same rationale as claim 8 limitations.

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Stojevic in view of Def, in further view of Vigen (U.S. Patent No 20090198488-A1).

Regarding claim 10, Stojevic in combination with Def teaches the method according to claim 3. Stojevic does not teach:
the input is an utterance of a user not intended to get a specific response in a conversation, and the plurality of responses are different responses to the utterance.
Def does teach: the input is an utterance of a user not intended to get a specific response in a conversation, and the plurality of responses are different responses to the utterance. (Def teaches in 0060-0061: … In this embodiment, the named entity recognition (i.e. and the plurality of responses are different responses to the utterance) compo­nent receives each sentence or group of sentences (the input is an utterance of a user not intended to get a specific response in a conversation) and uses a processor to tag the words according to the part of speech (4a) and identify particular noun phrases (4b) (i.e. and the plurality of responses are different responses to the utterance) within the input… In one embodiment, disambiguation component 330 performs Bayesian inference using the marginal likelihood of two different models correctly predicting the associated data:… Then, the features are used to train the classifier, which learns to disambiguate entities in the text. In one embodiment, this is done as a form of supervised learning, where known information (or information that has a high-enough likelihood of being correct) is used to inform the probabilities of each particular assertion ascertainable within the text…)
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Stojevic and Def for the same reasons disclosed above.
Additionally, Vigen teaches:
the input is an utterance of a user not intended to get a specific response in a conversation ([0127]: "...microphones...voice recognizers...devices for input and/or output. The CPU 202 may acquire communications, instructions and/or data for implementing communications analysis through the input/output bus 210,"; [0008]: "These profiles may then be utilized by the system to generate responsive communications that are selected based upon the communicator's preferences as interpreted from the attributes,"; Vigen teaches an input that can be collected as spoken audio (i.e. utterance of a user) and can have multiple responses generated for it (i.e. not intended to get a specific response in conversation))
and the plurality of responses are different responses to the utterance ([0008]: "These profiles may then be utilized by the system to generate responsive communications that are selected based upon the communicator's preferences as interpreted from the attributes,"; Vigen teaches multiple different responses to the input (i.e. utterance) can be generated).
Stojevic, Def, and Vigen are analogous art because they are from the same field of endeavor in machine learning. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Stojevic, Def, and Vigen before him or her to modify the encoding, decoding, and response of Stojevic to include receiving audio inputs as in Vigen, obtaining the advantage of allowing Stojevic’s and Def methods to analyze spoken audio (Vigen; [0127]: “The CPU 202 may acquire communications, instructions and/or data for implementing communications analysis through the input/output bus 210,”).

Claims 14-19 are rejected under 35 U.S.C. 103 as being unpatentable over Stojevic  in view of Graves (Graves, Alex, Jacob Menick, and Aaron van den Oord. "Associative compression networks for representation learning." arXiv preprint arXiv:1804.02476 (2018)).

Regarding claim 14, Stojevic teaches:
obtaining a training input ([0008]: "...a tensor network representation of molecular quantum states of a dataset of small, drug-like molecules is provided as an input to a machine learning system…"; Stojevic teaches obtaining an input (i.e. training input))
generating a latent variable by applying the training input to an encoder ([0171]: "The standard VAE first encodes an input x into a set of latent variables μ(x), σ(x),"; Stojevic teaches using an autoencoder to encode an input (i.e. training input) into a set of latent variables (i.e. generating a latent variable))
generating a training latent variable vector of a region corresponding to the control input in a latent variable region space corresponding to the latent variable ([0171]: “The standard VAE first encodes an input x into a set of latent variables p(x), a(x). The decoder network samples the latent space from a prior distribution p(z), usually a15 Gaussian, and decodes to an output x'. The network is optimised to reproduce the inputs,”; Stojevic teaches encoding an input into a set of latent variables in the latent space (i.e. generating a training latent variable vector of a region corresponding to the control input). The space is associated with the encoded set such that it reproduces the input when decoded (i.e. a latent variable region space corresponding to the latent variable))
generating an output response by applying the training latent variable vector to a decoder ([0026]: "...to decode this vector to reproduce the input as accurately as possible,"; Stojevic teaches decoding a vector (i.e. applying the training latent variable vector to a decoder) to reproduce an input (i.e. generating an output response))
training neural networks of the encoder and the decoder based on the output response and the training response ([0147]: "The weights in the neural network, or constituent tensors in a tensor network, are optimised to minimize the difference between outputs 118 and inputs 114,"; Stojevic teaches updating neural network weights (i.e. training neural networks of the encoder and the decoder) to minimize difference between the output and input (i.e. based on the output response and the training response)).
Stojevic does not teach:
obtaining a training response from among training responses to the training input
obtaining a control input corresponding to the training response from among control inputs corresponding to the training responses, respectively
Graves teaches:
obtaining a training response from among training responses to the training input ((Section 2): " The encoder receives observable data x as input and emits as output a data-conditional distribution q(zjx) over latent vectors z. A sample z q is drawn from this distribution and used by the decoder to determine a code-conditional reconstruction distribution r(xjz) over the original data,"; Graves teaches taking a sample from a distribution based on an input, which is used to determine a response (i.e. obtaining a training response from among training responses to the training input))
obtaining a control input corresponding to the training response from among control inputs corresponding to the training responses, respectively ((Section 3): "Given c, we randomly pick ^c, the code for ^x, from KNN(x), the set of K nearest Euclidean neighbors to c among all the codes for the training data,"; Graves teaches selecting a code (i.e. obtaining a control input…from among control inputs) associated with the training data (i.e. corresponding to the training response))
Stojevic and Graves are analogous art because they are from the same field of endeavor in neural networks. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Stojevic and Graves before him or her to modify the encoder and decoder networks of Stojevic to include the organization of inputs and responses as in Graves, obtaining the advantage of preventing diminished information (Graves; (Section 5): “Our experiments show that the latent representations learned by ACNs contain meaningful, high-level information that is not diminished by the use of autoregressive decoders,”).

Regarding claim 15, Stojevic and Graves teach the method according to claim 14. Stojevic further teaches:
the training latent variable vector is a multidimensional vector comprising information variables latent to generate a response to the training input ([0149]: "...the latent space might be a tensorial object, or a simple vector (which is the usual setup in an autoencoder), or some other mathematical construct such as a graph. The output determined by a given element of the latent space (and in particular the optimal element of the latent space) will in general not be a part of the original dataset,"; Stojevic teaches a tensorial object (i.e. vector) that gives an output (i.e. response to an input) which is comprised of several elements (i.e. variables))
the control input is information to induce generation of a latent variable vector in a region of the latent variable region space ([0139]: "The tensor network used to represent interesting regions of the exponentially large space needs to be determined using an intelligent prior based on available data,"; Stojevic teaches an intelligent prior (i.e. control input) used to determine a tensor network (i.e. induce generation of a latent variable vector) that represents an interesting region of the space (i.e. in a region of the latent variable region space)).

Regarding claim 16, Stojevic and Graves teach the method according to claim 14. Stojevic further teaches the latent variable region space is partitioned into regions corresponding to the control inputs ([0149]: "...the generative tensorial approach described here will explore regions of the huge space of possible compounds not accessible to other methods. The output data may alternatively or additionally be a filtered version of the input data, corresponding to a smaller number of data points,"; Stojevic teaches a space that has regions (i.e. is partitioned into regions), and the data that comes from it can be filtered corresponding to an input, in this case a number of data points (i.e. corresponding to control inputs)).

Regarding claim 17, Stojevic and Graves teach the method according to claim 14. Stojevic further teaches sampling vectors based on a probability distribution representing the latent variable region space ([0163]: "…samples of real molecules are fed to Discriminator D; molecules are represented as tensor networks T..." ; [0217]: "...a generative model G that captures the training dataset distribution and (b) a discriminative model D that estimates the probability that a sample came from the training dataset rather than G,"; Stojevic teaches sampling a plurality of molecules represented as tensor networks (i.e. vectors), using a distribution of the training dataset (i.e. based on a probability distribution representing the latent variable region space)).
Stojevic does not teach:
generating an embedded control input by randomizing the control input
applying the embedded control input to each of the sampled vectors
generating a training latent variable vector using a weighted sum of the sampled vectors to which the embedded control input is applied.
Graves teaches:
generating an embedded control input by randomizing the control input (Section 3: "Associative compression networks (ACNs) are similar to VAEs, except the prior for each x is now conditioned on the distribution q(zj^x) used to encode some neighboring datum ^x. We used a unit variance, diagonal Gaussian for all encoding distributions, meaning that q(zjx) is entirely described by its mean vector Ezq(zjx) [z], which we refer to as the code c for x. Given c, we randomly pick ^c, the code for ^x, from KNN(x), the set of K nearest Euclidean neighbors to c among all the codes for the training data. We then pass ^c to the prior network to obtain the conditional prior distribution p(zj^c) and hence determine the KL cost,"; Graves teaches randomly selecting a code (i.e. randomizing the control input), and using that code as an input (i.e. generating an embedded control input))
applying the embedded control input to each of the sampled vectors ((Section 2): "The encoder receives observable data x as input and emits as output a data-conditional distribution q(zjx) over latent vectors z. A sample z q is drawn from this distribution..."; (Section 3): "We then pass ^c to the prior network to obtain the conditional prior distribution p(zj^c) and hence determine the KL cost,"; Graves teaches applying a value (i.e. embedded control input) to a distribution, from which sample vectors are pulled (i.e. applying…to each of the sampled vectors))
generating a training latent variable vector using a weighted sum of the sampled vectors to which the embedded control input is applied ((Section 4): "The encoding distribution q(zjx) was always a unit variance Gaussian with mean specified by the output of the encoder network. The dimensionality of z was 16 for binarized MNIST..." (Section 4.1): "For the binarized MNIST experiments the ACN encoder had five convolutional layers..."; Graves teaches a Gaussian (i.e. weighted sum of the sampled vectors) used in the encoding process (i.e. generating a training latent variable vector). Graves also teaches applying a value to the distribution (i.e. to which the embedded control input is applied)).
Stojevic and Graves are analogous art because they are from the same field of endeavor in neural networks. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Stojevic and Graves before him or her to modify the sampling of Stojevic to include the use of an embedded control input as in Graves, obtaining the advantage of conserving computing resources (Graves; (Abstract): “Since the prior need only account for local, rather than global variations in the latent space, the coding cost is greatly reduced, leading to rich, informative codes”).

Regarding claim 18, Stojevic and Graves teach the method according to claim 14. Stojevic further teaches a value of a loss function comprising a difference between the training response and the output response is minimized ([0025]: "The term ‘cost function’ preferably connotes a mathematical function representing a measure of performance of an artificial neural network, or a tensor network, in relation to a desired output. The weights in the network are optimised to minimize some desired cost function,"; Stojevic teaches a function (i.e. a loss function) that measures the relation between the performance of a network and the desired output of that network (i.e. the difference between the training response and the output response), and that the network is updated to minimize the value of this function).

Regarding claim 19, Stojevic and Graves teach the method according to claim 14. Stojevic further teaches a non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the training method of claim 14 ([0274]: "...a computer readable medium having stored thereon a program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein,"; "As used herein, means plus function features may be expressed alternatively in terms of their corresponding structure, such as a suitably programmed processor and associated memory,"; Stojevic teaches a computer readable storage medium storing a program for running methods (i.e. instructions), and a processor that runs the program).

Claims 25-27 and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Stojevic.in view of Zadeh (U.S. Patent No 20180204111-A1).
Regarding claim 25, Stojevic teaches:
a memory configured to store a latent variable region space partitioned into regions corresponding to responses ([0139]: "The tensor network used to represent interesting regions of the exponentially large space needs to be determined using an intelligent prior based on available data,"; [0274]: "As used herein, means plus function features may be expressed alternatively in terms of their corresponding structure, such as a suitably programmed processor and associated memory,"; Stojevic teaches using a memory in its structure. Stojevic also teaches determining regions of a space (i.e. space partitioned into regions) using intelligent priors (i.e. corresponding to responses))
a processor configured to: ([0128]: "...a processor for processing the chemical compound dataset to determine a tensorial space for said chemical compound dataset…"; Stojevic teaches a processor)
encode the input using a neural network-based encoder to generate a latent variable vector in the latent variable region space ([0026]: "...trained to encode the input to a small dimensional vector in the latent space…"; [0139]: "The tensor network used to represent interesting regions of the exponentially large space needs to be determined using an intelligent prior based on available data,"; Stojevic teaches encoding an input to a vector (i.e. encode the input using a neural network-based encoder to generate a latent variable vector in the latent variable region space) in a region of the latent space); Stojevic teaches encoding an input to a vector (i.e. generating a latent variable vector…by encoding the input) in a region of the latent space, and that this space has different interesting regions (i.e. encode the input using a neural network-based encoder to generate a latent variable vector in the latent variable region space); As depicted in Fig. 5 and Fig. 21, claimed encode the input using a neural network-based encoder to generate a latent variable vector in the latent variable region space

    PNG
    media_image1.png
    542
    874
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    900
    1455
    media_image2.png
    Greyscale


In [0147]-[0148]: FIG. 5 is a schematic diagram showing how data is processed in the method 100 to train the autoencoder. Input data 114 is encoded into a chosen latent space 116 using a neural network or tensor network… In one embodiment, input data in the form of a tensor is received. A tensor network provided as an autoencoder is used to encode the input data into a complete-graph tensor network latent space…; And as an Variational autoencoder in [0171]: In FIG. 21, we show the operation of a Variational Auto-Encoder (VAE): The standard VAE first encodes an input x into a set of latent variables μ(x) (i.e. encode the input using a neural network-based encoder to generate a latent variable vector in the latent variable region space), a(x)…).
decode the latent variable vector using a neural network-based decoder generate a response corresponding to a region from among the regions ([0026]: "...to decode this vector to reproduce the input as accurately as possible,"; [0145]: "...the data is decoded using the neural network (or tensor data),"; Stojevic teaches decoding the vector from a region of the latent space to produce (i.e. generate) an output that reproduces the input. The output is thus linked with the vector (i.e. corresponding to a region); And claimed decoded depicted in Fig. 5 and Fig. 21  for processing claimed response of latent variables for decoding and generating the depicted output, in  [0147]-[0148]: FIG. 5 is a schematic diagram showing how data is processed in the method 100 to train the autoencoder. Input data 114 is encoded into a chosen latent space 116 using a neural network or tensor network… In one embodiment, input data in the form of a tensor is received. A tensor network provided as an autoencoder is used to encode the input data into a complete-graph tensor network latent space…; And as an Variational autoencoder in [0171]: In FIG. 21, we show the operation of a Variational Auto-Encoder (VAE): The standard VAE first encodes an input x into a set of latent variables μ(x) (i.e. encoding the input using a neural network-based encoder to a latent variable vector in a latent variable region space partitioned into regions), … The decoder network (i.e. decode the latent variable vector using a neural network-based decoder generate a response corresponding to a region from among the regions r) samples the latent space from a prior distribution p(z), usually a Gaussian, and decodes to an output x'. The network is optimized to reproduce the inputs x maximally well, given a small latent space z... ).
Stojevic does not teach:
a sensor configured to receive an input from a user
output the response through a user interface.
Zadeh teaches:
a sensor configured to receive an input from a user ([0729]: "The Z-mouse is for example provided through a user interface on a computing device or other controls such as sliding/knob type controls, to control the position and size of an f-mark,"; Zadeh teaches providing an input through a user interface via computing device or other controls (i.e. a sensor configured to receive an input from a user))
output the response through a user interface ([1810]: "...the results from above, which is connected to output module, e.g. printout or computer monitor or display or any graphic or table or list generator, for the user to use or see…"; Zadeh teaches an output module that allows the user to see results (i.e. output the response)).
Stojevic and Zadeh are analogous art because they are from the same field of endeavor in neural networks. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Stojevic and Zadeh before him or her to modify the encoding and decoding processes of Stojevic to include the user interface as in Zadeh, obtaining the advantage of allowing the user to interact with the implementation (Zadeh; [0729]: "The Z-mouse is for example provided through a user interface on a computing device or other controls such as sliding/knob type controls, to control the position and size of an f-mark,"; [1810]: "...the results from above, which is connected to output module, e.g. printout or computer monitor or display or any graphic or table or list generator, for the user to use or see…").

Regarding claim 26, Stojevic and Zadeh teach the apparatus according to claim 25. Stojevic further teaches:
encode the input to generate a latent variable ([0026]: "...trained to encode the input to a small dimensional vector in the latent space…"; Stojevic teaches encoding the input and generating a latent vector (i.e. latent variable))
partition the latent variable region space into the regions corresponding to control inputs ([0149]: "...the generative tensorial approach described here will explore regions of the huge space of possible compounds not accessible to other methods. The output data may alternatively or additionally be a filtered version of the input data, corresponding to a smaller number of data points,"; Stojevic teaches that the space has regions (i.e. the space is partitioned) and that the data that comes from it can be filtered based on data points (i.e. control inputs that correspond to responses))
select a control input, from the control inputs, corresponding to the latent variable ([0134]: “Tensor networks enable intelligent priors to be picked that, in turn, restrict the search to the space of physically relevant elements…"; [0171]: “The standard VAE first encodes an input x into a set of latent variables p(x), a(x). The decoder network samples the latent space from a prior distribution p(z), usually a15 Gaussian, and decodes to an output x'. The network is optimised to reproduce the inputs,”; Stojevic teaches a prior distribution (i.e. control input) associated with (i.e. corresponding to) samples from a latent space (i.e. latent variable[s]), used in the encoding and decoding operations. It also teaches selecting such priors)
generate the latent variable vector from the region of the latent variable region space corresponding to the control input ([0149]: “…the latent space might be a tensorial object, or a simple vector (which is the usual setup in an autoencoder), or some other mathematical construct such as a graph. The output determined by a given element of the latent space…”; [0171]: “The standard VAE first encodes an input x into a set of latent variables p(x), a(x). The decoder network samples the latent space from a prior distribution p(z), usually a15 Gaussian, and decodes to an output x'. The network is optimised to reproduce the inputs,”; Stojevic teaches an autoencoder that encodes a set of latent variables (i.e. the latent variable vector), then samples the latent space from a prior Gaussian distribution (i.e. the region corresponding to the selected control input based on a probability distribution) and decodes an output (i.e. generating the latent variable vector belonging to the region)).

Regarding claim 27, Stojevic and Zadeh teach the method according to claim 26. Stojevic further teaches the control input is configured to randomly correspond to any one of the regions ([0461]: "Starting from a random n−1×n−1 dimensional orthogonal matrix, a random n×n dimensional orthogonal matrix can be constructed by taking a randomly distributed n-dimensional vector, constructing its Householder transformation, and then applying the n−1 dimensional matrix to this vector,"; ).

Regarding claim 29, Stojevic and Zadeh teach the method according to claim 26. Stojevic further teaches:
the neural network-based encoder implementing a first neural network to receive the input at an input layer of the first neural network, and an output layer of the first neural network corresponding to a mean and a variance of a probability distribution modeling the latent variable ([0026]: "The term 'autoencoder' preferably connotes an artificial neural network having an output in the same form as the input, trained to encode the input to a small dimensional vector in the latent space, and to decode this vector to reproduce the input as accurately as possible,”; Fig. 21, [0171]: “The standard VAE first encodes an input x into a set of latent variables p(x), a(x). The decoder network samples the latent space from a prior distribution p(z), usually a15 Gaussian, and decodes to an output x'. The network is optimised to reproduce the inputs,”; Stojevic teaches a VAE neural network that accepts an input, and produces an output that is a reproduction of the input (i.e. modeling the variable). The output is produced with the use of a Gaussian (i.e. mean and variance of a probability); the claimed neural network-based encoder as depicted in Fig. 5 and Fig. 21, in 0171)
the neural network-based decoder implementing a second neural network to receive the latent variable vector at an input layer of the second neural network, and an output layer of the second neural network corresponding to the response ([0026]: "The term 'autoencoder' preferably connotes an artificial neural network having an output in the same form as the input, trained to encode the input to a small dimensional vector in the latent space, and to decode this vector to reproduce the input as accurately as possible,”; [0171]: “The standard VAE first encodes an input x into a set of latent variables p(x), a(x). The decoder network samples the latent space from a prior distribution p(z), usually a15 Gaussian, and decodes to an output x'. The network is optimised to reproduce the inputs,”; Stojevic teaches a VAE neural network decoder that takes an input from the latent space (i.e. latent variable vector) and produces an output corresponding to the input to the encoder (i.e. output response); the claimed neural network-based decoder as depicted in Fig. 5 and Fig. 21, in 0171).

Claim 28 is rejected under 35 U.S.C. 103 as being unpatentable over Stojevic and Zadeh. In view of Vigen. 

Regarding claim 28, Stojevic and Zadeh teach the apparatus according to claim 26. Vigen further teaches the control input corresponds to any one or any combination of keywords, sentiment of the user, attitude of the user, directive of the user, and guidance of the user ([0050]: "After the received communication is placed into a recognizable and parsable form, the received communication is separated or parsed in process block 102 into individual communication elements, such as words, phrases or groups,"; [0231]: "For example, the system could recognize those different goals, voice (tone), motivation, etc. of a communication and generate responses different to each sender group based upon the attributes and patterns of the communication and of each sender group,"; Vigen teaches the system accepting words, phrases or groups (i.e. keywords), as well as goals, tone, and motivation (i.e. sentiment of the user, attitude of the user, directive of the user, and guidance of the user), as data the system could recognize and operate with (i.e. a control input)).
Stojevic, Zadeh and Vigen are analogous art because they are from the same field of endeavor in computing. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Stojevic, Zadeh and Vigen before him or her to modify the control input of Stojevic and Zadeh to accept keywords and user sentiment, attitude, directive, and guidance as in Vigen, obtaining the advantage of being able to operate on such data (Vigen; [0127]: “The CPU 202 may acquire communications, instructions and/or data for implementing communications analysis through the input/output bus 210,”).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Bahuleyan et al. (NPL: “Variational Attention for Sequence-to-Sequence Models”): teaches the inherent properties of variational autoencoder (VAE) in mapping regions as “VAE populates hidden representations to a region (instead of a single point), making it possible to generate diversified data from the vector space (Bowman et al., 2016) or even control the generated samples”.. and the VAE use of encoder and Decoder Neural networks to map latent vector space z using a probability distribution. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to OLUWATOSIN ALABI whose telephone number is (571)272-0516. The examiner can normally be reached Monday-Friday, 8:00am-5:00pm EST..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/O.O.A./Examiner, Art Unit 2129                                                                                                                                                                                                        
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129