DETAILED ACTION
1.	This office action is in response to the response filed 8/15/2022.   Claims 1-20 are pending in the application and have been examined.  

Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
3.	Applicant’s arguments, with respect to the Olabiyi references have been fully considered and are persuasive.  The rejections under 35 USC 103 based on these references are withdrawn. New rejections are detailed herebelow.

Claim Rejections - 35 USC § 103
4.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

5.	Claims 1-5, 7, 8, 17, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent App. Pub. No. 20190130903 (Sriram et al., hereinafter “Sri”) in view of U.S. Patent App. Pub. No. 20190385019 (Bazrafkan et al., hereinafter “Baz”) and U.S. Patent App. Pub. No. 20200152184 (Steedman Henderson et al., hereinafter “Steed”).
With regard to Claim 1, Sri teaches:
initialize a machine classifier having a deep neural network architecture and a plurality of machine classifier parameters, wherein the deep neural network architecture comprises an encoder, a generator, a discriminator, and an output layer; (Figure 7 shows the generator 30 and discriminator 70, as describes in paragraphs 106 and 107.  Figure 9 shows the encoder 101 and decoder 103 output, as described in paragraphs 141 and 142.  Paragraph 83 describes that an input data is classified.)
train the machine classifier, based on a training set comprising a plurality of examples, to refine the plurality of machine classifier parameters, wherein training the machine classifier using an example comprises: (Paragraph 84 describes training the classifier.)
generating, by the encoder, an encoded input based on the example; (Paragraph 141 describes that the encoder pools the input data.)
generating, by the generator, a generator response based on the encoded input; (Paragraph 105 describes that generator G generates realistic samples of the target domain Y from the source domain X.) and
generate, by the output layer, one or more class labels based on an input data set using the trained machine classifier.  (Paragraph 168 describes that the model provides a classification result.)
However, Sri does not explicitly describe:	“one or more processors; and 
memory storing instructions that, when executed by the one or more processors, cause the computing device to:
…
generating, by the discriminator, discriminator feedback based on the encoded input and the generator response; 
updating the plurality of machine classifier parameters based on minimizing an average gradient of a loss function calculated based on a first weight determined based on the discriminator feedback and a second weight determined based on the generated response.”
However, Baz describes:
 “generating, by the discriminator, discriminator feedback based on the encoded input and the generator response.” (Paragraph 28 describes that discriminator error can be backpropagated through the generator (feedback).)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the feedback between a generator and discriminator of Baz into the system of Sri to provide superior GAN results, as described in paragraph 28 of Baz.
However, Sri in view of Baz does not explicitly describe:
“one or more processors; and 
memory storing instructions that, when executed by the one or more processors, cause the computing device to:
…
updating the plurality of machine classifier parameters based on minimizing an average gradient of a loss function calculated based on a first weight determined based on the discriminator feedback and a second weight determined based on the generated response.”
Steed describes:
 “one or more processors; (paragraph 120, processor 3) and 
memory storing instructions (paragraph 120, a computer program 5 is stored in non-volatile memory.) that, when executed by the one or more processors, cause the computing device to:
…
updating the plurality of machine classifier parameters based on minimizing an average gradient of a loss function calculated based on a first weight determined based on the discriminator feedback and a second weight determined based on the generated response”
In this regard, paragraph 554 of Steed states “In general, gradient descent based optimizers update the parameter in the direction of steepest descent of the loss function with respect to the parameter, scaled by a learning rate. The parameters are replaced with the new values and the process iterates with another slot value combination for the dialogue turn for example.”  (“update the parameter in the direction of steepest descent of the loss function with respect to the parameter” is cited as minimizing a gradient of a loss function calculated based on the generated response.))
Further, paragraph 555 describes that “The average of the gradient values may be taken across the batch, and the update performed using the batch average values.”  This describes minimizing the average gradient of a loss function calculated based on the generated response.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the minimum average gradient of a loss function of Steed into the system of Sri in view of Baz to optimize update of the model parameters, as described in paragraph 556 of Steed.
With regard to Claim 2, Sri does not explicitly describe “wherein the instructions, when executed by the one or more processors, further cause the computing device to generate the discriminator feedback based on a ground truth label associated with the encoded input and at least one previous response generated by the generator based on the encoded input.”  However, Figure 1 of Sri shows that the ground truth X is input into the discriminator.
Further, paragraph 28 of Baz describes that discriminator error can be backpropagated (fed back) through the generator.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the feedback between a generator and discriminator of Baz into the system of Sri to provide superior GAN results, as described in paragraph 28 of Baz.
With respect to Claim 3, Sri in view of Baz does not explicitly describe “wherein the instructions, when executed by the one or more processors, further cause the computing device to generate the generator response for an example by using the discriminator feedback to weight a cross-entropy loss for each of a set of candidate responses generated by the generator and selecting the candidate response with the lowest loss as the generator response.”  
Steed describes using a cross entropy loss function in paragraph 554, and to update the parameter in the direction of steepest descent of the loss function with respect to the parameter.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the cross entropy loss function of Steed into the system of Sri in view of Baz to optimize update of the model parameters, as described in paragraph 556 of Steed.
With regard to Claim 4, Sri describes “wherein the encoded input comprises a single vector representation of the example.”  Paragraph 227 describes that the encoded input is a vector.
With regard to Claim 5, Sri describes “the discriminator comprises a metric encoder.”  Paragraph 30 of Sri describes that the encoder includes a decoder distance enhancer, which is cited as a “metric encoder.”
With regard to Claim 7, Sri in view of Baz does not explicitly describe this subject matter.  However, Steed describes “the deep neural network architecture comprises a feed-forward neural network.”  Paragraph 171 describes that the classifier can include a feed-forward network.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the feed-forward classifier of Steed into the system of Sri in view of Baz in order to generate classification based on current user utterance, the previous dialogue output and the candidate value, as described in paragraph 172 of Steed.
With regard to Claim 8, Sri in view of Baz does not explicitly describe “the deep neural network architecture comprises a convolutional neural network.”  However, paragraph 156 of Steed describes the used of CNNs.  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the CNN of Steed into the system of Sri in view of Baz in order to generate a single fixed length feature vector from the input set of vector representations, as described in paragraph 155 of Steed.
	With regard to Claim 17, Sri describes:
initializing a machine classifier having a deep neural network architecture and a plurality of machine classifier parameters, wherein the deep neural network architecture comprises an encoder, a generator, and a discriminator; (Figure 7 shows the generator 30 and discriminator 70, as describes in paragraphs 106 and 107.  Figure 9 shows the encoder 101 and decoder 103 output, as described in paragraphs 141 and 142.  Paragraph 83 describes that an input data is classified.)
training the machine classifier, based on a training set comprising a plurality of examples, to refine the plurality of machine classifier parameters, (Paragraph 84 describes training the classifier.)
wherein training the machine classifier using an example comprises:
generating, by the encoder, an encoded input based on the example, the encoded input comprising a single vector (Paragraph 141 describes that the encoder pools the input data. Paragraph 30 describes that the input is the vector X.)
generating, by the generator, a generator response based on the encoded input by weighting a cross-entropy loss for each of a set of candidate responses, generated by the generator for the encoded input, and selecting the candidate response in the set of candidate responses with the lowest loss as the generator response; (Paragraph 31 describes using the minimum cross-entropy loss to determine the output of the GAN)
generating, by the output layer, one or more class labels based on an input data set using the trained machine classifier.  (Paragraph 168 describes that the model provides a classification result.)
However, Sri does not explicitly describe:
“A non-transitory machine-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform steps comprising:
generating, by the discriminator, discriminator feedback based on the generator response, a ground truth label associated with the encoded input, and at least one previous response generated by the generator based on the encoded input;
updating the plurality of machine classifier parameters based on minimizing an average gradient of a loss function calculated based on a first weight determined based on the discriminator feedback and a second weight determined based on the generator response.”
However, Figure 1 of Sri shows that the ground truth X is input into the discriminator.
Further, Baz describes:
“generating, by the discriminator, discriminator feedback based on the generator output, [[a ground truth label associated with the encoded input,]] and at least one previous response generated by the generator based on the encoded input.” (Paragraph 28 describes that discriminator error can be backpropagated through the generator (feedback).  This could be based on the ground truth input to the discriminator as described by Sri.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the feedback between a generator and discriminator of Baz into the system of Sri to provide superior GAN results, as described in paragraph 28 of Baz.
However, Sri in view of Baz does not explicitly describe:
“A non-transitory machine-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform steps comprising:
updating the plurality of machine classifier parameters based on minimizing an average gradient of a loss function calculated based on a first weight determined based on the discriminator feedback and a second weight determined based on the generated response.”
Steed describes:
“A non-transitory machine-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform steps comprising: (paragraph 120, a computer program 5 is stored in non-volatile memory and executed by processor 3.)
updating the plurality of machine classifier parameters based on minimizing an average gradient of a loss function calculated based on a first weight determined based on the discriminator feedback and a second weight determined based on the generated response.”
Steed describes at paragraph 554 “In general, gradient descent based optimizers update the parameter in the direction of steepest descent of the loss function with respect to the parameter, scaled by a learning rate. The parameters are replaced with the new values and the process iterates with another slot value combination for the dialogue turn for example.”  (“update the parameter in the direction of steepest descent of the loss function with respect to the parameter” is cited as minimizing a gradient of a loss function calculated based on the generated response.)
Further, paragraph 555 describes that “The average of the gradient values may be taken across the batch, and the update performed using the batch average values.”  This describes minimizing the average gradient of a loss function calculated based on the generated response.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the minimum average gradient of a loss function of Steed into the system of Sri in view of Baz to optimize update of the model parameters, as described in paragraph 556 of Steed.
With regard to Claim 19, Sri describes “the discriminator comprises a metric encoder.”  Paragraph 30 of Sri describes that the encoder includes a decoder distance enhancer, which is cited as a “metric encoder.”

6.	Claims 6 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Sri in view of Baz and Steed and further in view of U.S. Patent App. Pub. No. 20150055855 (Rodriguez et al., hereinafter “Rod”).
With regard to Claims 6 and 20, Sri in view of Baz and Steed does not explicitly describe that “the generator comprises a maximum likelihood estimator classifier.”  
However, Rod describes at paragraph 112 that vector classification can be done with a maximum likelihood estimator.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the maximum likelihood estimator of Rod into the system of Sri in view of Baz and Steed to provide a simple classification method, as described in paragraph 112 of Rod.

7.	Claims 9-14, 16, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Sri in view of Baz and Steed and further in view of U.S. Patent App. Pub. No. 20210217408 (Hakkani-Tur et al., hereinafter “Hak”).
With regard to Claim 9, Sri in view of Baz and Steed does not explicitly describe this subject matter.  However, Hak describes “the deep neural network architecture comprises a recurrent neural network.” Paragraph 19 of Hak describes that the encoder of a classifier can include a unidirectional RNN.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the RNN of Hak into the system of Sri in view of Baz and Steed to accumulate dialogue context from previous dialogue turns, as described in paragraph 19 of Hak.
With regard to Claim 10, Sri in view of Baz and Steed does not explicitly describe this subject matter.  However, Hak describes “the input data comprises a multi-turn dialogue data set.” Paragraph 112 of Hak describes the use of multi-turn dialogue sets.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the multi-turn dialogue sets of Hak into the system of Sri in view of Baz and Steed to resolve ambiguities, as described in paragraph 112 of Hak.
With regard to Claim 11, Sri teaches:
initializing, by a computing device, a machine classifier having a deep neural network architecture and a plurality of machine classifier parameters, wherein the deep neural network architecture comprises an encoder, a generator, and a discriminator; (Figure 7 shows the generator 30 and discriminator 70, as describes in paragraphs 106 and 107.  Figure 9 shows the encoder 101 and decoder 103 output, as described in paragraphs 141 and 142.  Paragraph 83 describes that an input data is classified.)
training, by the computing device, the machine classifier, based on a training set comprising a plurality of examples, to refine the plurality of machine classifier parameters, (Paragraph 84 describes training the classifier.)
wherein training the machine classifier using an example comprises:
generating, by the encoder, an encoded input based on the example, the encoded input comprising a single vector; (Paragraph 141 describes that the encoder pools the input data.  Paragraph 30 describes that the input is the vector X.)
generating, by the generator, a generator response based on the encoded input; (Figure 2 shows decoder RNN generating a response based on the output of the Context RNN.)
generating, by the computing device and using the trained machine classifier, at least one dialogue response [[based on a multi-turn dialogue data set]].  (Section 2.1 describes that input X is a multi-turn dialogue data set at the beginning, and that responses to a dialogue history are generated by the system.)
However, Sri does not explicitly describe the use of a multi-turn dialogue data set or:
“generating, by the discriminator, discriminator feedback based on the generator output, a ground truth label associated with the encoded input, and at least one previous response generated by the generator based on the encoded input; and
updating the plurality of machine classifier parameters based on minimizing an average gradient of a loss function calculated based on a first weight determined based on the discriminator feedback and a second weight determined based on the generator response.”
However, paragraph 30 of Sri describes that the ground truth label is input into the discriminator.
Further, Baz describes:
“generating, by the discriminator, discriminator feedback based on the generator output, [[a ground truth label associated with the encoded input,]] and at least one previous response generated by the generator based on the encoded input.” (Paragraph 28 describes that discriminator error can be backpropagated through the generator (feedback).  This could be based on the ground truth input to the discriminator as described by Sri.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the feedback between a generator and discriminator of Baz into the system of Sri to provide superior GAN results, as described in paragraph 28 of Baz.
However, Sri in view of Baz does not explicitly describe the use of a multi-turn dialogue data set or:
“updating the plurality of machine classifier parameters based on minimizing an average gradient of a loss function calculated based on a first weight determined based on the discriminator feedback and a second weight determined based on the generator response.”
Steed describes at paragraph 554 “In general, gradient descent based optimizers update the parameter in the direction of steepest descent of the loss function with respect to the parameter, scaled by a learning rate. The parameters are replaced with the new values and the process iterates with another slot value combination for the dialogue turn for example.”  (“update the parameter in the direction of steepest descent of the loss function with respect to the parameter” is cited as minimizing a gradient of a loss function calculated based on the generated response.)
Further, paragraph 555 describes that “The average of the gradient values may be taken across the batch, and the update performed using the batch average values.”  This describes minimizing the average gradient of a loss function calculated based on the generated response.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the minimum average gradient of a loss function of Steed into the system of Sri in view of Baz to optimize update of the model parameters, as described in paragraph 556 of Steed.
Sri in view of Baz and Steed does not explicitly describe the use of a multi-turn dialogue data set. However, Hak describes “the input data comprises a multi-turn dialogue data set.” Paragraph 112 of Hak describes the use of multi-turn dialogue sets.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the multi-turn dialogue sets of Hak into the system of Sri in view of Baz and Steed to resolve ambiguities, as described in paragraph 112 of Hak.
With regard to Claim 12, Sri describes “generating the generator response by: 
generating, by the generator, a plurality of candidate responses, each candidate response comprising a loss calculated [[based on the discriminator feedback;]] (Paragraph 30 describes that a loss is calculated for each of multiple hidden states.) and  
selecting the generator response from the plurality of candidate response based on the loss for each candidate response.”  (Paragraph 31 describes that one candidate is selected from among multiple candidates based on minimizing a loss function.)
Sri does not explicitly describe discriminator feedback.
However, Baz describes that discriminator error can be backpropagated through the generator (feedback).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the feedback between a generator and discriminator of Baz into the system of Sri to provide superior GAN results, as described in paragraph 28 of Baz.
With regard to Claim 13, Sri does not explicitly describe “generating the generator response based on a ground truth label associated with the encoded input and at least one previous response generated by the generator based on the encoded input.”
However, paragraph 30 of Sri describes that the ground truth X is input into the discriminator.
Further, paragraph 28 of Baz describes that discriminator error can be backpropagated (fed back) through the generator.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the feedback between a generator and discriminator of Baz into the system of Sri to provide superior GAN results, as described in paragraph 28 of Baz.
With regard to Claim 14, Sri describes “the discriminator comprises a metric encoder.”  Section 5.2 of Sri describes the metrics used to evaluate the network. This combination of metrics is cited as a “metric encoder.”
With regard to Claim 16, Sri in view of Baz and Steed does not explicitly describe this subject matter.  However, Hak describes “the deep neural network architecture comprises a recurrent neural network.” Paragraph 19 of Hak describes that the encoder of a classifier can include a unidirectional RNN.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the RNN of Hak into the system of Sri in view of Baz and Steed to accumulate dialogue context from previous dialogue turns, as described in paragraph 19 of Hak.
With regard to Claim 18, Sri in view of Baz and Steed does not explicitly describe this subject matter.  However, Hak describes “the deep neural network architecture comprises a recurrent neural network.” Paragraph 19 of Hak describes that the encoder of a classifier can include a unidirectional RNN.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the RNN of Hak into the system of Sri in view of Baz and Steed to accumulate dialogue context from previous dialogue turns, as described in paragraph 19 of Hak.

8.	Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Sri in view of Baz, Steed, and Hak and further in view of Rod.
With regard to Claim 15, Sri in view of Baz, Steed, and Hak does not explicitly describe that “the generator comprises a maximum likelihood estimator classifier.”  
However, Rod describes at paragraph 112 that vector classification can be done with a maximum likelihood estimator.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the use of the maximum likelihood estimator of Rod into the system of Sri in view of Baz, Steed, and Hak to provide a simple classification method, as described in paragraph 112 of Rod.

Conclusion
9.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
U.S. Patent App. Pub. No. 20200027442 (Mathur et al.) describes a device that performs classification using a discriminator and generator within a GAN.
10.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDWARD TRACY whose telephone number is (571)272-8332. The examiner can normally be reached Monday-Friday 9 AM- 5PM.  Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/EDWARD TRACY JR./Examiner, Art Unit 2656                                                                                                                                                                                                        
/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656