DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
	In response to the preinterview first office action mailed 5/2/2022 and the interview conducted 5/13/2022, Applicant has submitted an amendment filed 5/25/2022.
	Claims 1-5, 7, 11-18, and 20 have been amended.

Allowable Subject Matter
Claims 1-20 are allowed.
The following is an examiner’s statement of reasons for allowance: 

	As per Claim(s) 1 (and similarly claim[s] 11 and 18 [which are narrower than claim 1], and consequently claim[s] 2-10, 12-17, and 19-20 which depend on claim[s] 1, 11, and 18), the prior art of record does not teach or suggest the combination of all limitations in claim(s) 1, including (i.e. in combination with the remaining limitations in claim[s] 1) A computer-implemented method, comprising: initializing a machine classifier having a sequence to sequence network architecture, wherein the sequence to sequence network architecture comprises an encoder and a decoder; obtaining input data comprising a conversation history and a user utterance; determining a user intent based on the user utterance; determining, based on the conversation history and the user utterance, at least one entity in the user utterance; generating, by the machine classifier and based on the user intent and the user utterance, a candidate response; obtaining, by the machine classifier and based on the user intent and the at least one entity, a response template; generating, by the machine classifier and based on the response template and the candidate response, a first response; and providing the first response.
Ham et al. (“End-to-End Neural Pipeline for Goal-Oriented Dialogue Systems using GPT-2”, cited in IDS) appears to describe where dialogue history and a current utterance are used to determine candidate responses, and then using the candidate response to fill in a response template to generate an output response (Figure 2).  This reference does not appear to determine an entity in the user utterance based on both the conversation history and the user utterance (assuming “south” in “I would like the south part of town please” is considered an entity, no previous part of the dialogue is used to determine “south” to be an entity in the user utterance, and if “I would like the south part of town please” is considered conversation history, then that same utterance cannot be considered the user utterance).  Additionally, it does not appear that the response template is obtained based on the at least one entity.
2018/0181673 teaches “In this embodiment, after webpages containing candidate answers are obtained by searching according to the query statement, the first candidate answer can be obtained based on the webpage analysis method, the second candidate answer can be obtained based on the deep learning method, and the third candidate answer can be obtained based on the template matching method” (paragraph 58) and “The third candidate answer is obtained based on the template matching method as follows. The candidate answers in the webpages containing candidate answers are obtained, a semantic analysis is performed on the candidate answers to extract word characteristics of the candidate answers, candidate templates are obtained from an answer template according to the word characteristics, a correlation between each of the candidate templates and the query statement is obtained, an answer template is obtained from the candidate templates according to the correlation, and the third candidate answer is generated according to the answer template” (paragraph 71).  Paragraph 71 appears to describe where candidate answers (in webpages containing candidate answers) and obtained candidate templates that are based on candidate answers (such that the third candidate answer is generated based on both a template and candidate answers [upon which the template is based]).  The webpages containing candidate answers are obtained based on characteristics of (and entities in) a query statement (paragraphs 47-55).  This reference does not specifically describe where a user intent is used in addition to an entity to generate a template.
2014/0222436 teaches where an informational answer is provided in response to a user request, and where a user’s spoken/text input is interpreted to deduce user intent, and where the deduced user intent is used to generate responses (paragraphs 42-43).
2009/0150156 teaches “According to various aspects of the invention, the natural language voice user interface may enable successive refinement of a final destination by progressively narrowing the final destination. For example, successively refining the destination may be modeled after patterns of human interaction in which a route or a destination may be narrowed down or otherwise refined over a course of interaction. For example, a user may generally approximate a destination, which may result in a route being calculated along a preferred route to the approximated destination. While en route to the approximated destination, the user and the voice user interface may cooperatively refine the final destination through one or more subsequent interactions. Thus, a user may provide a full or partial destination input using free form natural language, for example, including voice commands and/or multi-modal commands. One or more interpretations of a possible destination corresponding to the voice destination input may be organized in an N-best list. The list of possible destinations may be post-processed to assign weights or ranks to one or more of the entries therein, thus determining a most likely intended destination from a full or partial voice destination input. Thus, the post-processing operation may rank or weigh possible destinations according to shared knowledge about the user, domain-specific knowledge, dialogue history, or other factors. As a result, the full or partial destination input may be analyzed to identify an address to which a route can be calculated, for example, by resolving a closest address that makes "sense" relative to the input destination. Subsequent inputs may provide additional information relating to the destination, and the weighted N-best list may be iteratively refined until the final destination can be identified through successive refinement. As a result, when a suitable final destination has been identified, the route to the final destination may be completed” (paragraph 12).  This reference describes where a destination entity can be determined based on dialog/conversation history and a user utterance (i.e. a spoken destination).  This reference does not appear to determine a user intent and use the user intent to generate a candidate response.
2021/0056270 (priority to KR 20210022819 A, filed 8/20/2019, where provisional application 62/877,076 does not appear to support the independent claims) teaches “The response generator 301 may generate candidate responses by using a hierarchical recurrent encoder decoder (HRED)-based sequence to sequence deep neural network and the user language model 302, and transfer the candidate responses to the ranking network 305” (paragraphs 64-73).  This reference describes where an encoder decoder based sequence to sequence neural network is used to generate responses (paragraph 66), and where past conversation content (conversation “history”) is used to obtain a response to a user’s current input (paragraph 68)
2020/0202887 teaches “According to a conventional sequence-to-sequence model having an encoder and a decoder but no feature corresponding to affective re-ranking stage 370, the encoder computes vector representation h.sub.S for source sequence S, while the decoder generates one word at a time, and computes the conditional probability of a candidate response, R.sub.C, as” (paragraph 28).
7853557 teaches “receiving a first query; displaying a plurality of candidate response templates in a user interface, wherein the candidate response templates are displayed in an agent language selected by a user; displaying candidate response data for populating the candidate response templates, wherein the candidate response data is displayed in the agent language; receiving an input via a user interface, wherein the input identifies that the first query is in a first query language, the first query language is different from the agent language, and the input selects a first suitable response template from among the candidate response templates; selecting a first outgoing response template, wherein the first outgoing response template is in the first query language, and the first outgoing response template corresponds to the first suitable response template; populating, using a processor, the first outgoing response template with first response data, wherein the populating the first outgoing response template is performed by a first thread in a multi-threaded process, the first response data is obtained using a first pointer to a first set of data; the first response data is based at least in part on the first set of data, the first set of data corresponds to the first thread and to the first query language; receiving a second query in a second query language; and populating the second outgoing response template with second response data” (claim 56) and “populating the first suitable response template with first response data in the agent language; displaying the first suitable response template populated with the response data in the agent language; and transmitting the outgoing response template populated with the response data in the query language” claim 57).  This reference appears to describe where candidate response data is used to populate candidate response templates in order to generate an outgoing response template, but it is not clear that the candidate response data is a candidate response (as opposed to data used to populate a template but which is not, itself, a response/answer)
6721706 teaches “If the mood/personality classifer 290 receives a signal from the video image classifier 240, indicating the user is moving in a fashion consistent with being agitated, that mood/personality classifer 290 may combine this information with other classifier signals to generate a mood/personality state vector indicating an emotional state of heightened anxiety. For example, the audio classifier 210 may be contemporaneously indicating that the speaker's voice is more highly pitched than usual and the input parser 410 may indicate that the word count of the most recent responses is unusually low. The choices of candidate response templates chosen by the response generator 415 will be affected by the mood/personality state, for example by choosing to change the topic of conversation to one or more that the response generator 415 is programmed to select in such circumstances” (col. 26, line 35 – col. 27, line 3).  This reference suggests where responses are generated using candidate response templates (not where a response is generated based on a response template and a separate candidate response).
10978056 teaches “In particular embodiments, one or more of the candidate responses may be generated by a language-template. The language-template may integrate one or more n-grams based on a particular order into a candidate response. In particular embodiments, one or more of the candidate responses may be generated by an information-retrieval algorithm” (col. 23, lines 41-46).  This reference also describes generating candidate responses based on a language template (not where a response is generated based on a response template and a separate candidate response)

Upon further search (in response to an interview):
2009/0150156 teaches “According to various aspects of the invention, the natural language voice user interface may enable successive refinement of a final destination by progressively narrowing the final destination. For example, successively refining the destination may be modeled after patterns of human interaction in which a route or a destination may be narrowed down or otherwise refined over a course of interaction. For example, a user may generally approximate a destination, which may result in a route being calculated along a preferred route to the approximated destination. While en route to the approximated destination, the user and the voice user interface may cooperatively refine the final destination through one or more subsequent interactions. Thus, a user may provide a full or partial destination input using free form natural language, for example, including voice commands and/or multi-modal commands. One or more interpretations of a possible destination corresponding to the voice destination input may be organized in an N-best list. The list of possible destinations may be post-processed to assign weights or ranks to one or more of the entries therein, thus determining a most likely intended destination from a full or partial voice destination input. Thus, the post-processing operation may rank or weigh possible destinations according to shared knowledge about the user, domain-specific knowledge, dialogue history, or other factors. As a result, the full or partial destination input may be analyzed to identify an address to which a route can be calculated, for example, by resolving a closest address that makes "sense" relative to the input destination. Subsequent inputs may provide additional information relating to the destination, and the weighted N-best list may be iteratively refined until the final destination can be identified through successive refinement. As a result, when a suitable final destination has been identified, the route to the final destination may be completed” (paragraph 12).  This reference appears to describe determining a destination “entity” based on a user utterance and dialog history (“conversation history”).  This reference does not appear to describe where a user intent and the destination entity is used to obtain a response template.
2014/0358890 teaches “Referring back to FIG. 3, at 312, the answer assembler 208 constructs the candidate answer based on the most relevant candidate paragraph. There are many ways of delivering the same answer. In one implementation, the answer assembler 208 determines the template of the input question, and maps the question template to an answer template. The candidate answer may then be assembled in accordance with the answer template” (paragraph 91) and “After the training set is generated, the candidate answer may be extracted from input paragraphs. For purposes of illustration, assume that the input question is: "When was Wolfgang Amadeus Mozart born?". The original text of a paragraph returned by the answer retrieval and ranking unit 206 may be as follows: "Wolfgang Amadeus Mozart was born on 1756". Based on the training set, the input question type may correspond to the answer template: &lt;NAME&gt; was born on &lt;Answer&gt;. Since the paragraph matches the answer template, the answer "1756" may be extracted as the candidate answer” (paragraph 105).  This reference appears to describe where a candidate answer is extracted based on a template and based on text returned by an answer retrieval and ranking unit (where the text is a sentence that can be interpreted as an answer to an input question).  It is not clear that a question type or question template can be interpreted as an intent that is used to obtain a response template, and it does not appear that the response template is also obtained based on intent and “at least one entity” which is determined based on conversation history and a user utterance.
Wang, F. (2019). Building high-performance distributed systems with synchronized clocks (Order No. 28113373). Available from ProQuest Dissertations and Theses Professional. (2467863602). Retrieved from https://dialog.proquest.com/professional/docview/2467863602?accountid=131444 teaches “If the user uses voice interaction, the utterance is converted to text using speech recognition. The text is then processed by the natural language understanding (NLU) module. The goal is to understand the intent of the given text, and extract the entities, or parameters from the text. The intent maps what the user says to what action the chatbot should take. The entities are the concepts in the text, often with types. Typical examples of entity types are locations, dates, time, as well as domain-specific concepts, such as server, switch, and link etc. An example text with its intent and entities are shown in Figure 4.13. In the meanwhile, the chatbot keeps track of the context, which is the state of the conversation based on previous interactions. The context needs to be incorporated with the input to understand the intent…Once the intent and entities are obtained from the natural language understanding module, our software has actions logic to process the result and take corresponding actions.  For the text in Figure 4.13, the intent is to query the amount of data. The corresponding action is to generate a database query, execute it against the database, and display the response using the query result. Our data is stored in a relational database that uses SQL as its query language. We define a SQL template and a response template for each intent. The SQLs are generated by filling in the templates with the given entities. The responses are generated by filling in the templates with parameters from the input and the query results…  The actions logic module takes the intent and entities as input, and perform desired actions.  There are two types of actions in our system. The first type involves only actions on the dashboard, such as animation control, opening up reconstruction plots, etc. Another type of actions involves querying the database, and possibly analyzing the data. As described earlier, we use a relational database with SQL, and prepare a SQL template for each of the intents. Given an intent and entities, we pick the corresponding SQL template, and fill in the template with the given entities. The query is executed against the database, and the result is filled into response templates before displaying the responses to the users” (pages 74-79).  This reference appears to suggest where responses provided to users are generated by filling in templates with entities, and where a template is chosen based on intent and entities.  Page 74 describes where context is based on conversation history, but this reference does not appear to describe where conversation history is used to determine entities.  The provided response also does not appear to be generated based on a candidate response and a template.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC YEN whose telephone number is (571)272-4249. The examiner can normally be reached M-F 12:00PM -8:30PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached on (571)272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





EY 6/22/2022
/ERIC YEN/Primary Examiner, Art Unit 2658