Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Acknowledgement  
Acknowledgement is made of applicant’s amendment made on 06/11/2021. Applicant’s submission filed has been entered and made of record.
Status of the Claims
Claims 1-8 are pending. 
Response to Applicant’s Argument
In response to “Robichaud is, however, completely silent with respect to the claimed features "generate situation information indicating the talking situation," "determine an intention of the talk based on the situation information," "generate intention information indicating the intention of the talk," and "convert the situation information and the intention information into the natural language script form" as recited in claim 1”. 
In view of applicant’s amendment to claims 1 and 5, rejection set forth under 35 USC 102 has been withdrawn. Upon further search and consideration, please see details of a new combination of references set forth below. 
Claim Rejections - 35 USC § 103
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 103 that form the basis for the rejections under this section made in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1 and 5 are rejected under 35 USC 103(a) as being unpatentable over Robichaud et al. (US 2016/0188565 A1) in view of Pitschel et al. (US 9922642 B2).
Regarding Claims 1 and 5, Robichaud discloses a user adaptive conversation apparatus (Fig. 1 and ¶18, dynamic system 100 may be implemented on a client computing device 104), comprising: 
a voice recognition unit configured to convert a talk of a user in a conversational situation into a natural language script form to generate talk information (¶21, speech recognition component 110 receives natural language expression and outputs a plurality of n-best representations of the received natural language expression); 
an artificial visualization unit configured to recognize a talking situation from a video acquired in the conversational situation to generate situation information indicating the talking situation and generate intention information indicating an intention of the talk (¶70 and ¶74, camera 830 captures gesture entry for interpreting user gestures for controlling functionality of the computing device; in view of ¶24-25, the gesture may constitute a previous request / contextual information for language understanding component 120 to evaluate current natural language expression in a current dialog session to perform intent prediction); 
¶23 and ¶25, language understanding component performs domain and intent prediction and slot tagging using the contextual information; e.g., slot tagging of “from 2 pm to 4 pm” fills slot type “start_time” with “2 pm” and slot type “end_time” with “4 pm”; intent prediction of “how is the weather tomorrow” includes predicting domain / intent as “weather” and using this as context to predict domain / intent of “how about this weekend” as relating to “weather”); 
a natural language analysis unit configured to perform a natural language analysis for the talk information, the intention information, and the situation information (¶22 and ¶25, language understanding component 120 may perform domain and intent prediction and slot tagging based on / using (1) contextual information (domain / intent / slot from a previous turn) and (2) received n-best representations from speech recognition component 110); and 
a conversation state tracing unit configured to generate current talk state information representing a meaning of the talk information by interpreting the talk information according to the intention information and the situation information (¶28, predictions determined by language understanding component 120 may be sent to dialog component 130 to create dialog hypothesis set for each natural language expression and determine what response / action to take for each natural language expression), and determine next talk state information that includes a plurality of candidate responses corresponding to the current talk status information (¶46, after dialog hypothesis set is created using contextual information, a plurality of dialog responses are generated for the dialog hypothesis set). 
Robichaud does not disclose determining an intention of the talk based on the situation information.
Pitschel discloses a user adaptive conversation apparatus (Col 7, Rows 15-25, digital assistant system 300 residing on a user device 104), comprising: 
a voice recognition unit configured to convert a talk of a user in a conversational situation into a natural language script form to generate talk information (Col 9, Rows 49-56, speech to text processing module 330 recognizes speech input as a sequence of words or tokens); 
an artificial visualization unit (Col 5, Rows 41-43, camera subsystem 220 for taking photographs and recording video clips) configured to: 
recognize a talking situation from a video acquired in the conversational situation (Col 6, Rows 41-55, digital assistant client module 264 utilizes various sensors and subsystems to establish context information associated with the user including images or videos of surrounding environment);
generate situation information indicating the talking situation (Col 10, Rows 34-38, the context information includes sensor information collected before, during, or shortly after the user request); 
determine an intention of the talk based on the situation information (Col 10, Rows 6-11 and Rows 30-34, natural language processor 332 uses context information to clarify, supplement and further define the information contained in the token sequence received from speech to text processing module 330 and attempts to associate the token sequence with one or more actionable intent); and 
Col 13, Rows 31-35, once NLP 332 identifies an actionable intent based on the user request, the NLP 332 generates a structured query to represent the identified actionable intent); 
a natural language analysis unit configured to convert the situation information and  the intention information into the natural language script form to generate talk information (Col 13, Rows 51-56 and Col 14, Rows 26-33, task flow processor 336 invokes dialog flow processor 334 to determine missing parameters necessary to complete the structured query generated by NLP 332; e.g., “Make me a dinner reservation at a sushi place at 7” of Col 13, Rows 40-55 and Col 14, Rows 31-32 is processed by NLP 332 to identify the actionable intent “restaurant reservation” with a partial structured query  {cuisine = “sushi”} {Time = “7 pm”} and missing parameters {party size} and {date} such that dialog flow processor generates questions such as “For how many people?” and “On which day?”).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to determine an intention of the talk based on situation information indicating the talking situation recognized from a video acquired in the conversational situation as taught by Pitschel in order to clarify, supplement, and further define information contained in token sequences received from voice recognition to generate the natural language script form (Pitschel, Col 14, Rows 31-32).
Claims 2-4 and 6-8 are rejected under 35 USC 103(a) as being unpatentable over Robichaud et al. (US 2016/0188565 A1) and Pitschel et al. (US 9922642 B2) as applied to claims 1 and 5, in view of Tseretopaulos et al. (US 2019/0103127 A1).

Regarding Claims 2 and 6, Robichaud does not disclose an emotion tracing unit and an ethic analysis unit.
Tseretopaulos teaches a conversational computer system (Abstract) for implementing an emotion tracing unit (¶77, determine lexical personality score based on sarcasm (i.e., intent) detected in the input, words associated with particular moods, particular regional words / phrases) configured to generate emotion state information indicating an emotional state of the user based on talk information (¶63, virtual assistants can interpret input using natural language processing (NLP) to match user text or voice input to executable commands), intention information (¶49, processing performed by the NLP engine 110 can include identifying an intent associated with the input received via the conversational interface 108; i.e., detect sarcasm), and situation information (¶88, obtain contextual data associated with user such as user location to determine particular regional words / phrases); and 
an ethic analysis unit configured to generate ethical state information indicating ethics of the conversation based on the talk information, the intention information, and the situation information (¶77, determine lexical personality score based on predefined words like curse words; i.e., given a certain user context / locations, words at said location is a curse word and the user spoke it with an intent being interpreted as to curse). 
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to implement Robichaud to generate emotional state information / lexical personality score and ethical state information / lexical personality score in order to generate persona-associated responses for particular user (Tseretopaulos, ¶22 and ¶77, modify the corresponding conversational response in a personalized manner such that the lexical personality of the response corresponds to the lexical personality of the conversational input).
Regarding Claims 3 and 7, Tseretopaulos modified Robichaud to teach a multi-modal conversation management unit configured to select one of the plurality of candidate responses (Robichaud, ¶46, generating dialog responses for each dialog hypothesis set) according to at least one of the emotion state information and the ethical state information to determine final next talk state information including a selected response (Tseretopaulos, ¶77, the intent may be used to determine a particular response to be generated and the lexical personality score can be used to identify a personality of the conversational input to modify the corresponding conversational response in a personalized manner). 
Regarding Claims 4 and 8, Robichaud discloses a natural language generation unit configured to convert the final talk state information into an output conversation script having the natural language script form (¶46, domain specific components to generate the plurality of dialog responses; see also Tseretopaulos, ¶54, NLG124); and 
an adaptive voice synthesizing unit configured to synthesize a voice signal in which an intonation and conforming to at least one of the emotion state information, the situation information, and the intention information is given to the output conversation script (¶21, text to speech component; Tseretopaulos, ¶77, modify the corresponding conversational response in a personalized manner such that the lexical personality of the response corresponds to the lexical personality of the conversational input). 


Conclusion
Applicant's amendment necessitated the new grounds of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to examiner Richard Z. Zhu whose telephone number is 571-270-1587 or examiner’s supervisor King Y. Poon whose telephone number is 571-272-7440. Examiner Richard Zhu can normally be reached on M-Th, 0730:1700.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 
/RICHARD Z ZHU/Primary Examiner, Art Unit 2675                                                                                                                                                                                                        06/16/2021