DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
2.	The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


3.	Claims 15-16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. In particular, claims 15-16 recite “the first second audio speech segment” in line 2. There is insufficient antecedent basis for “the first second audio speech segment” in the claims. For compact prosecution, the Examiner interprets “the first second audio speech segment” as “the first audio speech segment”. 

Claim Rejections - 35 USC § 102
4.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

5.	Claims 1, 2, 4, 5, 11, 12, 14, 17, 20 are rejected under 35 U.S.C. 102(a) (2) as being anticipated by Koukoumidis et al. (US 11,086,858 B1.)

	With respect to Claim 1, Koukoumidis et al. disclose
	A computer-implemented method for performing incremental natural language understanding, the method comprising: 
 	acquiring a first audio speech segment associated with a user utterance (Koukoumidis et al. col. 10 lines 53-57 If the user input is based on an audio modality (e.g., the user may speak to the assistant application 136 or send a video including speech to the assistant application 136), the assistant system 140 may process it using an audio speech recognition (ASR) module 210 to convert the user input into text); 
 	converting the first audio speech segment into a first text segment (Koukoumidis et al. col. 10 lines 53-57 If the user input is based on an audio modality (e.g., the user may speak to the assistant application 136 or send a video including speech to the assistant application 136), the assistant system 140 may process it using an audio speech recognition (ASR) module 210 to convert the user input into text); 
 	determining a first intent based on a text string associated with the first text segment, wherein the text string represents a portion of the user utterance (Koukoumidis et al. col. 22 lines 40-49 The assistant system 140 may use a predictive model to determine whether to proceed with speculatively executing a query given a received initial portion of a user input (i.e., the assistant system 140 does not wait for the user to complete the user input). As an example and not by way of limitation, if the user begins to input “What’s the ...” at 7:30 am, the assistant system 140 may infer from this partial request that the user is about to ask “What’s the weather today [in user’s current location]” and may start by executing a query for the answer); and 
 	generating a first response based on the first intent prior to when the user utterance completes (Koukoumidis et al. col.22 lines 49-56 By the time the user is actually finished with the user input, the assistant system 140 may have generated and cached a response, which is ready to be returned when the assistant system 140 confirms the received user input matches the speculative query. This may improve upon the response time by reducing the latency by proactively executing a speculative query prior to the completion of the user input.)

Claim 2, Koukoumidis et al. disclose
 	further comprising determining that a confidence score associated with the first intent is greater than a threshold value, wherein generating the first response comprises performing one or more operations based on an intent specific response library to generate a response that is related to the first intent (Koukoumidis et al. col. 3 lines 12-18 the assistant system may calculate a confidence score for each speculative query based on the predictive model. The confidence score represents a likelihood that the predicted complete request corresponding to the respective speculative query will match an intended complete request associated with the user input after the further input is provided, col. 25 lines 40-44 the assistant system 140 may rank the one or more speculative queries. In particular embodiments, the assistant system 140 may rank the one or more speculative queries based on their respective confidence scores, col. 26 lines 37-41 the assistant system 140 may adjust a threshold rank for executing speculative queries based on the current processing load and execute speculative queries that have a rank greater than or equal to the adjusted threshold rank.)

 	With respect to Claim 4, Koukoumidis et al. disclose
further comprising: 
 	acquiring a second audio speech segment associated with the user utterance (Koukoumidis et al. col. 27 lines 41-52 the process of determining and executing speculative queries  may happen in an iterative manner, such that as the user’s input gets longer, new speculative queries may be determined and executed. As an example and not by way of limitation, if the user input “What’s the...” the assistant system 140 may speculatively execute a weather query as described above. But if the user input continues as “What’s the way...”); 
 	converting the second audio speech segment into a second text segment (Koukoumidis et al. col. 10 lines 53-57 If the user input is based on an audio modality (e.g., the user may speak to the assistant application 136 or send a video including speech to the assistant application 136), the assistant system 140 may process it using an audio speech recognition (ASR) module 210 to convert the user input into text); 
 	concatenating the second text segment to the text string to generate a concatenated text string (Koukoumidis et al. col. 27 lines 41-56 the process of determining and executing speculative queries  may happen in an iterative manner, such that as the user’s input gets longer, new speculative queries may be determined and executed. As an example and not by way of limitation, if the user input “What’s the...” the assistant system 140 may speculatively execute a weather query as described above. But if the user input continues as “What’s the way...”, the assistant system 140 may re-calculate its confidence scores for the possible speculative queries and determines now that the user is likely intending to ask “What’s the way to [location in calendar appointment]? Similarity, if the user input continues as “What’s the way to make...” the assistant system 140 may recalculate its confidence scores again and determine that the user is likely intending to ask “What’s the way to make corned beef?”;
 	determining a second intent based on the concatenated text string that is different than the first intent (Koukoumidis et al. col. 27 lines 41-56 the process of determining and executing speculative queries  may happen in an iterative manner, such that as the user’s input gets longer, new speculative queries may be determined and executed. As an example and not by way of limitation, if the user input “What’s the...” the assistant system 140 may speculatively execute a weather query as described above. But if the user input continues as “What’s the way...”, the assistant system 140 may re-calculate its confidence scores for the possible speculative queries and determines now that the user is likely intending to ask “What’s the way to [location in calendar appointment]? Similarity, if the user input continues as “What’s the way to make...” the assistant system 140 may recalculate its confidence scores again and determine that the user is likely intending to ask “What’s the way to make corned beef?”); and 
 	generating a second response based on the second intent prior to when the user utterance completes (Koukoumidis et al. col. 27 lines 56-62 At each instance, the speculative query may be executed in advance of the user completing his input, and the response may be cached and then discarded as the user input gets longer and the assistant system 140 determines that a particular speculative query is no longer correct (e.g., the confidence score for a particular speculative query drops below a threshold).

 	With respect to Claim 5, Koukoumidis et al. disclose
 	further comprising: 
 	applying text prediction to the text string to determine a second text segment that is likely to follow the first text segment (Koukoumidis et al. col. 24 lines 37-42 From the initial portion of the user input, the predictive model may generate one or more speculative queries and assign a confidence score related to the likelihood that the assistant system 140 determines the user input is associated with the speculative query (e.g., user input matches the speculative query)); and 
 	prior to determining the first intent, concatenating the second text segment to the text string (Koukoumidis et al. col. 24 lines 49-67 if a user input starts with “What’s the...” the assistant system 140 may generate the speculative queries “What’s the weather [in the user’s location]?” and “What’s the traffic like today [in the user’s location]?” However, give spatial signals, if the assistant system 140 determines that the user is already at work when the initial portion of the user is received, then the assistant system 140 may assign a higher confidence score to the speculative query, “What’s the weather [in the user’s location]?” because of the likelihood the user may want to know the weather as opposed to the traffic conditions. In particular embodiments, the predictive model may be trained to analyze the broader context of the user inputting the user input, such as the weather at the user’s location. Therefore, as an example and not by way of limitation, if the assistant system has detected the user is in a location subject to an oncoming hurricane, for the initial portion of the user input “What’s the ...” the assistant system may generate the speculative queries, “What’s the deadline to evacuate?” “What’s the best way to prepare for a hurricane?” and the like and assign a higher confidence score to these speculative queries.”)

 	With respect to Claim 11, Koukoumidis et al. disclose
 	One or more non-transitory computer-readable media storing program instructions that, when executed by one or more processors, cause the one or more processors (Koukoumidis et al. col. 49 lines 9-22, col. 47 lines 47-66), to perform the steps of:
 	acquiring a first audio speech segment associated with a user utterance (Koukoumidis et al. col. 10 lines 53-57 If the user input is based on an audio modality (e.g., the user may speak to the assistant application 136 or send a video including speech to the assistant application 136), the assistant system 140 may process it using an audio speech recognition (ASR) module 210 to convert the user input into text);
 	converting the first audio speech segment into a first text segment (Koukoumidis et al. col. 10 lines 53-57 If the user input is based on an audio modality (e.g., the user may speak to the assistant application 136 or send a video including speech to the assistant application 136), the assistant system 140 may process it using an audio speech recognition (ASR) module 210 to convert the user input into text); 
 	concatenating the first text segment to a text string that represents a portion of the user utterance (Koukoumidis et al. col. 27 lines 41-56 the process of determining and executing speculative queries  may happen in an iterative manner, such that as the user’s input gets longer, new speculative queries may be determined and executed. As an example and not by way of limitation, if the user input “What’s the...” the assistant system 140 may speculatively execute a weather query as described above. But if the user input continues as “What’s the way...”, the assistant system 140 may re-calculate its confidence scores for the possible speculative queries and determines now that the user is likely intending to ask “What’s the way to [location in calendar appointment]? Similarity, if the user input continues as “What’s the way to make...” the assistant system 140 may recalculate its confidence scores again and determine that the user is likely intending to ask “What’s the way to make corned beef?”); 
 	determining a first intent based on the text string (Koukoumidis et al. col. 27 lines 41-56 the process of determining and executing speculative queries  may happen in an iterative manner, such that as the user’s input gets longer, new speculative queries may be determined and executed. As an example and not by way of limitation, if the user input “What’s the...” the assistant system 140 may speculatively execute a weather query as described above. But if the user input continues as “What’s the way...”, the assistant system 140 may re-calculate its confidence scores for the possible speculative queries and determines now that the user is likely intending to ask “What’s the way to [location in calendar appointment]? Similarity, if the user input continues as “What’s the way to make...” the assistant system 140 may recalculate its confidence scores again and determine that the user is likely intending to ask “What’s the way to make corned beef?”); and 
 	generating a first response based on the first intent prior to when the user utterance completes (Koukoumidis et al. col. 27 lines 56-62 At each instance, the speculative query may be executed in advance of the user completing his input, and the response may be cached and then discarded as the user input gets longer and the assistant system 140 determines that a particular speculative query is no longer correct (e.g., the confidence score for a particular speculative query drops below a threshold, col.22 lines 49-56 By the time the user is actually finished with the user input, the assistant system 140 may have generated and cached a response, which is ready to be returned when the assistant system 140 confirms the received user input matches the speculative query. This may improve upon the response time by reducing the latency by proactively executing a speculative query prior to the completion of the user input.)

 	With respect to Claim 12, Koukoumidis et al. disclose
 	further comprising determining that a confidence score associated with the first intent is greater than a threshold value, wherein generating the first response comprises performing one or more operations based on an intent specific response library to generate a response that is related to the first intent (Koukoumidis et al. col. 3 lines 12-18 the assistant system may calculate a confidence score for each speculative query based on the predictive model. The confidence score represents a likelihood that the predicted complete request corresponding to the respective speculative query will match an intended complete request associated with the user input after the further input is provided, col. 25 lines 40-44 the assistant system 140 may rank the one or more speculative queries. In particular embodiments, the assistant system 140 may rank the one or more speculative queries based on their respective confidence scores, col. 26 lines 37-41 the assistant system 140 may adjust a threshold rank for executing speculative queries based on the current processing load and execute speculative queries that have a rank greater than or equal to the adjusted threshold rank.)

 	With respect to Claim 14, Koukoumidis et al. disclose
further comprising: 
 	acquiring a second audio speech segment associated with the user utterance (Koukoumidis et al. col. 27 lines 41-52 the process of determining and executing speculative queries  may happen in an iterative manner, such that as the user’s input gets longer, new speculative queries may be determined and executed. As an example and not by way of limitation, if the user input “What’s the...” the assistant system 140 may speculatively execute a weather query as described above. But if the user input continues as “What’s the way...”); 
 	converting the second audio speech segment into a second text segment (Koukoumidis et al. col. 10 lines 53-57 If the user input is based on an audio modality (e.g., the user may speak to the assistant application 136 or send a video including speech to the assistant application 136), the assistant system 140 may process it using an audio speech recognition (ASR) module 210 to convert the user input into text); 
 	concatenating the second text segment to the text string to generate a concatenated text string (Koukoumidis et al. col. 27 lines 41-56 the process of determining and executing speculative queries  may happen in an iterative manner, such that as the user’s input gets longer, new speculative queries may be determined and executed. As an example and not by way of limitation, if the user input “What’s the...” the assistant system 140 may speculatively execute a weather query as described above. But if the user input continues as “What’s the way...”, the assistant system 140 may re-calculate its confidence scores for the possible speculative queries and determines now that the user is likely intending to ask “What’s the way to [location in calendar appointment]? Similarity, if the user input continues as “What’s the way to make...” the assistant system 140 may recalculate its confidence scores again and determine that the user is likely intending to ask “What’s the way to make corned beef?”;
 	determining a second intent based on the concatenated text string that is different than the first intent (Koukoumidis et al. col. 27 lines 41-56 the process of determining and executing speculative queries  may happen in an iterative manner, such that as the user’s input gets longer, new speculative queries may be determined and executed. As an example and not by way of limitation, if the user input “What’s the...” the assistant system 140 may speculatively execute a weather query as described above. But if the user input continues as “What’s the way...”, the assistant system 140 may re-calculate its confidence scores for the possible speculative queries and determines now that the user is likely intending to ask “What’s the way to [location in calendar appointment]? Similarity, if the user input continues as “What’s the way to make...” the assistant system 140 may recalculate its confidence scores again and determine that the user is likely intending to ask “What’s the way to make corned beef?”); and 
 	generating a second response based on the second intent prior to when the user utterance completes (Koukoumidis et al. col. 27 lines 56-62 At each instance, the speculative query may be executed in advance of the user completing his input, and the response may be cached and then discarded as the user input gets longer and the assistant system 140 determines that a particular speculative query is no longer correct (e.g., the confidence score for a particular speculative query drops below a threshold).

Claim 17, Koukoumidis et al. disclose
 	further comprising: 
 	applying text prediction to the text string to determine a second text segment that is likely to follow the first text segment (Koukoumidis et al. col. 24 lines 37-42 From the initial portion of the user input, the predictive model may generate one or more speculative queries and assign a confidence score related to the likelihood that the assistant system 140 determines the user input is associated with the speculative query (e.g., user input matches the speculative query)); and 
 	prior to determining the first intent, concatenating the second text segment to the text string (Koukoumidis et al. col. 24 lines 49-67 if a user input starts with “What’s the...” the assistant system 140 may generate the speculative queries “What’s the weather [in the user’s location]?” and “What’s the traffic like today [in the user’s location]?” However, give spatial signals, if the assistant system 140 determines that the user is already at work when the initial portion of the user is received, then the assistant system 140 may assign a higher confidence score to the speculative query, “What’s the weather [in the user’s location]?” because of the likelihood the user may want to know the weather as opposed to the traffic conditions. In particular embodiments, the predictive model may be trained to analyze the broader context of the user inputting the user input, such as the weather at the user’s location. Therefore, as an example and not by way of limitation, if the assistant system has detected the user is in a location subject to an oncoming hurricane, for the initial portion of the user input “What’s the ...” the assistant system may generate the speculative queries, “What’s the deadline to evacuate?” “What’s the best way to prepare for a hurricane?” and the like and assign a higher confidence score to these speculative queries.”)

 	With respect to Claim 20, Koukoumidis et al. disclose
 	A system, comprising: 
 	a memory that includes instructions, and a processor that is coupled to the memory and, when executing the instructions (Koukoumidis et al. col. 49 lines 9-22, col. 47 lines 47-66), is configured to: 
 	acquire an audio speech segment associated with a user utterance (Koukoumidis et al. col. 10 lines 53-57 If the user input is based on an audio modality (e.g., the user may speak to the assistant application 136 or send a video including speech to the assistant application 136), the assistant system 140 may process it using an audio speech recognition (ASR) module 210 to convert the user input into text); 
 	convert the audio speech segment into a text segment (Koukoumidis et al. col. 10 lines 53-57 If the user input is based on an audio modality (e.g., the user may speak to the assistant application 136 or send a video including speech to the assistant application 136), the assistant system 140 may process it using an audio speech recognition (ASR) module 210 to convert the user input into text);
 	determine an intent based on a text string associated with the text segment, wherein the text string represents a portion of the user utterance (Koukoumidis et al. col. 22 lines 40-49 The assistant system 140 may use a predictive model to determine whether to proceed with speculatively executing a query given a received initial portion of a user input (i.e., the assistant system 140 does not wait for the user to complete the user input). As an example and not by way of limitation, if the user begins to input “What’s the ...” at 7:30 am, the assistant system 140 may infer from this partial request that the user is about to ask “What’s the weather today [in user’s current location]” and may start by executing a query for the answer); and 
 	generate a response based on the intent prior to when the user utterance completes (Koukoumidis et al. col.22 lines 49-56 By the time the user is actually finished with the user input, the assistant system 140 may have generated and cached a response, which is ready to be returned when the assistant system 140 confirms the received user input matches the speculative query. This may improve upon the response time by reducing the latency by proactively executing a speculative query prior to the completion of the user input.)

Claim Rejections - 35 USC § 103
6.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

7.	Claims 3, 13 are rejected under 35 U.S.C.103 as being unpatentable over Koukoumidis et al. (US 11,086,858 B1) in view of Pyo et al. (US 2005/0261905 A1.)

 	With respect to Claim 3, Koukoumidis et al. disclose all the limitations of Claim 1 upon which Claim 3 depends. Koukoumidis et al. fail to explicitly teach 
 	further comprising determining that a confidence score associated with the first intent is less than a threshold value, wherein generating the first response comprises performing one or more operations based on a non-intent specific response library to generate a response that is unrelated to the first intent.  
	However, Pyo et al. teach 
 	further comprising determining that a confidence score associated with the first intent is less than a threshold value, wherein generating the first response comprises performing one or more operations based on a non-intent specific response library to generate a response that is unrelated to the first intent (Pyo et al. [0053] Referring to FIGS. 1 and 4, in operation 410, a user utterance is input from the system speaking style determination unit 120, and the reliability of a recognized character string included in the user utterance is compared with a predetermined threshold. According to the comparison result, it is determined whether or not it is necessary for the speaker to re-utter the user utterance. If the reliability is less than the threshold, an interrogative in a predetermined system utterance generated in advance is set as an emphasis part and the user is asked to re-utter the sentence.) 
 	Koukoumidis et al. and Pyo et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of generating the response based on the speculative query as taught by Koukoumidis et al., using teaching of generating an interrogative in advance as taught by Pyo et al. for the benefit of asking the user to re-utter the input (Pyo et al. [0053] Referring to FIGS. 1 and 4, in operation 410, a user utterance is input from the system speaking style determination unit 120, and the reliability of a recognized character string included in the user utterance is compared with a predetermined threshold. According to the comparison result, it is determined whether or not it is necessary for the speaker to re-utter the user utterance. If the reliability is less than the threshold, an interrogative in a predetermined system utterance generated in advance is set as an emphasis part and the user is asked to re-utter the sentence.) 

 	With respect to Claim 13, Koukoumidis et al. disclose all the limitations of Claim 11 upon which Claim 13 depends. Koukoumidis et al. fail to explicitly teach 
 	further comprising determining that a confidence score associated with the first intent is less than a threshold value, wherein generating the first response comprises performing one or more operations based on a non-intent specific response library to generate a response that is unrelated to the first intent.  
	However, Pyo et al. teach
 	further comprising determining that a confidence score associated with the first intent is less than a threshold value, wherein generating the first response comprises performing one or more operations based on a non-intent specific response library to generate a response that is unrelated to the first intent (Pyo et al. [0053] Referring to FIGS. 1 and 4, in operation 410, a user utterance is input from the system speaking style determination unit 120, and the reliability of a recognized character string included in the user utterance is compared with a predetermined threshold. According to the comparison result, it is determined whether or not it is necessary for the speaker to re-utter the user utterance. If the reliability is less than the threshold, an interrogative in a predetermined system utterance generated in advance is set as an emphasis part and the user is asked to re-utter the sentence.) 
 	Koukoumidis et al. and Pyo et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of generating the response based on the speculative query as taught by Koukoumidis et al., using teaching of generating an interrogative in advance as taught by Pyo et al. for the benefit of asking the user to re-utter the input (Pyo et al. [0053] Referring to FIGS. 1 and 4, in operation 410, a user utterance is input from the system speaking style determination unit 120, and the reliability of a recognized character string included in the user utterance is compared with a predetermined threshold. According to the comparison result, it is determined whether or not it is necessary for the speaker to re-utter the user utterance. If the reliability is less than the threshold, an interrogative in a predetermined system utterance generated in advance is set as an emphasis part and the user is asked to re-utter the sentence.) 

8.	Claims 6, 7, 18 are rejected under 35 U.S.C.103 as being unpatentable over Koukoumidis et al. (US 11,086,858 B1) in view of Penilla et al. (US 2016/0104486 A1.)

	With respect to Claim 6, Koukoumidis et al. disclose all the limitations of Claim 1 upon which Claim 6 depends. Koukoumidis et al. fail to explicitly teach 
 	further comprising: 
 	determining a personality attribute weighting of an artificial intelligence avatar associated with the first response; and 
 	modifying the first response based on the personality attribute weighting. 
	However, Penilla et al. teach 
 	further comprising: 
 	determining a personality attribute weighting of an artificial intelligence avatar associated with the first response (Penilla et al. [0025] the voice profile identifies a type of vehicle response that is customized for the user, based on the identified tone in the voice input by the user); and 
 	modifying the first response based on the personality attribute weighting (Penilla et al. [0298] the vehicle response can be adjusted to cater to the tone of the user, e.g., so as to provide, augment, modify, moderate, and/or change the vehicle response to detected tone in the user’s voice.)
 	Koukoumidis et al. and Penilla et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of generating the response based on the speculative query as taught by Koukoumidis et al., using teaching of modifying the response as taught by Penilla et al. for the benefit of modifying the response in response to detected tone in the user’s voice (Penilla et al. [0298] the vehicle response can be adjusted to cater to the tone of the user, e.g., so as to provide, augment, modify, moderate, and/or change the vehicle response to detected tone in the user’s voice.)

Claim 7, Koukoumidis et al. in view of Penilla et al. teach 
 	wherein the personality attribute weighting includes at least one of an excitability weighting, a curiosity weighting, and an interruptability weighting (Penilla et al. [0013] the mood of the user includes one or more of a normal mood, a frustrated mood, an agitated mood, an upset mood, a hurried mood, an urgency mood, a rushed mood, a stressed mood, a calm mood, a passive mood, a sleepy mood, a happy mood, an excited mood, or combinations of two or more thereof.)

 	With respect to Claim 18, Koukoumidis et al. disclose all the limitations of Claim 11 upon which Claim 18 depends. Koukoumidis et al. fail to explicitly teach 
 	further comprising: 
 	determining a personality attribute weighting of an artificial intelligence avatar associated with the first response; and 
 	modifying the first response based on the personality attribute weighting. 
	However, Penilla et al. teach 
 	further comprising: 
 	determining a personality attribute weighting of an artificial intelligence avatar associated with the first response (Penilla et al. [0025] the voice profile identifies a type of vehicle response that is customized for the user, based on the identified tone in the voice input by the user); and 
 	modifying the first response based on the personality attribute weighting (Penilla et al. [0298] the vehicle response can be adjusted to cater to the tone of the user, e.g., so as to provide, augment, modify, moderate, and/or change the vehicle response to detected tone in the user’s voice.)
 	Koukoumidis et al. and Penilla et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of generating the response based on the speculative query as taught by Koukoumidis et al., using teaching of modifying the response as taught by Penilla et al. for the benefit of modifying the response in response to detected tone in the user’s voice (Penilla et al. [0298] the vehicle response can be adjusted to cater to the tone of the user, e.g., so as to provide, augment, modify, moderate, and/or change the vehicle response to detected tone in the user’s voice.)

9.	Claims 8, 9, 19 are rejected under 35 U.S.C.103 as being unpatentable over Koukoumidis et al. (US 11,086,858 B1) in view of Arora et al. (US 2019/0318219 A1.)

	With respect to Claim 8, Koukoumidis et al. disclose all the limitations of Claim 1 upon which Claim 8 depends. Koukoumidis et al. fail to explicitly teach 
 	further comprising: 
 	determining an intonation cue associated with the first audio speech segment; and
 	modifying the first response based on the intonation cue.  
	However, Arora et al. teach 
 	determining an intonation cue associated with the first audio speech segment (Arora et al. [0026] the customized response program may detect changes in the speaking pattern and tone of voice of the user and, based on an analysis of the user’s persona, the customized response program may appropriate match the user’s tone of voice, emotion, speaking pattern and language); and
 	modifying the first response based on the intonation cue (Arora et al. [0026] the customized response program may detect changes in the speaking pattern and tone of voice of the user and, based on an analysis of the user’s persona, the customized response program may appropriate match the user’s tone of voice, emotion, speaking pattern and language) 
Koukoumidis et al. and Arora et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of generating the response based on the speculative query as taught by Koukoumidis et al., using teaching of customizing the response as taught by Arora et al. for the benefit of customizing the response in response to changes in tone of voice of the user (Arora et al. [0026] the customized response program may detect changes in the speaking pattern and tone of voice of the user and, based on an analysis of the user’s persona, the customized response program may appropriate match the user’s tone of voice, emotion, speaking pattern and language.)

 	With respect to Claim 9, Koukoumidis et al. in view of Arora et al. teach 
 	wherein the intonation cue includes at least one of a rising intonation, a trailing intonation, and a declarative intonation (Arora et al. [0026] the customized response program may detect changes in the speaking pattern and tone of voice of the user and, based on an analysis of the user’s persona, the customized response program may appropriate match the user’s tone of voice, emotion, speaking pattern and language.)

 	With respect to Claim 19, Koukoumidis et al. disclose all the limitations of Claim 11 upon which Claim 19 depends. Koukoumidis et al. fail to explicitly teach 
 	further comprising: 
 	determining an intonation cue associated with the first audio speech segment (Arora et al. ; and
 	modifying the first response based on the intonation cue.  
	However, Arora et al. teach 
 	determining an intonation cue associated with the first audio speech segment (Arora et al. [0026] the customized response program may detect changes in the speaking pattern and tone of voice of the user and, based on an analysis of the user’s persona, the customized response program may appropriate match the user’s tone of voice, emotion, speaking pattern and language); and
 	modifying the first response based on the intonation cue (Arora et al. [0026] the customized response program may detect changes in the speaking pattern and tone of voice of the user and, based on an analysis of the user’s persona, the customized response program may appropriate match the user’s tone of voice, emotion, speaking pattern and language) 
Koukoumidis et al. and Arora et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of generating the response based on the speculative query as taught by Koukoumidis et al., using teaching of customizing the response as taught by Arora et al. for the benefit of customizing the response in response to changes in tone of voice of the user (Arora et al. [0026] the customized response program may detect changes in the speaking pattern and tone of voice of the user and, based on an analysis of the user’s persona, the customized response program may appropriate match the user’s tone of voice, emotion, speaking pattern and language.)

10.	Claim 10 is rejected under 35 U.S.C.103 as being unpatentable over Koukoumidis et al. (US 11,086,858 B1) in view of Wang et al. (US 2018/0357286 A1.)

	With respect to Claim 10, Koukoumidis et al. disclose all the limitations of Claim 1 upon which Claim 8 depends. Koukoumidis et al. fail to explicitly teach 
 	further comprising: 
 	analyzing a video feed associated with the user utterance; 
 	determining a second intent based on the video feed; and 
 	modifying the first response based on the second intent.  
	However, Wang et al. teach 
 	further comprising: 
 	analyzing a video feed associated with the user utterance (Wang et al. [0030] the labeled emotion database 118 includes labeled user data where the user data is associated with one or more potential emotions. Potential emotions include, but are not limited to, "happy," "sad," "agitated," "angry," "upset," "joyful," "tearful," "depressed," "despair," and other such emotions or combinations of emotions. As known to one of ordinary skill in the art, the labeled user data, such as the historical GPS locations, the current GPS location, the prosody of a given query and/or command, the words and/or phrases used in a particular query and/or command, an image and/or video, are each associated with one or more of the potential emotions);
 	determining a second intent based on the video feed (Wang et al. [0013] the disclosed emotional chatbot continuously tracks a given user's emotions using a multimodal emotion detection approach that includes determining the user's emotions from a variety of signals including, but not limited to, biometric data, voice data (e.g., the prosody of the user's voice), content/text, one or more camera images, one or more facial expressions, and other such sources of contextual data); and 
The conversational chatbot may then further modify response to a user’s query and/or command based on assigned particular emotional state.)
 	Koukoumidis et al. and Wang et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of generating the response based on the speculative query as taught by Koukoumidis et al., using teaching of modifying the response as taught by Wang et al for the benefit of modifying the response in response to the particular emotional state (Wang et al. [0017] The conversational chatbot may then further modify response to a user’s query and/or command based on assigned particular emotional state.)

11.	Claims 15, 16 are rejected under 35 U.S.C.103 as being unpatentable over Koukoumidis et al. (US 11,086,858 B1) in view of Costa (US 9,361,084 B1.)

 	With respect to Claim 15, Koukoumidis et al. disclose all the limitations of Claim 14 upon which Claim 15 depends. Koukoumidis et al. fail to explicitly teach 
 	wherein a first duration of time represented by the first second audio speech segment overlaps with a second duration of time represented by the second audio speech segment.  
	However, Costa teaches 
 	wherein a first duration of time represented by the first second audio speech segment overlaps with a second duration of time represented by the second audio speech segment (Costa col. col. 6 lines 60-67 The speech recognition module 104 may also be configured to sample and quantize the received input, divide the received input into overlapping or non-overlapping frames of time (e.g., 15 milliseconds), and/or perform spectral analysis on the frames to derive the spectral components of each frame. In addition, the speech recognition module 104 or a similar component may be configured to perform processes relating to noise removal.)
 	Koukoumidis et al. and Costa are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of generating the response based on the speculative query as taught by The speech recognition module 104 may also be configured to sample and quantize the received input, divide the received input into overlapping or non-overlapping frames of time (e.g., 15 milliseconds), and/or perform spectral analysis on the frames to derive the spectral components of each frame. In addition, the speech recognition module 104 or a similar component may be configured to perform processes relating to noise removal.)

 	With respect to Claim 16, Koukoumidis et al. disclose all the limitations of Claim 14 upon which Claim 16 depends. Koukoumidis et al. fail to explicitly teach 
 	wherein a first duration of time represented by the first second audio speech segment is non- overlapping with a second duration of time represented by the second audio speech segment.  
 	However, Costa teaches 
 	wherein a first duration of time represented by the first second audio speech segment is non- overlapping with a second duration of time represented by the second audio speech segment (Costa col. col. 6 lines 60-67 The speech recognition module 104 may also be configured to sample and quantize the received input, divide the received input into overlapping or non-overlapping frames of time (e.g., 15 milliseconds), and/or perform spectral analysis on the frames to derive the spectral components of each frame. In addition, the speech recognition module 104 or a similar component may be configured to perform processes relating to noise removal.)
 	Koukoumidis et al. and Costa are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of generating the response based on the speculative query as taught by Koukoumidis et al., using teaching of non-overlapping frames in time domain as taught by Costa for the benefit of performing spectral analysis in speech recognition (Costa col. col. 6 lines 60-67 The speech recognition module 104 may also be configured to sample and quantize the received input, divide the received input into overlapping or non-overlapping frames of time (e.g., 15 milliseconds), and/or perform spectral analysis on the frames to derive the spectral components of each frame. In addition, the speech recognition module 104 or a similar component may be configured to perform processes relating to noise removal.)

Conclusion
12.	The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. See PTO-892.
a.	Bhardwaj et al. (US 8,041,565 B1.) Bhardwaj et al. disclose a method of overlapping the converting speech into text. 
b.	Murugeshan et al. (US 2018/0032884 A1.) Murugeshan et al. disclose a method for generating adaptive response to user interaction. 
c.	Campbell et al. (US 2016/0063118 A1.) Campbell et al. disclose a method of predicting queries based on the partial query input. 

13.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to THUYKHANH LE whose telephone number is (571)272-6429.  The examiner can normally be reached on Mon-Fri: 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached on 571-272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to 



/THUYKHANH LE/Primary Examiner, Art Unit 2658