DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 11, and 21 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-6, 8-16, 18-26, and 28-30 are rejected under 35 U.S.C. 103 as being unpatentable over Gillespie et al. (US-10229680-B1) in view of Robichaud et al. (US-20160188565-A1) and Badaskar et al. (US-20140081633-A1).
Regarding Claim 1,
Gillespie et al. teaches a computer-implemented method for providing an artificial intelligence (AI) -based digital assistant, the method being executed by one or more processors and comprising: 
receiving, by an interaction manager, communication data from a device (“In one embodiment, orchestrator 250 may be configured to receive the audio data, and may determine that the notification has also been received. Orchestrator 250 may determine whether or not the notification indicates that there is (or was) content displayed by display screen 212 at the time that the utterance was spoken (e.g., when the wakeword was uttered).” See e.g., Gillespie, Col. 26 lines 42-48), the communication data comprising data input by a user of the device (“FIG. 1 is an illustrative diagram of an exemplary system for using information associated with displayed content for anaphora resolution, in accordance with various embodiments. In the non-limiting embodiment, an individual 2 may speak an utterance 4 to a voice activated electronic device 100.” See e.g., Gillespie, Col. 5 lines 9-14), the interaction manager coordinating interactions between the device and an action handler (“In some embodiments, orchestrator 250 may also provide text data representing received audio data to NLU system 260. The output data from NLU system 260, which may include one or more resolved entities and a selected context file, may then be provided to functionalities system 262 to cause, or to attempt to cause, one or more actions to occur.” See e.g., Gillespie, Col. 13 lines 57-63); 
“Similarly, orchestrator 250 may also receive text data representing the audio data from ASR system 258.” See e.g., Gillespie, Fig. 3A, Col. 33 lines 35-36); 
providing an intent set and an entity set based on processing the text data through an artificial intelligence service, the intent set comprising one or more intents indicated in the text data (Col. 2 lines 29-35; For example, if an individual says, "Alexa, buy this," the intent of this utterance may be related to a shopping domain, and the intent may be for purchasing of an item. The "purchasing an item" intent may include various slots that may be resolved based, in one embodiment, on entity data requested by the orchestrator.), the entity set comprising one or more entities indicated in the text data (Col. 22 lines 30-33 NLU system 260 may include a named entity recognition ("NER") system 272, which may be used to identify portions of text that correspond to a named entity recognizable by NLU system 260.); 
identifying, by the interaction manager, a set of actions based on one or more of the text data, the intent set, and the entity set, the set of actions comprising one or more actions to be executed by one or more third-party computer-implemented services (See e.g., Gillespie, Col. 3 lines 4-14) and See e.g., Gillespie, Col. 28 lines 6-23);
transmitting, by the interaction manager, data representative of an action to the action handler, the action handler requesting execution of the action by a third-party computer-implemented service (“If the heuristics score is greater than zero, then NLU system 260 may be configured to generate a selected context file that may be included with the output data from NLU system 260, which orchestrator 250 may provide back to an application, or applications, of functionalities system 262 to perform, or attempt to perform, one or more actions.” See e.g., Gillespie, Col. 28 lines 6-23) and (“Functionalities system 262 may, for example, correspond to various action specific applications, which are capable of processing various task specific actions and/or performing various functionalities. Functionalities system 262 may further correspond to first party applications and/or third party applications capable of performing various tasks or actions, or performing various functionalities.” See e.g., Gillespie, Col. 28 lines 24-30 )  ; 
Gillespie et al. does not explicitly disclose
the artificial intelligence service implementing one or more convolution neural networks (CNNs); 
receiving, by the interaction manager and from the action handler, a set of results comprising at least one result  from the third-party computer- implemented service executing the action of the set of actions; 
determining, by the interaction manager, that the set of results includes a deficiency comprising the set of results including too many results to be efficiently communicated to the user, and in response and prior to transmitting the at least one result to the device, transmitting at least one disambiguation question to the device;
receiving, by the interaction manager, a disambiguation response that is responsive to the at least one disambiguation question;
defining a sub-set of results based on the disambiguation response, the sub-set of results comprising fewer results than the set of results and comprising the at least one result;

transmitting, by the interaction manager, the result data to the device.
However, Robichaud et al. teaches
the artificial intelligence service implementing one or more convolution neural networks (CNNs) (para [0022] In aspects, the language understanding component 120 may include standard spoken language understanding models such as support vector machines, conditional random fields and/or convolutional non-recurrent neural networks for training purposes.); 
receiving, by the interaction manager and from the action handler, a set of results (para [0040] “The dialog component 130 may send this query to the backend engine 360 and get relevant results back.”) comprising at least one result  from the third-party computer- implemented service executing the action of the set of actions (paragraph [0030]); 
determining, by the interaction manager, that the set of results includes a deficiency, and in response and prior to transmitting the at least one result to the device, transmitting at least one disambiguation question to the device (para [0030] In other cases, when there is an ambiguity as to which dialog hypothesis should be ranked the highest, the HRS component 350 may decide to pick the dialog hypothesis with the highest score, even if the difference is very small. In other cases, when there is an ambiguity as to which dialog hypothesis should be ranked the highest, the HRS component 350 may send a disambiguation question to a user of the client computing device 104 such as, "I'm not sure what you want to do, do you want to look up the opening hours of 5 Guys Burger restaurant?" If the user answers yes, the HRS component 350 may rank the dialog hypothesis associated with the answer as the highest. and para [0043] In other cases, when there is an ambiguity as to which dialog hypothesis should be ranked the highest, the HRS component 350 may send a disambiguation question to a user of the client computing device 104 such as, "I'm not sure what you want to do, do you want to look up the opening hours of 5 Guys Burger restaurant?");
receiving, by the interaction manager, a disambiguation response that is responsive to the at least one disambiguation question (para [0043] If the user answers yes, the HRS component 350 may rank the dialog hypothesis associated with the answer as the highest. In the user answers no, the HRS component 350 may send a generic web search query to the backend engine 360.);
providing, by the interaction manager, result data comprising data describing the at least one result… (para [0048] “In one case, the action performed may include using the highest ranked dialog hypothesis to query a web backend engine for results and sending the results to the user of the client computing device; and 
transmitting, by the interaction manager, the result data to the device (para [0048] “In one case, the action performed may include using the highest ranked dialog hypothesis to query a web backend engine for results and sending the results to the user of the client computing device.”).
 It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the digital assistant of Gillespie et al. with the CNN of Robichaud et al.
para [0022] In one aspect, domain prediction may include classifying the natural language expression into a supported domain of the language understanding component 120.).
	Badaskar (US 20140081633 A1) teaches 
determining, by the interaction manager, that the set of results includes a deficiency  comprising the set of results including too many results to be efficiently communicated to the user (para [0102] For example, the digital assistant may recognize that the search query for "photos from last summer's vacation" is likely to return too many results, or that those photos that are returned do not match a model or profile of "vacation" pictures (e.g., they were taken in too many different locations, they span too long a time, there are significant gaps between returned photos, etc.).), and in response and prior to transmitting the at least one result to the device, transmitting at least one disambiguation question to the device (para [0102] In some implementations, the digital assistant engages in a dialogue with a user in order to refine and/or disambiguate a search query, to acquire additional information that may help limit search results to a more relevant set, or to increase a confidence that the digital assistant has correctly understood the query.);
receiving, by the interaction manager, a disambiguation response that is responsive to the at least one disambiguation question (para [0102] In some implementations, the digital assistant engages in a dialogue with a user in order to refine and/or disambiguate a search query, to acquire additional information that may help limit search results to a more relevant set, or to increase a confidence that the digital assistant has correctly understood the query. For example, if a user searches for "photos from last summer's vacation," the digital assistant may respond to the user (e.g., via audible and/or visual output) by asking "did you mean all photos, or photos taken in a particular area?");
defining a sub-set of results based on the disambiguation response, the sub-set of results comprising fewer results than the set of results and comprising the at least one result (para [0102] Thus, the digital assistant requests the additional information from the user in order to determine which photographs the user wishes to see. The digital assistant may also or additionally identify that a search query does not contain sufficient information with which to generate a relevant result set.);
providing, by the interaction manager, result data comprising data describing the sub-set of results (para [0102] Accordingly, in some implementations, the digital assistant will request additional information from the user in order to generate a search query that will return a more appropriate result set.); and 
	It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the method of retrieving information with a digital assistant of Gillespie et al. with the digital assistant of Badaskar.
	Doing so would allow for retrieving more accurate responses (para [0102] Accordingly, in some implementations, the digital assistant will request additional information from the user in order to generate a search query that will return a more appropriate result set.)
Regarding Claim 2,
Col. 2 lines 17-20 The text data may then be provided to the natural language understanding system, which may attempt to resolve an intent of the utterance based, at least in part, on the text data. And Col. 24 lines 8-11 An intent classification ("IC") system 274 may parse the query to determine an intent or intents for each identified domain, where the intent corresponds to the action to be performed that is responsive to the query.).
Regarding Claim 3,
Gillespie et al., Badaskar, and Robichaud et al. teach the method of claim 2. Gillespie et al. further teaches wherein the NLP comprises word embedding (Col. 21 lines 49-52 ASR system 258 may further attempt to match received feature vectors to language phonemes and words as known in acoustic models and language models stored within storage/memory 254 of ASR system 258.).
Regarding Claim 4,
Gillespie et al., Badaskar, and Robichaud et al. teach method of claim 1. Gillespie et al. further teaches wherein the artificial intelligence service comprises an entity extraction model using named entity recognition (NER) to provide the entity set (Col. 22 lines 30-33 NLU system 260 may include a named entity recognition ("NER") system 272, which may be used to identify portions of text that correspond to a named entity recognizable by NLU system 260.).
Regarding Claim 5,
Col. 7 lines 1-12 In some embodiments, speech-processing system 200 may be unable to identify, or resolve, the entity that utterance 4 corresponds to based, at least in part, on the lack of filled declared slots associated with the particular intent. For example, speech-processing system 200 may be unable to determine what song "this" refers to using only the text data provided to natural language understanding processing from automatic speech generation processing. This may cause speech-processing system 200 to prompt individual 2 for additional information related to their request so as to determine an appropriate action, or actions, to occur in response.), 
Robichaud et al. teaches transmitting at least one disambiguation question to the device (Para [0043] In other cases, when there is an ambiguity as to which dialog hypothesis should be ranked the highest, the HRS component 350 may send a disambiguation question to a user of the client computing device 104 such as, "I'm not sure what you want to do, do you want to look up the opening hours of 5 Guys Burger restaurant?").
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the request for more information of Gillespie et al. with the disambiguation question of Robichaud et al. 
para [0017] Accordingly, aspects described herein include machine learning based techniques for dynamically discriminating ambiguous requests.).

Regarding Claim 6,
Gillespie et al., Badaskar, and Robichaud et al. teach the method of claim 1.
Gillespie et al. further teaches further comprising determining that an expected entity is absent from the entity set based on an intent of the intent set, and in response (Col. 9 lines 52-64 The application may then cause a certain action to be performed by voice activated electronic device 100 in an attempt to resolve any entities from the declared slots that may be still be needed. For example, voice activated electronic device 100 may be caused to output a message requesting more information, such as, "I did not understand," or "Please say that again." In some embodiments, instead of passing the intent back to the application, an output may be generated including only the filled declared slots from the natural language understanding processing, and a domain ranking may occur to determine if any domains are capable of servicing the request based on the available information.), 
Robichaud et al. teaches transmitting at least one disambiguation question to the device (Para [0043] In other cases, when there is an ambiguity as to which dialog hypothesis should be ranked the highest, the HRS component 350 may send a disambiguation question to a user of the client computing device 104 such as, "I'm not sure what you want to do, do you want to look up the opening hours of 5 Guys Burger restaurant?").
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the request for more information of Gillespie et al. with the disambiguation question of Robichaud et al. 
Doing so would allow for discriminating ambiguous requests (para [0017] Accordingly, aspects described herein include machine learning based techniques for dynamically discriminating ambiguous requests.).
Regarding Claim 8,
Gillespie et al., Badaskar, and Robichaud et al. teach the method of claim 1. Gillespie et al. further teaches wherein the communication data comprises audio data (Col. 2 lines 14-20; In a non-limiting embodiment, audio data representing an utterance may be received by the speech-processing system.), and the result data comprises audio result data (Col. 41 lines 29-31 At step 532, the determined action(s) may be performed. For instance, the audio file may being playing the song "Song 1" using speaker(s) 210.).
Regarding Claim 9,
Gillespie et al., Badaskar, and Robichaud et al. teach the method of claim 1. Gillespie et al. further teaches wherein the communication data comprises text data (Col. 2 lines 14-20; In a non-limiting embodiment, audio data representing an utterance may be received by the speech-processing system. Using automatic speech recognition processing, text data representing the audio data may be generated. The text data may then be provided to the natural language understanding system, which may attempt to resolve an intent of the utterance based, at least in part, on the text data.), and the result data comprises text result data (Col. 32 lines 3-10 For instance, NLU system 260 and/or functionalities system 262 may be employed to determine contextual features of a response to be generated, and may generate the corresponding text data representing that response. The text data may then be provided to TTS system 264, which may generate audio data representing the text data, which may then be sent to the requesting device).
Regarding Claim 10,
Gillespie et al., Badaskar, and Robichaud et al. teach the method of claim 1. Gillespie et al. further teaches wherein the result data comprises audio data that is provided by a voice response composition module based on text result data (As an illustrative example, the text, "play this," and contextual metadata describing content displayed by display screen 112 may resolve the entities [Domain]: "Music," [Anaphoric Term]: "this," [ Song Name] : "Song 1," [Artist Name]: "Artist 1," and [Album Name]: "Album 1." Therefore, the generated output may include each of these entities--[Domain], [Anaphoric Term], [ Song Name], [Artist Name], and [Album Name]--with their respective values--"Music," "this," " Song 1," "Artist 1," and "Album 1." In this particular scenario, an appropriate action, or actions, to occur for the intent having the output interpretation may be determined. Continuing the previous example, the action to occur may be to cause electronic device 100 to begin playing an audio file for a song having a title " Song 1." At step 532, the determined action(s) may be performed. For instance, the audio file may being playing the song "Song 1" using speaker(s) 210.).
Regarding Claim 11,
Claim 11 is the non-transitory computer-readable storage media corresponding to the method of claim 1. Claim 11 is substantially similar to claim 1 and is rejected on the same grounds.
Regarding Claim 12,
Claim 12 is the non-transitory computer-readable storage media corresponding to the method of claim 1. Claim 12 is substantially similar to claim 2 and is rejected on the same grounds.
Regarding Claim 13,
Claim 13 is the non-transitory computer-readable storage media corresponding to the method of claim 1. Claim 13 is substantially similar to claim 3 and is rejected on the same grounds.
Regarding Claim 14,
Claim 14 is the non-transitory computer-readable storage media corresponding to the method of claim 1. Claim 14 is substantially similar to claim 4 and is rejected on the same grounds.
Regarding Claim 15,
Claim 15 is the non-transitory computer-readable storage media corresponding to the method of claim 1. Claim 15 is substantially similar to claim 5 and is rejected on the same grounds.
Regarding Claim 16,

Regarding Claim 18,
Claim 18 is the non-transitory computer-readable storage media corresponding to the method of claim 1. Claim 18 is substantially similar to claim 8 and is rejected on the same grounds.
Regarding Claim 19,
Claim 19 is the non-transitory computer-readable storage media corresponding to the method of claim 1. Claim 19 is substantially similar to claim 9 and is rejected on the same grounds.
Regarding Claim 20,
Claim 20 is the non-transitory computer-readable storage media corresponding to the method of claim 1. Claim 20 is substantially similar to claim 10 and is rejected on the same grounds.
Regarding Claim 21,
Claim 21 is the system corresponding to the method of claim 1. Claim 21 is substantially similar to claim 1 and is rejected on the same grounds.
Regarding Claim 22,
Claim 22 is the system corresponding to the method of claim 1. Claim 22 is substantially similar to claim 2 and is rejected on the same grounds.
Regarding Claim 23,

Regarding Claim 24,
Claim 24 is the system corresponding to the method of claim 1. Claim 24 is substantially similar to claim 4 and is rejected on the same grounds.
Regarding Claim 25,
Claim 25 is the system corresponding to the method of claim 1. Claim 25 is substantially similar to claim 5 and is rejected on the same grounds.
Regarding Claim 26,
Claim 26 is the system corresponding to the method of claim 1. Claim 26 is substantially similar to claim 6 and is rejected on the same grounds.
Regarding Claim 28,
Claim 28 is the system corresponding to the method of claim 1. Claim 28 is substantially similar to claim 8 and is rejected on the same grounds.
Regarding Claim 29,
Claim 29 is the system corresponding to the method of claim 1. Claim 29 is substantially similar to claim 9 and is rejected on the same grounds.
Regarding Claim 30,
Claim 30 is the system corresponding to the method of claim 1. Claim 30 is substantially similar to claim 10 and is rejected on the same grounds.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Jadhav et al. (US 20180174220 A1) – discloses modifying a query whenever too many results are returned.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HENRY K NGUYEN whose telephone number is (571)272-0217.  The examiner can normally be reached on Mon - Fri 7:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 5712723768.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.










/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121