DETAILED ACTION
This action is responsive to the Request for Continuation filed on 27 September 2021. Claims 21-40 are pending in the case, claims 1-20 were previously canceled. Claims 21 and 31 are independent claims.
This action is non-final.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. 4. 
Applicant's submission filed on 27 September 2021 has been entered.
Applicant’s Response
Applicant had previously filed an after-final response on 26 July 2021 which was responded to with an advisory action mailed 2 August 2021. In the advisory action, Examiner made suggestions for clarifications in the claims:
If a first portion of the output data is determined separate from the application and a second portion of the output data is determined from the application, Applicant may wish to make further amendments to this effect, with appropriate citations of support in the disclosure as originally filed.
In Applicant’s response dated 27 September 2021 (hereinafter Response), Applicant amended Claims 21-22, 27, 30-32, 37, and 40; and argued against the objections and/or rejections previously set forth in the Office Action dated 30 April 2021 (hereinafter Previous Action). It is noted that Applicant’s response provides no citations of support for 
Response to Amendment/Arguments
In response to Applicant's amendment to cure the 35 U.S.C. § 112 rejection(s) of claim(s) 22 and 32, the arguments are persuasive, and the 35 U.S.C. § 112 rejection(s) of the claim(s) is respectfully withdrawn.
In response to Applicant's statement that GRUBER fails to teach performing speech processing using the input audio data to determine natural language understanding (NLU) data, the NLU data including intent data corresponding to the user utterance; determining the intent data is associated with the user selection; determining, based at least in part on the intent data, a first portion of output data; and performing text-to-speech processing on the first portion of the output data to determine output audio data  as recited in claims 21 and 31 (see Response, page 8), and thus does not anticipate the claimed subject matter, Examiner respectfully disagrees.
Applicant states “Gruber does not determine intent data that corresponds to the user utterance” however the instant application provides no specific definition of “intent data” and GRUBER clearly teaches performing speech processing using the input audio data to determine natural language understanding (NLU) data, the NLU data including intent data corresponding to the user utterance (FIG 28 shows the natural language processing procedure 200 in detail, note the output is representation of user intent; this process is used in the method of FIG 33 to process input).
Applicant makes no other specific arguments against GRUBER on either page 8 or page 9 of the Response, other than a general allegation that “determination of intent data as claimed and supported by the Specification that is different from the intent of Gruber which is parsed from input text via a specific match list” which appears to acknowledge that GRUBER analyzes the input to determine the user intent.
It is noted that Applicant makes no specific argument as to how the amendment to independent claims (replacing “command” with the broader “user utterance”; 
The rejections of record of independent claims 21 and 31 are respectfully maintained, restated where necessary in response to Applicant’s amendment.
Applicant makes no argument with respect to the rejection of any dependent claim other than to rely on their dependency from the independent claims. These rejections are respectfully maintained.
Further, new grounds of rejection under 35 USC § 112 are provided in response to Applicant’s amendment.
Examiners are to give claims their broadest reasonable interpretation in light of the supporting disclosure. As explained in MPEP § 2111, giving a claim its broadest reasonable interpretation during prosecution will reduce the possibility that the claim, when issued, will be interpreted more broadly than is justified. Applicant is reminded that although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
	
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 21-40 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Regarding claim 21, the claim recites the limitation “determining the intent data is associated with the user selection” however there is no clear support for this limitation in the original disclosure of the instant application.  
The term “intent” or “intent data” appears in the originally-filed specification at [0047] and [0060]. These paragraphs are solely with respect to the natural language understanding of speech input and are silent with respect to how the intent data may be related or associated with any other element.
The term “intent” or “intent data” does not appear in the originally-filed claims. 
The term “intent data” does appear in the later-filed preliminary amendment to the claims (“processing the input text data to determine intent data corresponding to the command; using the intent data to determine output text data; determining the intent data corresponds to a first invocation of the application; sending the intent data to the application”; filed without identifying any support in the originally-filed application). Applicant cannot rely on the later-filed preliminary amendment for support of this element (see 37 CFR 1.115(a)(2) A preliminary amendment filed after the filing date of the application is not part of the original disclosure of the application. See also MPEP § 608.04 
Thus, the original disclosure does not clearly support determining the intent data (at least part of the output of natural language processing) is associated with the user selection (corresponding to an application displayed using the GUI). 
It is noted that Applicant provided no citation of support with their amendment.
Regarding claim 31, the claim recites a similar limitation and is rejected under similar rationale.
Regarding dependent claims 22-30 and 32-40, dependent claims necessarily inherit the deficiency of the parent claim.
Claim Rejections - 35 USC § 102
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 21-25, 27-28, 30, 31-35, 37-38, and 40 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by GRUBER et al. (Pub. No.: US 2013/0275164 A1).
Regarding claim 21, GRUBER teaches the computer (FIGs 3, 4 [0080] computing device 60 suitable for implementing at least a portion of the intelligent automated assistant features/functionalities… end-user, network server or server system; note also FIG 5 showing multiple clients communicating with multiple servers and service provides); implemented method, comprising 
causing a graphical user interface (GUI) to be displayed on a device (FIG 33 (100) prompted to enter request [0675] using e.g. FIG 26 “Active multi-modal input” which includes (2640) “present GUI for input” and “actively offer suggested possible responses in dialogs”; see further FIG 23 (140) “Active GUI-based Input solicitation”, (141) “present GUI with links and buttons”; note also FIG 24 (151,152) responses are suggested to user; see one of many example UI representations in FIG 12);
receiving a user selection (FIG 23 (142) “user interacts with GUI element”; FIG 24 (154) “user picks suggested response”) corresponding to an application displayed using the GUI (see e.g. FIG 12 listing various tasks that the assistant can perform; interpreting each possible task as an “application” under the broadest reasonable interpretation of the term, particularly in view of the different natures of requesting information about weather, setting task reminders, booking a taxi, or making a restaurant reservation);
receiving input audio data (FIG 26 includes (2610) “actively elicit speech input”; see further FIG 22 “active speech input elicitation” (121) receive speech input) corresponding to a user utterance (some task the user wishes to perform, e.g. for restaurants can “find” and/or “make reservation”, for weather can “get forecasts”);
performing speech processing using the input audio data to determine intent data corresponding to the user utterance ([0681] An embodiment of an active input elicitation component 1094 calls a speech-to-text service (this converts the input audio to text which may then be analyzed for intent with respect to a command to be performed [0681] embodiment of language interpreter component 1070 is then called in step 200, as described in connection with FIG. 29. Language interpreter component 1070 parses the text input and generates a list of possible interpretations of the user's intent 290; interpreting “intent” as what the user intends to accomplish using the “user utterance” input);
determining the intent data is associated with the user selection (part of FIG 33 step 300 “Identify task, task parameters, dialog flow” this includes determining which application (e.g. the one selected by the user or some other application) needs to receive the user’s command in order to complete the task; [0683] user intent is passed to dialog flow processor 1080… (e.g. constrained selection task to find a restaurant by constraints);
determining, based at least in part on the intent data, a first portion of output data (once the system knows the task, parameters, then FIG 33 (400) the flow is executed; see also [0685] In step 500, output processor 1092 generates a dialog summary of the results … formats output for user device in step 600…);
performing text-to-speech processing on the first portion of the output data to determine output audio data (inherent because [1006] makes clear that the dialog flow may be speech-only; or a combination of conversation screen and speech output; thus there necessarily must be some text-to-speech processing of text output provided by the digital assistant to an audio format); and
causing output audio data to be sent to the device ([0686] in step 700 device specific output is sent to mobile device which renders it on the screen (or other output device; [1006] dialog may be speech-only, or may be both conversation screen and speech-based input/output).
Regarding dependent claim 22, incorporating the rejection of claim 21, GRUBER further teaches: wherein the determining the output data further comprises:
determining the intent data corresponds to a first invocation of the application (FIG 33: (300) identify task, task parameters, dialog flow);
sending the intent data to the application (FIG 33: (400) execute flow, orchestrating services 1084; see discussion of what a service could include starting at [0504], some examples include [0512] browsing movies [0513] restaurant selection and meal planning); and
receiving a second portion of the output data from the application ([0684] services 1084 contribute some data (a second portion) to the common result (the total output to be provided)).
Regarding dependent claim 23, incorporating the rejection of claim 21, GRUBER further teaches: wherein the GUI is configured to display data corresponding to a plurality of applications (see e.g. FIG 12 examples of different tasks the user can ask the assistant to perform; booking a taxi is clearly a different task/application than inquiring about the weather or inquiring about reminders; see further FIG 20).
Regarding dependent claim 24, incorporating the rejection of claim 21, GRUBER further teaches receiving, from the device, a first identifier ([0548] match task requirements with declarative descriptions of capabilities and properties of services; interpreting “first identifier” as any means to indicate specific task desired [0555] note service numbers 1-99; citations drawn from [0544] FIG 37 which is an example procedure for executing service orchestration 1082, which is part of overall method in FIG 33 (400) orchestrating services) corresponding to the application (through the multimodal input; indicating which service is desired), wherein a representation corresponding to the first identifier was selected from a displayed set of options on the GUI (e.g. selected from FIG 12, such as weather, reminders, or some other task to be performed).
Regarding dependent claim 25, incorporating the rejection of claim 21, GRUBER further teaches wherein the user selection corresponds to a touch input ([0088] input device can be any suitable type including touchscreen; [0150] clicking and menu selection from GUI includes touches to a touch screen).
Regarding dependent claim 27, incorporating the rejection of claim 21, GRUBER further teaches causing a microphone associated with the device to determine the input audio data based at least in part on audio corresponding to the user utterance ([0144] types of input data includes [0145] voice input from mobile devices, computers with microphones).
Regarding dependent claim 28, incorporating the rejection of claim 27, GRUBER further teaches wherein the causing the microphone to determine the audio is performed at least in part in response to the user selection (note that in FIG 12 there is a speech-input icon which may be activated by the user to turn on speech input; part of the overall active elicitation of multi-modal input).
Regarding dependent claim 30, incorporating the rejection of claim 21, GRUBER further teaches wherein the performing speech processing using the input text data to determine the intent data is based at least in part on the application ([0682] Language interpreter component 1070 parses the text input and generates a list of possible interpretations of the user's intent 290… as part of the conversation flow, the intent is translated into a task (once a task has been determined) and task parameters, which then [0684] causes a service to be invoked in order to obtain results; additional information regarding dialog flow may be found in FIG 32 which makes clear that if the intent is not too ambiguous (310) then the domain and task parameters (312) can be determined, otherwise additional information is requested (322) and the conversation flow loops around again; note the use of domain models 1056; see [0225] for discussion of domain model 1622 in FIG 8 for dining out/meal event tasks; see further [0256] domain models used to constrain inputs; additional information about domain models starts at [0372]).
Regarding dependent claim 29, incorporating the rejection of claim 21, GRUBER further teaches determining a second user selection (breadth of interpretation includes receiving user input to close (or switch away from) the current conversation dialog because the user is finished (at least for the current session); see FIG 33, [0687] (790) User is done?) or selecting a different application that does not require voice interaction. Note that GRUBER provides suggestions which the user can select (see for example FIG 35 which shows an example set of restaurant results that the user can interact with (generating a map of the results by clicking “Map All”, calling a particular restaurant “Call”)) these at least suggest that the user is finished with the conversation with the digital assistant; and in response to the second user selection (If the user has indicated they are finished their conversation with the digital assistant) ceasing (the digital assistant will stop) certain {any additional} processing {of the conversation loop} with regard to the input audio data (no more conversational input required).
Regarding claims 31-35, 37-40, GRUBER similarly teaches the system (e.g. FIG 5 [0090], clients 1304 communicating with servers 1340 and external services 1360; [0080] computing device 60 suitable for implementing at least a portion of the intelligent automated assistant features/functionalities… end-user, network server or server system) comprising: at least one processor (FIG 4 processor(s) 63 for each computing device 60, whether client or server); at least one memory ([0088] memory 1210 and/or storage device 1208) comprising instructions that ([0090] each client 1304 may run software for implementing client-side portions of the present invention. In addition, any number of servers 1340 can be provided for handling requests received from clients 1304), when executed by the at least one processor, cause the system to .


Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 26 and 36 are rejected under 35 USC 103 as unpatentable over GRUBER in view of GONG, Li (Pub. No.: US 2003/0167167 A1).
Regarding dependent claim 26, incorporating the rejection of claim 21, GRUBER does not appear to expressly disclose determining, based at least in part on the application, data corresponding to a voice type, wherein the performing the text-to-speech processing is based at least in part on the data corresponding to the voice type
GONG is similarly directed to (abstract) an intelligent personal assistant (agent) that assists a user in operating a computing device and using application programs on the computing device. As can be seen in FIG 3, information about the user (305) as well as application information (310) are used to adapt (using adaptation engine 330) to generate verbal responses (340) which are modified by an affect generator (360). FIG 5 is a specific method for assisting the device user and generating appropriate visual (facial expression of assistant) and verbal (vocal expression of assistant) feedback. GONG states:
[0046] The verbal generator 340 then sends the textual verbal content to an 1/0 device for the computer device, typically a display device, or a text-to-speech generation program that converts the text to speech and sends the speech to a speech synthesizer.
[0047] The affect generator 360 receives information from the adaptation engine 330 and produces the affective expression for the intelligent social agent 350. The affect generator 360 produces facial expressions and vocal expressions for the intelligent social agent 350 based on an indication from the dynamic adaptor module 336 as to what emotion the intelligent social agent 350 should express. 
[0069] The processor then generates the appropriate affect for the verbal expression of the intelligent social agent (step 555). This may be accomplished by modifying the speech style from the baseline style of speech for the intelligent social agent. Speech style may include speech rate, pitch average, pitch range, intensity, voice quality, pitch changes, and level of articulation.

Applicant may also wish to note the intended use of affect generator 360 in the overall system, for example as explained with respect to FIG 9:
[0095] an architecture 900 of an intelligent personal assistant helping a user to operate applications in a computing device. The intelligent personal assistant 910 may assist the user 915 across various application programs or functions. As described with respect to FIGS. 3 and 7, intelligent personal assistant 910 interacts with the user 915 and the application programs 920 in a computing device, including basic functions relating to the device itself and applications running on the device such as enterprise applications.


Thus GONG may clearly be relied upon to teach determining, based at least in part on the application (the context and content information for the request), data corresponding to a voice type (the affect for vocal expression), wherein the performing the text-to-speech processing is based at least in part on the data corresponding to the voice type (modifying the speech style from the baseline style of speech for the intelligent social agent. Speech style may include speech rate, pitch average, pitch range, intensity, voice quality, pitch changes, and level of articulation; all used by speech generation program that converts the text to speech and sends the speech to a speech synthesizer).
Accordingly, it would have been obvious to one having ordinary skill in graphical user interfaces before the effective filling date of the claimed invention, having the teachings of GRUBER and GONG before them, to have combined GRUBER (inherently teaching text-to-speech conversion without details with respect to an automated assistant) and GONG (teaching a specific text-to-speech mechanism which adapts the vocal expression based on context information including the user and the application with respect to an automated assistant) and arrived at the claimed invention with expected and predictable results, motivated by GONG [0055] process 500 may help an intelligent social agent to act appropriately based on the user and the application context, for example [0041] more relaxed, [0042] reflect the user is happy or energetic, [0043] be apologetic when the user is frustrated with the device or the assistant itself.

It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned. They are part of the literature of the art, relevant for all they contain.” In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)). Further, a reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill the art, including nonpreferred embodiments. Merck & Co. v. Biocraft Laboratories, 874 F.2d 804, 10 USPQ2d 1843 (Fed. Cir.), cert. denied, 493 U.S. 975 (1989). See also Upsher-Smith Labs. v. Pamlab, LLC, 412 F.3d 1319, 1323, 75 USPQ2d 1213, 1215 (Fed. Cir. 2005); Celeritas Technologies Ltd. v. Rockwell International Corp., 150 F.3d 1354, 1361, 47 USPQ2d 1516, 1522-23 (Fed. Cir. 1998).

CONCLUSION
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 20200251111 A1 (TEMKIN) FIG. 20C shows an interface 2002 that explains that the voice assistant functionality can be provided as an application or “skill” that can extend the functionality of a voice assistant. Note particularly relevant FIGs 20E showing sample invocations with additional information, and 20G showing sample response to an invocation. This is an intervening reference relevant to unclaimed subject matter, and would be applicable when claims are deemed not entitled to priority due to new matter.
US 20090030691 A1 (CERRA) A user may control a mobile communication facility through recognized speech provided to the mobile communication facility. Speech that is recorded by a user using a mobile communication facility resident capture facility. A speech recognition facility generates results of the recorded speech using an unstructured language model based at least in part on information relating to the recording. An application resident on the mobile communications facility is identified, wherein the resident application is capable of taking the results generated by the speech recognition facility as an input. The generated results are input to the application.
US 8924219 B1 (BRINGERT) a computing device listens for speech that corresponds to one of a plurality of activation phrases or "hotwords" that cause the computing device to recognize further speech input in a second speech detection mode. Each activation phrase is associated with a respective application. During the first speech detection mode, the computing device compares detected speech to the activation phrases to identify any potential matches. In response to identifying a matching activation phrase with a sufficiently high confidence, the computing device invokes the application associated with the matching activation phrase and enters the second speech detection mode. In the second speech detection mode, the computing device listens for speech input related to the invoked application
Methods and devices for enabling and disabling applications using voice are described herein… the backend system may receive one or more rules for performing functionalities of the application, as well as one or more sample templates of sample utterances and sample responses that future utterances may use when requesting the application [relevant to subject matter described but not claimed, note patented claim 1]
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMY M LEVY whose telephone number is (571)270-3771. The examiner can normally be reached Mon-Fri 8am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RENEE CHAVEZ can be reached on (571) 270-1104. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Amy M Levy/Primary Examiner, Art Unit 2179