DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on March 22, 2021 has been entered.  Claims 1-4, 6-11, and 13-20 are pending.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-4, 6-11, and 13-20 are rejected under 35 U.S.C. 103 as being unpatentable over Gruber et al. (US 9,318,108 B2) in view of Lee et al. (US 2014/0316776 A1).
In re Claim 1, Gruber discloses an electronic device (see FIGS. 3-5: Computing Device 60/Client 1304; cols. 8-10: ll. 9-27, describing a computing device 60; col. 10: ll. 28-47, describing an architecture for implementing at least a portion of an intelligent automated assistant on a standalone computing system; and cols. 10-11: ll. 48-60, describing an architecture for implementing at least a portion of an intelligent automated assistant on a distributed computing network) for supporting a speech recognition service (see FIG. 1 and cols. 13-17: ll. 36-51, describing an intelligent automated assistant 1002; FIG. 9 and cols. 19-23: ll. 11-35, describing an intelligent automated assistant 1002 with respect to an active ontology 1050; FIG. 22 and cols. 28-30: ll. 41-46, describing an active speech input elicitation procedure; FIG. 28 and cols. 39-44: ll. 34-12, describing a natural language processing procedure; FIG. 32 and cols. 44-46: ll. 13-30, describing a dialog and flow analysis procedure; FIG. 33 and cols. 57-61: ll. 54-34, describing an automated call and response procedure; FIGS. 37-38 and cols. 47-52: ll. 5-67, describing a flow and service orchestration procedure as well as a service invocation procedure; and see FIGS. 39-42 and cols. 53-55: ll. 1-19, describing a multiphase output procedure as well as a multimodal output processing procedure), the electronic device comprising: 
a communication circuit (see FIG. 3: Interface(s) 68 and cols. 8-10: ll. 9-27, whereby computing device 60 includes interfaces 68 and a bus 67) configured to communicate with at least one external device (Id., and see FIGS. 5-7: Servers 1340/External Services 1360 and cols. 10-11: ll. 48-60, whereby each client 1304 may run software for implementing client-side portions, any number of servers 1340 can be provided for handling requests received from clients 1304, clients 1304 and servers 1340 can communicate with one another via electronic network 1361 using any known network protocols, and whereby servers 1340 can call external services 1360 when needed to obtain additional information or refer to store data concerning previous interactions with particular users via network 1361; see also cols. 18-19: ll. 36-10, whereby some or all of the intelligent automated assistant components may be distributed between client 1304 and server 1340);
a microphone (see FIG. 4: Input Device 1206 and col. 10: ll. 28-47, whereby input device 1206 can be a microphone for voice input) configured to receive a speech input corresponding to a user utterance (Id., and see col. 16: ll. 10-38, whereby input data/information may include voice input from mobile devices such as mobile telephones and tablets; cols. 17-18: ll. 52-35, whereby a user is speaking to intelligent automated assistant 1002 using input device 1206 which may be a speech input mechanism; see also FIG. 22 and cols. 28-30: ll. 41-46, whereby assistant 1002 receives voice or speech input 121 in the form of an auditory signal); 
(see FIG. 3: Memory 61/Memory 65 and cols. 8-10: ll. 9-27, whereby memory block 61 may be used for a variety of purposes such as caching and/or storing data, programming instructions, and the like, and memory block 65 is configured to store data, program instructions for the general-purpose network operations and/or other information relating to the functionality of the intelligent automated assistant; also see FIG. 4: Storage Device 1208/Memory 1210 and col. 10: ll. 28-47) configured to store at least one piece of data associated with an operation of the speech recognition service (Id., whereby the memory or memories may also be configured to store data structures, keyword taxonomy information, advertisement information, user click and impression information, and/or other specific non-program information; see also col. 13: ll. 3-35, whereby client 1304 maintains subsets and/or portions of these components locally to improve responsiveness and reduce dependence on network communications, where such subsets and/or portions include a subset of vocabulary 1058a, subset of library of language pattern recognizers 1060a, cache of short term personal memory 1052a, and cache of long term personal memory 1054a; and cols. 55-57: ll. 20-53, describing short term personal memory components 1052 and long term personal memory components 1054); 
at least one speaker (see FIG. 4: Output Device 1207 and col. 10: ll. 28-47, whereby output device 1207 can be a speaker) configured to output speech data (see col. 17: ll. 15-51, whereby output data/information may include speech output of synthesized speech, sampled speech, recorded messages, or combinations thereof; see also FIG. 42: Step 636 via Step 626 and cols. 53-55: ll. 1-19, whereby output processor components 1090 may be operable to render output data for modalities that include speech output and render output data in different speech voices dynamically, and whereby speech output is generated in Step 626, and the generated speech output is sent to a speech generation module in Step 636, and further whereby if the output modality is speech 626, the language of used to paraphrase user input 730, text interpretations 732, task and domain interpretations 734, progress 736, and/or result summaries 738 may be more or less verbose or use sentences that are easier to comprehend in audible form than in written form); and 
a processor (see FIG. 3: CPU 62 and cols. 8-10: ll. 9-27, whereby computing device 60 includes central processing unit 62 which may include one or more processor(s) 63; also see FIG. 4: Processor(s) 63 and col. 10: ll. 28-47) electrically connected to the communication circuit, the microphone, the memory, and the at least one speaker (Id., via Bus 67), 
wherein the processor is configured to: 
	receive the speech input (see FIG. 39: Step 710 and cols. 53-54: ll. 1-50, whereby in Step 710, a speech input utterance is obtained and a speech-to-text component, such as component described in connection with FIG. 22, interprets the speech to produce a set of candidate speech interpretations 712) and transmit first data associated with the speech input (Id., whereby in Step 714, the candidate speech interpretations 712 are sent to a language interpreter 1070, and whereby in Step 718, task and dialog analysis is performed, and whereby in Step 720, requests are dispatched to services and results are dynamically gathered, where services can include web-enabled services and/or services that access information stored locally on the device and/or from any other source) to a first server (see FIGS. 5-7: Server 1340 and col. 13: ll. 3-35, whereby language interpreter 1070, dialog flow processor 1080, output processor 1090, task flow models 1086, services orchestration 1082, and service capability models 1088 may implemented as part of server 1340) configured to support the speech recognition service (Id., whereby input elicitation functionality and output processing functionality are distributed among client 1304 and server 1340, with server part of input elicitation 1094b and server part of output processing 1092b located at server 1340, and whereby server 1340 obtains additional information by interfacing with external services 1360 when needed; and see col. 23: ll. 27-35, whereby when input is provided by speech, the waveform might be sent to a server 1340 where words are extracted, and semantic interpretation performed, where the results of such semantic interpretation can then be used to drive active input elicitation; see also FIGS. 37-38 and cols. 47-52: ll. 5-67, whereby service invocation is used to obtain additional information or to perform tasks by the use of external services); 
	output a specified sound or control a motion of at least a part of the electronic device, based on completion of recognition of the first data by the first server, at a first time (see FIGS. 39-41: Step 738 and cols. 53-54: ll. 1-50, whereby after the final output format is completed, a different kind of paraphrase may be offered in Step 738, whereby in this phase the entire result set may be analyzed and compared against the initial request, and a summary of results or answer to a question may then be offered; and whereby output processor components 1090 may be operable to perform and/or implement various types of functions, operations, actions, and/or other features, such as, render output data for modalities that may include speech output); 
	receive second data (see FIGS. 39-41: Step 720/Step 736 and cols. 53-54: ll. 1-50, whereby in Step 720, requests are dispatched to services and results are dynamically gathered) corresponding to processing of a part of the first data (Id., by way of Steps 710-720) from the first server (see FIGS. 5-7: Server 1340 and col. 13: ll. 3-35; and see col. 54: ll. 30-44, whereby results are dynamically gathered) and output the second data (see FIGS. 39-41: Step 736 and cols. 53-54: ll. 1-50, whereby in Step 736, intermediate results may be displayed in the form of real-time progress 736) at a second time after a first period of time has elapsed since the transmission of the first data (Id., by way of real-time progress in Step 736 displaying intermediate results, and whereby screen 4101 depicts real-time progress 4103 generated by Step 736); and 
	output third data corresponding to processing of a remaining part of the first data (see FIGS. 39-41: Step 738 by way of Steps 710-724 and cols. 53-54: ll. 1-50, whereby after the final output format is completed, a different kind of paraphrase may be offered in Step 738, and whereby in this phase, the entire result set may be analyzed and compared against the initial request, and a summary of results or answer to a question may then be offered) at a third time after a second period of time has elapsed (Id., by way of Step 738 wherein the entire result set may be analyzed and compared against the initial request and a summary of results or answer to a question may then be offered, and whereby screen 4101 depicts paraphrased summary 4104 generated by Step 738, and where detailed results 4105 are also included), in response to receiving the third data from a second server configured to support the speech recognition service before the third time (by way of External Services 1360; see FIGS. 5-7: Server 1340 and col. 13: ll. 3-35, whereby input elicitation functionality and output processing functionality are distributed among client 1304 and server 1340, with server part of input elicitation 1094b and server part of output processing 1092b located at server 1340, and whereby language interpreter 1070, dialog flow processor 1080, output processor 1090, task flow models 1086, services orchestration 1082, and service capability models 1088 may implemented as part of server 1340; see also cols. 57-61: ll. 54-34, describing an automated call and response procedure; and see cols. 10-11: ll. 64-28, whereby servers 1340 can call external services 1360 when needed to obtain additional information or refer to store data concerning previous interactions with particular users, and whereby communications with external services 1360 can take place via network 1361, and where external services 1360 include web-enabled services and/or functionality related to or installed on the hardware device itself; see also FIGS. 37-38 and cols. 47-52: ll. 5-67, whereby service invocation is used to obtain additional information or to perform tasks by the use of external services), 
	wherein the second data (i.e., intermediate results) is a part of a response to the user utterance (see e.g., FIG. 41: Real Time Progress 4103), and wherein the third data (i.e., paraphrase summary or answer) is a remaining part of the response to the user utterance (see e.g., FIG. 41: Paraphrased Summary 4104 and Detailed Results 4105; see also cols. 74-77: ll. 21-31, describing paraphrase and prompt text; and cols. 85-88: ll. 24-15, describing suggesting possible responses in dialog).

In a similar speech recognition endeavor, Lee teaches a speech recognition client system 120 and a speech recognition server system 130 (see FIG. 1 and ¶37). In Lee, the speech recognition client system 120 may be a terminal of the user 110 or a single module included in the terminal (see ¶38). When the user 110 inputs a speech through the speech recognition client system 120, the speech recognition client system 120 may extract features of the input speech (see ¶38). The speech recognition client system 120 may transfer the extracted features to the speech recognition server system 130, and the speech recognition server system 130 may generate a result of speech recognition by performing the speech recognition using the received features (see ¶38). The speech recognition server system 130 may transfer the generated result of the speech recognition to the speech recognition client system 120, and the speech recognition client system 120 may display the result of the speech recognition using a display apparatus, and the like (see ¶38). In doing so, the user 110 may be provided with the result of the speech recognition with respect to the speech input by the user 110 (see ¶38).
In this way, Lee teaches:
receive second data corresponding to processing of a part of the first data from the first server and output the second data at a second time after a first predetermined period of time has elapsed since the transmission of the first data (see ¶¶38-39); and
output third data corresponding to processing of a remaining part of the first data at a third time after a second predetermined period of time has elapsed from the second2DOCKET No. SAMS13-01183 APPLICATION No. 16/038,893PATENTtime (see ¶¶38-39).
Based on the foregoing, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Gruber’s electronic device by implementing the intermediate results providing feature of Lee’s speech recognition service as it amounts to nothing (see Lee at ¶25), or to reassure the user of the speech recognition (Id., at ¶26).Gru
In re Claim 2, Gruber discloses wherein the processor (see FIGS. 3-4: CPU 62/Processor(s) 63 and cols. 8-10: ll. 9-47) is configured to construct at least one piece of specified response data (see FIGS. 8-9 and cols. 19-22: ll. 11-64: Active Ontologies 1050, whereby the different types of data which may be accessed by active ontologies 1050 include static data that is available from one or more components of intelligent automated assistant 1002, and data that is dynamically instantiated per user session, for example, maintaining the state of the user-specific inputs and outputs exchanged among components of intelligent automated assistant 1002, the contents of short term personal memory, the inferences made from previous states of the user session, and the like) in the memory (see FIG. 3: Memory 61/Memory 65 and FIG. 4: Storage Device 1208/Memory 1210, and cols. 8-10: ll. 9-47; see also FIG. 7 and col. 13: ll. 3-35, whereby client 1304 maintains subset of vocabulary 1058a, subset of library of language pattern recognizers 1060a, cache of short term personal memory 1052a, and cache of long term personal memory 1054a) as a database (see cols. 8-10: ll. 9-47, whereby the memory or memories may be configured to store data structures; see also FIGS. 8-9 and cols. 19-22: ll. 11-64: Active Ontologies 1050, whereby a given instance of active ontology 1050 may access and/or utilize information from one or more associated databases, and at least a portion of the database information may be accessed via communication with one or more local and/or remote memory devices, also whereby active ontologies 1050 may be embodied as configurations of models, databases, and components in which the relationships among models, databases, and components are any of: containership and/or inclusion, relationship with links and/or pointers; and interface over APIs, both internal to a program and between programs) in relation to processing of the speech input (see FIGS. 8-9 and cols. 19-22: ll. 11-64, whereby domain models 1056, vocabulary 1058, language pattern recognizers 1060, short term personal memory 1052, and long term personal memory 1054 components are organized under a common container associated with active ontology 1050, and other components such as active input elicitation components 1094, language interpreter 1070 and dialog flow processor 1080 are associated with active ontology 1050 via API relationships; and see cols. 55-57: ll. 20-53, describing short term personal memory components 1052 and long term personal memory components 1054 in detail).
In re Claim 3, Gruber discloses wherein the processor (see FIGS. 3-4: CPU 62/Processor(s) 63) is configured to output fourth data (see FIG. 47: Steps 4710/4712 and cols. 66-68: ll. 26-16, whereby in Step 4709, assistant 1002 determines whether all required constraints can be determined, and if not, assistant 1002 prompts for required information in Step 4710, and whereby in Step 4711, assistant 1002 determines whether any result items can be found given the constraints, and if there are no items that meet the constraints, assistant 1002 offers ways to relax the constraints in Step 4712) corresponding to any one of the at least one piece of specified response data at the second time (Id., and see FIGS. 39-41 and cols. 53-54: ll. 1-50; cols. 74-77: ll. 21-31, whereby in one embodiment, assistant 1002 responds to user input relatively quickly with the paraphrase, and the paraphrase is then updated after results are known; and see cols. 85-88: ll. 24-15, describing suggesting possible responses in dialog), in response to not receiving the third data from the first server or the second server before the second time (see FIG. 47: Steps 4709/4711 and cols. 66-68: ll. 26-16, whereby in Step 4709, assistant 1002 determines whether all required constraints can be determined, and if not, assistant 1002 prompts for required information in Step 4710, and whereby in Step 4711, assistant 1002 determines whether any result items can be found given the constraints, and if there are no items that meet the constraints, assistant 1002 offers ways to relax the constraints in Step 4712).
In re Claim 4, Gruber discloses further comprising: 
a display (see FIG. 3: Interface(s) 68 and cols. 8-10: ll. 9-27, e.g., a user's personal digital assistant (PDA) may be configured or designed to function as an intelligent automated assistant system utilizing CPU 62, memory 61, 65, and interface(s) 68; and FIG. 4: Output Device 1207 and col. 10: ll. 28-47, whereby output device 1207 can be a screen, speaker, printer, and/or any combination thereof),
wherein the processor (see FIGS. 3-4: CPU 62/Processor(s) 63) is configured to:
control the display (see cols. 13-17: ll. 36-51, whereby output data/information may include one or more of: text output sent directly to an output device and/or to the user interface of a device; text and graphics sent to a user over email; text and graphics sent to a user over a messaging service; graphical layout of information with photos, rich text, videos, sounds, and hyperlinks; actuator output to control physical actions on a device such as causing it to turn on or off, change color, control a light, and more; invoking other applications on a device; and actuator output to control physical actions to devices attached or controlled by a device such as playing videos on remote displays; see also FIG. 42 and cols. 53-55: ll. 1-19, whereby output processor components 1090 may be operable to: format output data into forms and layouts that render it appropriately on different modalities; render output data for different modalities; dynamically render data for different graphical user interface display engines based on the request; dynamically render to specified modalities based on user preferences; dynamically render output using user-specific "skins" that customize the look and feel; and send a stream of output packages to a modality showing intermediate status, feedback, or results throughout phases of interaction with assistant 1002) to output first contents in a first form (see FIGS. 39-42 and cols. 53-55: ll. 1-19, whereby a uniform representation of response is generated in Step 722 and formatted for the appropriate output modality in Step 724, and whereby assistant 1002 is capable of generating output in multiple modes, and any of a number of different output mechanisms can be used in any combination, and the content of output messages generated by multiphase output procedure 700 is tailored to the mode of multimodal output processing 600, and the language is tailored in the steps of the multiphase output procedure 700, and whereby the multiphase output procedure 700 produces an intermediate result that is further refined into specific language by multimodal output processing 600) when outputting the third data at the second time (see FIGS. 39-42 and cols. 53-55: ll. 1-19, whereby a different kind of paraphrase may be offered in Step 738, whereby the entire result set may be analyzed and compared against the initial request, and a summary of results or answer to a question may then be offered); and
control the display (see cols. 13-17: ll. 36-51, by way of said output data/information detailed above; and see FIG. 42 and cols. 53-55: ll. 1-19, by way of said output processor components 1090 detailed above) to output second contents in a second form that is at least partly different from the first form (see FIGS. 39-42 and cols. 53-55: ll. 1-19, whereby assistant 1002 is capable of generating output in multiple modes, and any of a number of different output mechanisms can be used in any combination, and the content of output messages generated by multiphase output procedure 700 is tailored to the mode of multimodal output processing 600, and the language is tailored in the steps of the multiphase output procedure 700, and whereby the multiphase output procedure 700 produces an intermediate result that is further refined into specific language by multimodal output processing 600; and see FIG. 47 and cols. 66-68: ll. 26-16, whereby assistant 1002 prompts for required information in Step 4710, and assistant 1002 offers ways to relax the constraints in Step 4712) when outputting the fourth data at the second time (see FIGS. 39-42 and cols. 53-55: ll. 1-19, whereby a different kind of paraphrase may be offered in Step 738; and see cols. 74-77: ll. 21-31, whereby in one embodiment, assistant 1002 responds to user input relatively quickly with the paraphrase, and the paraphrase is then updated after results are known, and whereby the paraphrase algorithm accounts for the query, domain model 1056, and the service results).
In re Claim 6, Gruber discloses wherein the processor (see FIGS. 3-4: CPU 62/Processor(s) 63) is configured to control the electronic device to remain in a standby state at the first time (see FIGS. 39-41 and cols. 53-54: ll. 1-50, whereby in Step 720, requests are dispatched to services and results are dynamically gathered, and in Step 736, real time progress of service coordination is shown, and whereby screen 4101 depicts real-time progress 4103 generated by Step 736), the electronic device being capable of receiving at least one piece of data from the external device (see FIGS. 5-7: Server 1340 and col. 13: ll. 3-35) in the standby state (see FIGS. 39-41 and cols. 53-54: ll. 1-50, whereby results are dynamically gathered and intermediate results may be displayed in the form of real-time progress 736, and whereby output processor components 1090 may be operable to perform and/or implement various types of functions, operations, actions, and/or other features, such as, render output data for different modalities, dynamically render data for different graphical user interface display engines based on the request, and send a stream of output packages to a modality, showing intermediate status, feedback, or results throughout phases of interaction with assistant 1002; see also FIGS. 37-38 and cols. 47-52: ll. 5-67, describing a flow and service orchestration procedure as well as a service invocation procedure).
In re Claim 7, Gruber discloses wherein the third data comprises personified contents (see cols. 13-17: ll. 36-51, whereby output data/information may include one or more of: text and graphics sent to a user over email; text and graphics sent to a user over a messaging service; speech output; graphical layout of information with photos, rich text, videos, sounds, and hyperlinks; and invoking other applications on a device; see also FIGS. 39-42 and cols. 53-55: ll. 1-19, whereby output processor components 1090 may be operable to: format output data that is represented in a uniform internal data structure into forms and layouts that render it appropriately on different modalities; render output data for modalities that may include any combination of graphical user interfaces, messages, sounds, animations, and/or speech output; dynamically render data for different graphical user interface display engines based on the request; render output data in different speech voices dynamically; dynamically render to specified modalities based on user preferences; dynamically render output using user-specific "skins" that customize the look and feel; and send a stream of output packages to a modality showing intermediate status, feedback, or results throughout phases of interaction with assistant 1002, and further whereby the content of output messages generated by multiphase output procedure 700 is tailored to the mode of multimodal output processing 600).
Claims 8-11, 13, 14 essentially recite the same limitations as claims 1-4, 6, and 7, and are rejected for similar reasons. Therefore, Gruber in view of Hebert makes obvious all limitations of the claims.
In re Claim 15, Gruber discloses a server (see FIGS. 3 and 5-7: Computing Device 60/Server 1340 and cols. 8-13: ll. 9-35, describing a computing device 60 suitable for implementing at least a portion of the intelligent automated assistant features and/or functionalities, and an architecture for implementing at least a portion of an intelligent automated assistant on a distributed computing network) for supporting a speech recognition service (see FIG. 1 and cols. 13-17: ll. 36-51, describing an intelligent automated assistant 1002; FIG. 9 and cols. 19-23: ll. 11-35, describing an intelligent automated assistant 1002 with respect to an active ontology 1050; FIG. 22 and cols. 28-30: ll. 41-46, describing an active speech input elicitation procedure; FIG. 28 and cols. 39-44: ll. 34-12, describing a natural language processing procedure; FIG. 32 and cols. 44-46: ll. 13-30, describing a dialog and flow analysis procedure; FIG. 33 and cols. 57-61: ll. 54-34, describing an automated call and response procedure; FIGS. 37-38 and cols. 47-52: ll. 5-67, describing a flow and service orchestration procedure as well as a service invocation procedure; and see FIGS. 39-42 and cols. 53-55: ll. 1-19, describing a multiphase output procedure as well as a multimodal output processing procedure), the server comprising: 
a communication interface (see FIG. 3: Interface(s) 68 and cols. 8-13: ll. 9-35, whereby computing device 60 includes interfaces 68 and a bus 67) configured to support communication with at least one external device (Id., and see FIGS. 5-7: Clients 1304/External Services 1360, whereby each client 1304 may run software for implementing client-side portions, any number of servers 1340 can be provided for handling requests received from clients 1304, clients 1304 and servers 1340 can communicate with one another via electronic network 1361 using any known network protocols, and servers 1340 can call external services 1360 when needed to obtain additional information or refer to store data concerning previous interactions with particular users via network 1361, and whereby input elicitation functionality and output processing functionality are distributed among client 1304 and server 1340; see also cols. 18-19: ll. 36-10, whereby some or all of the intelligent automated assistant components may be distributed between client 1304 and server 1340); 
a memory (see FIG. 3: Memory 61/Memory 65 and cols. 8-10: ll. 9-47, whereby memory block 61 may be used for a variety of purposes such as caching and/or storing data, programming instructions, and the like, and memory block 65 is configured to store data, program instructions for the general-purpose network operations and/or other information relating to the functionality of the intelligent automated assistant) configured to store at least one piece of data associated with an operation of the speech recognition service (Id., whereby the memory or memories may also be configured to store data structures, keyword taxonomy information, advertisement information, user click and impression information, and/or other specific non-program information; see also FIGS. 5-7 and col. 13: ll. 3-35, whereby server part of input elicitation 1094b and server part of output processing 1092b are located at server 1340, and whereby server 1340 includes: complete vocabulary 1058b, complete library of language pattern recognizers 1060b, master version of short term personal memory 1052b, master version of long term personal memory 1054b, language interpreter 1070, dialog flow processor 1080, output processor 1090, domain entity databases 1072, task flow models 1086, services orchestration 1082, and service capability models 1088; and cols. 55-57: ll. 20-53, describing short term personal memory components 1052 and long term personal memory components 1054); 
a speech input processing module (see FIGS. 5-7 and col. 13: ll. 3-35, whereby input elicitation functionality and output processing functionality are distributed among client 1304 and server 1340, with server part of input elicitation 1094b and server part of output processing 1092b located at server 1340, and whereby server 1340 includes: complete vocabulary 1058b, complete library of language pattern recognizers 1060b, language interpreter 1070, dialog flow processor 1080, domain entity databases 1072, task flow models 1086, services orchestration 1082, and service capability models 1088) configured to process a speech input corresponding to a user utterance (see FIG. 22 and cols. 28-30: ll. 41-46, whereby assistant 1002 receives voice or speech input 121 in the form of an auditory signal; and see FIG. 39 and cols. 53-54: ll. 1-50, whereby in Step 710, a speech input utterance is obtained and a speech-to-text component, such as component described in connection with FIG. 22, interprets the speech to produce a set of candidate speech interpretations 712, and in Step 714, the candidate speech interpretations 712 are sent to a language interpreter 1070, and in Step 718, task and dialog analysis is performed, and in Step 720, requests are dispatched to services and results are dynamically gathered), the speech input received from a first external device (see FIG. 4: Input Device 1206 and col. 10: ll. 28-47, whereby input device 1206 can be a microphone for voice input; and col. 16: ll. 10-38, whereby input data/information may include voice input from mobile devices such as mobile telephones and tablets; and cols. 17-18: ll. 52-35, whereby a user is speaking to intelligent automated assistant 1002 using input device 1206 which may be a speech input mechanism; and see col. 23: ll. 27-35, whereby when input is provided by speech, the waveform might be sent to a server 1340 where words are extracted, and semantic interpretation performed, where the results of such semantic interpretation can then be used to drive active input elicitation); and 
a processor (see FIG. 3: CPU 62 and cols. 8-10: ll. 9-47, whereby computing device 60 includes central processing unit 62 which may include one or more processor(s) 63) electrically connected to the communication interface, the memory, and the speech input processing module (Id., via Bus 67), 
wherein the memory (see FIG. 3: Memory 61/Memory 65) stores at least one instruction (Id., and see cols. 8-10: ll. 9-47, whereby memory block 61 may be used for a variety of purposes such as caching and/or storing data, programming instructions, and the like, and memory block 65 is configured to store data, program instructions for the general-purpose network operations and/or other information relating to the functionality of the intelligent automated assistant) that, when executed, causes the processor to:
receive first data (see FIG. 39 and cols. 53-54: ll. 1-50, whereby in Step 710, a speech input utterance is obtained and a speech-to-text component, such as component described in connection with FIG. 22, interprets the speech to produce a set of candidate speech interpretations 712, and/or in Step 714, the candidate speech interpretations 712 are sent to a language interpreter 1070) associated with the speech input from the first external device (Id., and see col. 23: ll. 27-35, whereby when input is provided by speech, the waveform might be sent to a server 1340 where words are extracted, and semantic interpretation performed, and where the results of such semantic interpretation can then be used to drive active input elicitation; and FIG. 22 and cols. 28-30: ll. 41-46, whereby assistant 1002 receives voice or speech input 121 in the form of an auditory signal; see also FIG. 4: Input Device 1206 and col. 10: ll. 28-47, whereby input device 1206 can be a microphone for voice input), based on the communication interface (see FIGS. 5-7 and cols. 8-13: ll. 9-35, whereby clients 1304 and servers 1340 can communicate with one another via electronic network 1361 using any known network protocols, and whereby input elicitation functionality and output processing functionality are distributed among client 1304 and server 1340, with server part of input elicitation 1094b and server part of output processing 1092b located at server 1340, and whereby server 1340 includes: complete vocabulary 1058b, complete library of language pattern recognizers 1060b, language interpreter 1070, dialog flow processor 1080, domain entity databases 1072, task flow models 1086, services orchestration 1082, and service capability models 1088; see also cols. 17-18: ll. 52-35, whereby a user is speaking to intelligent automated assistant 1002 using input device 1206 which may be a speech input mechanism; and cols. 18-19: ll. 36-10, whereby some or all of the intelligent automated assistant components may be distributed between client 1304 and server 1340); 
(see FIG. 39 and cols. 53-54: ll. 1-50, whereby in Step 716, language interpreter 1070 produces representations of user intent for at least one candidate speech interpretation 712, and in Step 718, task and dialog analysis is performed, and in Step 720, requests are dispatched to services and results are dynamically gathered), based on at least one of communication with the speech input processing module (see FIGS. 5-7 and cols. 8-13: ll. 9-35, whereby input elicitation functionality and output processing functionality are distributed among client 1304 and server 1340, and whereby server 1340 includes: complete vocabulary 1058b, complete library of language pattern recognizers 1060b, language interpreter 1070, dialog flow processor 1080, domain entity databases 1072, task flow models 1086, services orchestration 1082, and service capability models 1088; and see also cols. 18-19: ll. 36-10, whereby some or all of the intelligent automated assistant components may be distributed between client 1304 and server 1340) and communication with at least one second external device (see FIGS. 5-7: External Services 1360, whereby any number of servers 1340 can be provided for handling requests received from clients 1304, clients 1304 and servers 1340 can communicate with one another via electronic network 1361 using any known network protocols, and servers 1340 can call external services 1360 when needed to obtain additional information or refer to store data concerning previous interactions with particular users via network 1361); 
transmit second data (see FIGS. 39-41 and cols. 53-54: ll. 1-50, whereby as requests are dispatched to services and results are dynamically gathered in Step 720, intermediate results may be displayed in the form of real-time progress in Step 736) corresponding to processing of a part of the first data (Id., by way of Steps 710-720) to the first external device (Id., and see FIGS. 3-7: Computing Device 60/Clients 1304 and cols. 8-13: ll. 9-35) at a first time that the processing of the part of the first data is completed (see FIGS. 39-41 and cols. 53-54: ll. 1-50, whereby output processor components 1090 may be operable to perform and/or implement various types of functions, operations, actions, and/or other features, such as, render output data for modalities that may include speech output, and send a stream of output packages to a modality, showing intermediate status, feedback, or results throughout phases of interaction with assistant 1002, and whereby screen 4101 depicts real-time progress 4103 generated by Step 736); and 
receive third data corresponding to processing of a remaining part of the first data (see FIGS. 39-41 and cols. 53-54: ll. 1-50, by way of Steps 710-724) from the at least one second external device (see cols. 10-11: ll. 64-28, whereby servers 1340 can call external services 1360 when needed to obtain additional information or refer to store data concerning previous interactions with particular users, and whereby communications with external services 1360 can take place via network 1361, and where external services 1360 include web-enabled services and/or functionality related to or installed on the hardware device itself; and FIGS. 5-7 and col. 13: ll. 3-35, by way of External Services 1360; see also FIGS. 37-38 and cols. 47-52: ll. 5-67, whereby service invocation is used to obtain additional information or to perform tasks by the use of external services) and transmit the third data (see FIGS. 39-41 and cols. 53-54: ll. 1-50, by way of Steps 722-724 to Step 738) to the first external device (Id., and see FIGS. 3-7: Computing Device 60/Clients 1304 and cols. 8-13: ll. 9-35) at a second time after the transmission of the second data (see FIGS. 39-41: Step 738 by way of Steps 710-724 and cols. 53-54: ll. 1-50, whereby after the final output format is completed, a different kind of paraphrase may be offered in Step 738, and whereby in this phase, the entire result set may be analyzed and compared against the initial request, and a summary of results or answer to a question may then be offered, and whereby screen 4101 depicts paraphrased summary 4104 generated by Step 738 with detailed results 4105 also included; and see col. 53: ll. 1-36, whereby output processor components 1090 may be operable to perform and/or implement various types of functions, operations, actions, and/or other features, such as send a stream of output packages to a modality, showing intermediate status, feedback, or results throughout phases of interaction with assistant 1002),
7DOCKET No. SAMS13-01183APPLICATION NO. 16/038,893PATENTwherein the second data (i.e., intermediate results) is a part of a response to the user utterance (see e.g., FIG. 41: Real Time Progress 4103), and wherein the third data (i.e., paraphrase summary or answer) is a remaining part of the response to the user utterance (see e.g., FIG. 41: Paraphrased Summary 4104 and Detailed Results 4105; see also cols. 74-77: ll. 21-31, describing paraphrase and prompt text; and cols. 85-88: ll. 24-15, describing suggesting possible responses in dialog).
In re Claim 16, Gruber discloses wherein the speech input processing module includes at least one of an automatic speech recognition module, a natural language understanding module, and a text-to-speech module (see FIGS. 5-7: Components of Server 1340 and cols. 10-13: ll. 48-35; also see cols. 37-40: ll. 50-59, by way of language interpreter components 1070; cols. 42-43: ll. 23-40, by way of language pattern recognizer components 1060; cols. 43-45: ll. 41-26, by way of dialog flow processor components 1080; and cols. 53-55: ll. 1-19, by way of output processor components 1090).
In re Claim 17, Gruber discloses wherein the processor (see FIG. 3: CPU 62 and cols. 8-10: ll. 9-47) is configured to derive an intent of user utterance associated with the speech input from the first data as partial processing of the first data (see cols. 22-26: ll. 65-33, whereby by performing active input elicitation, assistant 1002 is able to disambiguate intent at an early phase of input processing; see also FIG. 39: Steps 714-716 and cols. 53-55: ll. 1-19, whereby in Step 714, the candidate speech interpretations 712 are sent to a language interpreter 1070 which may produce representations of user intent in Step 716 for at least one candidate speech interpretation 712), based on the automatic speech recognition module (Id., and see FIG. 28 and cols. 37-40: ll. 50-59, describing a natural language processing procedure 200; and FIG. 32 and cols. 43-45: ll. 41-26, describing a dialog and flow analysis procedure 300; and FIG. 33 and cols. 57-61: ll. 54-34, describing an automated call and response procedure; see also FIGS. 5-7: Components of Server 1340) and to generate the second data by converting at least a part of the first data associated with the derivation of the intent of the user utterance (see FIGS. 39-41 and cols. 53-55: ll. 1-19, whereby in Step 732, paraphrases of these representations of user intent 716 are generated and presented to the user, and in Step 734, task and domain interpretations are presented to the user using an intent paraphrasing algorithm, and in Step 736, intermediate results are displayed in the form of real-time progress 736 as results are dynamically gathered), based on the text-to-speech module (see FIG. 4: Output Device 1207 and col. 10: ll. 28-47, whereby output device 1207 can be a speaker; and col. 17: ll. 15-51, whereby output data/information may include speech output of synthesized speech, sampled speech, recorded messages, or combinations thereof; see also FIG. 42: Step 636 via Step 626 and cols. 53-55: ll. 1-19, whereby output processor components 1090 may be operable to render output data for modalities that include speech output and render output data in different speech voices dynamically, and whereby speech output is generated in Step 626, and the generated speech output is sent to a speech generation module in Step 636, and further whereby if the output modality is speech 626, the language of used to paraphrase user input 730, text interpretations 732, task and domain interpretations 734, progress 736, and/or result summaries 738 may be more or less verbose or use sentences that are easier to comprehend in audible form than in written form).
In re Claim 18, Gruber discloses wherein the processor (see FIG. 3: CPU 62 and cols. 8-10: ll. 9-47) is configured to map and store the first data and the intent of the user utterance in the memory (Id., whereby memory block 61 may be used for a variety of purposes such as caching and/or storing data, programming instructions, and the like, and memory block 65 is configured to store data, program instructions for the general-purpose network operations and/or other information relating to the functionality of the intelligent automated assistant, and whereby the memory or memories may also be configured to store data structures; see also FIGS. 5-7 and col. 13: ll. 3-35, whereby server part of input elicitation 1094b and server part of output processing 1092b are located at server 1340, and whereby server 1340 includes: master version of short term personal memory 1052b, master version of long term personal memory 1054b, language interpreter 1070, dialog flow processor 1080, and output processor 1090; and cols. 55-57: ll. 20-53, describing short term personal memory components 1052 and long term personal memory components 1054).
In re Claim 19, Gruber discloses wherein the processor (see FIG. 3: CPU 62 and cols. 8-10: ll. 9-47) is configured to: 
identify at least one third external device (see FIGS. 5-7: External Services 1360 and cols. 10-13: ll. 48-35, whereby servers 1340 can call external services 1360 when needed to obtain additional information or refer to store data concerning previous interactions with particular users, and whereby communications with external services 1360 can take place via network 1361) associated with the intent of the user utterance (see FIGS. 37-38 and cols. 47-52: ll. 5-67, whereby service invocation is used to obtain additional information or to perform tasks by the use of external services; and see FIGS. 39-41 and cols. 53-55: ll. 1-19, by way of Steps 716-720), among the at least one second external device (Id., and see FIGS. 5-7: External Services 1360 and cols. 10-13: ll. 48-35, whereby servers 1340 can call external services 1360 when needed to obtain additional information or refer to store data concerning previous interactions with particular users, and whereby communications with external services 1360 can take place via network 1361, and where external services 1360 include web-enabled services and/or functionality related to or installed on the hardware device itself), as processing of the remaining part of the first data (see FIGS. 39-41 and cols. 53-55: ll. 1-19, by way of Steps 720-724, whereby after the final output format is completed, a different kind of paraphrase may be offered in Step 738, and whereby in this phase, the entire result set may be analyzed and compared against the initial request, and a summary of results or answer to a question may then be offered); and 
request the third data corresponding to the intent of the user utterance from the third external device (see FIGS. 39-41 and cols. 53-55: ll. 1-19, by way of Steps 710-724, and whereby screen 4101 depicts paraphrased summary 4104 generated by Step 738 with detailed results 4105 also included; also see FIGS. 37-38 and cols. 47-52: ll. 5-67, whereby service invocation is used to obtain additional information or to perform tasks by the use of external services).
In re Claim 20, Gruber discloses wherein the third data comprises personified contents (see cols. 13-17: ll. 36-51, whereby output data/information may include one or more of: text and graphics sent to a user over email; text and graphics sent to a user over a messaging service; speech output; graphical layout of information with photos, rich text, videos, sounds, and hyperlinks; and invoking other applications on a device; see also FIGS. 39-42 and cols. 53-55: ll. 1-19, whereby output processor components 1090 may be operable to: format output data that is represented in a uniform internal data structure into forms and layouts that render it appropriately on different modalities; render output data for modalities that may include any combination of graphical user interfaces, messages, sounds, animations, and/or speech output; dynamically render data for different graphical user interface display engines based on the request; render output data in different speech voices dynamically; dynamically render to specified modalities based on user preferences; dynamically render output using user-specific "skins" that customize the look and feel; and send a stream of output packages to a modality showing intermediate status, feedback, or results throughout phases of interaction with assistant 1002, and further whereby the content of output messages generated by multiphase output procedure 700 is tailored to the mode of multimodal output processing 600).
Response to Arguments
Applicant’s arguments with respect to claims 1, 8, and 15 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.  Examiner has detailed above the manner in which the prior art enables the claimed invention.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER L ELJAIEK whose telephone number is (571)272-5474. The examiner can normally be reached Monday-Thursday, 9:00am-3:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, DUC M NGUYEN can be reached on (571)272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ALEXANDER L. ELJAIEK/
Examiner
Art Unit 2651



/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2651