DETAILED ACTION
This office action is in response to Applicant’s submission filed on 3/13/2020. Claims 1-20 are pending in the application. As such, claims 1- 20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365 is acknowledged. The prior-filed application (Provisional application No. 62/967352 Filed on 1/29/2020) is acknowledged. 

Information Disclosure Statement
The information disclosure statement(s)(IDS) submitted on the following dates 3/13/2020, and 4/7/2021 have been considered by the examiner.
Drawings
The drawing filed on 3/13/2020 have been accepted and considered by the examiner.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:


(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1, 8, and 15 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Sekar et al. (US20210201238A1)(hereinafter "Sekar").

Regarding claims 1, 8, and 15 Sekar teaches a data processing system, method, and memory device comprising: a processor; and a computer-readable medium storing executable instructions for causing the processor to perform operations of: (Sekar, Par. 0003:” The system may include: a hardware processor; and a machine-readable storage medium on which is stored instructions that cause the hardware processor to execute a process.”, and Par. 0020:” These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.”).
receiving a first audio input from a user comprising spoken content; (Sekar, Par. 0089:” ... the input may be provided in the form of free speech or text [e.g., unstructured, natural language input]. Input also may include other forms of data received or stored on the customer device.”).
analyzing the first audio input using one or more natural language processing models to produce a first textual output comprising a textual representation of the first audio input; (Sekar, 0123:” Upon determining the context, the analysis may proceed with segmenting and correspond to that particular context.”).
analyzing the first textual output using one or more machine learning models to determine first context information of the first textual output; and (Sekar, Par. 0123:” For example, the personal bot may have a machine learning or artificial intelligence (AI) engine that is trained with predefined set of intents that are segregated by context.”).
processing the first textual output in the application based on the first context information. (Sekar, Par. 0123:” In this case, the personal bot has analyzed the transcript from the interaction 505 and, from that analysis, recognized, inferred, and/or classified an overall “context” of the interaction and identified “intents” (i.e., the meaning or intention behind spoken phrases or word groupings), which, as will be seen, then may be used to identify pending actions.”).


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 2, 3, 4, 6, 9 , 10, 11, 13, 16, 17, 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Sekar, and in further view of Gruber et al. (US20170263248A1)(hereinafter "Gruber").

Regarding claims 2, 9, and 16 Sekar does not teach the data processing system of claim 1, 8, and 15 respectively, wherein the spoken content comprises a command, textual content, or both; and wherein the first context information of the first textual output provides an indication of whether the first textual output includes the command and an indication of how the user intended to apply the command to content in an application.
Gruber teaches wherein the spoken content comprises a command, textual content, or both; and (Gruber, 0255:” In some embodiments, while displaying previously typed or dictated device can receive [e.g., via a microphone] a natural-language input from a user and determine whether the natural-language input includes a predefined editing command. If the natural-language user input includes a predefined editing command, the device can modify the text based on the predefined editing command."). Note: Editing is considered the context.
wherein the first context information of the first textual output provides an indication of whether the first textual output includes the command and an indication of how the user intended to apply the command to content in an application. (Gruber, Figures 8B though 8MMM illustrate various examples of processing the text recognized according to the context, i.e. recognized command and target, and as an example in figure 9, ref 908: Modify the textual data based on the predefined editing command.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Sekar in view of Gruber to wherein the spoken content comprises a command, textual content, or both; and wherein the first context information of the first textual output provides an indication of whether the first textual output includes the command and an indication of how the user intended to apply the command to content in an application, in order to improve the accuracy of subsequent speech recognition on the device by updating a language model associated with the speech recognition engine, as evidence by Gruber (See Par. 0314).

Regarding claims 3, 10, and 17 Sekar does not teach the data processing system (of claim 2, 9, and 16 respectively) wherein the instructions to process the first textual output in the application based on the first context information further include instructions configured to 
Gruber teaches rendering first textual content to a document in the application responsive to the first contextual information indicating that the first textual output includes the textual content.  (Gruber, Par. 0255:” Par. 0255:“In some embodiments, while displaying previously typed or dictated text, a device can receive [e.g., via a microphone] a natural-language input from a user and determine whether the natural-language input includes a predefined editing command. If the natural-language user input includes a predefined editing command, the device can modify [render] the text based on the predefined editing command."). Note: Editing is considered the context.
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Sekar in view of Gruber to render first textual content to a document in the application responsive to the first contextual information indicating that the first textual output includes the textual content, in order to improve the accuracy of subsequent speech recognition on the device by updating a language model associated with the speech recognition engine, as evidence by Gruber (See Par. 0314).

Regarding claims 4, 11, and 18 Sekar does not teach the data processing system (of claim 2, 9, and 16 respectively) wherein the instructions to process the first textual output in the application based on the first context information further include instructions configured to cause the processor to perform operations of executing a command on the contents of a 
Gruber teaches executing a command on the contents of a document in the application responsive to the first contextual information indicating that the first textual output includes the command. (Gruber, Par. 0291:” In some embodiments, if there are multiple instances of the target text string in the textual data, device 104 requests a user confirmation before replacing each instance of the target text string with the replacement text string. In the example depicted in FIGS. 8S-V, in response to receiving the natural-language user input 822 and identifying the predefined editing command ‘replace is with isn't,’ device 104 finds the first instance 824 of the target text string. Device 104 visually highlights the first instance 824 by underlining it, and audibly requests confirmation from the user by asking, “this one?” In other embodiments, device 104 may request confirmation by displaying a request for confirmation on the display, such as by displaying a ‘yes’ affordance.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Sekar in view of Gruber to execute a command on the contents of a document in the application responsive to the first contextual information indicating that the first textual output includes the command, in order to improve the accuracy of subsequent speech recognition on the device by updating a language model associated with the speech recognition engine, as evidence by Gruber (See Par. 0314).


Gruber teaches receiving usage information from the application indicative of user interactions with the application prior to receiving the first audio input, while receiving the first audio input, or after receiving the first audio input; and (Gruber, Par. 93:” In some examples, the contextual information that accompanies the user input can include sensor information, e.g., lighting, ambient noise, ambient temperature, images or videos of the surrounding environment, etc. In some examples, the contextual information can also include the physical state of the device, e.g., device orientation, device location, device temperature, power level, speed, acceleration, motion patterns, cellular signals strength, etc. In some examples, information related to the software state of DA server 106, e.g., running processes, installed programs, past and present network activities, background services, error logs, resources usage, etc., and of portable multifunction device 200 can be provided to DA server 106 as contextual information associated with a user input.”).
wherein analyzing the first textual output further comprises analyzing the first textual output and the application usage information using one or more machine learning models to contextual information or a subset thereof with the user input to DA server 106 to help infer the user's intent. In some examples, the digital assistant can also use the contextual information to determine how to prepare and deliver [machine learning] outputs to the user. Contextual information can be referred to as context data.”, Par. 0287:”For example, if the user has previously used the word ‘reign’ [or has previously replaced the word ‘rain’ with ‘reign’] during the application context of a text message conversation with another user, John, then the device may rank the word ‘reign’ higher than the word ‘rain’ when the user is using dictation-based editing during a text message conversation [first textual output] with John. In contrast, when the user in using dictation-based editing during a different application context [first context information], such as during text message conversations with someone else [other than John] or in a different application [e.g., in a notepad application], device 104 may instead rank [model] ‘rain’ more highly than ‘reign’.”, and Par. 0288:” For example, if the textual data includes the phrase ‘the rein of the queen’ and the user provides a natural-language input requesting to replace the word ‘rein’ with a homophone, the device may rank the alternative replacement text string ‘reign’ more highly than ‘rain’ based on the presence of the phrase ‘of the queen’ after the target text string.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Sekar in view of Gruber to receive usage information from the application indicative of user interactions with the application prior to receiving the first audio input, while receiving the first audio input, or after .

Claims 5 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Sekar, and in further view of Weber et al. (US9495955B1)(hereinafter "Weber").

Regarding claims 5, and 12 Sekar does not teach the data processing system (of claim 1, 8, and 15 respectively) wherein the instructions to analyze the first textual output further include instructions configured to cause the processor to perform an operation of: disambiguating between textual input and command input included in the first textual output based on the output of the one or more machine learning models.
Weber teaches disambiguating between textual input and command input included in the first textual output based on the output of the one or more machine learning models. (Weber, Col. 7, lines 20 -37:"As shown, the transcript excerpt 300 includes conversational speech with one command and multiple questions or queries being identified [by double-underline and single underline, respectively]. As described above in connection with FIG. 1, one of the utterance characteristics may be utterance type [such as a query or command]. As described above in connection with FIG. 2, the characteristic determination module 110 or the transcript excerpt 300 includes a phrase C1 identified as a command and three phrases Q1, Q2, Q3 identified as queries. The corresponding portions or segments of audio excerpt 310 may also be tagged or labeled as indicative of the utterance characteristic of being a command or query. In some implementations, data in the corpus 105 may be tagged as having multiple characteristics if such is the case.").
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Sekar in view of Weber to disambiguate between textual input and command input included in the first textual output based on the output of the one or more machine learning models, in order to improve the efficiency and speed of acoustic model training by the acoustic model generator, as evidence by Weber (See Col. 6, lines 7 - 9).

Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Sekar, Gruber and in further view of  Kennewick et al. (US20090150156A1)(hereinafter " Kennewick ").

Regarding claims 7, and 14 Sekar and Gruber do not teach the data processing system (of claim 6, 6 and 13 respectively) wherein the instructions to analyze the first textual output further include instructions configured to cause the processor to perform operations of disambiguating command scope based on the usage information from the application.
recognition model may be used in conjunction with, or independently of, a peer to peer recognition model. For example, contextual histories may include various preferences or user characteristics, in addition to providing a basis for inferring information about a user based on patterns of usage or behavior, among others. As a result, the recognition models may include additional awareness relating to global usage patterns, preferences, or other characteristics of peer users on a global basis. For example, certain keywords, concepts, queries, commands, or other aspects of a contextual framework may be commonly employed by all users within a context. In another example, users of certain demographics may share common jargon, slang, or other semantic speech patterns. As a result, operation 440 may utilize various recognition models, which consider context and semantics in various dimensions, to identify queries or command. For example, in addition to information generally available within a given environment or context, a voice input may be recognized using context and semantics based on short-term or long-term behavior or preferences for a specific user, global users, peer users, or other meaningful user abstractions.”). Note:  usage by specific users allows for disambiguating command scope
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Sekar, and Gruber in view of Kennewick to disambiguate command scope based on the usage information from the application, in order to reduce overhead, assure that mistakes will not be repeated, and/or improve accuracy of interpretations, as evidence by Kennewick (see Par. 0069)



Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. Dielmann et al. (U.S. Patent Application No: US20170125015A1) teaches (Par. 0022):”a system 100 having voice dictation formatting in accordance with embodiments of the invention. A microphone 102 is coupled to a pre-processing module 104 that can provide speech enhancement and denoising along with voice activity detection and acoustic feature extraction. The pre-processed output from module 104 is provided to a transcription module 106 that transcribes the speech received from the user by the microphone 102. The verbatim speech transcription generated by the transcription module 106 is transformed by a formatting processing module 112 into text formatted with the formatting style usually expected in a written document. A deterministic formatting module 108 and a trained stochastic model 110 are coupled to a formatting processing module 112, which is coupled to an output module 114. As described more fully below, deterministic and stochastic processing are integrated via the deterministic formatting module 108 and the trained model 110 to disambiguate competing readings of the verbatim speech transcription 106 and hence to provide formatted text 114.”
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DARIOUSH AGAHI whose telephone number is (408)918-7689. The examiner can normally be reached Monday - Thursday and alternate Fridays, 7:30-4:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DARIOUSH AGAHI/             Examiner, Art Unit 2656                                                                                                                                                                                           
/HUYEN X VO/             Primary Examiner, Art Unit 2656