Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This action is responsive to the Amendment filed on 08/17/2022. Claims 1 - 10 are pending in the case. 

Applicant Response
In Applicant’s response dated 03/16/2021, Applicant amended Claim 1 and argued against all objections and rejections previously set forth in the Office Action dated 05/18/2022.

Examiner Comments
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 


Claim Rejections - 35 USC § 103
6.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


7.	Claims 1-10 are rejected under 35 U.S.C. 103 as being unpatentable over Sung et al., (Pub. No.: US 20170372703 A1, Pub. Date: December 28, 2017, hereinafter Sung) in view of Yanagihara (Pub. No.: US 20120166192 A1, Pub. Date: June 28, 2012.) n further view of  Phillips et al (Pat. No.: US 20110066634 Aa Pub. Date: 2011-03-17)  in further view of  Arrouye et al (Pat. No.: US 8150826 B2 Pub. Date: 03-Apr-2012)

Regarding independent Claim 1, 
	Sung teaches a method to allow a thin client using dictation to provide dictation functionality (see Sung: Fig.1, [0045], “The user device 102 detects the spoken input and records audio data (dictation) that represents the voice command 108.” I.e. the recording of the audio data   by the client device is the dictation functionality), when the thin client device does not have connectivity to a remotely hosted speech to text application (see Sung: [0088], “device may determine that the action (the action is the recorded audio data file) involves communication with a server, and that network connectivity is temporarily disconnected or that the server is currently responding slowly or is unavailable …. the device can determine to execute the action asynchronously instead, for example, by placing the task in a buffer or queue, or scheduling later execution action” i.e. the thin client device will store or queue the audio recorded or dictated data for later execution”), the method comprising:
Invoking, at the thin client device, an application configured to receive audio data (see Sung: Fig.2, [0045], “user device 104 receives a user request from the user 102. The user 102 may make the request to digital assistant functionality accessed through or provided by the user device 104. The user 102 may invoke the digital assistant in any multiple ways, such as speaking a hot word, pressing an on-screen button, pressing, and holding a "home" button, performing a gesture. The user may make the request through any appropriate type of user input, such as typed input or voice input. In the illustrated example, the user 102 speaks a voice command 108, "Set a reminder for tomorrow at 4:00 pm." The user device 102 detects the spoken input and records audio data that represents the voice command 108.”), and transmit the audio data over a communication link to the remotely hosted speech to text application (see Sung: Fig.2, [0046], “the user device 104 sends (transmits) data indicating the user request 115 to the server system 110. For example, when the request is made as a voice input, the user device 104 can provide audio data for the user's utterance. The audio data can be an audio waveform recorded by the user device 102, a compressed form of the audio information, or information derived or extracted from recorded audio, such as data indicating speech features such as mel frequency coefficients.”)
determining, by the application on the thin client device, whether the communication link to transmit the audio data is available to allow communication of the audio data to the remotely hosted speech to text application (see Sung: Fig.3, [0087], “It is determined that the action is classified as an action that can be performed asynchronously (remotely) to the user request (306). For example, certain actions or types of actions can be assigned as appropriate for synchronous or asynchronous execution. In some implementations, a system determines that the action corresponds to a particular action type. Assignment data is accessed that indicates whether different action types are assigned to be executed synchronously or asynchronously to a request. Based on the assignment data, a system can determine that the particular action type is assigned as capable of being executed asynchronously to a request. As another example, certain applications or services may be designated for asynchronous or synchronous processing.”)
if the communication link to the remotely hosted speech to text application is available, transmitting the audio data to the remotely hosted speech to text application wherein the remotely hosted speech to text application is configured to convert the audio data to textual data (see Sung: Fig.1, [0046], [0047], “the user device 104 sends data indicating the user request 115 to the server system 110. For example, when the request is made as a voice input, the user device 104 can provide audio data for the user's utterance.”; “the server system 110 interprets the user request 115 to determine what action the user 102 has requested to be performed. The server system 110 can also determine other details about how the action should be performed. The server system 110 includes a request interpreter module 120 that analyzes the request. In some implementations, the request interpreter module 120 obtains text representing the user request.”)
if the communication link to the remotely hosted speech to text application is not available (see Sung: Fig.1, [0088], “For example, a device may determine that an action that is classified as appropriate for synchronous execution. However, the device may determine that the action involves communication with a server, and that network connectivity is temporarily disconnected or that the server is currently responding slowly or is unavailable.”)
generating, on the thin client device, an audio data file (see Sung: Fig.2, [0013], “a text-to-speech system is used to generate audio data comprising synthesized speech.”); [0045], “The user device 102 detects the spoken input and records audio data that represents the voice command 108,”)
generating, on the thin client device, a context file (see Sung: Fig.1, [0086], “the user device 104 can send context information indicating its current context. This context information may include, for example, data indicating items visible on a display of the user device 104, data indicating applications installed or running on the user device 104, and so on.”) 
storing, in the audio data file, audio data received by the thin client device (See Sung: [0043], “the asynchronous nature of processing can allow a device to cache (store) interactions or deal with low connectivity. A queue of commands may be created at a device and then be sent for later execution. A device that lacks connectivity to a server can still receive commands and store them, then send them to a server for processing once connectivity is restored.”) and
	As shown above, Sung teaches or suggests between synchronous and asynchronous speech to text transcription requested by a user may also be performed by one or more client devices (thin Clint devices), or by a combination of a server system and one or more client devices when network connectivity is temporarily disconnected or that the server is currently responding slowly or is unavailable. 
	However, Sung does not explicitly teach or disclose the system that storing, in the context file, data, commands, or data and commands, wherein the context file includes a command to launch the application on client device, such that on execution, the thin client device can navigate to a text entry field for which the audio data was generated wherein the data identifies a location of the audio file in the application on the thin client device.
	Yanagihara teaches the system wherein storing, in the context file, data, commands, or data and commands, […] such that on execution, the thin client device can navigate to a text entry field for which the audio data was generated (see Yanagihara: Fig.1B, Fig.7, [0065], “The textual representation (context file), at state 740, is presented to the user. For example, a presentation engine (e.g., the presentation engine 460 of FIG. 4) can present the textual representation to the user using a display (e.g., the input window 130). In some implementations, the mobile device can receive edits on the displayed text. For example, a user can use a virtual keyboard (e.g., the virtual keyboard 140 of FIG. 1) to revised the displayed text. Based on the received edits, the mobile device can correct the displayed text.”)
	Because both Sung and Yanagihara are in the same/similar field of endeavor speech to text transcription and performing an action on user interface accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the method of Sung to include a method of storing, in the context file, data, commands, or data and commands such that on execution, the thin client device can navigate to a text entry field for which the audio data was generated as taught by Yanagihara. After modification of Sung, the synchronous and asynchronous speech to text transcription to generate action on the client device use interface can also facilitate the client device application to navigate to a text entry field to enter text data that was generated by the speech to text transcription taught by Yanagihara. One would have been motivated to make such a combination to provide users an easier, efficient, and time saving document processing and data entry application by effectivity generating textual data from the speech data.

	As shown above, Sung and Yanagihara teaches or suggest all the limitations of Claim 1. 
Yanagihara teaches providing an application such as user interface 110 that is used to compose a text message, such as a text message for an electronic mail (email) application, a short message service (SMS) application, a word processing application, a data entry application, and/or an instant message (IM) application, among many others. Sung teaches processing user speech request asynchronously  or synchronously by a combination of a server system and one or more client devices. 
	However, Sung and Yanagihara does not explicitly teach or suggest the system wherein the context file includes a command to launch the application on client device and the data identifies a location of the audio file in the application on the thin client device.
	
	However, Phillips teaches or disclose the system wherein the context file includes a command to launch the application on client device (see Phillips: Fig.1, [0096], “The combined or selected result may be used to perform a function on the mobile communication facility 120, such as filling in a text field, launching an application, and the like as described herein.”)
	Because Sung , Yanagihara and Phillips are in the same/similar field of endeavor speech to text transcription and performing an action on user interface accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Sung to include a context file that include a command to launch the application on client device of as taught by Phillips. After modification of Sung, the synchronous and asynchronous speech to text transcription to generate action on the client device use interface can also facilitate the client device application to launch an application that that was generated by the speech to text transcription command requires as taught by Phillips. One would have been motivated to make such a combination to improve efficiency of using and electronic device by allowing speech command to launch an application.

	Sung, Yanagihara and Phillis does not explicitly teach or suggest the system wherein the  the data identifies a location of the audio file in the application on the thin client device.
	However, Arrouye teaches or suggests the system wherein the data identifies a location of the audio file in the application on the thin client device (see Arrouye: Fig.4, Col.17, Line 62-67 and Line 3-10, “the software architecture 400 also includes a file system directory 417 for the metadata. This file system directory keeps track of the relationship between the data files and their metadata and keeps track of the location of the metadata object (e.g. a metadata file which corresponds to the data file from which it was extracted)”, i.e. the metadata identifies and indicates the location on any file including Audio file  stored in the application client device)
	Because Sung, Yanagihara, Phillis and Arrouye address the same issue of storing a data file in a thin client to be accessed by applications, accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching  of Sung to include a context file that identifies a location of the audio file in the application on the thin client device as taught by Arrouye. After modification of Sung, synchronous and asynchronous speech to text transcription that generate action on the client device user interface can also identify the data file location information or directory information or file property information in the client device application as taught by Arrouye. One would have been motivated to make such a combination to provide users an easier, efficient, and time saving document processing and data searching and accessing application by effectivity managing , locating, and presenting the data file.

Regarding Claim 2, 
	Sung, Yanagihara, Phillis and Arrouye teaches all the limitations of Claim1. Sung further teaches the method wherein:
if the communication link to the remotely hosted speech to text application is not available (see Sung: Fig.1, [0088], “For example, a device may determine that an action that is classified as appropriate for synchronous execution. However, the device may determine that the action involves communication with a server, and that network connectivity is temporarily disconnected or that the server is currently responding slowly or is unavailable. As a result, the device can determine to execute the action asynchronously instead.”)
monitoring, at the thin client device, for re-establishment of the communication link to the remotely hosted speech to text application and  transmitting the audio data from the audio data file to the remotely hosted speech to text application88999-8043. USO1 /148153906.1 -21-  (see Sung: Fig.2, [0072], “requested action may be designated as being most appropriate for synchronous execution, upon determining that connectivity with an application server needed to perform the action is not available, the client device 104 may store data causing the action to be performed at a later time. For example, the task may be scheduled, placed in a buffer of tasks to be completed, set to occur in response to connectivity being restored, and/or set to be retried at a certain time period. The client device 104 can use a multi-threaded or multi-process technique to receive and fulfill other user requests in the meantime.”), wherein, 
the remotely hosted speech to text application is configured to convert the audio data from the audio data file to textual data (see Sung: Fig.3, [0047], “The server system 110 includes a request interpreter module 120 that analyzes the request. In some implementations, the request interpreter module 120 obtains text representing the user request. For voice requests, the request interpreter module 120 may obtain a transcription for received audio from an automated speech recognizer, which may be provided by the server system 110 or another system.”)
Sung does not explicitly teach or disclose the system wherein:
receiving, at the thin client device, the textual data generated by the remotely hosted speech to text application
navigating, by the thin client device, to the text entry field using the data, commands, or data and command stored in the context file and populating the text entry field with the textual data.
	However, Yanagihara teaches the system wherein:
receiving, at the thin client device, the textual data generated by the remotely hosted speech to text application (see Yanagihara: Fig.7, [0065], “The textual representation, at state 740, is presented to the user.”)
navigating, by the thin client device, to the text entry field using the data, commands, or data and command stored in the context file (see Yanagihara: Fig.7, [0065], “The textual representation, at state 740, is presented to the user. For example, a presentation engine (e.g., the presentation engine 460 of FIG. 4) can present the textual representation to the user using a display (e.g., the input window 1300.)
populating the text entry field with the textual data (see Yanagihara: Fig.7, [0065], “The textual representation, at state 740, is presented to the user. For example, a presentation engine (e.g., the presentation engine 460 of FIG. 4) can present the textual representation to the user using a display (e.g., the input window 130). In some implementations, the mobile device can receive edits on the displayed text. For example, a user can use a virtual keyboard (e.g., the virtual keyboard 140 of FIG. 1) to revised the displayed text. Based on the received edits, the mobile device can correct the displayed text.”)
	Because both Sung, Yanagihara, Phillis and Arrouye are in the same/similar field of endeavor speech to text transcription and conversion to perform an action on user interface accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to further modify (refer to claim 1) the teaching of Sung to include a method of populating text entry field by using speech to text transcription as taught by Yanagihara. One would have been motivated to make such a combination to provide users with quicker, effective, and time saving document processing mechanism to enter text entry data in an application from the speech data.
	
Regarding Claim 3, 
	Sung, Yanagihara, Phillis and Arrouye teaches all the limitations of Claim1. Yanagihara teaches the method wherein the text entry field is an editable tab in a graphical user interface (see Yanagihara: Fig.1B, [0017], “The editing interface 110 (editable tab) can support speech input from the user. For example, the mobile device 100 can receive speech through a microphone 160. In some implementations, the editing interface 110 can display text derived from the received speech using the input window 130”)
	Because Sung, Yanagihara, Phillis and Arrouye are in the same/similar field of endeavor speech to text transcription and conversion to perform an action on user interface accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to further modify (refer to claim 1) the teaching of Sung to include an editable tab in a graphical user interface in the data entry field as taught by Yanagihara. One would have been motivated to make such a combination to provide users with quicker, effective, and time saving document processing mechanism to enter text entry data in an application from the speech data.

Regarding Claim 4, 
	Sung, Yanagihara, Phillis and Arrouye teaches all the limitations of Claim1. Yanagihara further teaches the method wherein the text entry field is a word document (see Yanagihara: Fig.1B, [0017], “user can use the editing interface 110 to compose a text message, such as a text message for an electronic mail (email) application, a short message service (SMS) application, a word processing application, a data entry application, and/or an instant message (IM) application, among many others.”). 
	Because Sung, Yanagihara, Phillis and Arrouye are in the same/similar field of endeavor speech to text transcription and conversion to perform an action on user interface accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to further modify (refer to claim 1) the teaching of Sung to include an editable tab in a graphical user interface in the data entry field as taught by Yanagihara. One would have been motivated to make such a combination to provide users with quicker, effective, and time saving document processing mechanism to enter text entry data in an application from the speech data.

Regarding Claim 5, 
	Sung, Yanagihara, Phillis and Arrouye teaches all the limitations of Claim1. Sung further teaches the method wherein the context file comprises metadata appended to the audio data file (see Sung: Fig.1, [0083], “server system 110 uses additional information to determine whether a requested action should be performed synchronously or asynchronously. For example, the user device 104 can send context information (metadata) indicating its current context. This context information may include, for example, data indicating items visible on a display of the user device 104, data indicating applications installed or running on the user device 104, and so”)

Regarding Claim 6, 
	Sung, Yanagihara, Phillis and Arrouye teaches all the limitations of Claim1. Yanagihara further teaches the method wherein the data, commands, or data and commands stored in the context file are transmitted to the remotely hosted speech to text application along with the audio data from the audio data file (see Phillis: Fig.7, [0063], “processing the speech using a resident speech recognition facility to recognize command elements and content elements; transmitting at least a portion of the speech through a wireless communication facility to a remote speech recognition facility; transmitting information from the mobile communication facility to the remote speech recognition facility, wherein the information includes information about a command recognizable by the resident speech recognition facility and at least one of language, location, display type, model, identifier, network provider, and phone number associated with the mobile communication facility; generating speech-to-text results utilizing the remote speech recognition facility based at least in part on the speech and on the information related to the mobile communication facility; and transmitting the text results for use on the mobile communications facility.”)
	Because Sung , Yanagihara and Phillips are in the same/similar field of endeavor speech to text transcription and performing an action on user interface accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Sung to include  the method wherein context file that include a command to be transmitted to a remotely hosted speech to text application as taught by Phillips. One would have been motivated to make such a combination to provide users with quicker, effective, and time saving document processing mechanism to enter text entry data in an application from the speech data.

Regarding Claim 7, 
	Sung, Yanagihara, Phillis and Arrouye teaches all the limitations of Claim 1. Sung further teaches the method wherein receiving, at the thin client device, comprises receiving an executable file (see Sung: Fig.3, [0063], “The action is caused to be performed asynchronously to the user request (310). The execution of the action can be decoupled from the user's conversation with the digital assistant, allowing other requests to the digital assistant to be received and processed independently and in parallel to the first request.”)

Regarding Claim 8, 
	Sung, Yanagihara, Phillis and Arrouye teaches all the limitations of Claim 1. Sung further teaches the method comprising processing the audio data by an alternative speech to text application on the thin client device (see Sung: Fig.2, [0097], “Based on determining that the second action is not classified as an action to be performed asynchronously to the second user request, the second action can be caused to be performed synchronously with respect to the user request. Confirmation can be provided to the client device after synchronous execution has completed.”)

Regarding Claim 9, 
	Sung, Yanagihara, Phillis and Arrouye teaches all the limitations of Claim 8. Yanagihara further teaches the method wherein the alternative speech to text application data temporarily populates the primary application data field (see Yanagihara: Fig.7, [0065], “presentation engine (e.g., the presentation engine 460 of FIG. 4) can present the textual representation to the user using a display (e.g., the input window 130). In some implementations, the mobile device can receive edits on the displayed text. For example, a user can use a virtual keyboard (e.g., the virtual keyboard 140 of FIG. 1) to revised the displayed text. Based on the received edits, the mobile device can correct the displayed text”). 
	Because Sung, Yanagihara, Phillis and Arrouye are in the same/similar field of endeavor speech to text transcription and conversion to perform an action on user interface accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to further modify (refer to claim 1) the teaching of Sung to transmit audio data file to a remote server for data transcription as taught by Yanagihara. One would have been motivated to make such a combination to provide users with time saving, efficient, and effective speech to text transcription remotely or over a server.
	
Regarding Claim 10, 
	Sung, Yanagihara, Phillis and Arrouye teaches all the limitations of Claim 8. Yanagihara further teaches the method wherein the textual data received from the hosted application replaces the alternative speech to text application data (see Yanagihara: Fig.4, [0055], “text composition engine 440 produces text derived from the speech data and supplemented with the non-speech data, the edited text data can be returned to the editing interface instructions 374 via a link 450. The editing interface instructions 374 can assemble the data for presentation using a presentation engine 460 and output the data to a user interface (e.g., using GUI instructions 354). In some implementations, the presentation engine 460 can generate an output 470 to be displayed. For example, the output 470 can be the text displayed in the input window 130 as shown in FIG. 1B.”)
	Because Sung, Yanagihara and Arrouye are in the same/similar field of endeavor speech to text transcription and conversion to perform an action on user interface accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to further modify (refer to claim 1) the teaching of Sung to transmit audio data file to a remote server for data transcription as taught by Yanagihara. One would have been motivated to make such a combination to provide users with time.

Response to Arguments
	Applicant’s arguments with respect to claim amendments have been considered but are moot considering the new combination of references being used in the current rejection. The new combination of references was necessitated by Applicant’s claim amendments. Therefore, the claims are rejected under the new combination of references as indicated above.
Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZELALEM "Zee" W SHALU whose telephone number is (571)272-3003. The examiner can normally be reached M- F 0800am- 0500pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CESAR B PAULA can be reached on (571)272- 4128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Zelalem "Zee" Shalu/Examiner, Art Unit 2177   

/CESAR B PAULA/Supervisory Patent Examiner, Art Unit 2177