DETAILED ACTION
Applicant’s arguments filed in the reply on 12/17/2021 were received and fully considered. Claims 1, 3, 8, 10, and 15 were amended.  Claim 16 is a new claim. Please see corresponding rejection headings and response to arguments section below for more detail.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. IN20184109106, filed on 5/22/2018.

Information Disclosure Statement
The information disclosure statement(s)(IDS) submitted on the following dates 5/25/2021 has been considered by the examiner.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been 12/17/2021 has been entered.

Response to Arguments
Applicant’s arguments with respect to the prior art rejections raised in the previous office action have been considered but are moot because the new ground of rejection does not rely on the combination of references that are currently applied. Please see prior art section above for more detail including updated citations and obviousness rationale.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 2, 8 -10, and 15  are rejected under 35 U.S.C. 103 as being unpatentable over Gella (US10089983B1), and in further view of Markus Vogel (US20190027149A1)(hereinafter “Vogel”).

Gella was applied for in the previous Office Action.
Regarding claim 1, Gella  teaches a method, performed by an electronic device, the method comprising: receiving by a user inputter of the electronic device, a speech input; (Gella, Col. 10, lines 0045 - 0049:”Upon receipt, the first audio data may be provided to an automatic speech recognition [“ASR”] system capable of performing automatic speech recognition processing.”, and Col. 14, lines 0053 - 0055:”For example, electronic device 100 may include, or be in communication with, one or more microphones that listen for a wakeword by continually monitoring local audio.).
in response to receiving the speech input, obtaining, by at least one processor of the electronic device, text corresponding to the speech input by performing speech recognition on text data representing the first audio data may be generated. Upon receipt, the first audio data may be provided to an automatic speech recognition [“ASR”] system capable of performing automatic speech recognition processing. The ASR system, as described in greater detail below with reference to FIG. 2A, may perform speech-to-text processing to the first audio data to generate first text data representing the first audio data.”, and Col. 15, lines 0012 – 0017:”Electronic device 100 may include one or more processors 202a, storage/memory 204a, communications circuitry 206a, one or more microphones 208a or other audio input devices [e.g., transducers], one or more speakers 210a or other audio output devices, a display screen 212a, and one or more cameras 214a or other image capturing components.).
obtaining, by the at least one processor, metadata for the speech input based on the obtained text; (Gella, Col. 10, lines 0054 - 0056, 0059 – 0061:”After the first text data is generated, the text data may be provided to a natural language understanding [“NLU”] system to perform NLU processing to the text data…. The NLU system may determine one or more domains, which may also be referred to as categories that may be capable of handling the intent of the utterance.”).
based on the metadata and the preference information, selecting, by the at least one processor, at least one application from among the plurality of applications for outputting the response to the speech input; and (Gella, Col. 10, line 0067- col. 11, line 0001-0014:“Order a pizza from ‘Pizza Application’,” may be identified by a Food domain as possibly being able to handle the corresponding request. For instance, the NLU system may identify that the word ‘order’ may be a recognized intent as being an invocation word associated with the food domain, and may use various sample utterances and invocation phrases associated with the food domain to determine an intent of the utterance. In some embodiments, the NLU system may determine that the intent of utterance 4 is for placing an order with an application [e.g., {Intent}: “Order Item”], where the item to be ordered is a pizza [e.g., {Item To Be Ordered}: “Pizza”], and that a particular application to be used to order that item [e.g., {Skill/Application}: ‘Pizza Application’].”).
outputting, by the at least one processor, the response to the speech input by using the selected at least one application. (Gella, Col. 11, line 0008 – 0014:”… the NLU system may determine that the intent of utterance 4 is for placing an order with an application [e.g., {Intent}: “Order Item”], where the item to be ordered is a pizza [e.g., {Item To Be Ordered}: “Pizza”], and that a particular application to be used to order that item [e.g., {Skill/Application}: ‘Pizza Application’].”, and Col. 13, lines 0008 – 0015:”At step 180, notification [response] data indicating the action has been/is going to be completed by first application system 140 may be generated. For instance, the notification data may indicate to language processing system 200 that a request associated with the intent is being processed. Therefore, language processing system 200 may be able to inform a requesting device (e.g., voice activated electronic device 100), that the action is being carried out.”, and Col. 15, lines 0022 – 0037:“At step 192, the notification [response] data may be received. For instance, the notification data generated and sent by first application system 140 may be received by language processing system 200. In response to receiving the notification data, the functionality associated with the first application may determine an action to be performed by language processing system 200. At step 194, second text data representing a response using the first application's functionality may be determined. For example, the first functionality may have caused sample responses to be added to the language model associated with the first account. In response to receiving the notification data, language processing system 200 may determine text data representing a sample response to use to indicate that first application system 140 is carrying out the action requested by the intent of utterance 4.”).
Gella does not teach based on the metadata, obtain preference information about a plurality of applications for processing the speech input, the preference information comprising at least one of information about a result of processing the speech input by the plurality of applications or information about a time taken for the plurality of applications to output responses.
Vogel teaches based on the metadata, obtain preference information about a plurality of applications for processing the speech input, the preference information comprising at least one of information about a result of processing the speech input by the plurality of applications or information about a time taken for the plurality of applications to output responses (Vogel, Par. 0064:” If, for example, the text contains a patient ID for John Smith, the tag strings may map to dataset fields that contain information relevant only to John Smith. The context information may additionally or alternatively be derived from metadata associated with a text. For example, dictation application 124 may have access to metadata that is associated with the text and describes the speech input that was input and the text that was generated from the speech input, such as a person [e.g., patient] to which the speech/text relates or a form [e.g., medical report] to which the speech/text relates.”).


Regarding claim 8, Gella teaches an electronic device comprising: an outputter, (Gella, Col. 14, lines 41-42:”For example, electronic device 100 may be able to receive and output audio”, and Col. 15, lines 12-17:”Electronic device 100 may include one or more processors 202a, storage/memory 204a, communications circuitry 206a, one or more microphones 208a or other audio input devices [e.g., transducers], one or more speakers 210a or other audio output devices, a display screen 212a, and one or more cameras 214a or other image capturing components.”), with steps virtually identical to the functions performed in method claim 1. As a result, claim 8 is rejected as obvious over Gella, and Vogel under section 103 for the same reasons as claim 1.

Regarding claim 15, Gella teaches a non-transitory computer-readable storage medium configured to store one or more computer programs including instructions that, when executed by at least one processor of an electronic device, cause the at least one processor to control to: computer-readable instructions, data structures, and/or program systems. Various types of storage/memory may include, but are not limited to, hard drives, solid state drives, flash memory, permanent memory [e.g., ROM], electronically erasable programmable read-only memory [“EEPROM”], CD-ROM, digital versatile disk [“DVD”] or other optical storage medium, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, RAID storage systems, or any other storage type, or any combination thereof. Furthermore, storage/memory 204 may be implemented as computer-readable storage media [“CRSM”], which may be any available physical media accessible by processor[s] 202 to execute one or more instructions stored within storage/memory 204.”), with steps virtually identical to the functions performed in method claim 1. As a result, claim 15 is rejected as obvious over Gella, and Vogel under section 103 for the same reasons as claim 1.

Regarding claims 2, and 9 Gella teaches the method of claim 1, and 8 respectively, wherein the metadata comprises at least one of a keyword extracted from the obtained text, information about an intention of a user obtained based on the obtained text, information about a sound characteristic of the speech input, or information about a-the user of the electronic device. (Col. 10, lines 0049 – 0056:”The ASR system, as described in greater detail below with reference to FIG. 2A, may perform speech-to-text processing to the first audio data to generate first text data representing the first audio data. At step 160, an intent of the utterance may be determined to be associated with a first application. After the first text data is generated, the text data may be provided to a natural language understanding text data.”, and Col. 10, line 0064 – Col. 11 lines 8:” The NLU system may determine one or more domains, which may also be referred to as categories that may be capable of handling the intent of the utterance. For example, utterance 4, 'Order a pizza from ‘Pizza Application’,' may be identified by a Food domain as possibly being able to handle the corresponding request. For instance, the NLU system may identify that the word ‘order’ [keyword] may be a recognized intent as being an invocation word associated with the food domain, and may use various sample utterances and invocation phrases associated with the food domain to determine an intent of the utterance.).


Claims 3, and 10  are rejected under 35 U.S.C. 103 as being unpatentable over Gella and Vogel,  as applied to claim 1, and 8 respectively, in further view of Yeom et al. (US9740751B1)(hereinafter “Yeom).

Regarding claims 3, and 10 Gella and Vogel teach a method, performed by an electronic device.
Neither Gella nor Vogel teach the method of claim 1, preference information further comprises information about feedback information of a user about responses output by the plurality of applications.
Yeom teaches preference information further comprises information about feedback information of a user about responses output by the plurality of applications. (Yeom, Col. 1, lines 34 – 47:”For example, the most appropriate application may be identified based on a metadata associated with the application, for a particular uniform resource identifier [URI] associated with the action keyword and object keyword combination, a power score associated with the application that is based at least on a popularity metric and a rating associated with the application, and a feedback score for the URI that is based at least on a user action associated with the URI. Based on those factors, candidate applications that correspond to the action keyword and object keyword may be ranked, and the candidate application with the highest ranked application with a URI associated with the action keyword and object keyword may be executed.”)
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Gella, and Vogel in view of Yeom to select application based on feedback information, in order to execute the most appropriate application for that action keyword, as evidence by Yeom (See Col. 1, lines 24 – 25).


Claims 4, and 11  are rejected under 35 U.S.C. 103 as being unpatentable over  Gella and Vogel  as applied to claim 1, and 8 respectively, and in further view of  Van Os (US9576574B2).

Van Os was applied in the previous Office Action.
Regarding claims 4, and 11 Gella and Vogel teach a method, performed by an electronic device, of outputting a response to a speech input by using an application.

Van Os teaches wherein the outputting of the response comprises: obtaining by the at least one processor, at least one response to the speech input from the selected at least one application; (Van Os, Col. 25, line 59 – 0067:"In some embodiments, the context-sensitive interruption handler of the digital assistant intercepts the responses, reminders, and/or notifications before they are provided to the user, and determines dynamically in real-time, a relative urgency [priority] between the responses, reminders, and/or notifications. The context-sensitive interruption handler of the digital assistant then provides the responses, reminders, and/or notifications in an order based on the relative urgency thereof. "). 
determining by the at least one processor, a priority of the at least one response; (Van Os, Col. 26, line 0001- 0007:”In some embodiments, since the context may change again during the time it takes for the most highly prioritized response/reminder/notification to be provided to the user, the relative urgency [priority] is re-evaluated among the remaining and any newly available responses, reminders, and notifications. In some embodiments, the re-evaluation takes into account new information that alters the present context.”).
and outputting, by the at least one processor, the at least one response according to the determined priority. (Van Os, Col. 26, lines 0012 - 0018:"For example, if a reminder can be provided via a graphical interface, and a response to user input can be provided to the user via a speech output, the digital assistant can optionally provide the reminder and the response simultaneously using the graphical interface and the speech output without resorting to the interruption handler.", and Col. 26, lines 0021 - 0027:"In some embodiments, the digital assistant prioritizes the concurrently available outputs [e.g., responses, reminders, and/or notifications] for delivery one at a time over a single output channel when the digital assistant detects that the user is likely to have diminished or impaired ability to focus on multiple output channels at the same time. ”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Gella and Vogel in view of  Van Os to obtain at least one response to the speech input from the selected at least one application; determining a priority of the at least one response; and outputting the response according to the determined priority, in order to improve a user's experience in interacting with the system and promote the user's confidence in the system's services and capabilities, a well-designed response procedure is needed, as evidence by Van Os (See Col.1, lines 0041 – 0044). 


Claims 5, and 12  are rejected under 35 U.S.C. 103 as being unpatentable over  Gella, Vogel, and Van Os  as applied to claim 4, and 11 respectively, and in further view of  Zhang (US20140188477A1).

Zhang was applied in the previous Office Action.

Gella, Vogel and Van Os do not teach the method of claim 4, wherein the determining of the priority comprises determining, by the at least one processor, the priority based on at least one of an intention of a user related to the at least one response, a size of the at least one response, whether the at least one response comprises a characteristic preferred by the user, or information about a time taken to output the at least one response after obtaining the at least one response. 
Zhang teaches wherein the priority wherein the determining of the priority comprises determining, by the at least one processor, the priority based on at least one of an intention of a user related to the at least one response, a size of the at least one response, whether the at least one response comprises a characteristic preferred by the user, or information about a time taken to output the at least one response after obtaining the at least one response. (Zhang, Par. 0196:”The natural language comprehension system 720 may also determine the priorities of the report answers 711 according to a user's usage frequencies. Specifically, the natural language comprehension system 720 is able to register those received user's speech inputs 701 in the properties database 730, and the properties database 730 may register those keywords 709 obtained when the natural language comprehension system 720 parses the user's speech inputs 701 and may also register all the report answers 711 generated by the natural language comprehension system 720. Afterwards, the natural language comprehension system 720 may find the report answer 711 relatively conformable to the user's intention [determined by the user's speech input] according to the priority, so as to find the speech response finally. The recorded information mentioned here may include the user's preferences/dislikes/habits and even the public preferences/dislikes/habits.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Gella, Vogel, Van Os in view of  Zhang to determine priority based on at least one of an intention of a user related to the at least one response, a size of the at least one response, in order to facilitate the use of the natural language dialogue system and to correct the previously output speech response and further outputs another speech response according to another speech input subsequently provided by the user, as evidence by Zhang (see Par. 0012).


Claims 6, and 13  are rejected under 35 U.S.C. 103 as being unpatentable over  Gella, Vogel, and Van Os as applied to claim 4, and 11 respectively, and in further view of Ohmura  (US 20180074785 A1 ).

Ohmura was applied in the previous Office Action.
Regarding claim 6, Gella, Vogel and Van Os do not teach a method, performed by an electronic device, of outputting a response to a speech input by using an application.
Gella, Vogel, and Van Os do not teach the method of claim 6, wherein the determining of the priority comprises in response to the user inputter receiving a plurality of speech inputs, determining, by the at least one processor, the priority based on metadata of each of the plurality of speech inputs.
generate responses to speeches from a plurality of users; a decision unit configured to decide methods of outputting the responses to the respective users on the basis of priorities according to order of the speeches from the plurality of users; and an output control unit configured to perform control such that the generated responses are output by using the decided methods of outputting the responses.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Gella, Vogel, and Van Os in view of Ohmura  to determine priority when the electronic device receives a plurality of speech inputs, based on metadata of each of the plurality of speech inputs, in order to improve convenience of a speech recognition system by outputting appropriate responses to respective users when the plurality of users are talking, as evidence by Ohmura (See Par. 0014). 


Claims 7, and 14  are rejected under 35 U.S.C. 103 as being unpatentable over  Gella, and Vogel as applied to claim 1, and 8 respectively, and in further view of  Takafumi (WO2018025668A1).

Takafumi was applied in the previous Office Action.

Gella and Vogel do not teach the method of claim 1, further comprising, as-based on detecting an event comprising a state in which an application previously determined as an application for outputting the response to the speech input cannot output the response to the speech input, selecting, by the at least one processor, at least one of a plurality of applications for outputting the response to the speech input.
Takafumi teaches further comprising, as-based on detecting an event comprising a state in which an application previously determined as an application for outputting the response to the speech input cannot output the response to the speech input, selecting, by the at least one processor, at least one of a plurality of applications for outputting the response to the speech input. (Takafumi, Par. 0052:"In other words, the response unit 121 suspends the other specific conversation application when the user starts a topic different from the topic related to the other specific conversation application while executing the other specific conversation application. In response to the interruption of the other specific conversation application, the response unit 121 sets the specific conversation application that can respond to the user's speech among the specific conversation applications previously interrupted to the suspended state of the specific conversation application. You may resume based. That is, in response to the interruption of another specific conversation application, the response unit 121 selects a specific conversation application corresponding to a topic newly started by the user from among the specific conversation applications previously interrupted. You may resume based on the interruption status.", and Par. 0064:"After an appropriate conversation application is selected by the selection unit 134 for the user's utterance, the response unit 121 executes the selected conversation application and responds to the user's utterance [S134].").
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Gella, and Vogel in view of Takafumi to output the response to the speech input cannot output the response to the speech input is detected, at least one of a plurality of applications for outputting the response to the speech input is selected, in order to create a system that can more naturally communicate with users is desired, as evidence by Takafumi (See Par. 0003).


Allowable Subject Matter
Claim 16 is objected to as being dependent upon a rejected base claim, but would be allowable if written in independent form including all of the limitations of the base claim and any intervening claims.
Claim 16 recites “The method of claim 1, further comprising: updating the preference information based on information about speech output successful for outputting the response to the speech input by each application and the time taken for each application to output the response to the speech input; and storing the updated preference information.” Which is allowable over the prior art. The closest teachings to the indicated allowable subject matter are the references that cited in the current office action. One such prior art of the record is Lim et al. (U.S. Patent No: 10446142B2), where in Col. 11, lines 39 – 67 teaches “Alternatively or in addition to the other examples described herein, examples include any combination of the 


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. Bozarth et al. (US 9754016 B1) teaches a user interacting with an electronic device can receive suggestions for applications or services that can help the user with a specific task. Using such an approach, a device can utilize a semantic process to attempt to infer an action or .
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DARIOUSH AGAHI whose telephone number is (408)918-7689. The examiner can normally be reached Monday - Thursday and alternate Fridays, 7:30-4:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DARIOUSH AGAHI/             Examiner, Art Unit 2656                                                                                                                                                                                           
/HUYEN X VO/             Primary Examiner, Art Unit 2656