DETAILED ACTION

Introduction

1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . A response was filed in this application on 07/20/2022 after the non-final rejection of 02/03/2022. No claims have been added, claim 12 has been cancelled while claims 5-6, 9-10 and 16-19 have been amended in this submission. Thus claims 1-11 and 13-20 are currently pending for reconsideration by the Examiner and are examined below.

Response to amendments and arguments

2.	The Applicants have acknowledged the allowable subject matter indicated by the Examiner in last office action and have accordingly amended the independent claims to incorporate said allowable subject matter. Instant claims therefore overcome the prior art based on an updated search. The Applicants arguments are also persuasive in light of these amendments. Therefore, no more outstanding issues remain in the instant Application and claims 1-11 and 13-20 are in condition for allowance. 



                        Allowable Subject Matter

3.	Claims 1-11 and 13-20 are allowable over the prior art of record. The following is the examiner’s statement of the reasons for allowance. The closest relevant prior art, either taken individually or in combination, fails to explicitly teach or reasonably suggest the invention as represented by independent claims. The applicant has provided a novel interactive voice response system configured to detect an interrupt event and process a voice command without detecting a wake-word.

Most pertinent prior art:

Iwase (U.S. Patent Application Publication # 2021/0035554 A1) in figures 4 and 14 along with paragraphs 154-158, teach a system utterance and a user/barge-in utterance. Figures 1-5 along with paragraphs 185-187, teach user barge-in utterance as “I'm coming out, tell me the weather today". The system responds by “Today's weather is fine. However, there may be a thunderstorm in the evening". Para 213, teaches that the voice activity detection technology that makes it possible to differentiate the user utterance speech and environmental noise from an input sound signal and to specify a period during which the user speech is uttered. Paragraphs 219-224, teach that if an intention and entity information can be accurately estimated and acquired from a user utterance, the information processing apparatus 100 can perform accurate processing on the user utterance, e.g. the intention of the user utterance of "Tell me the weather tomorrow afternoon in Osaka" is to know the weather, and the entity information is Osaka, tomorrow, afternoon, and words of these. It is possible to acquire the weather for tomorrow afternoon in Osaka and output the acquired weather as a response. However, Iwase does not teach detecting, by a speech processing component of the device, an endpoint of the first speech represented in the first audio data; determining, by the speech processing component, semantic information associated with the first speech; processing, using a classifier of a second component of the device, the second portion of the first audio data and the semantic information to determine that the first speech corresponds to a first device-directed speech event and causing speech processing to be performed on the second portion of the first audio data.

Kapinos (U.S. Patent # 10848392 B2) in col. 1 and figure 4, teaches a scenario wherein when a device attempts to access a local area network (LAN) such as a Wi-Fi network or wired local network in a personal residence or other environment, a user may be presented with a one-time use password or code on his or her personal device's display. The user may then provide the password/code to a digital assistant via voice input in a command for the device to be able to join the LAN. The digital assistant may then communicate with a management console (e.g., a network access point) that may then give the device access to the LAN for a threshold amount of time. This may be done since the management console may already know identifying information for the device, as might have been discovered based on broadcast packets sent to/from the device during a network discovery protocol, and it may pair that identifying information with the password/code. Accordingly, the user's device may be seamlessly and quickly added to an access list that the management console might maintain of devices that are approved to communicate over the LAN and the device may be provided with LAN access when the password/code is received via voice input. Then, at the end of the threshold amount of time, the management console may revoke the device's LAN access and purge the device's information from its access list, at which point the user's device may have to re-authenticate using a different one-time use code in order to continue communicating over the LAN. However, Kapinos also fails to teach detecting, by a speech processing component of the device, an endpoint of the first speech represented in the first audio data; determining, by the speech processing component, semantic information associated with the first speech; processing, using a classifier of a second component of the device, the second portion of the first audio data and the semantic information to determine that the first speech corresponds to a first device-directed speech event and causing speech processing to be performed on the second portion of the first audio data.

Boss (U.S. Patent Application Publication # 2019/0279624 A1) in para 5 and figure 4, teaches a method for detecting voice commands without using wake-words wherein a computer system records audio to form a recorded audio. The computer system then determines whether a voice command spoken by a first person is present in the recorded audio. If the voice command is present in the recorded audio, the computer system determines whether the voice command is directed to a second person by the first person. If the voice command is not directed to the second person, the computer system processes the voice command, wherein the processing of the voice command occurs without a wake word. However, once again Boss fails to teach detecting, by a speech processing component of the device, an endpoint of the first speech represented in the first audio data; determining, by the speech processing component, semantic information associated with the first speech; processing, using a classifier of a second component of the device, the second portion of the first audio data and the semantic information to determine that the first speech corresponds to a first device-directed speech event and causing speech processing to be performed on the second portion of the first audio data.

Typrin (U.S. Patent Application Publication # 2015/0088514 A1) in paragraphs 8-16 and figure 1, teaches techniques for providing virtual assistants to assist users during voice communications between the users. For instance, a first user operating a device may establish a voice communication with respective devices of one or more additional users, such as with a device of a second user. After establishing the voice communication, or as part of establishing this communication, the techniques may join another computing device to the voice communication, namely a computing device that hosts a virtual assistant for performing tasks for one or both users. After joining the virtual assistant to a voice communication, one or both of the users on the voice communication may invoke the virtual assistant when the respective user desires the assistance of the virtual assistant. However, Typrin also fails to teach detecting, by a speech processing component of the device, an endpoint of the first speech represented in the first audio data; determining, by the speech processing component, semantic information associated with the first speech; processing, using a classifier of a second component of the device, the second portion of the first audio data and the semantic information to determine that the first speech corresponds to a first device-directed speech event and causing speech processing to be performed on the second portion of the first audio data.

Hence, as evidenced above, the prior art of record, although teaching bits and parts, fails to completely describe the invention set forth in the instant independent claims, namely a computer-implemented method, comprising: generating first output audio using a loudspeaker associated with a device; receiving first audio data; processing the first audio data using a first component of the device to determine that a first portion of the first audio data represents first speech that is directed to the device; in response to determining that the first speech is represented in the first portion of the first audio data, performing a first action; determining that the first speech is represented in a second portion of the first audio data that includes the first portion of the first audio data; detecting, by a speech processing component of the device, an endpoint of the first speech represented in the first audio data; determining, by the speech processing component, semantic information associated with the first speech; processing, using a classifier of a second component of the device, the second portion of the first audio data and the semantic information to determine that the first speech corresponds to a first device-directed speech event and causing speech processing to be performed on the second portion of the first audio data.

The dependent claims represent a narrower and more specific version of the invention set forth in independent claims, and thus as such, are also allowable for at least the preceding reasons. Furthermore, it would not have been obvious to one of ordinary skill in the art to modify the teachings of the prior art of record to obtain the recited claim limitations of the independent claims as noted above.

CONCLUSION

4.	The following prior art, made of record but not relied upon, is consideredpertinent to applicant's disclosure: Finkelstein (U.S. Patent # 10984782 B2), Booker (U.S. Patent Application Publication # 2019/0147880 A1), Simpson (U.S. Patent # 8301452 B2), Pratt (U.S. Patent Application Publication # 2020/0133629 A1), Rodgers (U.S.  Patent Application Publication # 2019/0095524 A1), Krishnamoorthy (U.S. Patent Application Publication # 2016/0063998 A1). All references are included in the PTO-892 form attached to this office action.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. If you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). In case you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to NEERAJ SHARMA whose contact information is given below.  The examiner can normally be reached on Monday to Friday 8 am to 5 pm. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Louis-Desir can be reached on 571-272-7799 (Direct Phone).  The fax number for the organization where this application or proceeding is assigned is 571-273-8300.

/NEERAJ SHARMA/
Primary Examiner, Art Unit 2659
571-270-5487 (Direct Phone)
571-270-6487 (Direct Fax)
neeraj.sharma@uspto.gov (Direct Email)