DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Allowable Subject Matter

Claims 12-14 and 26-28 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-7 and 15-21 are rejected under 35 U.S.C. 102(a1) as being anticipated by Aleksic et al., (US 2017/069309 A1).
As per claims 1 and 15, Aleksic et al., teach a method/system comprising: receiving, at data processing hardware, audio data corresponding to an utterance spoken by a user and captured by a user device associated with the user (0027, 34, Figs. 2-3, item 310, 0210-0214); processing, by the data processing hardware, using a speech recognizer, the audio data to determine that the utterance includes a query for a digital assistant to perform an operation (0027, 35, Fig.3, item 340), wherein the speech recognizer is configured to trigger endpointing of the utterance after a predetermined duration of non-speech in the audio data; and before the predetermined duration of non-speech in the audio data (0020, 0027, 0037 – “end-of-speech (EOS) timeout”); detecting, by the data processing hardware, a freeze word in the audio data, the freeze word following the query in the utterance spoken by the user and captured by the user device (0027, 0030 –“match a phrase present in the client provided context data”, Fig.3, items 350,360); and in response to detecting the freeze word in the audio data, triggering, by the data processing hardware, a hard microphone closing event at the user device to prevent the user device from capturing any audio subsequent to the freeze word (0020, Fig.3, item 370, “ Microphone maybe closed or turned off”). 
As per claims 2 and 16, Aleksic et al., the method/system of claim 1 and 15, wherein the freeze word comprises one of: a predefined freeze word comprising one or more fixed terms across all users in a given language; a user-selected freeze word comprising one or more terms specified by the user of the user device; or an action-specific freeze word associated with the operation to be performed by the digital assistant (0020, 0027, 0030). 
As per claims 3 and 17, Aleksic et al., the method/system of claim 1 and 15, wherein detecting the freeze word in the audio data comprises: extracting audio features from the audio data; generating, using a freeze word detection model, a freeze word confidence score by processing the extracted audio features, the freeze word detection model executing on the data processing hardware; and determining that the audio data corresponding to the utterance includes the freeze word when the freeze word confidence score satisfies a freeze word confidence threshold (0020, 0027, 0030). 
As per claims 4 and 18, Aleksic et al., the method/system of claim 1 and 15, wherein detecting the freeze word in the audio data comprises recognizing, using the speech recognizer executing on the data processing hardware, the freeze word in the audio data (0020, 0027, 0030). 
As per claims 5 and 19, Aleksic et al., the method/system of claim 1 and 15, further comprising, in response to detecting the freeze word in the audio data: instructing, by the data processing hardware, the speech recognizer to cease any active processing on the audio data; and instructing, by the data processing hardware, the digital assistant to fulfill performance of the operation (0020, 0027, 0030). 
As per claims 6 and 20, Aleksic et al., the method/system of claim 1 and 15, wherein processing the audio data to determine that the utterance includes the query for the digital assistant to perform the operation comprises: processing, using the speech recognizer, the audio data to generate a speech recognition result for the audio data; and performing semantic interpretation on the speech recognition result for the audio data to determine that the audio data includes the query to perform the operation (0020, 0026, 0028, 0031). 
As per claims 7 and 21, Aleksic et al., the method/system of claim 6 and 20, further comprising, in response to detecting the freeze word in the audio data: modifying, by the data processing hardware, the speech recognition result for the audio data by stripping the freeze word from the speech recognition result; and instructing, by the data processing hardware, using the modified speech recognition result, the digital assistant to perform the operation requested by the query (0020, 0026, 0028, 0031).
Claims 11 and 25 are rejected under 35 U.S.C. 102(a1) as being anticipated by Binder et al., (US 2019/0122692 A1).
As per claims 11 and 25, Binder et al., teach a method/system comprising: receiving, at data processing hardware, a first instance of audio data corresponding to a dictation-based query for a digital assistant to dictate audible contents spoken by a user, the dictation-based query spoken by the user and captured by an assistant-enabled device associated with the user (0013, 0107, 0138, 0153); receiving, at the data processing hardware, a second instance of the audio data corresponding to an utterance of the audible contents spoken by the user and captured by the assistant-enabled device (0022, 0026, 0056); processing, by the data processing hardware, using a speech recognizer, the second instance of the audio data to generate a transcription of the audible contents (0077, 0107); and during the processing of the second instance of the audio data: detecting, by the data processing hardware, a freeze word in the second instance of the audio data, the freeze word following the audible contents in the utterance spoken by the user and captured by the assistant-enabled device (0132, 0161); and in response to detecting the freeze word in the second instance of the audio data, providing, by the data processing hardware, for output from the assistant-enabled device, the transcription of the audible contents spoken by the user(0013, 0107, 0138, 0153). 
Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 8-10 and 22-24 are rejected under 35 U.S.C. 103 as being unpatentable over Aleksic et al., (2017/0069309 A1) in view of Basye et al., (US 2014/0163978 A1).
Aleksic et al., teach the method/system of claims 8 and 22. However they do not explicitly teach prior to processing the audio data using the speech recognizer: detecting, by the data processing hardware, using a hotword detection model, a hotword in the audio data that precedes the query; and in response to detecting the hotword, triggering, by the data processing hardware, the speech recognizer to process the audio data by performing speech recognition on the hotword and/or one or more terms following the hotword in the audio data. Basye et al., do teach prior to processing the audio data using the speech recognizer: detecting, by the data processing hardware, using a hotword detection model, a hotword in the audio data that precedes the query; and in response to detecting the hotword, triggering, by the data processing hardware, the speech recognizer to process the audio data by performing speech recognition on the hotword and/or one or more terms following the hotword in the audio data (0013, 0039, 0047-0048, 0051-0052).
Therefore it would have been obvious to one of ordinary skill in the art before the time of the claimed invention, to incorporate the teachings of Basye et al., into the method/system of Aleksic et al., because this would enable the speech recognizer to effectively determine the presence or absence of hotword in the received speech (Basye et al., 0027-0028). 
As per claims 9 and 23, Aleksic et al., in view of Basye et al., teach the method/system of claims 9 and 23, further comprising verifying, by the data processing hardware, a presence of the hotword detected by the hotword detection model based on detecting the freeze word in the audio data (Basye et al., 0013, 0039, 0047-0048, 0051-0052). 
As per claims 10 and 24, teach the method/system of claim 8 and 22, wherein: detecting the freeze word in the audio data comprises executing a freeze word detection model on the data processing hardware that is configured to detect the freeze word in the audio data without performing speech recognition on the audio data; and the freeze word detection model and the hotword detection model each comprise the same or different neural network-based models (Basye et al., 0055, 0036, 0060-0061 ).
Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Please see attached form PTO-892.
Sundaram et al., (US 10,679,621 B1) teach Systems and methods for utilizing microphone array information for acoustic modeling are disclosed. Audio data may be received from a device having a microphone array configuration. Microphone configuration data may also be received that indicates the configuration of the microphone array. The microphone configuration data may be utilized as an input vector to an acoustic model, along with the audio data, to generate phoneme data. Additionally, the microphone configuration data may be utilized to train and/or generate acoustic models, select an acoustic model to perform speech recognition with, and/or to improve trigger sound detection.
Chang et al., (US 2020/0335091 A1) teach a method includes receiving audio data of an utterance and processing the audio data to obtain, as output from a speech recognition model configured to jointly perform speech decoding and endpointing of utterance: partial speech recognition results for the utterance; and an endpoint indication indicating when the utterance has ended. While processing the audio data, the method also includes detecting, based on the endpoint indication, the end of the utterance. In response to detecting the end of the utterance, the method also includes terminating the processing of any subsequent audio data received after the end of the utterance was detected.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to VIJAY B CHAWAN whose telephone number is (571)272-7601. The examiner can normally be reached 7-5 Monday thru Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/VIJAY B CHAWAN/Primary Examiner, Art Unit 2658