Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
All objections/rejections not mentioned in this Office Action have been withdrawn by the Examiner.

Information Disclosure Statement
The information disclosure statement(s) (IDS) submitted on January 20, 2022 is/are being considered by the examiner.

Response to Amendments 
Applicant’s amendment filed on November 5, 2021 has been entered. 
In view of the amendment to the claim(s), the amendment of claim(s) 1, 10, and 18 and the cancellation of claims 3, 12, and 20 has been acknowledged and entered.  
In view of the amendment to claim(s) 1, 10, and 18 and the cancellation of claims 3, 12, and 20, the rejection of claims 1-20 under 35 U.S.C. §103 is withdrawn.

Response to Arguments
Applicant’s arguments regarding the prior art rejections under 35 U.S.C. §103, see pages 17-34 of the Response to Non-Final Office Action dated August 16, 2021, received on November 5, 2021 (hereinafter Response and Office Action, respectively), have been fully considered.
With respect to the rejection(s) of claim(s) 1, 10, and 18 under 35 U.S.C. §103 in light of Meyers (U.S. Pat. No. 9,691,378, hereinafter Meyers) in view of Smith et al. (U.S. Pat. App. Pub. No. 2020/0090646, hereinafter Smith) and Joseph (U.S. Pat. No. 9,934,777, hereinafter Joseph), the applicant argues that Meyers in view of Smith and Joseph
The Applicant further argues that, in light of the previous arguments and amendments, claims 2-9, 11, 13-17, and 19 are patentable over the cited references, at least based on dependency from claims 1, 10, and 18.  
Applicant’s arguments in light of the amendments have been fully considered and are persuasive.   Therefore, the rejection has been withdrawn.  
The Applicant has not provided any further statement and, therefore, the Examiner directs the Applicant to the below rationale.

EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given by Mark Friedman, Agent for the Applicant, by teleconference on February 3, 2022.
The application has been amended as follows: 
Please Cancel Claims 12 and 20.

Reasons for Allowance
Claims 1-2 and 4-11, 13-19 are allowed.
The following is an examiner’s statement of reasons for allowance: 
The closest prior art of record Meyer teaches an audio detection method (The methods described with reference to the voice activated electronic device 10 and the backend systems 100) comprising: receiving, by a processor (processor 252) of a control device (backend systems 100) from a user, data indicating an activation term associated with enabling an automated device (voice activated electronic device 10) from a deactivated state (“Voice activated electronic device Meyers, Col. 4, lines 47-50, Col. 5, lines 30-39); retrieving, by said processor, text data with respect to an associated language comprised by a transcript of a multimedia file (In response to the command in the initial file 6, “an appropriate server or servers of backend system 100 may be accessed to retrieve or obtain a response to command 4” where “ the response may... be speech converted from text, or it may be a portion of an audio file (e.g., a song or audio from a video),” (the multimedia file) and the response includes data tags which indicate “(i) what a particular word within the response is (e.g., a word identifier),” where determination of a particular word is text data, and where a language is an associated language by using said language; Meyers, Col. 6, lines 24-30; Col. 2, lines 49-55; Col. 2, line 64 - Col. 3, line 3) comprising an audio and video file for user presentation (As indicated above the multimedia file may be “audio from a video,” thus, Meyers contemplates the multimedia file comprising an audio and video file. Audio from a video file indicates that the multimedia file comprises both audio and video; “for user presentation” is intended use and, as any multimedia file may be used for user presentation, the multimedia file described in Meyers may be said to be “for user presentation”; Meyers, Col. 6, lines 24-30; Col. 2, lines 49-55; Col. 2, line 64 - Col. 3, line 3); first analyzing at a remote location, by said processor via a first pass filtering process, said text data (The response is received from the backend system 100 “along with one or more data tags (e.g., word identifiers, temporal identifiers), back to voice activated electronic device 10 in the form of return file 8.” By associating the response 12 with the data tags, the text data is analyzed and the backend system 100 can be a remote location {“Initial file 6 may be transmitted over a network, such as the Internet, to backend system 100”}; Meyers, Col. 6, lines 60-65; Col. 5, lines 44-45); wherein said first analyzing comprises executing a phonetic analysis process via execution of natural language processing; ( “After the audio data is Meyers, Col. 6, lines 51-54), and wherein said phonetic analysis process comprises phonetically comparing said activation term to terms of said text data (“Upon receipt of initial file 6, backend system 100 may perform various actions based on, and in response to, command 4. For instance, backend system 100 may convert the audio data representing command 4 into text, and may use the text to determine the word(s) within command 4. Furthermore, backend system 100 may also include automatic speech recognition and natural language understanding function thereon to process and analyze the audio data representing command 4.”; Meyers, Col. 6, lines 15-25); configuring for subsequent processes, by said processor, associated timestamps at various locations within said text data ("The data tag(s), such as the word identifiers and temporal identifiers (e.g., start/end time of a word within the speech) may be sent within return file 8 such that they are processed by voice activated electronic device 10 prior to the speech being outputted."; Meyers, Col. 7, lines 5-10); determining, by said processor based on results of said first analyzing, potential phonetic matches between a set of terms of said text data and said activation term… (“The data tag(s), such as the word identifiers and temporal identifiers (e.g., start/end time of a word within the speech) may be sent within return file 8 such that they are processed by voice activated electronic device 10 prior to the speech being outputted” Thus, the backend system is determining instances of the wakewords within the text data (potential phonetic matches) between all terms within the response 12 (a set of terms from the text data), and the activation term (wakeword); Meyers, Col. 7, lines 5-10 and 33-37); [and]  detecting, by said processor via an audio sensor, said multimedia file being potentially activated (The system detects the multimedia file being potentially activated by detecting a wakeword from one or more echoes which occur after the speech is output, Meyers, Col. 3, lines 52-56; Col. 4, lines 47-51). 
Smith further teaches wherein said set of terms are phonetically similar to but differing from said activation term ( “The wake-word engine suppressor 576 may in turn be configured to identify within the audio stream AS the first and second particular wake words and other false wake words related thereto.” where the “false wake word may be a word that is phonetically similar to an actual wake word,” and where, given the example of the wakeword Alexa, false wake words can include “Alexis,” “Lexus,” “Election,” thus a set of terms; Smith, ¶¶ [0150], [0026], [0030])… ; second analyzing, by said processor via a listening device engine, an audio portion and an associated location within said audio portion of said audio sequence with respect to said associated timestamps of said text data (“wake-word engine suppressor 576 is configured to (i) receive as input an audio stream (e.g., the processed audio stream AP) and (ii) identify the presence of at least the particular wake word therein according to a second sensitivity level for false positives of the particular wake word” where “ the wake-word engine suppressor may perform keyword spotting in the audio stream in a manner similar to keyword spotting by the primary wake-word engine, except that the wake-word engine suppressor is configured to identify keywords in the path of the audio stream (i.e., the audio that the playback device is to playback) rather than the path of detected sound.”; Smith, ¶¶ [0147], [0032]); and said potential phonetic matches comprising phonetic similarities and sounds with respect to said activation term (as shown between the false wake words can include “Alexis,” “Lexus,” “Election,” where “Alexa” is the wakeword, the words “Alexis,” “Lexus,” and “Election,” comprise phonetic similarities and sounds with respect to the activation term “Alexa.”; Smith, ¶¶ [0147]); determining, by said processor based on results of said second analyzing, a subset of terms of said set of terms, wherein said subset of terms are determined to potentially and inadvertently enable said automated device to be automatically activated from said deactivated state (“the second sensitivity level may cause the wake-word engine suppressor 576 to identify more words that are Smith, ¶¶ [0147]); selecting, by said processor, at least one term of said subset of terms determined to comprise a phonetic match to but differing from said activation term, wherein said phonetic match indicates said at least one term being associated with an action for inadvertently enabling said automated device from said deactivated state (“the playback device 102 a identifies in an audio stream, via the wake-word engine suppressor 576, a false wake word for the at least one primary wake-word engine 570,” thus a set of false wake words for the primary wake word (said activation term), where false wake words can include “words that are phonetically similar to the particular wake word (e.g., “Lexus,” “Alexis,” “Election,” etc.),” thus phonetically similar to but differing from the primary wake word (said activation term), and where activation by a false wake word is inadvertently enabling said automatic device from said deactivated state; Smith, ¶ [0148]); presenting with respect to said associated timestamps, by said processor, enabled audio of said multimedia file comprising said subset of terms with respect to said automated device (“In practice, the playback device 102 a may receive the audio stream AS via the audio interface 519, which receives or otherwise obtains audio from an audio source” where “the audio interface 519 provides the audio stream AS to the audio output processing components 515 that then process the audio stream AS. The audio processing components 515 output the processed audio stream Ap to the wake-word engine suppressor 576.”; Smith, ¶ [0149]); additionally determining, by said processor in response to said presenting, that said automated device has been enabled from said deactivated state in response to said at least one term (The system can further include “in response to identifying in the first sound input the particular wake word or the false wake word, {in response to said presenting}” to “while activated, trigger extraction of a first sound input received via the at least one microphone,” Thus the system can be activated {enabled from a deactivated state} by the false wakeword {in response to said at least one term}.; Smith, ¶ [0182]); flagging, by said processor based on results of said selecting and in response to said additionally determining, said at least one term for future Smith, ¶¶ [0150], [0151], [0182]); [modifying sensitivity to] a group of terms of said subset of terms determined not to enable said automated device from said deactivated state (“In some implementations, a remote server associated with the MPS 100 that is configured to store and process information corresponding to past wake-word triggers for a particular wake word (e.g., spectral and/or gain information for past voice inputs comprising the particular wake word) may send to the playback device 102a, a message or the like that defines or otherwise updates the second sensitivity level based on such information from one or more NMDs.” In updating the second sensitivity level using actual detections, the system is modifying the second sensitivity level in response to past wakeword triggers for a particular wakeword {a group of terms of said subset of terms} determined to not enable said automated device from said deactivated state.; Smith, ¶¶ [0152]); determining, by said processor, a specified time within said audio and video file, associated with a timestamp of said associated timestamps,  (“the playback device 102 a may temporarily deactivate the at least one primary wake-word engine 570 in different manners…” such as by “ignor[ing] wake-word triggers output by the at least one primary wake-word engine 570 for a certain amount of time {a specified time}” where “the playback device 102a may be configured to define the particular amount of time based on an evaluation of one or more characteristics of the identified false wake word, such as the length, number of syllables, number of vowels, etc., of the false wake word {thus, a timestamp}”, Smith,  ¶¶ [0154], [0167]), within said multimedia file associated with an occurrence of said at least one term (“the wake-word engine suppressor 576 identifies in the audio stream AS a false wake word for the at least one primary wake-word engine 570. Based on that identification, at block 704, the playback device Smith, ¶¶ [0153], [0034]); generating, by said processor, a control action for preventing said automated device from being enabled from said deactivated state (“ the wake-word engine suppressor 576 may instruct the playback device 102 a, or components thereof (e.g., the voice extractor 572 and/or the VAS selector 574), to ignore wake-word triggers output by the at least one primary wake-word engine 570 for a certain amount of time.”; Smith, ¶ [0154]); storing within a memory structure, by said processor, said at least one term with an associated flag, a reference to said specified time, and said control action (“the wake-word engine suppressor 576 may send a suppression trigger S1 to the at least one primary wake-word engine 570 instructing it not to indicate wake-word triggers for a certain amount of time,” where the suppression trigger is the associated flag, the certain amount of time is the reference to the specified time, and the instruction to not indicate wake-word triggers is the control action which can be stored in memory 213 (“instructions stored in memory 213”); Smith, ¶¶ [0155], [0055]); detecting during a subsequent timeframe with respect to a timeframe associated with said storing, by said processor within said multimedia file being presented via a multimedia device, presentation of said at least one term (“In response to this instruction, the at least one primary wake-word engine 570 ... spots the particular wake word in the sound-data stream SDS.” which indicates that the primary wake-word engine 570 detected the false wake word in a time frame after the instruction was received (a subsequent time frame) within the audio stream (multimedia file); Smith, ¶ [0155]); and executing, by said processor based on results of said detecting said presentation, said control action such that said automated device remains in said deactivated state at said specified time during play back of said multimedia file presenting said at least one term (“In response to this instruction, the at least one primary wake-word engine 570 may not output a wake-word trigger signal SW when it spots the particular wake word in the Smith, ¶ [0155]).
Joseph teaches systems and methods for configuring a language model. (Joseph, Col. 3, lines 66-67). Regarding claim 1, Joseph teaches discarding, by said processor, a group of terms of said subset of terms determined not to enable said automated device from said deactivated state (“To save computational resources, the ASR engine 258 may prune and discard low recognition score states or paths {a group of terms} that have little likelihood of corresponding to the spoken utterance, either due to low recognition scores {determined not to enable said automated device}, or for other reasons,” where the states or paths correspond to different word recognition outcomes. Thus, as applied to the false wakewords rejection of Smith, Joseph discloses discarding possible recognitions (including false wakewords) when said possible recognitions have low recognition scores {determined not to enable said automated device} in practice; Joseph, Col. 15, lines 15-20). 
Newly cited reference Dagtas (U.S. Pat. No. 6,973,256, hereinafter Dagtas) teaches wherein said second analyzing comprises: analyzing closed caption metadata of said audio portion (“audio processor 270 may use text received from closed caption (CC) detector 260 to identify keywords in a video program. …CC text is typically inserted in the blanking interval at the end of line 21 of the video signal. CC detector 260 uses a time stamp associated with each line of CC data to identify a segment of the video program corresponding to the CC text. CC detector 260 transmits each line of CC text and the time stamp to audio processor 270. Audio processor 270 then compares the CC text words to selected keywords stored in keyword (KW) library 284. When a match occurs, audio processor 270 stores on hard disk drive 230 a keyword identifier associated with the corresponding segment of the stored video program identified by the time stamp.” Dagtas, Col. 6, lines 43-59).

More specifically, the limitation of “wherein said flagging comprises adding to a temporal flag list, detected audio, of said multimedia file, determined to cause said automated device to be inadvertently enabled from said deactivated state… [and] wherein said discarding comprises adding said group of terms to a specialized discarded term structure” as recited in amended claims 1, 10, and 18, is not taught by the prior art of record.
Claims 2, 4-9, 11-17, and 19-20 are allowable at least in light of their dependence from independent claims 1, 10, and 18.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sean E. Serraguard whose telephone number is (313)446-6627. The examiner can normally be reached 07:00-17:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel C. Washburn can be reached on (571) 272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/Sean E Serraguard/Patent Examiner, Art Unit 2657                                                                                                                                                                                                        

/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657