DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/01/2020 has been entered.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Response to Amendment
This communication is responsive to the applicant’s amendment dated 08/31/2020.  The applicant(s) amended claims 1, 11, and 20.

Response to Arguments
Applicant's arguments with respect to claims 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the 

Claim Rejections - 35 USC § 103
Claims 1-5, 7, 9-14, and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Hanes et al. (US 20190132694 A1) in view of Argyropoulos et al. (US 10586534 B1).

Regarding claim 1 and claim 11, Hanes teaches:
“accessing one or more stored blocked directions of background voice noise from one or more audio output devices for a location of a voice command device” (par. 0021; ‘Specifying the wide range listening mode to ignore sound from the direction of such walls means that optimal speech recognition is more likely to occur.’; par. 0061; ‘The electronic device receives user definition of the portion to be selectively ignored when the microphone of the device is operated in the wide range, non-focused listening mode, from a computing device to which the electronic device is communicatively connected (1202).’);
“receiving a voice input at the voice command device at the location and determining that the voice input is received from a blocked direction” (par. 0024; ‘For example, if two people are in the same room and one person asks the other, "what's the weather supposed to be like tomorrow," the electronic device should not audibly respond with tomorrow's weather forecast because the communication was not directed to the device.’; par. 0027; ‘In response to the microphone 204 detecting the spoken 
However, Hanes does not expressly teach:
“querying a status of an audio output device to determine whether it is currently emitting audio”;
“receiving the status from the audio output device indicating that the audio output device is currently emitting audio”;
“obtaining, in response to a determination that the audio output device is currently emitting audio based on the received status, an audio file from the audio output device, the audio file being a formatted audio recording generated at the audio output device corresponding to a time when the voice input was received”;
“comparing the obtained audio file with the received voice input”; and
“ignoring the received voice input if there is a substantial match with the obtained audio file.”
Argyropoulos teaches:
“querying a status of an audio output device to determine whether it is currently emitting audio” (col. 10, lines 33-64; ‘However, the AEC Statistics Engine 412 may monitor various AEC statistics for various other purposes (e.g., to detect current playback conditions, external loudspeaker connectivity, etc.).’);
receiving the status from the audio output device indicating that the audio output device is currently emitting audio” (col. 3, lines 52-67; ‘Other statistics of the AEC component may be monitored and used to control a voice-controlled device such as local device 102 in various ways. Examples of statistics of the AEC component (sometimes referred to herein as "AEC statistics") may include the ERLE, correlations (e.g., correlation values) between an input signal (e.g., a signal received by a microphone) and an output signal (e.g., a signal output to one or more loudspeakers), an energy level of a signal output to a loudspeaker, a determination of the existence of a "double talk" condition, during which a user utterance of the wake-word is detected by one or more microphones of local device 102 in the presence of playback 110, etc.’ The monitoring reads on receiving status from audio output device.);
“obtaining, in response to a determination that the audio output device is currently emitting audio based on the received status, an audio file (reference signal) from the audio output device, the audio file being a formatted audio recording generated at the audio output device corresponding to a time when the voice input was received” (col. 15, lines 1-17; ‘A single talk condition may occur when the audio detected by the microphone(s) of local device 102 closely matches the audio from the reference signal 490. Upon detection of the ST condition, AEC Statistics Engine 412 may set the wake-word accept/reject flag to "reject."’);
“comparing the obtained audio file with the received voice input” (col. 15, lines 1-17; ‘A single talk condition may occur when the audio detected by the microphone(s) of local device 102 closely matches the audio from the reference signal 490. Upon 
“ignoring the received voice input if there is a substantial match with the obtained audio file” (col. 15, lines 1-17; ‘A single talk condition may occur when the audio detected by the microphone(s) of local device 102 closely matches the audio from the reference signal 490. Upon detection of the ST condition, AEC Statistics Engine 412 may set the wake-word accept/reject flag to "reject."’).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify Hanes electronic device by incorporating Argyropoulos’ AEC Statistics Engine in order to prevent wake-word false triggers during audio playback and control automatic speech recognition (ASR) enabled devices based on acoustic echo cancellation statistics. (Argyropoulos: col. 2, lines 8-41)

Regarding claim 2 (dep. on claim 1) and claim 12 (dep. on claim 11), the combination of Hanes in view of Argyropoulos further teaches:
“wherein the audio output device is associated with the blocked direction” (Hanes: Fig. 5, item 506; Television).

Regarding claim 3 (dep. on claim 1) and claim 13 (dep. on claim 11), the combination of Hanes in view of Argyropoulos further teaches:
“querying the status of a plurality of audio output devices in a range of the voice command device to determine which of the plurality of audio output devices are currently emitting audio output” (Argyropoulos: col. 10, lines 33-64; ‘However, the AEC 
“obtaining an audio file from each of the audio output devices which were determined to be emitting audio output” and “comparing each of the obtained audio files with the received voice input” (Argyropoulos: col. 15, lines 18-41; ‘AEC Statistics Engine 412 may detect a double talk condition using, for example, the process described above along with one or more of Eqns. 1-14. In other examples, AEC Statistics Engine 412 may detect the DT condition based upon a sharp decline in a correlation between input audio signal 492 and reference signal 490.’);  and
“ignoring the received voice input if there is a substantial match with at least one of the obtained audio files” (Argyropoulos: col. 15, lines 1-17; ‘A single talk condition may occur when the audio detected by the microphone(s) of local device 102 closely matches the audio from the reference signal 490. Upon detection of the ST condition, AEC Statistics Engine 412 may set the wake-word accept/reject flag to "reject."’).

Regarding claim 4 (dep. on claim 1) and claim 14 (dep. on claim 11), the combination of Hanes in view of Argyropoulos further teaches:
“wherein comparing the obtained audio file with the received voice input includes processing the obtained audio file and the received voice input to convert the audio file and the voice input to text and to represent the text as phonetic strings for comparison”  (Argyropoulos: col. 16, lines 1-21; ‘The voice recognition servers may analyze the received audio stream and may translate the audio stream into natural language. The one or more voice recognition servers 220 may determine whether or not the natural 

Regarding claim 5 (dep. on claim 1), the combination of Hanes in view of Argyropoulos further teaches:
“wherein the audio file is buffered at the audio output device” (Argyropoulos: col. 4, lines 51-59; ‘In the example shown in FIG. 2, local device 102 comprises a processor 160 and a memory 162 although, as described in further detail herein, local device 102 may further comprise other components, such as a microphone and/or a microphone array, an audio output interface, a loudspeaker, etc.’ Buffers are well-known in the art and are just a type of memory storage.).

Regarding claim 7 (dep. on claim 1), the combination of Hanes in view of Argyropoulos further teaches:
“wherein the audio output device is queried for power status, wherein the audio output device is determined to be emitting audio in response to the audio output device being powered on” (Argyropoulos: col. 3, lines 1-16; ‘In various examples, the AEC statistics described herein may be used to determine whether or not an external loudspeaker 108, separate from local device 102, is powered off, or is otherwise unavailable for streaming audio.; col. 10, lines 33-64; ‘However, the AEC Statistics Engine 412 may monitor various AEC statistics for various other purposes (e.g., to detect current playback conditions, external loudspeaker connectivity, etc.).’).

Regarding claim 9 (dep. on claim 1) and claim 17 (dep. on claim 11), the combination of Hanes in view of Argyropoulos further teaches:
“wherein determining whether there is a substantial match between the received voice input and audio file is based on at least one threshold” (Argyropoulos: col. 4, lines 39-50; ‘. AEC Statistics Engine 412 may determine that a low correlation exists (e.g., a correlation value below a low threshold correlation value) between a microphone input signal and the audio output signal sent to external loudspeaker 108.’).

Regarding claim 10 (dep. on claim 1), the combination of Hanes in view of Argyropoulos further teaches:
“wherein the audio file and the voice input are each converted into a text transcript for comparison, wherein the substantial match is determined based on a threshold number of characters matching between the transcript for the audio file and the transcript for the voice input” (Argyropoulos: col. 16, lines 1-21; ‘The voice recognition servers may analyze the received audio stream and may translate the audio stream into natural language. The one or more voice recognition servers 220 may determine whether or not the natural language corresponds to a command.’ It would be obvious to modify Argyropoulos’s method of matching microphone signals of the local device with the reference signal by translating both audio streams into natural language and then comparing. The Examiner takes official notice for “threshold number of characters matching between the transcript for the audio file and the transcript for the 

Regarding claim 16 (dep. on claim 11), the combination of Hanes in view of Argyropoulos further teaches:
“wherein the blocked direction is determined via triangulation at two or more microphones within the voice command device” (Hanes: par. 0054; ‘Therefore, the electronic device 202 defines the portion to be selectively ignored by the microphone 204 from the dotted lined 908A, corresponding to the location 906A at which the user tapped, and held his or her finger on the touch-sensitive surface 902, clockwise to the dotted line 908B, corresponding to the location 906B at which the user released the finger from the surface 902.’).

Regarding claim 18 (dep. on claim 11), the combination of Hanes in view of Argyropoulos further teaches:
“wherein the audio file and the voice input are each converted into a text transcript for comparison, wherein the substantial match is determined based on a threshold number of words matching between the transcript for the audio file and the transcript for the voice input” (Argyropoulos: col. 16, lines 1-21; ‘The voice recognition servers may analyze the received audio stream and may translate the audio stream into natural language. The one or more voice recognition servers 220 may determine whether or not the natural language corresponds to a command.’ It would be obvious to modify Argyropoulos’s method of matching microphone signals of the local device with 

Regarding claim 19 (dep. on claim 11), the combination of Hanes in view of Argyropoulos further teaches:
“wherein the audio output device includes an audio output providing component comprising: a monitoring component configured to monitor an audio output of the audio output device” (Hanes: par. 0021; ‘As another example, a television when being watched can inadvertently result in the electronic device detecting the trigger phrase when the same or similar phrase is output by the TV.’);
“a buffering component configured to buffer a predefined duration of audio output” (Argyropoulos: col. 4, lines 51-59; ‘In the example shown in FIG. 2, local device 102 comprises a processor 160 and a memory 162 although, as described in further detail herein, local device 102 may further comprise other components, such as a microphone and/or a microphone array, an audio output interface, a loudspeaker, etc.’ Buffers are well-known in the art and are just a type of memory storage.);
“a status component configured to send a status reply to the querying component of the voice command device to indicate if the audio output device is currently emitting audio output”  (Argyropoulos: col. 3, lines 52-67; ‘Other statistics of the AEC component may be monitored and used to control a voice-controlled device such as local device 
“an audio file transmitting component configured to transmit an audio file to the voice command device if the audio output device is currently emitting audio output, wherein the audio file is a latest buffered audio output from the audio output device” (Argyropoulos: col. 15, lines 1-17; ‘A single talk condition may occur when the audio detected by the microphone(s) of local device 102 closely matches the audio from the reference signal 490. Upon detection of the ST condition, AEC Statistics Engine 412 may set the wake-word accept/reject flag to "reject."’).

Claims 6, 15, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Hanes in view of Argyropoulos as applied to claim 1 above, and further in view of Pogue et al. (US 20150006176 A1).

Regarding claim 6 (dep. on claim 1) and claim 15 (dep. on claim 11), the combination of Hanes in view of Argyropoulos teaches the audio output device.
However, Hanes and Argyropoulos do not expressly teach:

“wherein the audio output device is determined to be emitting audio by: receiving the queried volume status”;
“comparing the received volume status to a volume threshold”; and
“determining, in response to the received volume status exceeding the volume threshold, that the audio output device is emitting audio.”
Pogue teaches:
“wherein the audio output device is queried for volume status” (par. 0049; ‘The parameters 404 may also include loudness parameters 404(c), indicating the current loudness or volume level at which audio is being generated by the speaker 110 and/or the loudness of each of the received directional audio signals.’),
“wherein the audio output device is determined to be emitting audio by: receiving the queried volume status” (par. 0049; ‘As with the previously described parameters, the loudness parameters 404(c) may be indicated as values on a continuous scale, such as a percentage that ranges from 0% to 100%.’);
“comparing the received volume status to a volume threshold” (par. 0051; ‘As examples, the following factors may indicate the probability of a speaker-generated wake expression’; par. 0053; ‘high speaker volume’);
“determining, in response to the received volume status exceeding the volume threshold, that the audio output device is emitting audio” (par. 0051; ‘As examples, the following factors may indicate the probability of a speaker-generated wake expression’; par. 0053; ‘high speaker volume’).


Regarding claim 20, the combination of Hanes in view of Argyropoulos and Pogue further teaches:
“accessing one or more stored blocked directions of background voice noise from one or more audio output devices for a location of a voice command device” (see claim 1);
“receiving a voice input at the voice command device at the location and determining that the voice input is received from a blocked direction” (see claim 1);
“querying a volume status of an audio output device to determine whether it is currently emitting audio” (see claim 6),
“wherein the audio output device is determined to be emitting audio by: receiving the queried volume status; comparing the received volume status to a volume threshold; and determining, in response to the received volume status exceeding the volume threshold, that the audio output device is emitting audio” (see claim 6);
“obtaining, in response to a determination that the audio output device is currently emitting audio, an audio file from the audio output device, the audio file (‘downlink signal’) corresponding to a time when the voice input was received, wherein the audio file is buffered for a predetermined time interval” (Argyropoulos: col. 10, lines 
“comparing the obtained audio file with the received voice input by converting the audio file and the voice input each into text transcripts” (see claim 10) ; and
“ignoring the received voice input if there is a substantial match with the obtained audio file, wherein the substantial match is determined based on a threshold number of characters matching between the transcript for the audio file and the transcript for the voice input” (see claims 1 and 10).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Hanes in view of Argyropoulos as applied to claim 1 above, and further in view of Chen et al. (US 20100211387 A1).

Regarding claim 8 (dep. on claim 1), the combination of Hanes in view of Argyropoulos teaches a blocked direction.
However, Hanes and Argyropoulos do not expressly teach:

Chen teaches:
“wherein the blocked direction is determined via time difference of arrival at two or more microphones within the voice command device” (par. 0020; ‘ The distance and direction estimation are used to determine whether the speech segment is coming from a predetermined source. The distance and direction may be determined by comparing the volume and time of arrival delay property of signals from different microphones corresponding to a short segment of a single human voice signal. The distance and direction information can be used to reject background human speech.’).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the blocked direction taught by Hanes in view of Argyropoulos by incorporating the source location estimation system taught by Chen in order to determine a blocking direction using a time difference of arrival at two or more microphones. The combination reliably estimates the intended voice signal for a pre-specified microphone. (Chen: par. 0021)

Conclusion
Other pertinent prior art are cited in the PTO-892 for the Applicant’s consideration.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK VILLENA whose telephone number is (571)270-3191.  The examiner can normally be reached on 10 am - 6pm EST Monday through Friday.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


MARK . VILLENA
Examiner
Art Unit 2658



/MARK VILLENA/Examiner, Art Unit 2658