EXAMINER’S AMENDMENT AND REASONS FOR ALLOWANCE
Examiner’s Amendment
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in a telephone interview with Stephen Martin, Reg. No. 56,640 on August 25, 2021.
The claims are presented below in two sets: marked up form indicating the amendments being made to the claims, and in final form with all Examiner’s amendments having been entered. Only the claims presented below are being amended in this Examiner’s Amendment, with all other claims being in final form as presented in the Amendment filed July 7, 2021.

Marked Up Amended Claim Set
Claim 1.  A method of voice control in a multi-talker and multimedia environment, comprising: 
receiving a microphone signal for each zone in a plurality of zones of an acoustic environment, wherein at least one microphone is located in each zone, wherein the microphone signal from each zone in the plurality of zones is provided by a separate audio channel extending between the at least one microphone 
signal for each audio channel;
removing, by a zone interference cancellation module, interference from the echo cancelled microphone signal of each audio channel to generate a processed microphone signal for each audio channel; 
performing speech recognition on the processed microphone signals of each audio channel to generate words from the processed microphone signals;
performing keyword spotting on the words generated from the processed microphone signals of each audio channel; and
in response to detection of a wake word in the words generated from the processed microphone signals of each audio channel:
setting a first zone in the plurality of zones in which the wake word was detected as an active zone; 
setting an audio channel of the active zone as an active audio channel;
initiating an automatic speech recognition session for the active audio channel, wherein during the automatic speech recognition session, speech recognition is only performed on the active audio channel, wherein the speech recognition is performed by the application post processor on the processed microphone signal output from the zone interference cancellation module for the active audio channel; 
the action to be performed; and 
performing the determined action.

Claim 7.  The method of claim 1, wherein during the automatic speech recognition session, the echo caused by the audio transducers in the acoustic environment from each of the microphone signals is removed from the active audio channel and interference from the microphone signals of other audio channels is removed from the active audio channel.

Claim 17.  The method of claim 16, wherein removing interference speech caused by speech originating in other zones comprises: 
using measured signal and noise level differences between the plurality of microphone signals to detect speech of an occupant of a respective zone; 
for each zone in which speech of an occupant is detected, using an adaptive filter to estimate a speech contribution of the occupant on the microphone signals in other zones; 
for each microphone signal, removing the estimated speech contribution of occupants in other zones.

Claim 22.  A system for voice control in a multi-talker and multimedia environment, comprising: 
a plurality of microphones, each microphone being located in and associated with a zone in a plurality of zones of an acoustic environment; 
a plurality of speakers, each speaker being located in and associated with a zone in the plurality of zones of the acoustic environment; 
a processor system comprising one or more processors coupled to the plurality of microphones and the plurality of speakers programmed to: 
receive a microphone signal for each zone in a plurality of zones of an acoustic environment, wherein at least one microphone is located in each zone, wherein the microphone signal from each zone in the plurality of zones is provided by a separate audio channel extending between the at least one microphone 
remove, by an acoustic echo cancellation module, echo caused by audio transducers in the acoustic environment from the microphone signal of each audio channel to generate an echo cancelled microphone signal for each audio channel;
remove, by a zone interference cancellation module, interference from the echo cancelled microphone signal of each audio channel to generate a processed microphone signal for each audio channel; 
perform speech recognition on the processed microphone signals of each audio channel to generate words from the processed microphone signals;

in response to detection of a wake word in the words generated from the processed microphone signals of each zone in the plurality of zones:
set a first zone in the plurality of zones in which the wake word was detected as an active zone; 
set an audio channel of the active zone as an active audio channel;
initiate an automatic speech recognition session for the active audio channel, wherein during the automatic speech recognition session, speech recognition is only performed on the active audio channel, wherein the speech recognition is performed by the application post processor on the processed microphone signal output from the zone interference cancellation module for the active audio channel; 
perform natural language processing on results of the speech recognition to determine an action to be performed, wherein both the active zone and the results of speech recognition are used to determine the action to be performed; and 
perform the determined action.

Claim 23.  A non-transitory machine readable medium having tangibly stored thereon executable instructions for execution by a processor, wherein the executable instructions, when executed by the processor of the electronic device, cause the processor to: 

remove, by an acoustic echo cancellation module, echo caused by audio transducers in the acoustic environment from the microphone signal of each audio channel to generate an echo cancelled microphone signal for each audio channel;
remove, by a zone interference cancellation module, interference from the echo cancelled microphone signal of each audio channel to generate a processed microphone signal for each audio channel; 
perform speech recognition on the processed microphone signals of each audio channel to generate words from the processed microphone signals;
perform keyword spotting on the words generated from the processed microphone signals of each audio channel; and
in response to detection of a wake word in the words generated from the processed microphone signals of each zone in the plurality of zones:
set a first zone in the plurality of zones in which the wake word was detected as an active zone; 
set an audio channel of the active zone as an active audio channel;
initiate an automatic speech recognition session for the active audio channel, wherein during the automatic speech recognition session, speech 
perform natural language processing on results of the speech recognition to determine an action to be performed, wherein both the active zone and the results of speech recognition are used to determine the action to be performed; and 
perform the determined action.

Final Form Claim Set
Claim 1.  A method of voice control in a multi-talker and multimedia environment, comprising: 
receiving a microphone signal for each zone in a plurality of zones of an acoustic environment, wherein at least one microphone is located in each zone, wherein the microphone signal from each zone in the plurality of zones is provided by a separate audio channel extending between the at least one microphone located in each zone and an application post processor, wherein one audio channel is provided per zone;
removing, by an acoustic echo cancellation module, echo caused by audio transducers in the acoustic environment from the microphone signal of each audio channel to generate an echo cancelled microphone signal for each audio channel;

performing speech recognition on the processed microphone signals of each audio channel to generate words from the processed microphone signals;
performing keyword spotting on the words generated from the processed microphone signals of each audio channel; and
in response to detection of a wake word in the words generated from the processed microphone signals of each audio channel:
setting a first zone in the plurality of zones in which the wake word was detected as an active zone; 
setting an audio channel of the active zone as an active audio channel;
initiating an automatic speech recognition session for the active audio channel, wherein during the automatic speech recognition session, speech recognition is only performed on the active audio channel, wherein the speech recognition is performed by the application post processor on the processed microphone signal output from the zone interference cancellation module for the active audio channel; 
performing natural language processing on results of the speech recognition to determine an action to be performed, wherein both the active zone and the results of speech recognition are used to determine the action to be performed; and 
performing the determined action.

Claim 7.  The method of claim 1, wherein during the automatic speech recognition session, the echo caused by the audio transducers in the acoustic environment from each of the microphone signals is removed from the active audio channel and interference from the microphone signals of other audio channels is removed from the active audio channel.

Claim 17.  The method of claim 16, wherein removing interference speech caused by speech originating in other zones comprises: 
using measured signal and noise level differences between the plurality of microphone signals to detect speech of an occupant of a respective zone; 
for each zone in which speech of an occupant is detected, using an adaptive filter to estimate a speech contribution of the occupant on the microphone signals in other zones; 
for each microphone signal, removing the estimated speech contribution of occupants in other zones.

Claim 22.  A system for voice control in a multi-talker and multimedia environment, comprising: 
a plurality of microphones, each microphone being located in and associated with a zone in a plurality of zones of an acoustic environment; 
a plurality of speakers, each speaker being located in and associated with a zone in the plurality of zones of the acoustic environment; 

receive a microphone signal for each zone in a plurality of zones of an acoustic environment, wherein at least one microphone is located in each zone, wherein the microphone signal from each zone in the plurality of zones is provided by a separate audio channel extending between the at least one microphone located in each zone and an application post processor, wherein one audio channel is provided per zone;
remove, by an acoustic echo cancellation module, echo caused by audio transducers in the acoustic environment from the microphone signal of each audio channel to generate an echo cancelled microphone signal for each audio channel;
remove, by a zone interference cancellation module, interference from the echo cancelled microphone signal of each audio channel to generate a processed microphone signal for each audio channel; 
perform speech recognition on the processed microphone signals of each audio channel to generate words from the processed microphone signals;
perform keyword spotting on the words generated from the processed microphone signals of each audio channel; and
in response to detection of a wake word in the words generated from the processed microphone signals of each zone in the plurality of zones:
set a first zone in the plurality of zones in which the wake word was detected as an active zone; 

initiate an automatic speech recognition session for the active audio channel, wherein during the automatic speech recognition session, speech recognition is only performed on the active audio channel, wherein the speech recognition is performed by the application post processor on the processed microphone signal output from the zone interference cancellation module for the active audio channel; 
perform natural language processing on results of the speech recognition to determine an action to be performed, wherein both the active zone and the results of speech recognition are used to determine the action to be performed; and 
perform the determined action.

Claim 23.  A non-transitory machine readable medium having tangibly stored thereon executable instructions for execution by a processor, wherein the executable instructions, when executed by the processor of the electronic device, cause the processor to: 
receive a microphone signal for each zone in a plurality of zones of an acoustic environment, wherein at least one microphone is located in each zone, wherein the microphone signal from each zone in the plurality of zones is provided by a separate audio channel extending between the at least one microphone located in each zone and an application post processor, wherein one audio channel is provided per zone;

remove, by a zone interference cancellation module, interference from the echo cancelled microphone signal of each audio channel to generate a processed microphone signal for each audio channel; 
perform speech recognition on the processed microphone signals of each audio channel to generate words from the processed microphone signals;
perform keyword spotting on the words generated from the processed microphone signals of each audio channel; and
in response to detection of a wake word in the words generated from the processed microphone signals of each zone in the plurality of zones:
set a first zone in the plurality of zones in which the wake word was detected as an active zone; 
set an audio channel of the active zone as an active audio channel;
initiate an automatic speech recognition session for the active audio channel, wherein during the automatic speech recognition session, speech recognition is only performed on the active audio channel, wherein the speech recognition is performed by the application post processor on the processed microphone signal output from the zone interference cancellation module for the active audio channel; 
perform natural language processing on results of the speech recognition to determine an action to be performed, wherein both the active zone and the 
perform the determined action.


Reasons for Allowance
The following is an examiner’s statement of reasons for allowance.
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's Amendment filed on June 28, 2021 has been entered by way of the RCE filed on July 19, 2021.
Applicant’s arguments and amendments in the Amendment filed June 28, 2021 (herein “Amendment”), with respect to the objection to claims 1, 22 and 23, and therefore all claims depending therefrom have been fully considered and are persuasive.  The objection to claims 1, 22 and 23, and therefore all claims depending therefrom has been withdrawn. 
Applicant’s arguments and amendments in the Amendment with respect to the rejections of claims 1, 22 and 23, and claims depending therefrom under 35 U.S.C. 103 have been fully considered and are persuasive.  
Regarding claims 1, 22 and 23, and claims depending therefrom, the cited art of record does not, in any combination obvious to one having ordinary skill in the art before the effective filing date of the claimed invention, teach or suggest “the microphone 
The closest cited art of record includes Mohammad and Premont, as detailed in the most recent Office Action dated April 29, 2021. However neither of Mohammad and Premont, or the other cited art of record, whether considered alone, or in a combination obvious to one of ordinary skill in the art before the effective filing date, teaches or suggests performing speech recognition specifically on the processed microphone signals of each audio channel, and performing keyword spotting on each audio channel to set a first zone where a wake word is detected as an active zone for further speech 
Newly cited prior art, Stokes et al., US 7,831,035 B2, is directed towards microphone array signal processing for each microphone signal including an echo cancelling followed by a kind of interference cancellation (residual echo suppression – RES) as shown in figure 9. Stokes however does not disclose, nor would be combinable in any combination obvious to one of ordinary skill, speech recognition performed on each of the processed microphone signals, detection of wake-word therefrom, or the designation of one of the individual microphone channels for an ASR session/active session for intent determination.
Further, newly cited prior art Schillmoeller, US 20210118439 A1, is directed towards detecting a wake word and subsequent speech recognition in a system that includes multiple microphones with echo cancellation and noise cancellation, as shown in figure 7E. However, Schillmoeller does not teach or suggest separate audio channels for each microphone where separate echo cancellation followed by noise cancellation is performed, and therefore does not teach or suggest setting an audio channel as an active audio channel, and only performing speech recognition on the active audio channel.
Therefore, the cited art of record does not, in any combination obvious to one having ordinary skill in the art before the effective filing date of the claimed invention, 
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M KOETH whose telephone number is (571)272-5908.  The examiner can normally be reached on Monday-Friday, 9:30a-6:30p, EDT/EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.



MICHELLE M. KOETH
Primary Examiner
Art Unit 2656


/MICHELLE M KOETH/Primary Examiner, Art Unit 2656