DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are: “audio signal obtaining module configured to obtain,” “a state determining module, configured to determine,” “a parameter adjusting module, configured to adjust,” “a filter processing module, configured to control” “a speech recognition module, configured to perform in claim 11, where these limitations are further limited respectively in dependent claims 12-15. Further, other limitations are: “a desired signal generating module, configured to ... perform,” “a signal inputting unit, configured to input, “a filter processing unit, configured to control,” in claim 16, “a target sound zone determining module, configured to ... determine,” and “an engine waking up module, configured to wake up,” in claim 18, and “a recognition result responding module, configured to .. respond,” in claim 19.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 9, 11, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ganeshkumar (US 10,499,139 B2 herein “Ganeshkumar”).
Regarding claims 1 and 20, Ganeshkumar teaches [a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are configured to cause the computer to execute – claim 20 only (Ganeshkumar col. 23, ll. 3-12 teach the functions disclosed are carried out with a digital signal processor (DSP) including software such as firmware)] a speech recognition method, comprising (Ganeshkumar fig. 5, col. 14, ll. 6-13, operations of system 500 to produce a voice output used for voice recognition and speech to text processing): 
obtaining audio signals collected by microphones in at least two sound zones (Ganeshkumar col. 11, ll. 30-34, col. 10, ll. 26-40, two sets of microphones, one for a right side and another for a left side (thus at least two sound zones – one for right zone, one for left zone) produce acoustic signals to the system 500); 
determining whether each audio signal comprises a key speech according to sound energy of the audio signal to acquire a determined result (Ganeshkumar col. 13, l. 49- col. 14, l. 5, a weighting calculator 570 monitors and analyzes the microphone signals from the right and left microphone zones (thus each audio signal) to determine an energy level that indicates a higher voice to noise ratio (comprises key speech)); 
adjusting an adaptive adjustment parameter of an adaptive filter in each sound zone according to the determined result (Ganeshkumar fig. 5, col. 13, ll. 32-50, weighting calculator establishes factors (adjustment parameter) at combiners 542 and 544 which as shown control the signal levels into the adaptive filter, where the factors are weights applied to the individual left and right signals (each sound zone), where for the combiner 542, the signals are the primary/speech signals, and for the combiner 544, the signals are the noise/reference signals); 
controlling the adaptive filter to perform adaptive filtering processing on the audio signal collected in the sound zone corresponding to the adaptive filter according to the adaptive adjustment parameter (Ganeshkumar fig.5, col. 13, l. 58-col. 14, l. 5, the weights calculated by the weighting calculator are applied (controlling) to combiners 542 and 544 (which are shown to be components directly upstream of the adaptive filter, and as disclosed below, would be an obvious integration into an adaptive filter) respective to each left and right signals), and outputting a filtered signal (Ganeshkumar fig. 5, col. 12, ll. 7-12, adaptive filter 540 filters input signals and outputs a voice estimate signal 556, which is further spectrally enhanced to produce voice output 562 ); and 
performing speech recognition on the filtered signal (Ganeshkumar col. 14, ll. 6-11, voice output signal is provided to a virtual personal assistant for voice recognition and speech to text processing).
Although Ganeshkumar shows the combiners 542 and 544 as components separate from the adaptive filter 540, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have the combiners and applied weights be part of the adaptive filter at least because doing so would simply involve moving them as a first stage of the adaptive filter with the predictable result that the outputs of the combiners are then processed by the other components of the adaptive filter. Moreover, col. 23, ll. 3-12 of Ganeshkumar discloses that combinations of components may be carried out in a digital signal processor, thus at least suggesting the predictable results of combining the components disclosed. Therefore, such a combination would have been combining prior art elements according to known methods to yield predictable results. see MPEP 2143(I)(A).
Regarding claim 9, Ganeshkumar teaches wherein after performing the speech recognition on the filtered signal, the method further comprises: responding to a speech recognition result of the filtered signal according to the speech recognition result (Ganeshkumar fig. 5, col. 14, ll. 6-30, the voice output signal is provided to various other components such as a virtual personal assistant performing voice recognition and speech to text to be further provided for internet searching (response to speech recognition)) in combination with a setting function of the sound zone (the Examiner notes that this limitation is broad and includes any “setting function” so long as it is “of the sound zone”, where setting function is broad and includes anything function that involves a “setting” of some kind, where “calendar management” is also a given response of the speech processing in Ganeshkumar, col. 14, ll. 6-30, where an obvious type of calendar management would be to make a calendar appointment, which is both a “setting function” and “of the sound zone” since calendars are specific to people, and so the person who is making the calendar appointment via speech is the person making the speech in the sound zone).
While Ganeshkumar teaches that the speech recognition of the voice output can be used for “calendar management,” Ganeshkumar does not explicitly enumerate what the calendar management functions are. However, there are only a closed set of calendar management functions that a virtual personal assistant would perform, one such function being to make a calendar appointment, and as such, modifying the calendar management of Ganeshkumar to include making a calendar appointment would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention as doing so would have been obvious to try. see MPEP 2143(I)(E).
Regarding claim 11, Ganeshkumar teaches a speech recognition apparatus, comprising: a non-transitory computer-readable medium including computer-executable instructions stored thereon, and an instruction execution system which is configured by the instructions to implement at least one of: (Ganeshkumar fig. 5, col. 14, ll. 6-13, operations of system 500 to produce a voice output where col. 14, ll. 6-15 teach the voice output is used for voice recognition and speech to text processing, and col. 23, ll. 3-12 teach the functions disclosed are carried out with a digital signal processor (DSP) including software such as firmware): 
an audio signal obtaining module (Ganeshkumar col. 23, ll. 3-12, DSP), configured to obtain audio signals collected by microphones in at least two sound zones (Ganeshkumar col. 11, ll. 30-34, col. 10, ll. 26-40, two sets of microphones, one for a right side and another for a left side (thus at least two sound zones – one for right zone, one for left zone) produce acoustic signals to the system 500); 
a state determining module  (Ganeshkumar col. 23, ll. 3-12, DSP), configured to determine whether each audio signal comprises a key speech according to sound energy of the audio signal to acquire a determined result (Ganeshkumar col. 13, l. 49- col. 14, l. 5, a weighting calculator 570 monitors and analyzes the microphone signals from the right and left microphone zones (thus each audio signal) to determine an energy level that indicates a higher voice to noise ratio (comprises key speech)); 
a parameter adjusting module (Ganeshkumar col. 23, ll. 3-12, DSP), configured to adjust an adaptive adjustment parameter of an adaptive filter in each sound zone according to the determined result (Ganeshkumar fig. 5, col. 13, ll. 32-50, weighting calculator establishes factors (adjustment parameter) at combiners 542 and 544 which as shown control the signal levels into the adaptive filter, where the factors are weights applied to the individual left and right signals (each sound zone), where for the combiner 542, the signals are the primary/speech signals, and for the combiner 544, the signals are the noise/reference signals); 
a filter processing module (Ganeshkumar col. 23, ll. 3-12, DSP), configured to control the adaptive filter to perform adaptive filtering processing on the audio signal collected in the sound zone corresponding to the adaptive filter according to the adaptive adjustment parameter (Ganeshkumar fig.5, col. 13, l. 58-col. 14, l. 5, the weights calculated by the weighting calculator are applied (controlling) to combiners 542 and 544 (which are shown to be components directly upstream of the adaptive filter, and as disclosed below, would be an obvious integration into an adaptive filter) respective to each left and right signals), and output a filtered signal (Ganeshkumar fig. 5, col. 12, ll. 7-12, adaptive filter 540 filters input signals and outputs a voice estimate signal 556, which is further spectrally enhanced to produce voice output 562 ); and 
a speech recognition module  (Ganeshkumar col. 23, ll. 3-12, DSP), configured to perform speech recognition on the filtered signal (Ganeshkumar col. 14, ll. 6-11, voice output signal is provided to a virtual personal assistant for voice recognition and speech to text processing).
Although Ganeshkumar shows the combiners 542 and 544 as components separate from the adaptive filter 540, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have the combiners and applied weights be part of the adaptive filter at least because doing so would simply involve moving them as a first stage of the adaptive filter with the predictable result that the outputs of the combiners are then processed by the other components of the adaptive filter. Moreover, col. 23, ll. 3-12 of Ganeshkumar discloses that combinations of components may be carried out in a digital signal processor, thus at least suggesting the predictable results of combining the components disclosed. Therefore, such a combination would have been combining prior art elements according to known methods to yield predictable results. see MPEP 2143(I)(A).
Regarding claim 19, Ganeshkumar teaches wherein the instruction execution system is further configured by the instructions to implement: a recognition result responding module (Ganeshkumar col. 23, ll. 3-12, DSP), configured to, after performing the speech recognition on the filtered signal, respond to a speech recognition result of the filtered signal according to the speech recognition result (Ganeshkumar fig. 5, col. 14, ll. 6-30, the voice output signal is provided to various other components such as a virtual personal assistant performing voice recognition and speech to text to be further provided for internet searching (response to speech recognition)) in combination with a setting function of the sound zone (the Examiner notes that this limitation is broad and includes any “setting function” so long as it is “of the sound zone”, where setting function is broad and includes anything function that involves a “setting” of some kind, where “calendar management” is also a given response of the speech processing in Ganeshkumar, col. 14, ll. 6-30, where an obvious type of calendar management would be to make a calendar appointment, which is both a “setting function” and “of the sound zone” since calendars are specific to people, and so the person who is making the calendar appointment via speech is the person making the speech in the sound zone).
While Ganeshkumar teaches that the speech recognition of the voice output can be used for “calendar management,” Ganeshkumar does not explicitly enumerate what the calendar management functions are. However, there are only a closed set of calendar management functions that a virtual personal assistant would perform, one such function being to make a calendar appointment, and as such, modifying the calendar management of Ganeshkumar to include making a calendar appointment would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention as doing so would have been obvious to try. see MPEP 2143(I)(E).
Claims 2-3, 5-6, 12-13 and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Ganeshkumar, as set forth above regarding claim 1 from which claims 2 and 5 depend, and as regarding claim 11 from which claims 12 and 15 depend, and further in view of Sugiyama et al., (US 2009/0121934 A1, herein “Sugiyama”).
Regarding claims 2 and 12, Ganeshkumar teaches [wherein the determining whether each audio signal comprises the key speech according to the sound energy of the audio signal comprises (Ganeshkumar fig. 5, as shown, signal processing that occurs upstream of the weighting calculator 570 that monitors and analyzes the microphone signals from the right and left microphone zones (thus each audio signal) to determine an energy level that indicates a higher voice to noise ratio (comprises key speech), where col. 13, ll. 49-57 teaches the determined weighting being from monitored and analyzed primary and reference signals from the left and right microphones (zones)): - claim 2 / wherein the state determining module comprises ((Ganeshkumar col. 23, ll. 3-12, DSP)): - claim 12]: 
[an audio signal inputting unit (Ganeshkumar col. 23, ll. 3-12, DSP), configured to: - claim 12] inputting/input the audio signal to a blocking corresponding to the sound zone, and determining/determine the sound zone as a current sound zone of the blocking matrix (Ganeshkumar fig. 5, col. 10, l. 56-col. 11, l. 11, the right microphones (zone) input their signals to a right null steering 514 (blocking matrix), and the left microphones (zone) input their signals to a left null steering 524 (blocking matrix), where col. 13, ll. 39-49 teaches that the weighting calculator decides the weighting/which or what balance of left and right signals will contribute more to the calculated reference signal (a current sound zone of the blocking matrix)); 
[a reference signal determining unit (Ganeshkumar col. 23, ll. 3-12, DSP), configured to – claim 12] determining/determine, for the blocking, at least one reference signal of the current sound zone based on an audio signal of the current sound zone and audio signals of at least one non-current sound zone (Ganeshkumar fig. 5, col. 10, l. 56-col. 11, l. 11, and col. 13, ll. 39-49, reference signal 548 is determined from a weighting of the two zones, where one may contribute more than the other (the current sound zone), but both can contribute according to the given weighting), wherein the at least one reference signal is configured to environmental noises other than the key speech in the current sound zone (Ganeshkumar col. 10, l 62-col. 11, l. 5, the right and left null processors reduces the user voice component, thus the noise component being more prominent);
[a state determining unit (Ganeshkumar col. 23, ll. 3-12, DSP), configured to – claim 12] performing/perform comparison on sound energies of reference signals of the at least two sound zones to obtain a comparison result, and determining whether the audio signal comprises the key speech according to the comparison result (Ganeshkumar col. 13, ll. 49-61, col. 14, ll. 1-5, weighting calculator analyzes the left and right reference signals and determines which one has more noise based on energy, and weigh that reference signal more heavily, determining that the less noisier reference has speech).
Although Ganeshkumar teaches generating a reference signal from null processing, this is not the same as a blocking matrix, and does not “strengthen environmental noises” rather it reduces the signal to noise ratio.
However, Sugiyama teaches a blocking matrix, that enhances interference (Sugiyama para. 68, blocking matrix circuit 340 enhances interference).
Therefore, taking the teachings of Ganeshkumar and Sugiyama together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the null steering of Ganeshkumar to be a blocking matrix as disclosed in Sugiyama at least because doing so would provide accurate interference estimation (see Sugiyama para. 64).
Regarding claims 3 and 13, Ganeshkumar teaches [wherein the performing the comparison on the sound energies of reference signals of the at least two sound zones to obtain the comparison result, and determining whether the audio signal comprises the key speech according to the comparison result comprise (Ganeshkumar col. 13, ll. 49-61, col. 14, ll. 1-5, weighting calculator analyzes the left and right reference signals and determines which one has more noise based on energy, and weigh that reference signal more heavily, determining that the less noisier reference has speech) – claim 3 / wherein the state determining unit comprises (Ganeshkumar col. 23, ll. 3-12, DSP) – claim 13]: 
[a state determining subunit (Ganeshkumar col. 23, ll. 3-12, DSP), configured to – claim 13] performing/perform the comparison on the sound energies of reference signals of the at least two sound zones, determining/determine that an audio signal collected in a sound zone with the smallest sound energy comprises the key speech, and determining/determine that audio signals collected in sound zones other than the sound zone with the smallest sound energy do not comprise the key speech (Ganeshkumar col. 13, l. 49 - col. 14, l. 5, weighting calculator analyzes the left and right reference signals and determines which one has more noise based on the energy of the signal being the smallest, where the signal with the lower total amplitude or energy is determined to have less of the noise, and thus is the “zone” (left or right) that has the voice in its signal, where the other zone (left or right) is considered to have mostly noise and thus not the voice signal).
Regarding claims 5 and 15, Ganeshkumar does not teach the limitations of claim 5. Ganeshkumar teaches wherein the parameter adjusting module comprises: a adjusting unit (Ganeshkumar col. 23, ll. 3-12, DSP), but does not teach the remainder of the limitations of claim 15. Sugiyama teaches [wherein the adjusting the adaptive adjustment parameter of the adaptive filter in each sound zone according to the determined result comprises: - claim 5] adjusting/adjust step size calculation strategy of the adaptive filter in the sound zone where the audio signal comprises the key speech to precise step size strategy (Sugiyama paras. 75-78, 89, when normalized mutual-correlation is larger than a threshold, it is determined that the target signal (speech) is prevailing in the input signals (thus also the sound zone signal where key speech is present), where speed (step size) of the update of the coefficients of the adaptive blocking matrix filter and multi-input canceller is controlled by the value of the normalized mutual-correlation, such that when target (speech) signal is present/prevailing, the accuracy of the coefficient update is changed by using a smaller step); and 
adjusting/adjust the step size calculation strategy of the adaptive filter in the sound zone where the audio signal does not comprise the key speech to rough step size strategy (Sugiyama paras. 75-78, 89, when normalized mutual-correlation is not larger than a threshold, the target signal (speech) is not prevailing in the input signals (thus also the sound zone signal where key speech would be present), where speed (step size) of the update of the coefficients of the adaptive blocking matrix filter and multi-input canceller is controlled by the value of the normalized mutual-correlation, such that when target (speech) signal is not present/prevailing, the speed/accuracy of the coefficient update is changed to be slower/bigger steps); wherein a step size determined by the precise step size strategy is smaller than a step size determined by the rough step size strategy (Sugiyama paras. 75-78, smaller step for when the target signal is prevailing is a more accurate/precise controlling, with a smaller step size versus a slower update speed/accuracy for when the target signal is not prevailing (rough)).
Therefore, taking the teachings of Ganeshkumar and Sugiyama together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the filtering of Ganeshkumar to include updates via different step sizes as disclosed in Sugiyama at least because doing so would provide accurate interference estimation (see Sugiyama para. 64).
Regarding claims 6 and 16, Ganeshkumar teaches the instruction execution system is further configured by the instructions to implement: a desired signal generating module (Ganeshkumar col. 23, ll. 3-12, DSP), configured to – claim 16 /wherein – claim 6] before controlling the adaptive filter to perform adaptive filtering processing on the audio signal collected in the sound zone corresponding to the adaptive filter according to the adaptive adjustment parameter, [the method further comprises – claim 6 only] (Gankeshkumar fig. 5, as shown, upstream processing from the adaptive filter 540, thus before):
performing/perform filtering processing on the audio signal of the sound zone by adopting at least two parameter filters corresponding to the sound zone, so as to generate a desired signal, wherein the desired signal is configured to strengthen the key speech in the sound zone (Ganeshkumar fig. 5, col. 10, l. 56 – col. 11, l .3, beamformers 512 and 522, which produce a primary signal in which the user’s voice (desired signal with speech) is increased (strengthened), where each beamformer has a beam respectively directed towards the user’s mouth, in a headset configuration, where one microphone is on the left, and the other is on the right (see figs 1, and 2)); and 
[controlling the adaptive filter to perform adaptive filtering processing on the audio signal collected in the sound zone corresponding to the adaptive filter according to the adaptive adjustment parameter, and outputting a filtered signal comprise (Ganeshkumar fig. 5, col. 12, ll. 7-18, by way of the adaptive coefficients, the adaptive filter receives the combined primary signal and the combined reference signal and applies a digital filter to output a voice estimate signal and a noise estimate signal) – claim 6 only]: 
[the filter processing module comprise: a signal inputting unit, configured to (Ganeshkumar col. 23, ll. 3-12, DSP) – claim 16 only] inputting/input the desired signal and the reference signal of the sound zone into the adaptive filter corresponding to the sound zone (Ganeshkumar fig. 5, col. 12, ll. 7-18, the adaptive filter receives the combined primary signal (desired signal) and the combined reference signal); and 
[a filter processing unit, configured to (Ganeshkumar col. 23, ll. 3-12, DSP) – claim 16 only] controlling/control the adaptive filter to perform the adaptive filtering processing on the desired signal and the reference signal by adopting the adaptive adjustment parameter, and outputting the filtered signal (Ganeshkumar fig. 5, col. 12, ll. 7-18, by way of the adaptive coefficients (adaptive adjustment parameter), the adaptive filter applies a digital filter to output a voice estimate signal and a noise estimate signal).
Ganeshkumar does not explicitly teach that the beamformers have “fixed parameter.” However, Sugiyama teaches fixed beamformers with fixed parameters (Sugiyama paras. 16-17, fixed beamformer operating on multiple signals (beamformers) with the beamformer having a set group delay time).
Therefore, taking the teachings of Ganeshkumar and Sugiyama together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the beamformers of Ganeshkumar to include fixed beamformers as disclosed in Sugiyama at least because doing so would provide accurate interference estimation (see Sugiyama para. 64).
Claims 4 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Ganeshkumar in view of Sugiyama, as set forth above regarding claim 2 from which claim 4 depends, and regarding claim 12 from which claim 14 depends, further in view of Vitte et al., (US 2011/0070926 A1, herein “Vitte”).
Regarding claims 4 and 14, Ganeshkumar teaches [wherein the performing the comparison on the sound energies of reference signals of the at least two sound zones to obtain the comparison result, and determining whether the audio signal comprises the key speech according to the comparison result comprise (Ganeshkumar col. 13, ll. 49-61, col. 14, ll. 1-5, weighting calculator analyzes the left and right reference signals and determines which one has more noise based on energy, and weigh that reference signal more heavily, determining that the less noisier reference has speech) – claim 4 /wherein the state determining unit comprises: a probability relative order determining subunit, configured to  (Ganeshkumar col. 23, ll. 3-12, DSP) – claim 14]: 
performing/perform the comparison on the sound energies of the reference signals of the at least two sound zones to obtain the comparison result, and determining/determine a relative order of each audio signal comprising the key speech according to the comparison result (Ganeshkumar col. 13, l. 58 – col. 14, l. 5, weighting calculator determines which side (zone- left or right) has the lower or higher total energy (thus comparing the total sound energies of left versus right) where an order is determined – that is the lower energy one of the two is considered to have less noise, and thus have more of the speech signal); and 
[a state determining unit (Ganeshkumar col. 23, ll. 3-12, DSP), configured to: - claim 14 only] determining/determine that an audio signal collected in a sound zone with the maximum comprises the key speech according to the comparison result, and determining that audio signals collected in sound zones other than the sound zone with the maximum do not comprise the key speech (Ganeshkumar col. 13, l. 49 - col. 14, l. 5, the signal with the lower total amplitude or energy is determined to have less of the noise, and thus is the “zone” (left or right) that has the voice in its signal, where the other zone (left or right) is considered to have mostly noise and thus not the voice signal).
Ganeshkumar does not explicitly teach of probabilities or probability
Vitte teaches probabilities and probability (Vitte fig. 1, paras. 23, 100-112, system calculating the probability of speech being present (or in the inverse, absent) in an input audio signal).
Therefore, taking the teachings of Ganeshkumar and Vitte together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the weighting calculations of Ganeshkumar to include the calculations of the probability of speech being present as disclosed in Vitte at least because doing so would allow for distinguishing between non-steady noise and speech, and allow adoption of de-noising to the detected presence of non-steady noise (Vitte paras 12-13).
Claims 7 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Ganeshkumar in view of Sugiyama, as set forth above regarding claim 6 from which claim 7 depends, and regarding claim 16 from which claim 17 depends, further in view of Even et al., "Semi-blind suppression of internal noise for hands-free robot spoken dialog system," 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009, pp. 658-663, doi: 10.1109/IROS.2009.5354451 (herein “Even NPL”).
Regarding claims 7 and 17, Ganeshkumar teaches wherein the adaptive filter is an adaptive filter (Ganeshkumar fig. 5, adaptive filter 540), and the parameter filters are parameter beamforming filters (Ganeshkumar fig. 5, col. 10, l. 56 – col. 11, l .3, beamformers 512 and 522, which produce a primary signal in which the user’s voice (desired signal with speech) is increased (strengthened)).
Ganeshkumar does not explicitly teach an adaptive beamforming filter.
Ganeshkumar further does not teach fixed filters and initial parameters of the fixed parameter beamforming filters and blocking matrixes are determined according to sound transmission time delays among the microphones in the at least two sound zones
However, Sugiyama teaches fixed beamformers with fixed parameters (Sugiyama paras. 16-17, fixed beamformer operating on multiple signals (beamformers) with the beamformer having a set group delay time).
Sugiyama further teaches and initial parameters of the fixed parameter beamforming filters and blocking matrixes are determined according to sound transmission time delays among the microphones in the at least two sound zones (Sugiyama paras. 16-17 and 82, fixed beamformer having a set group delay time, where the blocking matrices have a delay z-iD between microphones).
Even NPL teaches an adaptive beamforming filter (Even NPL fig. 4, section III(E), a delay and sum beamformer is adapted by way of the direction                         
                            
                                
                                    θ
                                
                                
                                    t
                                    a
                                    r
                                    g
                                    e
                                    t
                                
                            
                        
                     of the target speech).
Therefore, taking the teachings of Ganeshkumar and Sugiyama together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the beamformers of Ganeshkumar to include fixed beamformers as disclosed in Sugiyama at least because doing so would provide accurate interference estimation (see Sugiyama para. 64).
Further, taking the teachings of Ganeshkumar and Even NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the adaptive filter of Ganeshkumar to be an adaptive beamformer as disclosed in Even NPL at least because doing so would improve word accuracy in a dictation task in presence of diffuse background noise and robot noise (see Even NPL section I).
Claims 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Ganeshkumar, as set forth above regarding claim 1 from which claim 8 depends, and regarding claim 11 from which claim 18 depends, further in view of Huang et al., "Hotword Cleaner: Dual-microphone Adaptive Noise Cancellation with Deferred Filter Coefficients for Robust Keyword Spotting," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 6346-6350, doi: 10.1109/ICASSP.2019.8682682 (herein “Huang NPL).
Regarding claims 8 and 18, Ganeshkumar teaches [wherein after performing the speech recognition on the filtered signal, the method further comprises: - claim 8/ wherein the instruction execution system is further configured by the instructions to implement: a target sound zone determining module (Ganeshkumar col. 23, ll. 3-12, DSP), configured to, after performing the speech recognition on the filtered signal – claim 18] determining/determine a sound zone where the audio signal comprises the key speech as a target sound zone (Ganeshkumar col. 13, l. 49- col. 14, l. 5, a weighting calculator 570 monitors and analyzes the microphone signals from the right and left microphone zones (thus each audio signal) to determine an energy level in a particular one of the right or left signals (zones) that indicates a higher voice to noise ratio (comprises key speech)).
Regarding just claim 18, Ganeshkumar also teaches an engine waking up module, configured to (Ganeshkumar col. 23, ll. 3-12, DSP – which is programmable to perform any function).
Ganeshkumar does not teach waking/wake up a speech recognition engine for recognizing subsequent audio signals of the target sound zone when a speech recognition result of the target sound zone comprises a wake-up word.
Huang NPL teaches waking/wake up a speech recognition engine for recognizing subsequent audio signals of the target sound zone when a speech recognition result of the target sound zone comprises a wake-up word (Huang NPL sections 1 and 5, a hotword is detected in one of two microphone audio signals, where hotword is another term for a wake word, and where upon the wake word being detected, speech recognition and comprehension is activated (waking up a speech recognition engine)).
Therefore, taking the teachings of Ganeshkumar and Huang NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the speech detection of Ganeshkumar to include the hotword detection and activation thereafter as disclosed in Huang NPL at least because doing so would reduce a false-reject rate of speech with hotwords in them when the speech is in the presence of strong background noise (see Huang NPL, Abstract).
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Ganeshkumar, as set forth above regarding claim 9 from which claim 10 depends, further in view of Vitte.
Regarding claim 10, Ganeshkumar does not teach the limitations of claim 10. Vitte teaches wherein the at least two sound zones comprise a driver sound zone and at least one non-driver sound zone (Vitte paras. 9, 46, 62, an array of microphones arranges in a predetermined configuration, where the microphones are spaced apart, and the main lobe is directed towards the driver (driver sound zone), whereas the non-driver interference sounds are from a “far away” source (non-driver sound zone) such as a car horn, or a scooter going past the car).
Therefore, taking the teachings of Ganeshkumar and Vitte together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the microphone locations/sound zones of Ganeshkumar to include directionality of a driver’s noise as well as other farther noises as disclosed in Vitte at least because doing so would allow for distinguishing between non-steady noise and speech, and allow adoption of de-noising to the detected presence of non-steady noise (Vitte paras 12-13).


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Hoshuyama et al., "A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters," in IEEE Transactions on Signal Processing, vol. 47, no. 10, pp. 2677-2684, Oct. 1999, doi: 10.1109/78.790650. Hoshuyama is directed towards noise filtering for microphone arrays using a fixed beamformer, blocking matrices, and multi-input canceller that use a determined desired signal component and a reference signal component.
Baik, US 8,468,018 B2, directed towards canceling noise in a voice signal using a type of beamformer – a generalized sidelobe canceller that can be adjusted according to step-sizes determined by the signal-to-noise ratio of an input signal.
Visser et al., US 2007/0021958 A1, directed towards separating speech signals in a noise environment where there is a plurality of microphones and the speech signal level in each microphone signal is considered in order to determine adjustments to filtering out the noise.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M KOETH whose telephone number is (571)272-5908. The examiner can normally be reached Monday-Friday, 09:30-18:30 EDT/EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MICHELLE M. KOETH
Primary Examiner
Art Unit 2656



/MICHELLE M KOETH/Primary Examiner, Art Unit 2656