Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Compact Prosecution
Examiner would like to suggest amending the independent claims 1 and 11to include the limitations wherein an input having a maximum and a minimum power such as word error rates and spectra extracted from sentences read from a speecon database and a speech to text service, wherein a ASR performance is selected when the word error rate is calculated as the ratio of incorrectly recognized words. These amendments will overcome the current rejection. 
Allowable Subject Matter
Claims 7 and 17 are  objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. None of the prior art cited alone or in combination provides the motivation to teach the limitation  wherein the apparatus is caused to estimate a direction of sound arrival by being further caused to one of: 
estimate a direction of sound arrival based on a cross-correlation analysis of the two or more audio signals; 
estimate a direction of sound arrival based on a cross-correlation analysis of the two or more audio signals when an active speech segment is detected; or 
estimate a direction of sound arrival based on a cross-correlation analysis of the two or more audio signals when an active speech segment is detected and a noise level of the two or more audio signals is lower than a threshold value.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 6, 8-14, 16 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Dusan et al. (US20130329895) in view of Ebenezer et al. (US20180330747). 
Claim 1, Dusan discloses an apparatus (Section 0015, lines 15-16- thus mobile phone shown in fig. 5) comprising at least one processor (Processor 19- section 0040) and at least one memory including a computer program code, the at least one memory (Memory 28)  and the computer program code configured to, with the at least one processor, (Section 0040 the functions of the mobile device 2 are controlled by an application processor with instructions stored in memory 28)  cause the apparatus at least to
obtain two or more microphone audio signals; (Section 0026, detector 49 detects first and second audio signal from Mic 1 and Mic 2) 
analyse the two or more microphone audio signals for a defined noise type; (section 0017, lines 1-4- thus noise estimates are generated by processing the two audio signals form Mic 1 and Mic 2) and process the two or more microphone audio signals based on the analysis to generate at least one audio signal suitable for automatic speech recognition. (Section 0025, lines 6-10 thus out of the 2 signals from Mic 1 and Mic 2, Mic 1 is selected for automatic speech recognition instead of the signal of Mic 2). 
Dusan does not disclose analysing the audio signal to determine the noise type. 
Ebenezer discloses analysing the audio signal to determine the noise type (Section 0039, lines 24-26) using an adaptive threshold mechanism. (Section 0039, lines 1-3). 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of using an adaptive threshold mechanism to determine a noise type within an audio signal. The motivation is that noise removal becomes very easier since the threshold of the noise is adaptive to different noise conditions or different noise types. 

Claim 2, Dusan in view of Ebenezer discloses wherein the apparatus is caused to analyse the two or more microphone audio signals by being further caused to: 
determine energy estimates for the two or more microphone audio signals; (Dusan: Section  0026 lines 2-5- the calculated or computed power or energy of the audio signals reads on the determined energy estimate)
of the audio signal determine correlation estimates between pairs of the two or more microphone audio signals; (Dusan: Section 0017, lines 1-6- based on the estimated noise by noise estimates 43 (Mic 1) and 44 (Mic 2) the estimate A performs better than the estimator B – thus comparing the two signals reads on the correlations or relationships) 
determine a defined noise type noise estimate (Ebenezer: Section 0039, lines 1-2 “diverse noise types”) based on the energy estimates for the two or more microphone audio signals (Dusan: Section 0025 lines 4-8- thus the noise estimator estimates noise within the audio signal) and the correlation estimates between pairs of the two or more microphone audio signals; (Dusan: Section 0026, lines 1-4 processed to compute a power or energy ratio between signal x1 and x2 reads on the correlation between the two signals) 
determine a defined noise type noise frequency threshold below which the defined noise type noise is a dominant disturbance based on the defined noise type noise estimate, (Dusan: Section 0028, lines 8-12- bandpass filter should be 2000Hz means any audio with less than 2000Hz will be noise dominant) 
the energy estimates for the two or more microphone audio signals and the correlation estimates between pairs of the two or more microphone audio signals. (Ebenezer: Section 0006, lines 4-7 cross correlation function between a first microphone signal  and the second microphone)
Claim 3, Dusan in view of Ebenezer discloses wherein the apparatus is caused to process the two or more microphone audio signals by being further caused to: 
select, for frequency bands below the defined noise type noise frequency threshold, (Dusan: Section 0028 lines 4-13- thus the frequencies filters such as 2000Hz and 4000Hz sets the frequency domain  and therefore the noise frequency threshold) 
a lowest energy microphone audio signal of the two or more microphone audio signals; (Dusan: Section 0029, lines 5-6- thus the base value is 12.5db shows the lowest energy)  and select for frequency bands above the defined noise type frequency threshold, a highest energy microphone audio signal of the two or more audio signals. (Dusan: Section 0028, lines 8-12- “Computing the power or energy ratio from band pass filtered signals such as between 2000Hz and 4000Hz” and this means any signal with above 4000Hz shows a higher energy)  
Claim 4, Dusan in view of Ebenezer discloses wherein the apparatus is further caused to select for frequency bands below the defined noise type noise frequency threshold, (Dusan: Section 0028 lines 4-13- thus the frequencies filters such as 2000Hz and 4000Hz sets the frequency domain  and therefore the noise frequency threshold) a lowest energy microphone audio signal of the two or more microphone audio signals and generate for frequency bands above the defined noise type noise frequency threshold (Dusan: Section 0029, lines 5-6- thus the base value is 12.5db shows the lowest energy) a filter-and-sum combination (Dusan: Section 0023 lines 1-2- thus combiner selector 45) of the two or more microphone audio signals. (Ebenezer: Section 0033 lines 2-5 -filter and sum beamformers)
Claim 6, Dusan in view of Ebenezer discloses wherein the apparatus is caused to time-align by being further caused to estimate a direction of sound arrival; (Ebenezer: Section 0043, lines 8-10- thus a directional near-field speech source)  and filter the two or more microphone audio signals based on the direction of sound arrival (Dusan: Section 0022, lines 1-4- thus time varying filter considers the signal when I -1 when the sound arrives) and a microphone configuration defining the relative locations of microphones configured to capture the two or more microphone audio signals. (Ebenezer: Section 0043, lines 1-4- thus discriminate a near-field signal  reads on the a defined relative locations of microphone) 
Claim 7, See item 3 for details 
Claim 8, Dusan in view of Ebenezer discloses wherein the defined noise type (Dusan: Section 0018 list different types of noise) comprises at least one of structure borne noise; motor noise; actuator noise; wind noise; (Ebenezer: Section 0054, lines 2-6- thus wind noise detected) or handling noise.
Claim 9, Dusan in view of Ebenezer discloses wherein the apparatus is caused to obtained two or more microphone audio signals (Dusan: Section 0026, detector 49 detects first and second audio signal from Mic 1 and Mic 2) 
by being further caused to at least one of receive the two or more microphone audio signals from the two or more microphones; (Dusan: Section 0006, lines 6-9 audio device microphone receives an audio signal) or retrieve the two or more microphone audio signals from memory.
Claim 10, Dusan in view of Ebenezer discloses wherein the two or more microphone audio signals (Dusan: Section 0018 “using two microphones”) are captured from at least one of directional microphones; (Ebenezer: Section 0033, line 3 “direction such that the entire bank of beamformers” means the microphones forms beams based on directions) pressure microphones; or pressure gradient microphones.
Claim 11, Dusan discloses a method comprising obtaining two or more microphone audio signals; (Section 0026, detector 49 detects first and second audio signal from Mic 1 and Mic 2) 
analysing the two or more microphone audio signals for a defined noise; (section 0017, lines 1-4- thus noise estimates are generated by processing the two audio signals form Mic 1 and Mic 2)  and processing the two or more microphone audio signals based on the analysing to generate at least one audio signal suitable for automatic speech recognition. (Section 0025, lines 6-10 thus out of the 2 signals from Mic 1 and Mic 2, Mic 1 is selected for automatic speech recognition instead of the signal of Mic 2)
Dusan does not disclose analysing the audio signal to determine the noise type. 
Ebenezer discloses analysing the audio signal to determine the noise type (Section 0039, lines 24-26) using an adaptive threshold mechanism. (Section 0039, lines 1-3). 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of using an adaptive threshold mechanism to determine a noise type within an audio signal. The motivation is that noise removal becomes very easier since the threshold of the noise is adaptive to different noise conditions or different noise types.
Claim 12, Dusan in view of Ebenezer discloses wherein analysing the two or more microphone audio signals for the defined noise type comprises: 
determining energy estimates for the two or more microphone audio signals; (Dusan: Section  0026 lines 2-5- the calculated or computed power or energy of the audio signals reads on the determined energy estimate)
determining correlation estimates between pairs of the two or more microphone audio signals; (Dusan: Section 0017, lines 1-6- based on the estimated noise by noise estimates 43 (Mic 1) and 44 (Mic 2) the estimate A performs better than the estimator B – thus comparing the two signals reads on the correlations or relationships) 
determining a defined noise type noise estimate (Ebenezer: Section 0039, lines 1-2 “diverse noise types”) based on the energy estimates for the two or more microphone audio signals (Dusan: Section 0025 lines 4-8- thus the noise estimator estimates noise within the audio signal) and the correlation estimates between pairs of the two or more microphone audio signals; (Dusan: Section 0026, lines 1-4 processed to compute a power or energy ratio between signal x1 and x2 reads on the correlation between the two signals) 
and determining a defined noise type noise frequency threshold below which the 
defined noise type noise is a dominant disturbance based on the defined noise type noise estimate, (Dusan: Section 0028, lines 8-12- bandpass filter should be 2000Hz means any audio with less than 2000Hz will be noise dominant) 
 the energy estimates for the two or more microphone audio signals and the correlation estimates between pairs of the two or more microphone audio signals. (Ebenezer: Section 0006, lines 4-7 cross correlation function between a first microphone signal  and the second microphone)

Claim 13, Dusan in view of Ebenezer discloses wherein processing the two or more microphone audio signals comprises: 
selecting, for frequency bands below the defined noise type noise frequency threshold (Dusan: Section 0028 lines 4-13- thus the frequencies filters such as 2000Hz and 4000Hz sets the frequency domain  and therefore the noise frequency threshold) 
a lowest energy microphone audio signal of the two or more microphone audio signals; (Dusan: Section 0029, lines 5-6- thus the base value is 12.5db shows the lowest energy) and 
selecting, for frequency bands above the defined noise type frequency threshold, a highest energy microphone audio signal of the two or more audio signals. (Dusan: Section 0028, lines 8-12- “Computing the power or energy ratio from band pass filtered signals such as between 2000Hz and 4000Hz” and this means any signal with above 4000Hz shows a higher energy)  

Claim 14, Dusan in view of Ebenezer discloses wherein processing the two or more microphone audio signals comprises: 
selecting, for frequency bands below the defined noise type noise frequency threshold, (Dusan: Section 0028 lines 4-13- thus the frequencies filters such as 2000Hz and 4000Hz sets the frequency domain  and therefore the noise frequency threshold) a lowest energy microphone audio signal of the two or more microphone audio signals; and generating, for frequency bands above the defined noise type noise frequency threshold, (Dusan: Section 0029, lines 5-6- thus the base value is 12.5db shows the lowest energy)a filter-and-sum combination (Dusan: Section 0023 lines 1-2- thus combiner selector 45) of the two or more microphone audio signals. (Ebenezer: Section 0033 lines 2-5 -filter and sum beamformers)
Claim 16, Dusan in view of Ebenezer discloses wherein time-aligning the two or more microphone audio signals comprises estimating a direction of sound arrival; Ebenezer: Section 0043, lines 8-10- thus a directional near-field speech source) and filtering the two or more microphone audio signals based on the direction of sound arrival (Dusan: Section 0022, lines 1-4- thus time varying filter considers the signal when I -1 when the sound arrives)  and a microphone configuration defining the relative locations of microphones configured to capture the two or more microphone audio signals. (Ebenezer: Section 0043, lines 1-4- thus discriminate a near-field signal  reads on the a defined relative locations of microphone) 
Claim 17, See item 3 for details. 
Claim 18, Dusan in view of Ebenezer discloses wherein the defined noise type (Dusan: Section 0018 list different types of noise) comprises at least one of structure borne noise; motor noise; actuator noise; wind noise; (Ebenezer: Section 0054, lines 2-6- thus wind noise detected) or handling noise.
Claim 19, Dusan in view of Ebenezer discloses wherein obtaining two or more microphone audio signals (Dusan: Section 0026, detector 49 detects first and second audio signal from Mic 1 and Mic 2) comprises, at least one of receiving the two or more microphone audio signals from the two or more microphones; (Dusan: Section 0006, lines 6-9 audio device microphone receives an audio signal) or retrieving the two or more microphone audio signals from memory.
Claim 20, Dusan in view of Ebenezer discloses wherein the two or more microphone audio signals are captured from at least one directional microphone, (Ebenezer: Section 0033, line 3 “direction such that the entire bank of beamformers” means the microphones forms beams based on directions) and processing the two or more microphone audio signals based on the analysing to generate at least one audio signal suitable for automatic speech recognition (Section 0025, lines 6-10 thus out of the 2 signals from Mic 1 and Mic 2, Mic 1 is selected for automatic speech recognition instead of the signal of Mic 2)
comprises filter-and-summing the two or more microphone audio signals to generate a directional audio signal. (Ebenezer: Section 0033 lines 2-5 -filter and sum beamformers)
Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Dusan et al. (US20130329895) in view of Ebenezer et al. (US20180330747) as applied to claims 1-4,6, 8-14, 16 and 18-20 above, and further in view of Mao (US 20050047611).
       Claims 5 and 15, Dusan in view of Ebenezer discloses wherein generating, for frequency bands above the defined noise type noise frequency threshold, a filter-and-sum combination  of the two or more microphone audio signals  (Dusan: Section 0023 lines 1-2- thus combiner selector 45) and (Ebenezer: Section 0033 lines 2-5 -filter and sum beamformers) however Dusan in view of Ebenezer does not disclose wherein the audio signal comprises time-aligning the two or more microphone audio signals; and generating a weighted average of the time-aligned two or more microphone audio signals. 
Ebenezer discloses wherein the audio signal comprises time-aligning the two or more microphone audio signals; and generating a weighted average of the time-aligned two or more microphone audio signals. (Section 0044 lines 4-8 thus the cross-correlation is used to time- align signals from different directions and the time align signals from various sensors are weighted according to the beam forming directions). 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching generating a weighted average of the signals. The motivation of generating the weighted average is that it will make the system very accurate because it takes into account relative importance or frequency of the signals. 
	Cited Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Virolainen et al. (US20150312691) discloses an apparatus comprising a controller configured to initiate a sound capture event; at least two microphones configured to capture at least two audio signals for the sound capture event; a detector configured to determine at least one microphone operational parameter based on the at least two audio signals; an audio capture processor configured to process at least one of the at least two audio signals; and wherein the controller is configured to control the sound capture event such that at least one of the at least two audio signals is processed based on the at least one microphone operational parameter.
Elko (US20030147538) discloses a consumer device comprising (a) at least two microphones; (b) a filter configured to filter audio signals generated in response to a sound field by the at least two microphones to compensate for a phase difference between the at least two microphones; and (c) a signal processor configured to (1) generate a revised phase difference between the at least two microphones based on the audio signals; and (2) update, based on the revised phase difference, at least one calibration parameter used by the filter.
	Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Akwasi M Sarpong whose telephone number is (571)270-3438. The examiner can normally be reached Mon-Fri. 8:00am-4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KING D POON can be reached on 571-272-7440. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/AKWASI M SARPONG/Primary  Examiner, Art Unit 2675                                                                                                                                                                                                        05/31/2022