DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Drawings
The drawing (fig. 2) was received on 06/08/2021.  The drawing (fig. 2) has been accepted.

Response to Amendments and Arguments
Regarding the rejection under 35 U.S.C. §112(b), applicant amended independent claims 1, 8 and 15 to correct the antecedent basis issue. 

Applicant further stated (Remarks, pages 6-7) that a single output of one Fourier Transform operation being used separately by a noise estimate #314 and a filter #316. 

Examiner noticed that the above applicant argument equates a filter #316 as the claimed “the feature extraction”. The argument (Remarks, pages 6-7) is inconsistent with the claimed wherein the feature extraction and the noise estimate use an output of a same Fourier Transform”. 

By carefully studying the disclosure (Spec. [0026], Fig. 3, #308, #310, #314 and #316), the examiner believed the claimed feature extraction refers to Fig. 3, #308 and #310. In other words, both Fig. 3, #308 (feature extraction path) and #314 (noise estimation path) uses a single output of one Fourier Transform operation Fig. 3, #306.  The examiner decides to withdraw the rejection under §112(b).

Regarding the rejection under 35 U.S.C. §103, applicant's arguments filed 06/08/2021 have been fully considered but they are not persuasive. 

Applicant alleged that primary reference to Borjeson (US PG Pub. 2016/0267908) does not teach two limitations: “determining whether …” and “performing noise filtering …” (Remarks, page. 8). 

In particular, applicant argued (Remarks, page 9) “That is, Borjeson performs the noise filtering step regardless of whether speech is present in the sampled sound it is only after the noise filtering step that that Borjeson's method determines whether speech is present in the sampled sound at step 68.” (Emphasis in the Remarks). 

Regarding above applicant’s argument that the order of detecting whether speech being present step #68 and a filtering noise step #66 in Borjeson’s Fig. 3 is Borjeson, [0034], Although FIG. 3 shows a specific order of executing functional logic blocks, the order of executing the blocks may be changed relative to the order shown). 

In addition, Borjeson clearly discloses a lower speech recognition continuously monitoring audio to distinguish speech and ambient noise ([0020], [0041]) and performing filtering when an audio signal contains speech ([0042]). The argument is not persuasive. 

Applicant argued that Borjeson performs the noise filtering step regardless of whether speech is present (Remarks, page 9). By reading the argument (Remarks, page 9), it is appears that applicant wanted the examiner to improperly interpret the recited limitation “performing noise filtering … when speech is found to be present” as “not performing noise filter when no speech is found”.  (See Remarks, page 9, In some examples, conditionally performing noise filtering only when speech is found to be present advantageously reduces processing overhead. See [0024] of the published application). 

The examiner further points out that applicant argument above (Remarks, page 9) is inconsistent with the disclosure (Spec. [0024]), which discloses noise reduction is not performed when noise level is below the threshold. The disclosure does NOT have In some examples, conditionally performing noise filtering only when speech is found to be present advantageously reduces processing overhead”.

Because Borjeson discloses continuous monitoring ambient noise and distinguishing noise from speech ([0020]), even if according to applicant argued feaures (Remarks, page 9), Borjeson still meets the broadly recited limitations. The limitation only requires performing noise filtering when speech is found to be present. The claim limitations do not exclude a situation of performing noise filtering function when no speech was found.  
 
Applicant further argued dependent claims are allowable because of argument to independent claims (Remarks, page 10). For the reason explained above for independent claims, the argument is not persuasive. 

Applicant further argued dependent claims 7 and 14 by saying (Remarks, page 9) that Borjeson or Komeji does not teach “lower power processor performs the feature extraction and noise processing to identify the wakeup phrase” because Borjeson discloses a second state or a third stage are enabled to further analyze the detected sound to determine if a voice command is included in the sound. 

In response, the examiner notices that Borjeson clearly discloses a lower power processor performs feature extraction and noise processing to identify a keyword 

	Claim Rejections - 35 USC § 103
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Borjeson et al. (US PG Pub. 2016/0267908, hereinafter referred to as Borjeson) in view of Komeji et al. (US PG Pub. 2013/0231929, applicant submitted IDS, hereinafter referred to as Komeji).

Regarding claims 1, 8 and 15, Borjeson discloses a method, a system and computer readable medium (Fig. 2, [0005-0006],  [0020], a computer implemented system using a low power speech recognition stage for detecting keywords, when keywords detected, invoking 2nd stage and/or 3rd stage speech recognition to further process the received speech signal), comprising:
receiving a microphone signal ([0030], fig. 5, #42); 
determining whether the microphone signal contains noise above a noise threshold ([0006], [0041],  #Fig. 3, #64); 
determining whether the microphone signal contains speech when the noise threshold is exceeded ([0041], Fig. 3, #68); 
[0043], Fig. 3, #70, processing noise to estimate a new threshold); 
performing noise filtering using the noise estimate on the microphone signal (fig. 3, #66, [0041]); 

Borjeson discloses using a low power speech recognition stage to monitoring received audio signal. If keywords are detected in the received audio signal, invoke more power intensive the second stage / third stage speech recognition ([0005-0006]). Since feature extraction for speech recognition is very common, Borjeson does not give more details for feature extraction. Borjeson does not disclose “performing feature extraction on the microphone signal when speech is found to be present, wherein the feature extraction and the noise estimate use an output of the same Fourier Transform, such that the noise filtering of the speech is embedded with the feature extraction of the speech.”

Komeji disclosed noise estimation and suppression and extracting features from noise suppressed audio signal using Fourier transform (Fig. 1, [0057-0059], [0105], estimating noise from a silence frame (claimed “when speech is found not to be present”) and suppressing noise from audio signal having noise and speech [0061], [0086-0089]). Komeji further discloses using short time Fourier Transform to extract features ([0058-0061], [0067], [0100], [0112], [0182]). 



Regarding claims 2, 9 and 16, Borjeson in view of Komeji further discloses the noise estimate and the noise filtering are not performed on a same frame of the microphone signal (Borjeson, [0042-0043], Komeji, [0070], [0072], estimating noise from silent frame).

Regarding claims 3, 10 and 17, Borjeson in view of Komeji further discloses determining whether the microphone signal contains a wake-up phrase after the noise filtering (Borjeson, Borjeson, [0004], detecting keyword such as “OK, Google”, [005-006], [0048], detecting keywords such as “search”, “call” and invokes 2nd stage and/or  3rd stage speech recognition if keywords detected from noise suppressed audio signals). 

Regarding claims 4, 11 and 18, Borjeson in view of Komeji further discloses the feature extraction is performed while a device containing the microphone is in a sleep state (Borjeson, [0005-0006], [0036-0037],  power consumption is minimized in a low-power speech recognition to detect keywords, “low power speech recognition” is claimed “a sleep state” ).

Regarding claims 5, 12 and 19, Borjeson in view of Komeji further discloses the feature extraction includes the use of mel-frequency cepstral coefficients (MFCCs) (Komeji, [0045], speech recognition using MFCC coefficients ).

Regarding claims 6, 13 and 20, Borjeson in view of Komeji further discloses using a main processor and a lower power processor to provide processing of a wakeup phrase for a device (Borjeson, Fig. 2, #12, [0005-0006], [0020-0021], using server and a local processor to process wakeup phrase; Komeji, [0067]). 
 
Regarding claims 7 and 14,  Borjeson in view of Komeji further discloses the lower power processor performs the feature extraction and noise processing to identify the wakeup phrase (Borjeson, [0005-0006], [0019-0021], lower power speech recognition,  [0041], [0048], recognize keywords from noise filtered audio signal by the low power consumption speech recognition).

Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Borjeson in view of Komeji, and further in view of Shagalov (US PG Pub. 2015/0120290).

Regarding claim 21, Borjeson in view of Komeji discloses a client / server speech recognition system (Borjeson, [0004], [0006], [0020-0021]). Borjeson further discloses low power recognition stage performing a noise filtering function (Borjeson, Fig. 3, #66) and a keyword detection function (Borjeson, Fig. 3, #78). 


It would have been obvious to a person having ordinary skill in the art at the time the invention was made to combine Borjeson in view of Komeji’s teaching with Shagalov’s teaching to send extracted feature vectors to a speech recognition server. One having ordinary skill in the art would have been motivated to make such a modification to minimizing network data traffic (Shaglov, [0009]). 

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jialong He, whose telephone number is (571) 270-5359.  The examiner can normally be reached on Monday – Friday, 8:00AM – 4:30PM, EST.

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Pierre Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JIALONG HE/Primary Examiner, Art Unit 2659