DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The following title is suggested: Audio Keyword Detection by Low-Power and High-Power Detection Engines
The disclosure is objected to because of the following informalities:
In ¶[0016], “buffer 13” should be “buffer 131”.  See Figure 1.  
In ¶[0017], “unless awaken” should be “unless awakened”.
In ¶[0019], “in that was” should be “that was”.  
In ¶[0020], “stitching” appears to be incorrect.
In ¶[0021], “the processor” appears that it should be “or the processor”.
In ¶[0023], “remains in the second state 303” should be “remains in the second state 203”.  See Figure 2.
In ¶[0023], “KHDE” should be “HKDE”.  
In ¶[0026], “in which an processor” should be “in which a processor”.
In ¶[0028], “not more than one of the electrical signal” should be “not more than one of the electrical signals”.
Appropriate correction is required.


Election/Restrictions
Applicants’ election without traverse of Invention I, Claims 1 to 17, in the reply filed on 29 July 2022 is acknowledged.
Claims 18 to 20 are withdrawn from further consideration pursuant to 37 CFR 1.142(b) as being drawn to a nonelected invention, there being no allowable generic or linking claim.  Election was made without traverse in the reply filed on 29 July 2022.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1 and 9 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Patel et al. (U.S. Patent Publication 2019/0207777).
Concerning independent claim 1, Patel et al. discloses an audio processing device (“a digital processor for processing audio data”), comprising:
“an audio data interface” – audio sensor array 105 comprises a plurality of microphones 105a-105n, and audio input circuitry 121 of audio signal processor 120; audio signal processor 120 includes a digital signal processor (DSP) 123 and analog-to-digital converter circuitry (¶[0016] - ¶[0017]: Figure 1); 
“a buffer coupled to the interface and configured to buffer data received at the interface” – a delay buffer is operable to continuously receive and store audio samples (Abstract); delay buffer 124 is operable to receive and store audio samples from audio input circuitry 121; delay buffer 124 comprises a first-in first-out (FIFO) device (¶[0020]: Figure 1);
“a low-power keyword detection engine (LKDE) configured to determine likely presence of a keyword in data received at the interface while the data is buffered in the buffer” – a low power DSP receives audio input and runs a lower power trigger engine to detect a keyword (¶[0010]); a low power trigger engine performs an initial coarse detection of keywords (¶[0015]: Figure 1); delay buffer 124 is operable to store at least ‘T seconds’ of audio samples, where T is the amount of time it takes trigger engine 125 to detect a keyword plus the amount of time it takes to wake up host 150 (¶[0020]: Figure 1);
“a high-power keyword detection engine (HKDE) configured to wakeup from a low-power sleep mode if the LKDE determines likely presence of a keyword, and after awakening, verify the likely presence of the keyword detected by the LKDE by processing data in the buffer” – after a keyword is detected, a DSP transmits a wake up signal to a high power host processor; a host system may include a high performance trigger engine that revalidates the keyword to ensure the keyword is detected (¶[0010]); a high-power trigger engine performs a more precise detection of keywords; while low-power trigger engine is processing received audio, high-power trigger engine is in a sleep mode to conserve power; after the low-power trigger engine detects a keyword within received audio, the received audio is transferred to high-power trigger engine, which is awakened from sleep mode to process the audio to validate whether the audio did indeed include a keyword (¶[0015]: Figure 1); after trigger engine 125 detects a keyword in audio signals, DSP 123 transmits a wake up signal to host 150; host 150 then executes a wake up sequence and requests audio samples stored in delay buffer 125; delay buffer 124 is operable to transfer stored audio samples to host 150 (¶[0021]: Figure 1);
“wherein the HKDE is configured to detect keywords with higher certainty than the LKDE” – a high-power trigger engine performs a more precise detection of keywords (¶[0015]); a low power trigger engine may be configured to trigger with a high probability of identifying a presence of a trigger word, but without the robustness to avoid false detection; a higher power trigger engine 155 runs a more robust keyword detection algorithm than trigger engine 125, and reviews the identified set of audio samples to perform a more precise detection of keywords to determine whether the set of audio samples does indeed comprise a keyword (¶[0019]: Figure 1); here, a high-power trigger engine that is more robust and precise in keyword detection than a low-power trigger engine detects keywords with “higher certainty”.

Concerning independent claim 9, Patel et al. discloses all of the limitations of independent claim 1, and additionally:
“a microphone assembly comprising: a housing having a sound port and an external device interface with electrical contacts” – audio sensor array 105 comprises one or more microphones 105a-105n (“a microphone assembly”) (¶[0016]: Figure 1); audio processing device 100 may be implemented by a mobile phone, a smart speaker, a tablet, a laptop computer, a desktop computer, a voice controlled appliance, or an automobile (¶[0023]: Figure 1); implicitly, a mobile phone has “a housing”, “a sound port”, and “an external device interface with electrical contacts”; that is, “a housing” is the plastic case that circuitry of the mobile phone resides in, “a sound port” can be construed as microphone holes to enable sound to be detected by a microphone embedded in the mobile phone, and a USB connection enables “an external device interface with electrical contacts”;
“an electro-acoustic transducer disposed in the housing and configured to generate an electrical signal in response to detecting acoustic energy” – audio sensor array comprises a plurality of microphones 105a-105n, each generating an audio input signal which is provided to audio input circuitry 121 of audio signal processor 120; audio sensor array 105 generates a multichannel audio signal with each channel corresponding to an audio input signal from one of microphones 105a-105n (¶[0016]: Figure 1); each of microphones 105a-105n is “an electro-acoustic transducer . . . configured to generate an electrical signal in response to detecting acoustic energy”; implicitly, microphones 105a-105n are “disposed in the housing” of a mobile phone;
“an electrical circuit disposed in the housing and electrically coupled to contacts of the external device interface, the electrical circuit comprising: a converter configured to convert the electrical signal to digital audio” – audio signal processor 120 includes audio input circuitry 121 of audio signal processor 120; audio signal processor 120 includes a digital signal processor (DSP) 123 and analog-to-digital converter circuitry (“the electrical circuit comprising: a converter configured to convert the electrical signal to digital audio”) (¶[0017]: Figure 1); implicitly, audio signal processor 120 is “disposed in the housing” of  mobile phone; a mobile phone has at least a USB  that is “coupled to contacts of the external device interface”.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 6 to 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Patel et al. (U.S. Patent Publication 2019/0207777) in view of Bayse et al. (U.S. Patent Publication 2014/0163978).
Concerning claim 6, Patel et al. discloses a low-power trigger engine (“the LKDE”) and a high-power trigger engine (“the HKDE”), but omits these limitations of “wherein the LKDE is configured to determine likely presence of a keyword only if a preliminary condition is satisfied, and wherein the HKDE is configured to wakeup from the low-power sleep mode and determine likely presence of a keyword in data received at the interface while the data is buffered in the buffer if the preliminary condition is not satisfied.”  
Concerning claim 6, however, Bayse et al. teaches speech recognition with power management by one or more keywords.  (Abstract)  A power management subsystem may process an audio input to determine that the audio input includes a wakeword, and activates a network interface module in response to determining the audio input comprises the wakeword.  (¶[0010])  A power management subsystem may determine one or more values including an energy level or volume of the audio input, a score corresponding to a likelihood that speech is present in the audio input, and a score corresponding to a likelihood that a keyword is present in the speech.  (¶[0012])  Some embodiments provide that power supply 218 communicates a level of power that it can supply to power management subsystem 100, e.g., a percentage of battery life remaining or whether power supply 218 is plugged into an electrical outlet.  Power management subsystem 100 may selectively activate or deactivate one or more modules based at least in part on the power level indicated by the power supply.  (¶[0043]: Figure 2)  User interface 500 may include an on-device recognition selection element 514, wherein the user 401 may select whether user computing device 200 generates speech recognition results by itself or whether audio inputs are routed to speech recognition server 420 for processing into speech recognition results.  On-device recognition selection element 514 may be automatically deselected and on-device speech recognition capabilities automatically disabled if power supply 218 drops below a threshold power supply level, e.g., a battery charge percentage, as on-device speech recognition capabilities may require a relatively large power draw.  (¶[0079]: Figure 5)  Here, on-device speech recognition capabilities are only enabled if a threshold power supply level is met (“only if a preliminary condition is satisfied”), but speech is sent to a server for recognition if a threshold power supply level is not met (“if the preliminary condition is not satisfied”).  Compare Specification, ¶[0020], where LKDE is enabled using a noise level algorithm or an external power detection algorithm only if a preliminary condition is satisfied, otherwise HKDE is enabled without prior keyword detection.  An objective is to advantageously selectively activate modules of a computer device so that a power management subsystem may improve the energy efficiency of the computing device.  (¶[0014])  It would have been obvious to one having ordinary skill in the art to determine if a preliminary condition of a threshold power supply level is satisfied as taught by Bayse et al. to enable a low-power or high-power trigger engine of Patel et al. for a purpose of efficiently improving energy management of a computing device.

Concerning claim 7, Bayse et al. teaches speech recognition with power management by one or more keywords.  (Abstract)  A power management subsystem may process an audio input to determine that the audio input includes a wakeword, and activates a network interface module in response to determining the audio input comprises the wakeword.  (¶[0010])  A power management subsystem may determine one or more values including an energy level or volume of the audio input, a score corresponding to a likelihood that speech is present in the audio input, and a score corresponding to a likelihood that a keyword is present in the speech.  (¶[0012])  If an audio detection module determines that an audio input meets a threshold energy level or volume, speech detection module may be activated to determine whether audio input includes speech.  (¶[0013])  Audio detection module 106 is configured to determine that audio input has an energy level satisfying a threshold for at least a threshold duration of time.  (¶[0019]: Figure 1)  Some embodiments provide that power supply 218 communicates a level of power that it can supply to power management subsystem 100, e.g., a percentage of battery life remaining or whether power supply 218 is plugged into an electrical outlet.  Power management subsystem 100 may selectively activate or deactivate one or more modules based at least in part on the power level indicated by the power supply.  (¶[0043]: Figure 2)  Bayse et al., then, teaches at least that “the preliminary condition is . . . a supply of battery power to the processor”.  Additionally, an energy level satisfying a threshold can be construed as “the preliminary condition is a noise level below a threshold”, i.e., if an energy level threshold is not met.

Concerning claim 15, Patel et al. discloses a low-power trigger engine (“the LKDE”) and a high-power trigger engine (“the HKDE”), and these limitations of “wherein the electronic circuit is configured to provide a host device wakeup signal, the buffered data, and real-time digital data representative of the electrical signal to the external device interface”.  Here, Patel et al. discloses a high-power trigger engine is disposed on a host 150, so that after low-power trigger engine 125 detects a keyword in audio samples, DSP 123 transmits a wake up signal to host 150 across a communication bus, and host 150 then executes a wake up sequence and requests audio samples stored in delay buffer 125.  (¶[0021]: Figure 1)  One embodiment provides that audio samples may be transmitted to trigger engine 182 located on remote server 181, and trigger engine 155 or 182 receives the audio samples and validates the presence of a keyword in the received audio samples.  (¶[0033]: Figure 1)  Patel et al., then, provides “a host wakeup signal, the buffered digital data, and real-time digital data representative of the electrical signal to the external device interface”, but the host comprises a high-power trigger engine (“the HKDE”).  However, Patel et al. omits “only after the HKDE verifies the presence of a keyword detected by the LKDE.”  That is, Patel et al. performs low-power keyword detection and high-power keyword detection, but does not transmit speech for recognition by a host after low-power and high-power keyword detection, as it places high-power keyword detection on the host.  Still, Bayse et al. teaches that speech detection module 108 and speech processing module 110 may be used to determine whether to activate application processing module 112 and network interface module 206.  (¶[0028]: Figure 2)  Upon activation, network interface module 206 may transmit received audio input recorded to memory buffer module 104 over network to a remote speech recognition server.  (¶[0031]: Figure 2)  Bayse et al., then, teaches “wherein the electronic circuit is configured to provide . . . real-time digital data representative of the electrical signal to the external device interface” “only after the HKDE verifies the presence of a keyword detected by the LKDE” in Patel et al.    

Claims 2 to 3 and 10 to 11 are rejected under 35 U.S.C. 103 as being unpatentable over Patel et al. (U.S. Patent Publication 2019/0207777) in view of Bayse et al. (U.S. Patent Publication 2014/0163978) as applied to claims 1 and 9 above, and further in view of Stavron et al. (U.S. Patent Publication 2017/0161478).
Concerning claims 2 and 10, Patel et al. generally discloses that a low-power trigger engine may be configured to trigger with a high probability of identifying the presence of a trigger word, but without the robustness to avoid false detection.  (¶[0019]: Figure 1)  Here, false detection is equivalent to “false acceptance” in a limitation of “wherein the LKDE is configured to determine likely presence of a keyword with . . a false acceptance rate”.  Similarly, Patel et al. discloses a limitation of “wherein the HKDE is configured to detect likely presence of a keyword with a lower FAR than the LKDE” because a high-power trigger engine is consequently more robust to false detection.  Still, Patel et al. omits the limitations of “a true positive rate (TPR) above a first threshold and a false acceptance rate (FAR) below a second threshold, wherein the first and second thresholds are constrained by a maximum acceptable power consumption associated with a duty cycle with which the HKDE is awakened.”  That is, Patel et al. does not disclose “a true positive rate (TPR)”, thresholds of “a first threshold” and “a second threshold”, and a constraint of “a maximum acceptable power consumption associated with a duty cycle”.
Concerning claims 2 and 10, Bayse et al. teaches adaptive thresholding to reduce false positives by increasing a score threshold and reducing false negatives by periodically lowering threshold scores.  (¶[0051] - ¶[0052]: Figure 3)  Additionally, some embodiments provide that power supply 218 communicates a level of power that it can supply to power management subsystem 100, e.g., a percentage of battery life remaining or whether power supply 218 is plugged into an electrical outlet.  Power management subsystem 100 may selectively activate or deactivate one or more modules based at least in part on the power level indicated by the power supply.  (¶[0043]: Figure 2)  On-device recognition selection element 514 may be automatically deselected and on-device speech recognition capabilities automatically disabled if power supply 218 drops below a threshold power supply level, e.g., a battery charge percentage, as on-device speech recognition capabilities may require a relatively large power draw.  (¶[0079]: Figure 5)  Implicitly, if on-device recognition is deselected, then its “duty cycle” is decreased.  That is, a “duty cycle” can be defined as a percentage of time that a load is ON, so if an on-device recognition is disabled because a power supply drops below a threshold, then this provides that speech recognition is “constrained by a maximum acceptable power consumption associated with a duty cycle” of a high-power trigger engine that is on-device in Patel et al.  Additionally, Bayse et al. does not expressly teach “a true positive rate (TPR) above a first threshold”, but Stavron et al. teaches that a false rejection rate and a false acceptance rate are related by Equation (3): FAR=1-TruePositive.  (¶[0125] - ¶[0126]: Equation (3))  Given that a true positive rate is simply one minus a false acceptance rate, Bayse et al.’s threshold for a false positive rate (“false acceptance rate (FAR)”) is just an inverse of the threshold for a true positive rate (“true positive rate (TPR)”), i.e., these first and second thresholds can be the same.  Bayse et al. teaches an objective of advantageously selectively activating modules of a computer device so that a power management subsystem may improve the energy efficiency of the computing device.  (¶[0014])  It would have been obvious to one having ordinary skill in the art to determine a true positive rate being above a threshold and a false acceptance rate being below a threshold subject to a maximum acceptable power consumption constraint as taught by Bayse et al. to enable a low-power or high-power trigger engine of Patel et al. for a purpose of efficiently improving energy management of a computing device.

Concerning claims 3 and 11, Bayse et al. teaches that speech detection module 108 may determine a score or confidence level whose value corresponds to a likelihood that speech is actually present in the audio input; if a score satisfies a threshold, speech detection module 108 can determine that speech is present in the audio input; however, if the score does not satisfy the threshold, speech detection module 110 may determine that there is no speech in the audio input (“wherein the LKDE is configured to determine likely presence of a keyword based on whether a confidence level associated with detection of the keyword satisfies a condition”) (¶[0022]: Figure 1); speech processing module 110 may determine a score or confidence level whose value corresponds to a likelihood that a keyword is actually present in the speech (¶[0027]: Figure 1).  Here, speech detection module 108 is analogous to low-power trigger engine 125 and speech processing module 110 is analogous to high-power trigger engine 155 of Patel et al.  A score or confidence level of speech detection module 108 of Bayse et al. corresponds to a score or confidence level of low-power trigger engine 125 (“wherein the LKDE is configured to determine likely presence of a keyword based on whether a confidence level associated with detection of the keyword satisfies a condition”) of Patel et al.  
     
Claims 16 to 17 are rejected under 35 U.S.C. 103 as being unpatentable over Patel et al. (U.S. Patent Publication 2019/0207777) in view of Bayse et al. (U.S. Patent Publication 2014/0163978) as applied to claims 9 and 15 above, and further in view of Fϋrst et al. (U.S. Patent No. 9,712,923).
Patel et al. omits the limitations directed to “the electrical circuit further comprising a local oscillator, wherein the electrical circuit is configured to be clocked by the local oscillator before the electrical circuit provides the host device wakeup signal to the host device interface” and “the external device interface including an external clock contact, wherein the electrical circuit is configured to be clocked by an external clock signal received at the external clock contact after the electrical circuit provides the wakeup signal to the external device interface.”  However, it is well known that processing chips include timing signals provided by clocks of local oscillators.
Fϋrst et al. teaches a voice activity detection microphone that includes a microelectromechanical system (MEMS) circuit that receives a clock signal from an external host, where the clock signal is effective to operate a full system operation mode during a first time period and a voice activity mode of operation during a second time period.  The voice activity mode of operation has a first power consumption and the full system operation mode has a second power consumption.  (Abstract)  A microphone with voice or event detection enables the microphone to generate an interrupt signal which can wake the system up.  The microphone can have several modes of operation that are controlled by a clock signal.  The host generates the clock signal for the microphone, where an absence of a clock causes the microphone to enter voice activity detection (VAD) mode.  The VAD mode has a very low power consumption, and runs on a relatively low clock frequency which can be supplied by an on-chip oscillator.  (Column 2, Lines 10 to 43)  Host 104 generates clock signal 106 for microphone 102 and controls the mode of operation of microphone 102.  Microphone 102 runs on a relatively low clock frequency which can be supplied externally from clock signal 106 supplied by host 104 or from an internal on-chip oscillator in microphone 102.  The absence of a clock causes the microphone to enter a voice activity detection mode, where a clock circuit may be located on the same chip as the other components or located externally.  (Column 3, Lines 36 to 64: Figure 1)  In the wake up host mode 204, an external clock is received from the host.  The host becomes partially awake due to the detection of a keyword.  Subsequently, an external clock for the microphone is enabled with a clock frequency corresponding to a higher performance level, enough for reliable keyword detection.  (Column 4, Lines 39 to 46: Figure 2)  Fϋrst et al., then, teaches an on-chip oscillator (“the electrical circuit further comprising a local oscillator”) and that microphone 102 is clocked at a lower clock frequency by an on-chip oscillator when it is in voice activity detection mode (“wherein the electrical circuit is configured to be clocked by the local oscillator before the electrical circuit provides the host device wakeup signal to the host device interface”).  Once a microphone is woken up, an external clock runs the microphone at a higher clock frequency in a full system operation mode (“the external device interface including an external clock contact, where the electrical circuit is configured to be clocked by an external clock signal received at the external clock contact after the electrical circuit provides the wakeup signal to the external device interface”).  Implicitly, a clock signal is provided to an electrical circuit at “an external clock contact”.  An objective is to provide lower power consumption for smart-phones to enable very low power consumption levels so that only the most necessary signal processing is active.  (Column 1, Lines 38 to 44 and Column 2, Lines 44 to 46)  It would have been obvious to one having ordinary skill in the art to provide a local oscillator and an external clock signal as taught by Fϋrst et al. to control a low-power trigger engine and a high-power trigger engine of Patel et al. for a purpose of enabling low power consumption in smart phones so that only the most necessary signal processing is active.




Allowable Subject Matter
Claims 4 to 5, 8, and 12 to 14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure.
Vitaladevuni et al. (‘560), Sundaram et al., Vitaladevuni et al. (‘021), Wolff et al., Virolainen et al., Tulli, Lesso, and Li et al. disclose related prior art.  
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608.  The examiner can normally be reached Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571) 272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center.  Unpublished application information in Patent Center is available to registered users.  To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.  Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format.  For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARTIN LERNER/Primary Examiner
Art Unit 2657                                                                                                                                                                                                        September 6, 2022