Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
	Status of Claims
The present application is being examined under the claims filed on 03/25/2020.
Claims 1-20 are rejected.
Claims 1-20 are pending.
Drawings
The drawings were submitted on 0 in compliance with all requirements.  Accordingly, they are being considered by the examiner in their entirety.
Specification
The specification was submitted on 0 in compliance with all requirements.  Accordingly, it is being considered by the examiner in its entirety.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-4, 6, 9, 11-13, 15, 18 and 20 are rejected under 35 U.S.C.  102(a)(2) as being unpatentable over Yavagal et al. (US 11232788 B2), hereinafter Yavagal.

Regarding Claim 1:
Yavagal teaches:  An electronic device comprising: 
a first processing device;  (“The device includes a digital-signal processor (DSP)” (Yavagal, Abstract).
and a second processing device, wherein: the first processing device is configured to: use a keyword-detection model to determine if a segment of an input stream comprises a keyword; (“The device determines (162), using a second speech-processing component of the voice-controlled device, that the audio data includes a representation of a wakeword” (Yavagal, Fig. 1B and related text, 4:1-4).  Further, “A device having a voice-based interface activates or "wakes" when it detects an utterance (segment of an input stream) that includes a wakeword [keyword] […] wakeword-detection model of the first speech-processing component [keyword-detection model]” (Yavagal, Abstract).  Yavagal is thereby disclosing a second speech-processing sub-device, or second processor, which applies a keyword-detection model to detect keywords in input audio streams.
wake up the second processing device in response to determining that the segment of the input stream comprises the keyword; and  (“The device includes a digital-signal processor (DSP) that implements a first, lower-power speech-processing component that detects when captured audio includes the wakeword; the first speech-processing component then activates a second, higher-accuracy wakeword detection component” (Yavagal, Abstract).  Yavagal describes the wakeword detection component as “a second-stage, higher-accuracy speech processing component” (processor) (Yavagal, Fig. 2 and related text, 2:17-21).  Yavagal is thereby waking up a second processing device in response to determining the segment comprises the keyword.
modify the keyword-detection model in response to receiving a training input from the second processing device; and  (“The other system or device may send a command, via the API, to the first-stage speech-processing component 220 to receive a different wakeword-detection model than the one the first-stage speech-processing component 220 is currently using, and may subsequently send data corresponding to the different wakeword-detection model. The first-stage speech-processing  may then use
the different wakeword-detection model for wakeword detection” (Yavagal, Fig. 2 and related text, 15:17-28).  Yavagal is thereby modifying the keyword-detection model of the first processing device in response to training inputs received from the second processing device.
the second processing device is configured to: use a first neural network to determine whether the segment of the input stream comprises the keyword; and  (“a second, higher-accuracy wakeword detection component that confirms that the captured audio includes the wakeword.  The device may configure a wakeword-detection parameter, speaker-identification parameter, and/or wakeword-detection model” (Yavagal, Abstract).  Further, “The firmware includes a DSP 308 that implements a first-stage speech processing component 220.  The application software 306 includes an executable application 312 that includes a second-stage speech-processing component 222; the application software 306 may further include a user-training component 316 and/or a bridging component 318” (Yavagal, Fig. 3A and related text, 9:10-16).  Yavagal is thereby running both processor components on the device that receives the input stream, and determining on the second processing device whether the segment indeed comprises the keyword.   Further, a first neural network is used for making this determination on the second processing device, as described in Fig. 2 and as follows: “the wakeword detection components 220, 222 may be built on deep neural network (DNN)/recursive neural network (RNN) structures directly, without HMM being involved” (Yavagal, Fig. 2 and related text, 5:38-41).  
provide the training input to the first processing device in response to determining that the segment of the input stream does not comprise the keyword.  (“The quiet- and noisy-location wakeword-detection models may be implemented to have higher and lower sensitivities, respectively, for detecting a wakeword” (Yavagal, 16:47-52).  Yavagal is thereby selecting one of multiple possible wakeword-detection models based on the need to implement higher or lower wakeword-detection sensitivities.  

Regarding Claim 2:
Yavagal teaches:  The electronic device of claim 1, wherein: the electronic device comprises an 
input device  configured to provide the input stream;  (“Referring to FIG. 8, the device 110 may include input/output device interfaces 802 that connect to a variety of components such as an audio output component such as a speaker 812, a wired headset or a wireless headset (not illustrated), or other component capable of outputting audio. The device 110 may also include an audio capture component. The audio capture component may be, for example, a microphone 820 or array of microphones” (Yavagal, Fig. 8 and related text, 25:5-12).  Yavagal is thereby using input devices such as microphones to capture the input audio stream.) 
the input device comprises a microphone; and  (“These input devices may include, for example, a camera, a microphone,” (Yavagal, 15:40-41).
the input stream is an audio data stream.  (“The device 110 may also include an audio capture component. The audio capture component may be, for example, a microphone 820 or array of microphones” (Yavagal, Fig. 8 and related text, 25:5-12).)
 
Regarding Claim 3:
Yavagal teaches:  The electronic device of claim 1, wherein: the first processing device is a low-power processor;  (“The present disclosure improves computing devices by implementing a first-stage, 
the second processing device is a system on chip (SoC) configured to: operate in a reduced-activity mode;  (“The present disclosure improves computing devices by implementing a first-stage, lower-power speech-processing component that detects a representation of a wakeword in audio data and activates a second-stage, higher-accuracy speech-processing component” (Yavagal, 2:17-21).)
transition from operating in the reduced-activity mode to operating in an unrestricted-activity mode upon wake-up by the first processing device.  (“Once the wakeword is detected by both the first-stage wakeword detection component 220 and the second-stage wakeword detection component 222, the device 110 may wake and begin transmitting audio data 211, representing the audio 11, to the server(s) 120.”  (Yavagal, Fig. 2 and remaining text, 5:48-52)

Regarding Claim 4:
	Yavagal teaches:  The electronic device of claim 1, wherein the electronic device comprises a battery configured to provide power to the first and second processing devices.  (“The device 110 and/or the second-stage speech-processing component 222 may further command the first-stage speech-processing component 220, via the API, to change the wakeword-detection threshold based determining a connection to a power source, such as an electrical outlet. If the device 110 is not connected to the power source, the device 110 and/or the second-stage speech-processing component 222 may command the first-stage speech-processing component 220, via the API, to raise the wakeword-detection threshold (to, for example, 75);” (Yavagal, 18:33-42).  Yavagal is thereby teaching that without an external power source such as a wall socket, the first and second processors continue to operate by an internal power source (battery) and sets a lower wakeword-detection threshold.)  

Regarding Claim 6:
	The electronic device of claim 1, wherein using the keyword-detection model by the first processing device comprises:
generating a user score for the segment of the input stream;  (“the device 110 may select and use wakeword-detection and/or speaker-identification models for either or both of the first and second speech-processing components based on the device-use data. The device 110 may further determine
and use a wakeword-detection threshold” (Yavagal, 4:8-14).  “The device, however, may determine (158), using the first speech-processing component, a score corresponding to similarity between the audio data and user data and determines (160), using the first speech-processing component, that the score satisfies the speaker-identification parameter” (Yavagal, Fig. 1B and related text, 3:63-4:1).  Yavagal’s first processing device thereby determines, in using the keyword-detection model, a user score for the segment of the input stream.)
determining that the segment of the input stream comprises the keyword if the user score exceeds a keyword threshold; and  (“The first-stage speech-processing component 220 may, for example, determine a similarity score for the candidate wakeword based on how similar it is to the stored wakeword; if the similarity score is higher than the wakeword-detection threshold the first-stage speech processing component 220 determines that the wakeword is present in the audio data,” (Yavagal, 17:40-46).  Yavagal is thereby assigning a similarity score based on similarity from the user’s stored wakeword audio and the newly received input stream (user score), and determining the wakeword is present if this score exceeds a keyword threshold.
determining that the segment of the input stream does not comprise the keyword if the user score is less than the first threshold.  (“The first-stage speech-processing component 220 may, for example, determine a similarity score for the candidate wakeword based on how similar it is to the stored wakeword; if the similarly score is higher than the wakeword-detection threshold the first-stage  may further determine and use a wakeword-detection threshold” (Yavagal, 4:8-14).  

Regarding Claim 9:
The electronic device of claim 1, wherein the second processing device is further configured to provide the segment of the input stream to an automatic speech-recognition (ASR) system if the second processing device determines that the segment of the input stream comprises the keyword.   (“Once the wakeword is detected by both the first-stage wakeword detection component 220 and the second-stage wakeword detection component 222, the device 110 may wake and begin transmitting audio data 211, representing the audio 11, to the server(s) 120” (Yavagal, 5:48-52).  Further, “Upon receipt by the server(s) 120, the audio data 211 may be sent to an orchestrator component 240. […] The orchestrator component 240 may send the audio data 211 to an ASR component 250” (Yavagal, Fig. 2 and related text, 5:57-58,62-64).

Regarding Claim 11:
Yavagal teaches:
A method for an electronic device comprising a first processing device and a second processing device, the method comprising:  (“The device includes a digital-signal processor (DSP)” (Yavagal, Abstract).  “The device determines (162), using a second speech-processing component of the voice-controlled device, that the audio data includes a representation of a wakeword” (Yavagal, 4:1-4).  Yavagal is thereby disclosing a DSP (first processing device) and a separate speech-processing component (second processing device). 
receiving, at the first processing device, an input stream (“a voice-controlled device 110 receives audio 11 from a user 5” (Yavagal, 3:6-8));
using a keyword-detection model, by the first processing device, to determine if a segment of the 
input stream comprises a keyword;  (“The device determines (162), using a second speech-processing component of the voice-controlled device, that the audio data includes a representation of a wakeword” (Yavagal, Fig. 1B and related text, 4:1-4).  Further, “A device having a voice-based interface activates or "wakes" when it detects an utterance (segment of an input stream) that includes a wakeword [keyword] […] wakeword-detection model of the first speech-processing component [keyword-detection model]” (Yavagal, Abstract).  Yavagal is thereby disclosing a second speech-processing sub-device, or second processor, which applies a keyword-detection model to detect keywords in input audio streams.)
waking up the second processing device, by the first processing device, in response to determining that the segment of the input stream comprises the keyword;  (“The device includes a digital-signal processor (DSP) that implements a first, lower-power speech-processing component that detects when captured audio includes the wakeword; the first speech-processing component then activates a second, higher-accuracy wakeword detection component” (Yavagal, Abstract).  Yavagal describes the wakeword detection component as “a second-stage, higher-accuracy speech processing component” (processor) (Yavagal, Fig. 2 and related text, 2:17-21).  Yavagal is thereby waking up a second processing device in response to determining the segment comprises the keyword.
using a first neural network, by the second processing device, to determine whether the segment of the input stream comprises the keyword;  (“a second, higher-accuracy wakeword detection component that confirms that the captured audio includes the wakeword.  The device may configure a wakeword-detection parameter, speaker-identification parameter, and/or wakeword-detection model” (Yavagal, Abstract).  Further, “The firmware includes a DSP 308 that implements a first-stage speech processing component 220.  The application software 306 includes an executable application 312 that includes a second-stage speech-processing component 222; the application software 306 may further include a user-training component 316 and/or a bridging component 318” (Yavagal, Fig. 3A and related text, 9:10-16).  Yavagal is thereby running both processor components on the device that receives the input stream, and determining on the second processing device whether the segment indeed comprises the keyword.   Further, a first neural network is used for making this determination on the second processing device, as described in Fig. 2 and as follows: “the wakeword detection components 220, 222 may be built on deep neural network (DNN)/recursive neural network (RNN) structures directly, without HMM being involved” (Yavagal, Fig. 2 and related text, 5:38-41).  
providing, by the second processing device, a training input to the first processing device in response to determining that the segment of the input stream does not comprise the keyword; and  (“The quiet- and noisy-location wakeword-detection models may be implemented to have higher and lower sensitivities, respectively, for detecting a wakeword” (Yavagal, 16:47-52).  Yavagal is thereby selecting one of multiple possible wakeword-detection models based on the need to implement higher or lower wakeword-detection sensitivities.  “The device 110 and/or the second-stage speech-processing component 222 may further command the first-stage speech-processing component 220, via the API, to change the wakeword-detection threshold based on a number of false-positive wakeword detections” (Yavagal, Fig. 2 and related text, 18:17-26).   Yavagal is thereby providing the training to the first 
modifying, by the first processing device, the keyword detection model in response to receiving the training input. (“The other system or device may send a command, via the API, to the first-stage speech-processing component 220 to receive a different wakeword-detection model than the one the first-stage speech-processing component 220 is currently using, and may subsequently send data corresponding to the different wakeword-detection model. The first-stage speech-processing component 220 may then receive the different wakeword-detection model and load it into memory, firmware, or similar storage for use; the first-stage speech-processing component 220 may then use
the different wakeword-detection model for wakeword detection” (Yavagal, Fig. 2 and related text, 15:17-28).  Yavagal is thereby modifying the keyword-detection model of the first processing device in response to training inputs received from the second processing device.

Regarding Claim 12:
The method of claim 11, wherein: the electronic device comprises an 
input device  configured to provide the input stream;  (“Referring to FIG. 8, the device 110 may include input/output device interfaces 802 that connect to a variety of components such as an audio output component such as a speaker 812, a wired headset or a wireless headset (not illustrated), or other component capable of outputting audio. The device 110 may also include an audio capture component. The audio capture component may be, for example, a microphone 820 or array of microphones” (Yavagal, Fig. 8 and related text, 25:5-12).  Yavagal is thereby using input devices such as microphones to capture the input audio stream.) 
the input device comprises a microphone; and  (“These input devices may include, for example, a camera, a microphone,” (Yavagal, 15:40-41).
the input stream is an audio data stream.  (“The device 110 may also include an audio capture component. The audio capture component may be, for example, a microphone 820 or array of microphones” (Yavagal, Fig. 8 and related text, 25:5-12).)

Regarding Claim 13:
	The method of claim 11, wherein: the first processing device is a low-power processor;  (“The present disclosure improves computing devices by implementing a first-stage, lower-power speech-processing component that detects a representation of a wakeword in audio data and activates a second-stage, higher-accuracy speech-processing component” (Yavagal, 2:17-21).)
the second processing device is a system on chip (SoC) configured to operate in a reduced-activity mode; and  (“The present disclosure improves computing devices by implementing a first-stage, lower-power speech-processing component that detects a representation of a wakeword in audio data and activates a second-stage, higher-accuracy speech-processing component” (Yavagal, 2:17-21).)
the method further comprises the second processing device transitioning from operating in the reduced activity mode to operating in an unrestricted-activity mode upon wake-up by the first processing device.  (“Once the wakeword is detected by both the first-stage wakeword detection component 220 and the second-stage wakeword detection component 222, the device 110 may wake and begin transmitting audio data 211, representing the audio 11, to the server(s) 120.”  (Yavagal, Fig. 2 and remaining text, 5:48-52)

Regarding Claim 15:
The method of claim 11, wherein using the keyword detection model by the first processing device comprises: generating a user score for the segment of the input stream;  (“the device 110 may select and use wakeword-detection and/or speaker-identification models for either or both of the first  may further determine and use a wakeword-detection threshold” (Yavagal, 4:8-14).  “The device, however, may determine (158), using the first speech-processing component, a score corresponding to similarity between the audio data and user data and determines (160), using the first speech-processing component, that the score satisfies the speaker-identification parameter” (Yavagal, Fig. 1B and related text, 3:63-4:1).  Yavagal’s first processing device thereby determines, in using the keyword-detection model, a user score for the segment of the input stream.)
determining that the segment of the input stream comprises the keyword if the user score exceeds a keyword threshold; and  (“The first-stage speech-processing component 220 may, for example, determine a similarity score for the candidate wakeword based on how similar it is to the stored wakeword; if the similarly score is higher than the wakeword-detection threshold the first-stage speech processing component 220 determines that the wakeword is present in the audio data,” (Yavagal, 17:40-46).  Yavagal is thereby assigning a similarity score based on similarity from the user’s stored wakeword audio and the newly received input stream (user score), and determining the wakeword is present if this score exceeds a keyword threshold.
determining that the segment of the input stream does not comprise the keyword if the user score is less than the first threshold.  (“The first-stage speech-processing component 220 may, for example, determine a similarity score for the candidate wakeword based on how similar it is to the stored wakeword; if the similarly score is higher than the wakeword-detection threshold the first-stage speech processing component 220 determines that the wakeword is present in the audio data, and if the similarity score is less than the wakeword-detection threshold, the first-stage speech-processing component 220 determines that the wakeword not is present in the audio data.” (Yavagal, 17:40-46).  Yavagal is thereby assigning a similarity score based on similarity from the user’s stored wakeword audio and the newly received input stream (user score), and determining the wakeword is present if this score  may further determine and use a wakeword-detection threshold” (Yavagal, 4:8-14).  

Regarding Claim 18:
	The method of claim 11, further comprising providing, by the second processing device, the  segment of the input stream to an automatic speech-recognition (ASR) system if the second processing device determines that the segment of the input stream comprises the keyword.  (“Once the wakeword is detected by both the first-stage wakeword detection component 220 and the second-stage wakeword detection component 222, the device 110 may wake and begin transmitting audio data 211, representing the audio 11, to the server(s) 120” (Yavagal, 5:48-52).  Further, “Upon receipt by the server(s) 120, the audio data 211 may be sent to an orchestrator component 240. […] The orchestrator component 240 may send the audio data 211 to an ASR component 250” (Yavagal, Fig. 2 and related text, 5:57-58,62-64).

Regarding Claim 20:
	A non-transitory computer-readable medium comprising instructions that, when executed by a first and second processing device, cause the first and second processing device to perform operations for:  (“The device includes a digital-signal processor (DSP)” (Yavagal, Abstract).  “The device determines (162), using a second speech-processing component of the voice-controlled device, that the audio data includes a representation of a wakeword” (Yavagal, 4:1-4).  Yavagal is thereby disclosing a DSP (first processing device) and a separate speech-processing component (second processing device).
receiving, at the first processing device, an input stream (“a voice-controlled device 110 receives audio 11 from a user 5” (Yavagal, 3:6-8)); 
using a keyword-detection model, by the first processing device, to determine if a segment of the input stream comprises a keyword;  (“The device determines (162), using a second speech-processing component of the voice-controlled device, that the audio data includes a representation of a wakeword” (Yavagal, Fig. 1B and related text, 4:1-4).  Further, “A device having a voice-based interface activates or "wakes" when it detects an utterance (segment of an input stream) that includes a wakeword [keyword] […] wakeword-detection model of the first speech-processing component [keyword-detection model]” (Yavagal, Abstract).  Yavagal is thereby disclosing a second speech-processing sub-device, or second processor, which applies a keyword-detection model to detect keywords in input audio streams.)
waking up the second processing device, by the first processing device, in response to determining that the segment of the input stream comprises the keyword;  (“The device includes a digital-signal processor (DSP) that implements a first, lower-power speech-processing component that detects when captured audio includes the wakeword; the first speech-processing component then activates a second, higher-accuracy wakeword detection component” (Yavagal, Abstract).  Yavagal describes the wakeword detection component as “a second-stage, higher-accuracy speech processing component” (processor) (Yavagal, Fig. 2 and related text, 2:17-21).  Yavagal is thereby waking up a second processing device in response to determining the segment comprises the keyword.
using a first neural network, by the second processing device, to determine whether the segment of the input stream comprises the keyword;  (“a second, higher-accuracy wakeword detection component that confirms that the captured audio includes the wakeword.  The device may configure a wakeword-detection parameter, speaker-identification parameter, and/or wakeword-detection model” (Yavagal, Abstract).  Further, “The firmware includes a DSP 308 that implements a first-stage speech processing component 220.  The application software 306 includes an executable application 312 that 
providing, by the second processing device, a training input to the first processing device in response to determining that the segment of the input stream does not comprise the keyword; and  (“The quiet- and noisy-location wakeword-detection models may be implemented to have higher and lower sensitivities, respectively, for detecting a wakeword” (Yavagal, 16:47-52).  Yavagal is thereby selecting one of multiple possible wakeword-detection models based on the need to implement higher or lower wakeword-detection sensitivities.  “The device 110 and/or the second-stage speech-processing component 222 may further command the first-stage speech-processing component 220, via the API, to change the wakeword-detection threshold based on a number of false-positive wakeword detections” (Yavagal, Fig. 2 and related text, 18:17-26).   Yavagal is thereby providing the training to the first processing device based on the other (second) processor on device 110 determining the segment(s) do not comprise the keyword(s).    
modifying, by the first processing device, the keyword detection model in response to receiving the training input.  (“The other system or device may send a command, via the API, to the first-stage speech-processing component 220 to receive a different wakeword-detection model than the one the first-stage speech-processing component 220 is currently using, and may subsequently send data corresponding to the different wakeword-detection model. The first-stage speech-processing  may then use
the different wakeword-detection model for wakeword detection” (Yavagal, Fig. 2 and related text, 15:17-28).  Yavagal is thereby modifying the keyword-detection model of the first processing device in response to training inputs received from the second processing device.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C.  103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 5 and 14 are rejected under 35 U.S.C.  103 as being unpatentable over Yavagal et al. (US 11232788 B2), hereinafter Yavagal, and further in view of Gunn et al. (US 10381001 B2), hereinafter Gunn.
Regarding Claim 5:
	Yavagal teaches:  the first processing device is further configured to delay modifying the keyword-detection model until the battery is in a […] mode; and  (“The device 110 and/or the second-stage speech-processing component 222 may further command the first-stage speech-processing component 220, via the API, to change the wakeword-detection threshold based determining a connection to a power source, such as an electrical outlet. If the device 110 is not connected to the power source, the device 110 and/or the second-stage speech-processing component 222 may command the first-stage speech-processing component 220, via the API, to raise the wakeword-detection threshold (to, for example, 75); if the device 110 is connected to the power source, the device 

Yavagal does not explicitly teach:  
[…] battery is in a recharging mode; and
The electronic device of claim 4, wherein: the battery is a rechargeable battery configured to be recharged in a recharging mode; and  

However, in an analogous art of voice control during low-power mode, Gunn teaches: […] recharging mode; and 
The electronic device of claim 4, wherein: the battery is a rechargeable battery configured to be recharged in a recharging mode; and  (“most mobile devices will display at least a clock showing the time of day and a battery charge level” (Gunn, 2:2-4).  Gunn is thereby describing that the battery charge level is to be associated with most mobile devices, which since before the time of filing contained rechargeable batteries configured to be recharged.  The state of being recharged is a recharging mode.
It would have been obvious before the effective filing date of the invention to combine Yavagal with Gunn in order to improve Yavagal’s wakeword detection system by explicitly enabling mobile device 110 to recharge in a recharging mode when it is plugged into a wall socket as most mobile devices do, as is taught by Gunn.  This embodiment of providing battery recharging and a recharging 

Regarding Claim 14:
Yavagal teaches:
The method of claim 11, wherein: the electronic device comprises a battery configured to provide power to the first and second processing devices;  (“The device 110 and/or the second-stage speech-processing component 222 may further command the first-stage speech-processing component 220, via the API, to change the wakeword-detection threshold based determining a connection to a power source, such as an electrical outlet. If the device 110 is not connected to the power source, the device 110 and/or the second-stage speech-processing component 222 may command the first-stage speech-processing component 220, via the API, to raise the wakeword-detection threshold (to, for example, 75);” (Yavagal, 18:33-42).  Yavagal is thereby teaching that without an external power source such as a wall socket, the first and second processors continue to operate by an internal power source (battery) and sets a lower wakeword-detection threshold.)  
the method further comprises the first processing device delaying modifying the keyword-detection model until the battery is in a […] mode;  (“The device 110 and/or the second-stage speech-processing component 222 may further command the first-stage speech-processing component 220, via the API, to change the wakeword-detection threshold based determining a connection to a power source, such as an electrical outlet. If the device 110 is not connected to the power source, the device 110 and/or the second-stage speech-processing component 222 may command the first-stage speech-processing component 220, via the API, to raise the wakeword-detection threshold (to, for example, 75); if the device 110 is connected to the power source, the device 110 and/or the second-stage speech-

Yavagal does not explicitly teach:  
[…] battery is in a recharging mode; and
the battery is a rechargeable battery configured to be recharged in a recharging mode;  

However, in an analogous art of voice control during low-power mode, Gunn teaches: […] recharging mode; and 
the battery is a rechargeable battery configured to be recharged in a recharging mode;  (“most mobile devices will display at least a clock showing the time of day and a battery charge level” (Gunn, 2:2-4).  Gunn is thereby describing that the battery charge level is to be associated with most mobile devices, which since before the time of filing contained rechargeable batteries configured to be recharged.  The state of being recharged is a recharging mode.
It would have been obvious before the effective filing date of the invention to combine Yavagal with Gunn in order to improve Yavagal’s wakeword detection system by explicitly enabling mobile device 110 to recharge in a recharging mode when it is plugged into a wall socket as mobile devices do, as is taught by Gunn.  This embodiment of providing battery recharging and a recharging mode when the mobile device is plugged into external power is taught in the art and commonly desired for the . 

Claims 7, 8, 16, and 17 are rejected under 35 U.S.C.  103 as being unpatentable over Yavagal et al. (US 11232788 B2), hereinafter Yavagal, and further in view of Ponte et al. (US 20120047172 A1), hereinafter Ponte.

Regarding Claim 7:
The electronic device of claim 6, wherein: using the first neural network by the second processing
device comprises: recalculating the user score for the segment of the input stream; and  (“a second, higher-accuracy wakeword detection component that confirms that the captured audio includes the wakeword.  The device may configure a wakeword-detection parameter, speaker-identification parameter, and/or wakeword-detection model” (Yavagal, Abstract).  These parameters and models may be built on the neural network and used for making this user score recalculation on the second processing device, as described in Fig. 2 and as follows: “the wakeword detection components 220, 222 may be built on deep neural network (DNN)/recursive neural network (RNN) structures directly, without HMM being involved” (Yavagal, Fig. 2 and related text, 5:38-41).  Further, “the device determines (162), using a second speech-processing component of the voice-controlled device, that the audio data includes a representation of a wakeword.  The second speech-processing component may, in addition determine a score corresponding to a similarity between the audio data and user data and may further determine that the score satisfies the speaker identification parameter” (Yavagal, 4:1-8).  Yavagal is thereby running both processor components on the device that receives the input stream, and recalculates on the second processing device the user score for the input segment.   
appending a datum corresponding to the user score to a [learning bin if the] recalculated user score [exceeds a learning threshold]; and  (“The device determines (162), using a second speech-processing component of the voice-controlled device, that the audio data includes a representation of a wakeword.  The second speech-processing component may, in addition determine a score corresponding to a similarity between the audio data and user data and may further determine that the score satisfies the speaker identification parameter” (Yavagal, 4:1-8).  Yavagal is thereby representing input stream segments with datum which represent the similarity between the user-trained voice data and the keyword (user scores).
modifying the keyword-detection model comprises using the learning bin.  (“The trained wakeword detection components 220, 220 implemented by the device 110 may be trained and operated according to various machine learning techniques” (Yavagal, 13:66-14:1).  The keyword detection components and their models are trained by the data in the learning bin as discussed above.  Any memory location comprising training data is a learning bin.  Thus, the full process of training a model will always comprise a learning bin. 

Yavagal does not explicitly teach:  
appending a datum corresponding to the user score to a learning bin if the recalculated user score exceeds a learning threshold.

However, in an analogous art of scoring voice data, Ponte et al. (US 20120047172 A1), hereinafter Ponte, teaches: appending a datum corresponding to a user score to a learning bin if the user score exceeds a learning threshold (“evaluating candidate voice recording-transcriptions pairs based on common features shared by the pairs, scoring the candidate voice recording-transcription pairs based on the evaluation, and determining whether a voice recording-transcription pair is a voice recording and 
It would have been obvious before the effective filing date of the invention to combine Yavagal’s user score datum with Ponte’s voice data similarity score training threshold, which also represents the similarity between voice data and another piece of speech-related data.  This would improve the training data and thus the accuracy of the neural network for wakeword speech recognition on the secondary processor.  

Regarding Claim 8:
The electronic device of claim 7, wherein the learning threshold is greater than the keyword threshold.:  (“identifying audio recording-transcription pairs can include identifying a group of candidate transcriptions from a collection of transcriptions, where each of the candidate transcriptions shares one or more "rare" features (e.g., tokens),” (Ponte, ¶0069).  Ponte is digitally ascribing tokens when the data features attain some threshold of digital weight/worthiness (keyword threshold).   This token transcription to transcription candidates (user scores) which are algorithmically compared to their associated input speech: “voice audio recordings and voice audio recording transcriptions are mined to identify audio recording-transcription pairs”.   Further, a greater threshold standard is “evaluating candidate voice recording-transcriptions pairs based on common features shared by the pairs, scoring the candidate voice recording-transcription pairs based on the evaluation, and determining whether a voice recording-transcription pair is a voice recording and its corresponding transcription if the score associated with the pair is above a pre-defined threshold.” (Ponte, ¶0069).)  Since Ponte is using the 

Regarding Claim 16:
Yavagal teaches:
The method of claim 15, wherein: using the first neural network by the second processing
device comprises: recalculating the user score for the segment of the input
stream; and  (“a second, higher-accuracy wakeword detection component that confirms that the captured audio includes the wakeword.  The device may configure a wakeword-detection parameter, speaker-identification parameter, and/or wakeword-detection model” (Yavagal, Abstract).  These parameters and models may be built on the neural network and used for making this user score recalculation on the second processing device, as described in Fig. 2 and as follows: “the wakeword detection components 220, 222 may be built on deep neural network (DNN)/recursive neural network (RNN) structures directly, without HMM being involved” (Yavagal, Fig. 2 and related text, 5:38-41).  Further, “the device determines (162), using a second speech-processing component of the voice-controlled device, that the audio data includes a representation of a wakeword.  The second speech-processing component may, in addition determine a score corresponding to a similarity between the audio data and user data and may further determine that the score satisfies the speaker identification parameter” (Yavagal, 4:1-8).  Yavagal is thereby running both processor components on the device that receives the input stream, and recalculates on the second processing device the user score for the input segment.   
appending a datum corresponding to the user score to a [learning bin if the] recalculated user score [exceeds a learning threshold]; and  (“The device determines (162), using a second speech-
modifying the keyword-detection model comprises using the learning bin. (“The trained wakeword detection components 220, 220 implemented by the device 110 may be trained and operated according to various machine learning techniques” (Yavagal, 13:66-14:1).  The keyword detection components and their models are trained by the data in the learning bin as discussed above.  Any memory location comprising training data is a learning bin.

Yavagal does not explicitly teach:  
	appending a datum corresponding to the user score to a learning bin if the recalculated user score exceeds a learning threshold.

However, in an analogous art of scoring voice data, Ponte et al. (US 20120047172 A1), hereinafter Ponte, teaches: appending a datum corresponding to a user score to a learning bin if the user score exceeds a learning threshold (“evaluating candidate voice recording-transcriptions pairs based on common features shared by the pairs, scoring the candidate voice recording-transcription pairs based on the evaluation, and determining whether a voice recording-transcription pair is a voice recording and its corresponding transcription if the score associated with the pair is above a pre-defined threshold.  The pairs identified as a match (i.e., having a score above the threshold) then can be used as input data for training automatic speech recognition tools.” (Ponte, ¶0069).)  Since Ponte is training automatic 
It would have been obvious before the effective filing date of the invention to combine Yavagal’s user score datum with Ponte’s voice data similarity score training threshold, which also represents the similarity between voice data and another piece of speech-related data.  This would improve the training data and thus the accuracy of the neural network for wakeword speech recognition on the secondary processor.  

Regarding Claim 17:
The method of claim 16, wherein the learning threshold is greater than the keyword threshold.  (“identifying audio recording-transcription pairs can include identifying a group of candidate transcriptions from a collection of transcriptions, where each of the candidate transcriptions shares one or more "rare" features (e.g., tokens),” (Ponte, ¶0069).  Ponte is digitally ascribing tokens when the data features attain some threshold of digital weight/worthiness (keyword threshold).   This token transcription to transcription candidates (user scores) which are algorithmically compared to their associated input speech: “voice audio recordings and voice audio recording transcriptions are mined to identify audio recording-transcription pairs”.   Further, a greater threshold standard is “evaluating candidate voice recording-transcriptions pairs based on common features shared by the pairs, scoring the candidate voice recording-transcription pairs based on the evaluation, and determining whether a voice recording-transcription pair is a voice recording and its corresponding transcription if the score associated with the pair is above a pre-defined threshold.” (Ponte, ¶0069).)  Since Ponte is using the candidate keyword transcriptions as pairs with the keyword audio recording, and only using these pairs for training data when a learning threshold is passed, the learning threshold is greater than the keyword threshold.)

Claim 10 and 19 are rejected under the combination of Yavagal in view of Wightman (US 10,074,364), and further in view of Gunn. 

Regarding Claim 10:  
Yavagal teaches:
The electronic device of claim 9, wherein: the second processing device monitors a false-alarm rate of the first processing device; and  (“the second-stage speech processing component 222 may command the first-stage speech-processing component 220, via the API, to raise the wakeword-detection threshold (to, for example, 75); if the number of false-positive wakeword detections is low (e.g., less than one per minute),” (Yavagal, 18:23-32).  

the first processing device is configured to, [after achieving a predetermined false-alarm rate], subsequently bypass waking up the second processing device and [provide the segment of the input stream to the ASR system without waking up the second processing device].  (“the device 110 may select and use wakeword-detection and/or speaker-identification models for either or both of the first and second speech-processing components based on the device-use data. The device 110 may further determine and use a wakeword-detection threshold” (Yavagal, 4:8-14).  Further, “The wakeword-detection sensitivity of the wakeword detector 412 and/or second-stage classifier 416 may be altered in response to determining device-use data corresponding to the device 110 and determining a wakeword detection parameter corresponding to the device-use data” (Yavagal, Fig. 2 and related text, 13:33-37).  Yavagal is using either the first or second speech-processor for the keyword detection models, and in the case of only using the first, bypassing waking up the second.

 […] after achieving a predetermined false-alarm rate

However, in an analogous art of voice control during low-power mode, Wightman teaches: […] after achieving a predetermined false-alarm rate (“If, at step 812, it is determined that the similarity value is greater than the similarity threshold value, then process 800 may proceed to step 818 where speech recognition processing may be caused to stop.” (Wightman, Fig. 8 and related text, 31:59-62).  Further related to multiple processors, “The electronic system of claim 14, wherein: the at least one processor comprises at least one first processor associated with at least one first electronic device and at least one second processor associated with at least one second electronic device; the at least one first processor is operable to: receive the first instance of the audio data determine that the plurality of additional instances of audio data representing the first sound are also received, determine the number of the instances of audio data, determine that the number of instances is greater than the threshold value, generate the first sound profile, and cause a communication to be sent to the second electronic device that causes the first sound profile to be stored; and the at least one second processor is operable to: generate the second sound profile of the second sound, determine that the similarity value of the second sound profile and the first sound profile is greater than the similarity threshold value, and based at least in part on the determination that the similarity value is greater than the similarity threshold value [achieving predetermined false-alarm rate], refrain from causing the at least some automated speech recognition processing to be performed for the second audio data. (Wightman, 37:10-38:14).  Wightman is thereby determining that a secondary, more intensive speech recognition processor is not needed when a low enough falseness likelihood (false-alarm rate) is achieved by the first, lower power processing. 
It would have been obvious to combine Yavagal with Wightman in order to improve Yavagal’s Wakeword Detection system by enabling mobile device 110 to achieve a predetermined false-alarm rate  

Yavagal in combination with Wightman doesn’t teach: […] provide the segment of the input stream to the ASR system without waking up the second processing device

However, in an analogous art of voice control during low-power mode, Gunn teaches: […] provide the segment of the input stream to the ASR system without waking up the second processing device  (“the low-power processing 103 takes over some operations of the application processor 101 and may execute at least a reduced code version of the voice recognition engine 205 executable code. In other embodiments, the speech segment monitor 207 takes over when the application processor 101 goes into sleep mode, and provides a limited voice recognition capability”  (Gunn, 7:32-41).  Gunn is using the ASR system on the low-power processor without waking up the second, more powerful application processor which, in normal operation, would access the ASR system.  Gunn is thereby teaching the first processing device configured to bypass waking up the second, and providing the voice segment directly to the ASR system.  
It would have been obvious to combine Yavagal and Wightman with Gunn in order to improve Yavagal’s Wakeword Detection system by enabling mobile device 110 to operate more fully with the ASR function in the low power mode without the secondary processor activated.  This combination is taught by Gunn: “any function that may be performed while a device is in low-power operating mode may benefit from the embodiments herein described as may occur to those skilled in the art.”  Therefore, the embodiment of providing voice recognition capability when the application processor, or secondary processor, goes to sleep, is both taught and desired in the art. 

Regarding Claim 19:
	The method of claim 18, further comprising: monitoring, by the second processing device, a false alarm rate of the first processing device; and  (“the second-stage speech processing component 222 may command the first-stage speech-processing component 220, via the API, to raise the wakeword-detection threshold (to, for example, 75); if the number of false-positive wakeword detections is low (e.g., less than one per minute),” (Yavagal, 18:23-32).  
[after achieving a predetermined false-alarm rate] by the first processing device, subsequently bypassing waking up the second processing device and [providing the segment of the input stream to the ASR system without waking up the second processing device.]   (“the device 110 may select and use wakeword-detection and/or speaker-identification models for either or both of the first and second speech-processing components based on the device-use data. The device 110 may further determine and use a wakeword-detection threshold” (Yavagal, 4:8-14).  Further, “The wakeword-detection sensitivity of the wakeword detector 412 and/or second-stage classifier 416 may be altered in response to determining device-use data corresponding to the device 110 and determining a wakeword detection parameter corresponding to the device-use data” (Yavagal, Fig. 2 and related text, 13:33-37).  Yavagal is using either the first or second speech-processor for the keyword detection models, and in the case of only using the first, bypassing waking up the second.

Yavagal doesn’t teach: after achieving a predetermined false-alarm rate […]

However, in an analogous art of voice control during low-power mode, Wightman teaches: after achieving a predetermined false-alarm rate […]  (“If, at step 812, it is determined that the similarity value is greater than the similarity threshold value, then process 800 may proceed to step 818 where  wherein: the at least one processor comprises at least one first processor associated with at least one first electronic device and at least one second processor associated with at least one second electronic device; the at least one first processor is operable to: receive the first instance of the audio data determine that the plurality of additional instances of audio data representing the first sound are also received, determine the number of the instances of audio data, determine that the number of instances is greater than the threshold value, generate the first sound profile, and cause a communication to be sent to the second electronic device that causes the first sound profile to be stored; and the at least one second processor is operable to: generate the second sound profile of the second sound, determine that the similarity value of the second sound profile and the first sound profile is greater than the similarity threshold value, and based at least in part on the determination that the similarity value is greater than the similarity threshold value [achieving predetermined false-alarm rate], refrain from causing the at least some automated speech recognition processing to be performed for the second audio data. (Wightman, 37:10-38:14).  Wightman is thereby determining that a secondary, more intensive speech recognition processor is not needed when a low enough falseness likelihood (false-alarm rate) is achieved by the first, lower power processing. 
It would have been obvious to combine Yavagal with Wightman in order to improve Yavagal’s Wakeword Detection system by enabling mobile device 110 to achieve a predetermined false-alarm rate  in the low power mode without needing to enable the secondary, more power-hungry processor.  The embodiment of providing low error rates in speech comparison processing while maintaining relatively low power operation by keeping a secondary processor asleep, is both taught and desired in the art. 

 […] providing the segment of the input stream to the ASR system without waking up the second processing device.

However, in an analogous art of voice control during low-power mode, Gunn teaches: […] providing the segment of the input stream to the ASR system without waking up the second processing device.  (“the low-power processing 103 takes over some operations of the application processor 101 and may execute at least a reduced code version of the voice recognition engine 205 executable code. In other embodiments, the speech segment monitor 207 takes over when the application processor 101 goes into sleep mode, and provides a limited voice recognition capability”  (Gunn, 7:32-41).  Gunn is using the ASR system on the low-power processor without waking up the second, more powerful application processor which typically would enable access to the ASR system.  Gunn is thereby teaching the first processing device configured to bypass waking up the second, and providing the voice segment directly to the ASR system.  
It would have been obvious to combine Yavagal and Wightman with Gunn in order to improve Yavagal’s Wakeword Detection system by enabling mobile device 110 to operate more fully with the ASR function in the low power mode without the secondary processor activated.  This combination is taught by Gunn: “any function that may be performed while a device is in low-power operating mode may benefit from the embodiments herein described as may occur to those skilled in the art.”  Therefore, the embodiment of providing voice recognition capability when the application processor, or secondary processor, goes to sleep, is both taught and desired in the art. 




Conclusion
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.  
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PIERCE ANDREW MOONEY whose telephone number is (571)272-0971. The examiner can normally be reached Monday-Friday 8:30am-5:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like 




e/PIERCE ANDREW MOONEY/Examiner, Art Unit 2657                                                                                                                                                                                                        

/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657