DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
In response to the Office Action mailed 3/10/2021, applicant has submitted an amendment filed 6/10/2021.
Claim(s) 21, 24, 27, 30, 31, 33, 34-38, 40, has/have been amended.  Claim(s) 1-20, 29, 32, and 39 has/have been cancelled.  
Response to Arguments
	As per Claim 21, Applicant’s amendments did not incorporate what was previously indicated to be allowable subject matter (the combination of all limitations in claims 21 and 29) into claim 21, particularly because Applicant altered what was previously recited to be 
“wherein the capture of the second samples of the audio data at the first intervals comprises the audio interface sampling the second samples of the audio data at a first sample rate and the detection by the speech onset detector of speech onset in the first sample of the audio data comprises the audio interface sampling the first sample of the audio data at a second sample rate, wherein the first sample rate is greater than the second sample rate” 
to currently recite 
“wherein the capturing of the second samples of the audio data at the second intervals (NOT at the first intervals) comprises the audio interface sampling the second samples of the audio data at a second sample rate (NOT the first sample rate) and the detection by the speech onset detector of speech onset in the first sample of the audio data is based on the sampling by the audio interface of the first sample of the audio data at a first sample rate (i.e. a rate lower than the rate at which the second samples were sampled at the first intervals), wherein the second sample rate is greater than the first sample rate”
	More specifically, what Applicant currently claims in claim 21 is where the sampling of second samples of the audio data after detecting speech onset is at a rate which is higher than the rate at which the first sample of the audio data is sampled before/as part of detecting speech onset.  This is not unusual and is suggested based on the previous prior art rejection because Rossum suggests where, responsive to detecting vocalization after a period of non-speech, the “audio processing device” is changed from sampling at a lower internal oscillator sampling rate to sampling at a higher DSP-provided CLK312 rate (see Office Action mailed 3/10/2021, pages 15-17).
	In contrast, what was previously claimed in previously presented claim 29 was where the sampling of second samples of the audio data at first intervals before detecting speech onset (“wherein responsive to detection by the speech onset detector of speech onset in the first sample of the audio data, the audio interface control is operable to switch the audio processing device from capturing second samples of the audio data at first intervals, to capturing the second samples of the audio data at second intervals”) samples at a faster rate than the sampling rate at which the first sample of the audio data (in which speech onset is detected) is sampled.  Therefore, what was previously claimed in claim 29 is where the second data is sampled at a slower rate even slower rate than when the second data was sampled before detecting speech onset.
	Also put another way:
	Current claim 21 requires: sampling rate of first samples before onset detection < sampling rate of second samples after onset detection
	Previous claim 29 required: sampling rate of first samples before onset detection < sampling rate of second samples before onset detection < sampling rate of second samples after onset detection (because the second intervals used to sample the second samples after onset detection are shorter than the first intervals used to sample the second samples before onset detection)
	Therefore, what is current claimed in claim 21 is not directed to allowable subject matter previously indicated for previously presented claim 29 and the reasons for indicating allowable subject matter that pertained to claim 29 do not apply to claim 21.
	Therefore, a prior art rejection similar to what was previously applied to reject claim 21, adjusted as necessitated by the amendments to claim 21, is presented below.

	Additionally, since this amended subject matter is also suggested by the claims of Parent Patent 1 (US Patent 10,332,543), Claim 21, as amended, is still rejected under Double Patenting, and new Double Patenting rejections (adjusted as necessary for the amendments made to the claims) are presented below.


	After further search and consideration, claim 25 is also able to be rejected based on prior art, and a new prior art rejection of claim 25 necessitated by amendment (because claim 25 no longer “inherits” the allowable subject matter of previously presented claim 24) is presented below.

	New Double Patenting rejections of claims 24-25, 27, 31, 33-34, 40 are also necessitated by amendment (these claims were not previously rejected because they were directed to new matter, and as currently claimed they are suggested/taught by the claims of Parent Patent 1)

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are:
The “audio interface”, “speech onset detector”, “combiner”, and “audio interface control” in claim 21.

The “communication interface”, “audio interface”, “speech onset detector”, “combiner”, “wake-up phrase detector”, and “audio interface control” in claim 36.
“the audio interface”, “the speech onset detector” and “a threshold computation module” in claims 38-40.
All of the names of these elements are functional words (i.e. functional element names are both generic placeholders and functional language), and all descriptions of these elements are functional, and there is not sufficient structure in the claim to perform the claimed functions.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Objections
	Nothing is linguistically wrong with claim 23, but claim 21 recites “at least one portion of the second samples of the audio data captured at the first intervals” which is the antecedent basis for “the at least one captured portion of the second samples of the audio data captured at the first intervals” in claim 23.  Applicant may, at applicant’s discretion, amend claim 23 to recite –the at least one portion of the second samples of the audio data captured at the first intervals—(delete “captured” in “at least one captured portion”) for language consistency or just having fewer words in claim 23.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 21-28 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 21 recites “the sampling by the audio interface of the first sample of the audio data at a first sample rate”, as an entire phrase, lacks antecedent basis.
Claim 21 recites “wherein the audio interface is operable to provide a first sample of the audio data to the speech onset detector” and also “an audio interface operable to at a first sample rate.

The dependent claims include the issues of their respective parent claims.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 21, 24, is/are rejected under 35 U.S.C. 103 as being unpatentable over Rossum et al. (US 2016/0196838), hereafter Rossum.

As per Claim 21, Rossum suggests An audio processing device, comprising: an audio interface operable to sample audio data; a speech onset detector; a combiner; and an audio interface control, wherein the audio interface is operable to provide a first sample of the audio data to the speech onset detector, wherein responsive to detection by the speech onset detector of speech onset in the first sample of the audio data, the audio interface control is operable to switch the audio processing device from capturing second samples of the audio data at first intervals, to capturing the second samples of the audio data at second intervals, wherein each second interval is shorter than each first interval, and wherein the combiner is operable to provide contiguous audio data using at least one portion of the second samples of the audio data captured at the first intervals and the second samples of the audio data captured at the second intervals, wherein the capturing of the second samples of the audio data at the second intervals comprises the audio interface sampling the second samples of the audio data at a second sample rate and the detection by the speech onset detector of speech onset in the first sample of the audio data is based on the sampling by the audio interface of the first sample of the audio data at a first sample rate , wherein the second sample rate is greater than the first sample rate  (Figures 1-3; paragraphs 3-6, 9, 23, 26, 29, 33, 34, 36, 37, 38, 39, 40, 41, 45, 46;
“An audio processing device,”: Paragraph 36 describes a system which includes at least one microphone [also referred to by DMIC] coupled to either an external or host DSP, which suggests an embodiment where “host” DSP refers to a DSP that is part of the same device as the microphone[s] [since “host” is described as an alternative to “external” which suggests where a “host DSP” is an “internal” component].  Figure 2 depicts microphone[s] and a “processor 210” which is part of the same device, and paragraph 33 describes where the processor 210, in one example, includes a DSP.  These portions suggest “An audio processing device” [a device including the DMIC and the DSP]
“comprising: an audio interface operable to sample audio data;”: Figure 3 and paragraph 36 describe where the DMIC/microphone 120 [part of “An audio processing 
“a speech onset detector;”: Figure 3 depicts where the DMIC/microphone 120 [part of “An audio processing device” including the DMIC/microphone and a DSP, suggested by paragraphs 33, 36, and Figures 2-3 as discussed above] includes a vocalization detector 320.  Paragraph 29 describes where vocalization detection is synonymous with “voice activity detection” [which conventionally/commonly in the art refers to detecting the presence or absence of speech] and buffering audio data significantly prior to the vocalization detection.  Paragraph 37 describes actions done “prior to the vocalization detection” and paragraph 38 describes where certain actions are done “when the DMIC 120 detects a vocalization”.  These paragraphs suggest where detecting “vocalization” detects presence of speech [at least because it would be unusual to call absence of speech “vocalization”], where no vocalization [i.e. VAD detection detecting no presence of speech] is detected for a period of time “prior to the vocalization detection” such that “when the DMIC… detects a vocalization”, the DMIC [particularly the vocalization detector 320 of the DMIC] is detecting speech after a period of non-speech [i.e. the vocalization detector 320 detects a “speech onset”].  These portions suggest where the “audio processing device” [device including the DMIC 
“a buffer;”: Figure 3 depicts where the DMIC/microphone 120 [part of “An audio processing device” including the DMIC/microphone and a DSP, suggested by paragraphs 33, 36, and Figures 2-3 as discussed above] includes a buffer 310 [i.e. such that the “audio processing device” comprises “a buffer”].
“a combiner;”: As discussed above, paragraphs 33, 36 and Figure 2-3 suggest “An audio processing device” [a device including the DMIC and the DSP].  Paragraph 41 describes where one of the functions of the DSP is to “pre-pend” buffered data to real-time audio data [at least suggested to be combining the buffered data with the real-time audio data].  These portions thus suggest where the “audio processing device” comprises “a combiner” [i.e. a portion of the DSP that pre-pends/combines buffered data to real-time audio data]
“and an audio interface control”: As discussed above, paragraphs 33, 36 and Figure 2-3 suggest “An audio processing device” [a device including the DMIC and the DSP].  Also, as discussed above, the combination of the transducer 302, the amplifier 304, the A/D converter 306 and the PDM 308 can be interpreted as “an audio interface operable to sample audio data” [as suggested by paragraphs 9, 36, 37, and Figure 3].  Paragraphs 38-39 describe where the DSP, among other things, outputs a clock on the CLK line appropriate for receiving real-time PDM 308 audio data from the DMIC 120, and where the DMIC 120 responds to the CLK input 312 by switching from the internal sample rate to the sample rate of the provided clock line.  These portions suggest where the “audio processing device” comprises “an audio interface control” [a portion of 
“wherein the audio interface is operable to provide a first sample of the audio data to the speech onset detector,”: As discussed above [see portions of this rejection directed to “an audio interface” and “a speech onset detector”, incorporated here by reference], the combination of elements 302, 304, 306, and 308, in Figure 3 are interpreted as “the audio interface” [Figures 2-3; paragraphs 9, 33, 36, 37] and Rossum suggests where the vocalization detector Figure 3 is “the speech onset detector” [Figures 2-3, paragraphs 29, 33, 36-38], and “audio data” is interpreted as all audio data received by the DMIC/microphone over time from which the buffered audio data “samples” were acquired by sampling at a sampling rate.  Paragraph 37 describes where the DMIC analyzes audio data to determine whether a vocalization has occurred, which, together with Figure 3, suggests “wherein the audio interface is operable to provide a first sample of the audio data to the speech onset detector,” [where the transducer+amplifier+A/D-converter+PDM “audio interface” provides a “first” portion/”sample” of all audio data received by the DMIC/microphone over time to the vocalization detector so that the vocalization detector can determine whether a vocalization has occurred in the “first” portion/”sample”]
“wherein responsive to detection by the speech onset detector of speech onset in the first sample of the audio data, the audio interface control is operable to switch the audio processing device from capturing second samples of the audio data at first intervals, to capturing the second samples of the audio data at second intervals, wherein each second interval is shorter than each first interval,”: As discussed above 
“and wherein the combiner is operable to provide contiguous audio data using at least one portion of the second samples of the audio data captured at the first intervals and the second samples of the audio data captured at the second intervals”: As discussed above [see portion of this rejection directed to “a combiner;”, and “wherein responsive to detection by the speech onset detector of speech onset in the first sample of the audio data, the audio interface control is operable to switch the audio processing device from capturing second samples of the audio data at first intervals, to capturing the second samples of the audio data at second intervals, wherein each second interval is shorter than each first interval,”, incorporated here by reference] Rossum suggests 
“wherein the capturing of the second samples of the audio data at the second intervals comprises the audio interface sampling the second samples of the audio data at a second sample rate and the detection by the speech onset detector of speech onset in the first sample of the audio data is based on the sampling by the audio interface of the first sample of the audio data at a first sample rate, wherein the second sample rate is greater than the first sample rate”: Figures 2-3, paragraphs 9, 33, 36-39, 45-46; the transducer+amplifier+A/D-converter+PDM “audio interface” “captures”/”samples” the “second” samples of all “audio data” received by the DMIC/microphone over time which are not the “first” sample provided to the vocalization detector at a higher DSP-provided CLK line 312 sample rate [“at the second intervals” and “at a second” DSP-provided CLK line 312 “sample rate”] and the “first” audio data 

wherein the audio processing device is operable to capture the second samples of the audio data periodically according to the first intervals and capture the second samples of the audio data continuously according to the second intervals (Figures 1-3; paragraphs 3-6, 9, 23, 26, 29, 33, 34, 36, 37, 38, 39, 40, 41, 45, 46;
Figures 2-3, paragraphs 9, 33, 36-39, 45-46; The “audio processing device” [a device including the DMIC and the DSP] captures the “second” samples of all “audio data” received by the DMIC/microphone over time which are not the “first” sample provided to the vocalization detector, where, prior to vocalization detection, “the second samples” are “captured”/sampled “periodically according to the first intervals” at the internal oscillator sample rate [where sampling at a rate samples periodically once every time interval/sampling-period] and where, after vocalization detection, “the second samples” are “captured”/sampled “continuously according to the second intervals” at the DSP-provided CLK line 312 sample rate [where sampling at a rate samples again and again, i.e. “continuously”, once every time interval/sampling-period])

Claims 22-23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rossum, as applied to Claim 21, above, and further in view of Scott et al. (US 9,398,367), hereafter Scott.

As per Claim 22, Rossum suggests a… detector operable to process the contiguous audio data to recognize a… in the second samples of the audio data captured at the second intervals (Figures 1-3; paragraphs 3-6, 9, 23, 26, 29, 33, 34, 36, 37, 38, 39, 40, 41, 45, 46;
Paragraph 41 describes where “The buffered data may be pre-pended to the real-time audio data for the purposes of keyword recognition” [where buffered data sampled at an internal oscillator sampling rate pre-pended to the real-time audio data sampled at a DSP-provided CLK sampling rate is interpreted as “the contiguous audio data”, as discussed in the rejection of claim 21].  Paragraphs 3-5 describe where keyword recognition follows vocalization detection and includes examining an utterance and results in a keyword match or no match, where vocalization detection determines whether a person begins to utter a possible keyword.  Paragraph 6 also describes where a DSP is used to perform computations for detecting keywords.  Also, as discussed in the rejection of claim 21, paragraph 37 describes actions done “prior to the vocalization detection” and paragraph 38 describes where certain actions are done “when the DMIC 120 detects a vocalization”.  These paragraphs suggest where detecting “vocalization” detects presence of speech [at least because it would be unusual to call absence of speech “vocalization”], where no vocalization [i.e. VAD detection detecting no presence of speech] is detected for a period of time “prior to the vocalization detection” such that “when the DMIC… detects a vocalization”, the DMIC [particularly the vocalization detector 320 of the DMIC] is detecting speech after a period of non-speech [i.e. the vocalization detector 320 detects a “speech onset”].  
The portions discussed in the previous paragraph further suggest where an utterance of a keyword is in the real-time audio data sampled at the CLK sampling rate [because the buffered data is sampled “prior to vocalization” which is suggested to be 
These portions suggest “a… detector operable to process the contiguous audio data to recognize a… in the second samples of the audio data captured at the second intervals” [the-portion-of-the-DSP-that-performs-keyword-recognition/keyword-“detector” performs-keyword-recognition-on/”processes” the-buffered-audio-data-pre-pended-to-the-real-time-audio-data/”contiguous audio data” to recognize a keyword in the real-time-audio-data-sampled-at-the-DSP-provided-CLK-sampling-rate/”the second samples of the audio data captured at the second intervals”])
Rossum does not, but Scott suggests a wake-up phrase detector operable to process the contiguous audio data to recognize a wake phrase in the second samples of the audio data captured at the second intervals (col. 4, lines 4-22; col. 5, lines 4-27; col. 5, line 61 – col. 6, line 17;
Col. 4, lines 4-22 describes where an aural cue can include a “specific keyword[s]” and also where keyword[s] may be referred to as a “wakephrase”.  Col. 5, 
Scott thus suggests where a keyword recognized by the keyword recognizer/“detector” in “the contiguous audio” [as suggested by Rossum, as discussed above] is more specifically a wakephrase/”wake-up phrase”/”wake phrase” [such that the keyword recognizer/”detector” is more specifically a “wake-up phrase detector”])
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of keyword with another because the prior art teaches the claimed invention except for the substitution of a keyword which is not necessarily a wake-up phrase with a keyword which is.  Scott teaches that a keyword which is a wake-up phrase was known in the art.  One of ordinary skill in the art could have substituted one type of keyword with another to obtain the predictable results of a device which samples audio data at an internal oscillator sampling rate, and which, in response to detecting vocalization, samples audio data at a DSP-provided CLK sampling rate, which pre-pends buffered audio data sampled at the internal oscillator sampling rate to real-time audio data sampled at the DSP-provided CLK sampling rate for the purposes of keyword recognition, which performs keyword recognition, and which can send speech/audio wirelessly (as per Rossum) where the keyword recognition recognizes a wakephrase (as per Scott).
	
As per Claim 23, Rossum suggests wherein the… detector is operable to recognize the… using the at least one captured portion of the second samples of the audio data captured at the first intervals (Figures 1-3; paragraphs 3-6, 9, 23, 26, 29, 33, 34, 36, 37, 38, 39, 40, 41, 45, 46;
Rossum suggests “a… detector operable to process the contiguous audio data to recognize a… in the second samples of the audio data captured at the second intervals” as discussed in the rejection of claim 22 [the-portion-of-the-DSP-that-performs-keyword-recognition/keyword-“detector” performs-keyword-recognition-on/”processes” the-buffered-audio-data-pre-pended-to-the-real-time-audio-data/”contiguous audio data” to recognize a keyword in the real-time-audio-data-sampled-at-the-DSP-provided-CLK-sampling-rate/”the second samples of the audio data captured at the second intervals”]
Paragraph 41 describes where “The buffered data may be pre-pended to the real-time audio data for the purposes of keyword recognition” which, in addition to suggesting where the keyword recognition recognizes a keyword in the real-time audio data [since the real-time data is suggested to be sampled after vocalization/”onset”] also suggests “wherein the… detector is operable to recognize the… using the at least one captured portion of the second samples of the audio data captured at the first intervals” [i.e. where the keyword recognition performed by the portion of the DSP that performs keyword recognition recognizes the keyword in the real-time audio data based, in part, on the buffered audio data, which is suggested by the pre-pending because if the buffered audio was not used to recognize the keyword, then there would be no point to pre-pending the buffered audio data “for the purposes of keyword recognition”])
Rossum does not, but Scott suggests wherein the wake-up phrase detector is operable to recognize the wake phrase using the at least one captured portion of the second samples of the audio data captured at the first intervals (col. 4, lines 4-22; col. 5, lines 4-27; col. 5, line 61 – col. 6, line 17;
Same combination as discussed in the rejection of claim 22, where the keyword recognized in the real-time audio data is more specifically a wakephrase [such that the recognized keyword is more specifically a “wake phrase” and such that the keyword “detector” is more specifically a “wake-up phrase detector”])
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of keyword with another because the prior art teaches the claimed invention except for the substitution of a keyword which is not necessarily a wake-up phrase with a keyword which is.  Scott teaches that a keyword which is a wake-up phrase was known in the art.  One of ordinary skill in the art could have substituted one type of keyword with another to obtain the predictable results of a device which samples audio data at an internal oscillator sampling rate, and which, in response to detecting vocalization, samples audio data at a DSP-provided CLK sampling rate, which pre-pends buffered audio data sampled at the internal oscillator sampling rate to real-time audio data sampled at the DSP-provided CLK sampling rate for the purposes of keyword recognition, which performs keyword recognition, and which can send speech/audio wirelessly (as per Rossum) where the keyword recognition recognizes a wakephrase (as per Scott).

Claim 25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rossum, as applied to Claim 24, above, and further in view of Mok et al. (US 10,847,149), hereafter Mok.

As per Claim 25, Rossum suggests a buffer operable to store the second samples of the audio data captured at the first intervals, wherein the buffer is operable to operate… during each first interval (Figures 1-3; paragraphs 3-6, 9, 23, 26, 29, 33, 34, 36, 37, 38, 39, 40, 41, 45, 46;
Figure 3 depicts where the DMIC/microphone 120 [part of “An audio processing device” including the DMIC/microphone and a DSP, suggested by paragraphs 33, 36, and Figures 2-3 as discussed above] includes a buffer 310 [i.e. such that the “audio processing device” comprises “a buffer”].  
Paragraph 37 describes where, prior to vocalization, the DMIC, among other things, buffers audio data into a recirculating memory using buffer 310 and paragraph 9 describes where buffered audio data was previously acquired at a sample rate determined by the internal oscillator [which suggests that the buffered data is captured at the internal oscillator sample rate].  Paragraphs 38-40 and 45, as discussed above, describe switching to a DSP-provided CLK line 312 sample rate which is higher than the internal sample rate, and then providing both real-time PDM data and buffered PDM data on at least one channel of DATA output 314 [depicted in Figure 3 as going in the direction of DSP].  Paragraph 41 describes pre-pending buffered data to the real-time audio data.  It is at least suggested that the real-time audio data is sampled at the CLK line 312 rate [because paragraph 41 describes that the buffered audio data is converted to the same rate as the host CLK sample rate as an example of processing the buffered data in a manner matching the buffered data to the real-time audio data, and paragraph 39 describes a sample rate of the provided CLK line 312].  

Also of relevance to the combination applied to reject claim 25:
Paragraph 41 describes where “The buffered data may be pre-pended to the real-time audio data for the purposes of keyword recognition” [where buffered data sampled at an internal oscillator sampling rate pre-pended to the real-time audio data sampled at a DSP-provided CLK sampling rate is interpreted as “the contiguous audio data”, as discussed in the rejection of claim 21].  Paragraphs 3-5 describe where keyword recognition follows vocalization detection and includes examining an utterance and results in a keyword match or no match, where vocalization detection determines whether a person begins to utter a possible keyword.  Paragraph 6 also describes where a DSP is used to perform computations for detecting keywords.  Also, as discussed in the rejection of claim 21, paragraph 37 describes actions done “prior to the vocalization detection” and paragraph 38 describes where certain actions are done 
The portions discussed in the previous paragraph further suggest where an utterance of a keyword is in the real-time audio data sampled at the CLK sampling rate [because the buffered data is sampled “prior to vocalization” which is suggested to be during a period of non-speech, and because the real-time audio data is sampled at the DSP-provided CLK sampling rate in response to detecting vocalization, such that the real-time audio data is suggested to include speech following the “onset” that caused the vocalization detector to detect vocalization] and where the DSP includes a portion that performs keyword recognition [keyword recognizer/”detector”] that recognizes a keyword in “the contiguous audio data” [since keyword recognition is performed on speech, the speech is suggested to be in the real-time audio data as just discussed, and the buffered data is pre-pended to the real-time audio data which suggests that the audio data analyzed to recognize a keyword is the buffered data pre-pended to the real-time audio data]
What was discussed in the previous 2 paragraphs further at least suggests where the buffering of audio data sampled at an internal oscillator sampling rate precedes keyword recognition [because keyword recognition logically cannot be 
Rossum does not, but Mok suggests a buffer operable to store the second samples of the audio data captured at the first intervals, wherein the buffer is operable to operate in a sleep mode during each first interval (col. 2, lines 44-59;
Mok [col. 2, lines 44-59] describes where, a user device remains in sleep mode until the device “detects speech corresponding to a keyword” [at least suggested to be recognizing a keyword, such as an “Alexa” wakeword], and also describes where “While in sleep mode, the device may continuously buffer and process captured audio to detect speech corresponding to the keyword [similar to how Rossum buffers audio data prior to keyword recognition].
Mok thus suggests “a buffer operable to store the second samples of the audio data captured at the first intervals, wherein the buffer is operable to operate in a sleep mode during each first interval”: where Rossum’s “audio processing device” [including the DMIC/microphone and a DSP, and including, in the “audio processing device’s” DMIC/microphone, the buffer that is suggested to store “second” samples that are captured/sampled at the internal oscillator sample rate prior to detecting vocalization and prior to keyword recognition] is in a sleep mode prior to keyword recognition, such that the buffer “operates” as-part-of/”in” the “audio processing device’s” “sleep mode” and “during each first interval” [in order to capture/sample the “second” samples that are captured/sampled at the internal oscillator sample rate/”at the first intervals”])
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of device that buffers .
	
Allowable Subject Matter
The following is a statement of reasons for the indication of allowable subject matter:  

As per Claim 26 (and similarly claims 30 and 36, consequently claims 27-28, 31, 33-35, 37-38, and 40 which depend directly/indirectly on claims 26, 30, and 36) the prior art of record does not teach or suggest the combination of all limitations in claim(s) 21 and 26, including (i.e. in combination with the remaining limitations in claim[s] 21 and wherein the audio interface is operable to provide the first sample of the audio data to the speech onset detector responsive to sound waves meeting or exceeding a threshold activity level, wherein the audio data is representative of the sound waves (additionally, for claim 30, even assuming providing the audio including a wakeword in Mutagi could be interpreted as a different “providing the first sample of the audio data”, it would not be provided “responsive to sound waves meeting or exceeding a threshold activity level” because the providing/transmitting-to-a-server in Mutagi is performed responsive to keyword/wakeword recognition)
2015/0340042 describes one or more VAD processing stages has determined that the acoustic input likely contains speech” (paragraph 77).  Paragraph 48 describes “When the voice system is enabled to monitor the acoustic environment, act 210 may be performed continuously or periodically at a frequency that provides an appearance of continuous monitoring, even though it may not be strictly continuous”.  This reference also describes where a subsequent stage of a plurality of processing stages is performed only if one or more previous processing stages is unable to conclude that the acoustic input corresponds to spurious acoustic activity (paragraph 34)  Paragraph 83 describes where one or more VAD processing stages may be performed in sequence, including a low level amplitude check may be performed as a threshold inquiry, followed by evaluations of audio signal characteristics if the amplitude is sufficient to suggest it may be associated with voice.  This reference does not appear to specifically teach that one stage provides the audio input to the next stage (as opposed to applying VAD analysis one after the other on the same audio signal in the same “location”, see e.g. Figure 4 where the first stage and a second first stage are part of the same box) sound waves meet or exceed the threshold activity level, where it would not necessarily be obvious to add, to Rossum’s DMIC, a low level amplitude check at a sound wave level immediately prior to the vocalization detector (because at this point the audio is already converted into digital samples) or before the A/D conversion.  Additionally, it would not be obvious to add the thresholding that triggers collecting and processing of acoustic input (in paragraph 48) to Rossum because one of the points of Rossum is to buffer audio periodically, and to the extent that triggering collecting and processing is collecting and processing continuously, this is the function performed by Rossum based on the vocalization detector.  Paragraph 115 describes where low battery may be used to discourage “passing acoustic information on for further processing such that additional power consumption is incurred only in situations where the confidence is very high that the acoustic input includes a voice command” (suggesting where a threshold determination may lead to “passing acoustic information on for further processing) but this does not appear to describe where the acoustic input is passed on to a speech onset detector (as opposed to passing on to speech recognition or something else other than another voice activity detector).  Paragraph 161 similarly describes performing one or more VAD techniques that determine whether acoustic input includes a voice command to assess whether the received signal merits passing on to further “processing stages”.  Paragraph 13 also describes where two processing stages performed on the same acoustic input to evaluate whether the acoustic input includes a voice command are performed by two different processors, which suggests providing, “if further processing is needed”, the audio input to another processor after one processor processes the audio input.  all audio data”
2014/0244273 teaches “One or more stages of voice activity detection (VAD) can be used” (paragraph 34).  This reference does not appear to specifically teach that one stage provides the audio input to the next stage.
2008/0040109 teaches multiple VAD stages (first, second and third circuits) and associated VAD decisions in each stage (paragraphs 90-91) but these stages appear to operate on different portions of an audio signal (Figure 4).
2014/0278435 describes “performing one or more voice activity detection (VAD) processing stages that evaluate whether the acoustic input has the characteristics of voice/speech or whether the acoustic input is more likely the result of non-voice acoustic activity in the environment. VAD techniques refer generally to those that analyze one or more properties or characteristics of acoustic input (e.g., signal characteristics of acoustic input) to evaluate whether the one or more properties/characteristics are suggestive of speech, some techniques of which are described in further detail below” (paragraph 53) and “wherein performing the at least one first processing stage includes performing at least one voice activity detection processing stage including performing at least one of spectral analysis on the acoustic input to evaluate whether the spectrum of the acoustic input is indicative of voice activity, periodicity analysis to evaluate whether the signal periodicity is indicative of 
and similarly 2015/0340042 teaches “performing the at least one voice activity detection processing stage comprises performing spectral analysis on the acoustic input to evaluate whether a spectrum of the acoustic input is indicative of voice activity, performing periodicity analysis to evaluate whether signal periodicity of the acoustic input is indicative of voice activity and/or using phone loops to evaluate whether the acoustic input includes speech” (claim 28).  It is not clear if performing multiple stages of voice activity detection involves the same component performing multiple analyses or multiple components sequentially performing respective analyses (while passing audio data from one component to the next), and additionally it is not clear that one of the stages is a sound wave threshold comparison and where the audio is sent from the sound wave threshold comparison to a speech onset detection
2002/0138255 teaches two stages of voice activity detecting (paragraph 134; Figure 3; where one stage receives an input from another stage [element 22 receives an input from element 21]).  This reference does not appear to specifically teach that one stage provides the audio input to the next stage.
2014/0122078 teaches “When voice activity above the preset threshold level is detected in the audio input, the parts of the speech having the voice activity in them are then propagated to the feature creator 116. For example a command phrase like 
9478231 teaches transmitting “an interrupt signal to the DSP/CPU core… in response to detected sound energy” including in one example “the interrupt signal may be output if a filtered sound sample is above a threshold value” and “The threshold value may be dynamically updated”  
9484030 teaches “In the context of speech processing, if a specific sound is a "wakeword," once the wakeword is detected, the local device 110 may "wake" and begin transmitting audio data 111 corresponding to input audio 11 to the server(s) 120 for speech processing. Further, a local device 110 may "wake" upon detection of speech/spoken audio above a threshold, as described herein. Audio data corresponding to that audio may be sent to a server 120 for routing to a recipient device or may be sent to the server for speech processing for interpretation of the included speech (either for purposes of enabling voice-communications and/or for purposes of executing a command in the speech). Thus, upon receipt by the server(s) 120, an ASR module 250 may convert the audio data 111 into text”.  This describes providing audio to a server in response to detection of speech/spoken audio above a threshold for speech processing for interpretation of included speech whereas the claim 26 provides audio to an onset detector when the threshold is exceeded.
5983186 teaches “speech recognition devices and instruments include an input sound signal power or volume detector in communication with a central CPU for providing audio to a speech onset detector is performed in response to the exceeding of the threshold (this reference appears to describe where the speech recognition CPU performs keyword recognition after waking the speech recognition CPU, not onset detection.).
	2015/0051906 teaches where a VAD system is activated when an input signal exceeds a threshold level (“voice activity detector… use a broadband root mean square [RMS] measure of the signal energy… threshold of signal activity.  When the incoming RMS first exceeds this threshold, the VAD/SAD may be activated and the signal blocks may begin being passed to the other possible pre-processing”, paragraph 32)  This is or is not provided to the voice activity detector based on the threshold.
2016/0284363 teaches turning an audio sensor on/off based on a voice activity detection and a threshold, where turning an audio sensor on/off suggests determining whether to provide audio to a further speech processor (“If BSM determines that voice activity… evaluate biosignal data… against the first threshold… audio sensor… may remain in an OFF or low power state until… voice activity is present… limit power consumption by limiting the activity of audio sensor… and, therefore, the activity of a downstream speech recognition system”, paragraph 41).  In this reference, however, the voice activity detector is the device which controls the audio-providing function of the audio sensor, and is not the device whose input of audio is controlled by the threshold.
	5459814 further teaches where a VAD periodically updates a threshold (“VAD periodically monitors and updates the threshold values to reflect changes in the level of background noise”).  This reference, however, does not specifically teach where the VAD is “sleeping” when it is not performing the periodic updates.

	Upon further search:
2015/0269954 teaches “In an example implementation, the VAD heuristic (block 410) may be a moving threshold value that adapts to the level of background noise. If the MFCC is above the threshold value, the VAD module 130 determines that the sound samples are speech samples and then those samples are forwarded to other modules in the speech pipeline such as a keyword detector module, a speaker detector module, 
CN 103472960 B teaches “when have touch body touch when, surface acoustic wave signal through touch body reflect after reach receive transducer, work as surface When acoustic signals intensity is more than the threshold values for setting, receiving circuit sends to CPU MCU the surface acoustic wave signal, in Central Processing Unit MCU determines to touch the coordinate of body” (see Google Patents machine translation, page 3, paragraph starting with “(3)”)
	10176809 teaches “Another way to compress audio data is to remove portions of the audio data based on a noise level in the audio data, as illustrated in FIG. 10. The speech-controlled device 110 may capture (620) spoken audio corresponding to a spoken query. The speech-controlled device 110 may convert the spoken audio into audio data (not illustrated). The speech-controlled device 110 may then determine (1002) a level of background noise (i.e., a magnitude of audio in the audio data that 
	10297250 teaches “Block 508 illustrates providing, by the voice-controlled device, an audio signal that corresponds to the captured audio to the remote device. Provided that it is determined that the network 118 has sufficient bandwidth (e.g., the amount of bandwidth meets or exceeds the bandwidth threshold), the unprocessed audio signal(s) may be provided to (e.g., streamed, transferred, transmitted, etc.) the remote computing resource(s) 116 for processing”.  This reference is directed to bandwidth meeting/exceeding a threshold, not sound waves meeting/exceeding a threshold.
10649727 teaches “Returning to block 610, if the wake word was detected, such as at or above the threshold confidence level, then at block 614, the process 600 may include formatting the audio data for sending to the remote system. Formatting of the audio data may include removing portions of the audio data not corresponding to the user utterance, performing beamforming on the audio data, and/or one or more other techniques described herein with respect to the audio-data component of the electronic”
10692489 teaches “If a wake gesture is detected (1116:Yes), meaning the wake command module 220 or other component has detected a wake gesture with a sufficiently high confidence (e.g., a confidence above a threshold), the device 110 may capture audio and send audio data to the server(s) 120. The audio may include the original audio received in 1102 or may correspond to new audio data corresponding to audio received during or after the wake gesture”

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 21-28, 30-31, 33-35, are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 14-16 of U.S. Patent No. 10,332,543, hereafter Parent Patent 1. Although the claims at issue are not identical, they are not the claims of this application are rendered obvious by the claims of Parent Patent 1.

As per Claim 21, Claim 14 of Parent Patent 1 teaches An audio processing device, comprising: an audio interface operable to sample audio data; a speech onset detector; a combiner; and an audio interface control, wherein the audio interface is operable to provide a first sample of the audio data to the speech onset detector (first 6 lines Claim 14 of Parent Patent 1)
wherein responsive to detection by the speech onset detector of speech onset in the first sample of the audio data, the audio interface control is operable to switch the audio processing device from capturing second samples of the audio data at first intervals, to capturing the second samples of the audio data at second intervals, wherein each second interval is shorter than each first interval (lines 8-14 of Claim 14 of Parent Patent 1, where continuously, by definition, is suggested to be a capturing that is as frequent as possible, such that sampling continuously is suggested to be more frequent and with shorter intervals than the periodic capturing of second samples)
and wherein the combiner is operable to provide contiguous audio data using at least one portion of the second samples of the audio data captured at the first intervals and the second samples of the audio data captured at the second intervals (lines 14-19 of Claim 14 of Parent Patent 1; as discussed in the previous paragraph, periodic capturing is suggested to be capturing at first intervals and continuous capturing is suggested to be capturing at second intervals that are shorter 
wherein the capturing of the second samples of the audio data at the second intervals comprises the audio interface sampling the second samples of the audio data at a second sample rate and the detection by the speech onset detector of speech onset in the first sample of the audio data is based on the sampling by the audio interface of the first sample of the audio data at a first sample rate, wherein the second sample rate is greater than the first sample rate (lines 1-2, 4-14 of Claim 14 of Parent Patent 1; 
The audio interface is described as an element that samples audio data [lines 1-2 of Claim 14 of Parent Patent 1] and as part of the audio processing device [lines 1-2 of Claim 14 of Parent Patent 1]. 
Lines 8-14 describes where “the audio interface control” [suggested to control the “audio interface” that samples audio data, based on its name] switches “the audio processing device from periodically capturing second samples of the audio data at intervals to continuously capturing the second samples of the audio data at intervals” “responsive to detection by the speech onset detector of speech onset in the first sample of the audio data”, which suggests where, before onset detection, sampling by the audio processing device using the audio interface [including sampling the first sample and sampling/capturing the second samples] is performed periodically [“at a first sample rate”], and after onset detection, the audio interface control changes the audio 

As per Claim 22, Claim 15 of Parent Patent 1 suggests Claim 22 (continuously captured is suggested to be capturing at second intervals, as discussed in the rejection of claim 1).

As per Claim 23, Claim 15 of Parent Patent 1 suggests Claim 23 (Claim 15 of Parent Patent 1 recites processing “the contiguous audio data” to recognize a wake phrase in the continuously captured audio data, where, as per Claim 14 of Parent Patent 1, “the contiguous audio data” includes both the periodically captured second samples and the continuously captured second samples, and so based on the plain meaning of Claim 15 of Parent 1, it is suggested that the recognizing of the wake phrase uses “the contiguous audio data” and everything in it, including the periodically captured samples “captured at the first intervals”)

	As per Claim 24, Claim 14 of Parent Patent 1 suggests wherein the audio processing device is operable to capture the second samples of the audio data periodically according to the first intervals and capture the second samples of the audio data continuously according to the second intervals (lines 8-14 of Claim 14 of Parent Patent 1; periodically capturing and continuously capturing is at least suggested to refer to capturing one sample at every periodic/continuous sampling period/interval)

	As per Claim 25, Claim 16 of Parent Patent 1 (interpreted as incorporating the limitations of Claim 14 of Parent Patent 1) suggests a buffer operable to store the second samples of the audio data captured at the first intervals, wherein the buffer is operable to operate in a sleep mode during each first interval (line 3 of claim 14 of Parent Patent 1 recites a buffer, lines 8-14 of Claim 14 of Parent Patent 1 describes capturing second samples of the audio data periodically “at intervals” [i.e. at first intervals before onset detection].
	Captured data is suggested to be stored somewhere [because it is captured/sampled] and a buffer is commonly/conventionally used to store data, which suggests where the periodically captured second samples are stored in the buffer.
	Claim 16 of Parent Patent 1 teaches where the buffer is in a sleep mode buffer power domain “during the intervals” [i.e. during the periodic-capturing/”first” intervals]).

As per Claim 26, Claim 14 of Parent Patent 1 suggests Claim 26 (lines 4-8, where “sound waves associated with the audio data” suggests where the audio data is the sound waves in data form [such that the audio data is representative of the sound waves])

As per Claim 27, Claim 14 of Parent Patent 1 does not, but Claim 1 of Parent Patent 1 suggests wherein the speech onset detector is configured to wake from a sleep mode in response to the first sample of the audio data exceeding an activation threshold level (Claim 1 of Parent Patent 1;
Claim 1 of Parent Patent 1 teaches “waking up a speech onset detector responsive to a sound wave associated with the audio data meeting or exceeding a first activation threshold of an audio interface”, where “the audio data” has antecedent basis to “audio data” in “capturing a first plurality of portions of audio data by periodically capturing the audio data at first intervals” and “detecting of the speech onset without using the captured first plurality of portions of the audio data” and “combining at least one captured portion of the first plurality of portions of the audio data with the continuously captured audio data to provide contiguous audio data”.  
More specifically, the “first plurality of portions” corresponds to the “second samples” captured at the “first intervals” which are combined with the continuously captured audio data that corresponds to the “second samples” captured at the “second intervals”, and since the speech onset is detected without using the captured first plurality of portions of the audio data, the audio data used to detect speech onset is logically a sample of the audio data which is different from the captured first plurality of portions [i.e. a “first” sample of the audio data which is different from the “second” periodically captured samples]
Claim 1 of Parent Patent 1 does not exclude where the audio data used to detect speech onset [i.e. the “first sample”] and the audio data associated with the sound wave 
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of speech onset detector with another because Claim 14 of Parent Patent 1 teaches the claimed invention except for the substitution of a speech onset detector which is not necessarily awakened in response to audio data used to detect speech onset meeting/exceeding an activation threshold with a speech onset detector which is.  Claim 1 of Parent Patent 1 suggests that a speech onset detector which is awakened in response to audio data used to detect speech onset meeting/exceeding an activation threshold was known in the claims.  One of ordinary skill in the art could have substituted one type of speech onset detector with another to obtain the predictable results of Claim 14 of Parent Patent 1, where the speech onset detector is awakened when sound waves associated with the first sample of the audio data meet or exceed an activation threshold (as suggested by Claim 1 of Parent Patent 1)
	
As per Claim 28, Claim 14 of Parent Patent 1 suggests Claim 28 (last 7 lines).



As per Claim 31, Claim 14 of Parent Patent 1 teaches claim 31 (lines 8-14 of Claim 14 of Parent Patent 1 [lines 8-14 of Claim 14 of Parent Patent 1 particularly characterize the capturing of second samples to be switched to “continuously captured”)

As per Claim 33, its limitations are similar to those in claim 27, and so is rejected under similar rationale (col. 8, lines 3-16 of Parent Patent 1 [where this application is a continuation of Parent Patent 1] describes active mode as ON and sleep mode as OFF, such that waking the speech onset detector as described in Claim 1 of Parent Patent 1 can be interpreted as “turning on” the speech onset detector)

As per Claim 34, Claim 14 of Parent Patent 1 suggests Claim 34 (Claim 14 of Parent Patent 1 describes where a first sample in which speech onset is detected is provided in response to sound waves meeting or exceeding a threshold activity level, and where an updated threshold activity level [the same phrase used to describe what was met or exceeded by the sound waves] is calculated and provided to the audio interface, which suggests an embodiment where the threshold activity level which is met/exceeded by the sound waves is a previously calculated updated threshold activity 

As per Claim 35, Claim 14 of Parent Patent 1 suggests Claim 35 (the switch from periodically sampling to continuously sampling occurs after detecting speech onset, and so the first sample which is analyzed to detect the onset that causes the switch is suggested to be sampled using periodic sampling which is suggested to be a lower sample rate than the continuous sampling used to sample the “continuously captured second samples”) 

Claims 36-38, 40, are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 14-15 of Parent Patent 1 in view of Mutagi et al. (US 10,027,662), hereafter Mutagi.

As per Claim 36, Claim 15 of Parent Patent 1 (interpreted as incorporating claim 14 of Parent Patent 1) suggests An electronic communication device, comprising: a microphone;… and an audio processing device (first 2 lines of Claim 14 of Parent 
comprising an audio interface coupled to the microphone and configured to sample audio data, a speech onset detector, a combiner, a wake-up phrase detector, and an audio interface control (first 4 lines of Claim 14 of Parent Patent 1 and Claim 15 of Parent Patent 1, where the audio interface samples data which is suggested to be received by a microphone which suggests that the microphone is coupled to the audio interface)
wherein the audio interface is operable to provide a first sample of the audio data to the speech onset detector responsive to sound waves received at the microphone meeting or exceeding a threshold activity level, wherein the audio data is representative of the sound waves (lines 4-8 of Claim 14 of Parent Patent 1; reciting that the sound waves are associated with the audio data suggests that the audio data is representative of the sound waves [i.e. the audio data is an audio data representation of the sound waves], and sound waves are at least suggested to be received by a microphone [audio/sound is commonly/conventionally received by microphone])
wherein responsive to detection by the speech onset detector of speech onset in the first sample of the audio data, the audio interface control is operable to switch the audio interface from sampling second samples of the audio data at first intervals, to sampling the second samples of the audio data at second intervals (lines 8-14 of Claim 14 of Parent Patent 1, where continuously, by definition, 
wherein the combiner is operable to provide contiguous audio data using at least one portion of the second samples of the audio data sampled at the first intervals and the the second samples of the audio data sampled at the second intervals, (lines 14-19 of Claim 14 of Parent Patent 1; as discussed in the previous paragraph, periodic capturing is suggested to be capturing at first intervals and continuous capturing is suggested to be capturing at second intervals that are shorter than the first intervals, and so providing contiguous data using the periodically captured second samples and the continuously captured second samples is suggested to provide contiguous audio data using second samples of audio data captured at first intervals and second samples of audio data captured at second intervals).
the wake-up phrase detector is configured to process the contiguous audio data to recognize a wake phrase… (Claim 15 of Parent Patent 1)
Claims 14-15 of Parent Patent 1 do not, but Claim 1 of Parent Patent 1 suggests wherein the speech onset detector is configured to wake from a sleep mode in response to the first sample of the audio data exceeding an activation threshold level (Claim 1 of Parent Patent 1;

More specifically, the “first plurality of portions” corresponds to the “second samples” captured at the “first intervals” which are combined with the continuously captured audio data that corresponds to the “second samples” captured at the “second intervals”, and since the speech onset is detected without using the captured first plurality of portions of the audio data, the audio data used to detect speech onset is logically a sample of the audio data which is different from the captured first plurality of portions [i.e. a “first” sample of the audio data which is different from the “second” periodically captured samples]
Claim 1 of Parent Patent 1 does not exclude where the audio data used to detect speech onset [i.e. the “first sample”] and the audio data associated with the sound wave are the same audio data, and so one embodiment within the scope of Claim 1 of Parent Patent 1 is where the speech onset detector is awakened responsive to a sound wave associated with the “first sample” exceeding a first activation threshold [i.e. “wherein the speech onset detector is configured to wake from a sleep mode in response to the first sample of the audio data exceeding an activation threshold level” in the embodiment of 
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of speech onset detector with another because Claims 14-15 of Parent Patent 1 teaches the claimed invention except for the substitution of a speech onset detector which is not necessarily awakened in response to audio data used to detect speech onset meeting/exceeding an activation threshold with a speech onset detector which is.  Claim 1 of Parent Patent 1 suggests that a speech onset detector which is awakened in response to audio data used to detect speech onset meeting/exceeding an activation threshold was known in the claims.  One of ordinary skill in the art could have substituted one type of speech onset detector with another to obtain the predictable results of Claims 14-15 of Parent Patent 1, where the speech onset detector is awakened when sound waves associated with the first sample of the audio data meet or exceed an activation threshold (as suggested by Claim 1 of Parent Patent 1)
Claims 1 and 14-15 of Parent Patent 1 does not, but Mutagi suggests an electronic communication device and a communication interface configured to wirelessly transmit and receive data and wherein the communication interface is configured to wirelessly transmit at least a portion of the second samples of the audio data sampled at the second intervals to a network, responsive to detection of the wake up phrase (Figure 2; col. 7, lines 24-55; col. 8, line 50 - col. 9, line 6; Figure 9; col. 26, lines 21-51;

Mutagi suggests where the electronic device that includes the audio processing device of Claim 14 of Parent Patent 1 and a microphone further includes “a communication interface configured to wirelessly transmit and receive data” [Wifi commonly/conventionally involves sending and receiving data] such that the electronic device is more specifically an “electronic communication device” and “wherein the 
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to combine prior art elements according to known methods because Claims 1, 14-15 of Parent Patent 1 and Mutagi included each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference (Claims 1, 14-15 of Parent Patent 1 suggest the limitations of Claim 36 except for a wireless communication interface and wirelessly sending audio data in response to detecting a wakephrase, and Mutagi suggests a wireless communication interface and wirelessly sending audio data in response to detecting a wakeword).  One of ordinary skill in the art could have combined the elements as claimed by known methods (by adding Mutagi’s wireless communication interface to the audio processing device of Claims 14-15 of Parent Patent 1 [which includes a speech onset detector which is awakened when sound waves associated with the first sample of the audio data meet or exceed an activation threshold, as suggested by Claim 1 of Parent Patent 1] and by adding Mutagi’s transmission of audio data in response to detecting a wakephrase to the functions performed by the combination of Claims 14-15 of Parent Patent 1), and that in combination, each element merely performs the same function as it does separately (the transmission follows the detection of the keyword/wake-phrase, and the communication is a separate element 

	As per Claim 37, Claim 14 of Parent Patent 1 suggests Claim 37 (continuous capturing is suggested to capture at smaller intervals than periodic capturing)

As per Claim 38, Claim 14 of Parent Patent 1 suggests Claim 38 (the switch from periodically sampling to continuously sampling occurs after detecting speech onset, and so the first sample which is analyzed to detect the onset that causes the switch is suggested to be sampled using periodic sampling which is suggested to be a lower sample rate than the continuous sampling used to sample the “continuously captured second samples”)

As per Claim 40, Claim 14 of Parent Patent 1 suggests Claim 40 (last 7 lines of Claim 14 of Parent Patent 1; the last 7 lines of Claim 14 of Parent Patent 1 at least suggest where every time the threshold computation module wakes up, it calculates an updated threshold activity level and provides the calculated updated threshold activity level to the audio interface)

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC YEN whose telephone number is (571)272-4249.  The examiner can normally be reached on M-F 12:00PM -8:30PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached on (571)272-7602.  The fax phone 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






EY 7/13/2021
/ERIC YEN/Primary Examiner, Art Unit 2658