DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-16 and 19-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
As to claim 1, the distinction between “one or more processors” in line 3 and “a first one or more processors,” “a second one or more processors,” and “a third one or more processors” is unclear. For the sake of examination, this is interpreted as first, second and third processors of the one or more processors.
As to claim 16, there is a lack of antecedent basis for “the first probability of voice activity” in lines 4-5. 
As to claim 19, it has the same issues as claim 1 above.
Claims 2-15 and 20 are rejected for depending on the above claims.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 13-14, 17 and 19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Rosner et al. (US 2013/0339028 A1) hereinafter “Rosner.”
	As to claim 1, Rosner discloses a system comprising: 
a first microphone (¶0031 and Fig. 2. Microphone 202.); 
one or more processors configured to execute a method (¶0065) comprising: 
receiving, via the first microphone, an audio signal (¶0031-0032, Fig. 2. Microphone 202 receives an audio signal.); 
determining, via a first one or more processors, whether the audio signal comprises a voice onset event (¶0032, Fig. 2. First stage 26 is configured to analyze at least one energy characteristic of the received audio signal to determine whether the received signal includes speech,”); 
in accordance with a determination that the audio signal comprises the voice onset event: 
waking a second one or more processors (¶0032 and ¶0034, Fig. 2. “If the energy characteristics of the received audio signal meets or exceeds the one or more thresholds, first stage 206 outputs a first activation signal that activates second stage 208.” “A first state of second stage 208 can be a stand-by state in which only the components in second stage 208 that are needed to recognize the first activation signal remain active. Once the first activation signal is received, second stage 208 can transition to a second state. For example, the second state can be a fully-operational state.”); 
determining, via the second one or more processors, whether the audio signal comprises a predetermined trigger signal (¶0035, Fig. 2. “Second stage 208 can be configured to analyze at least one profile of the received audio signal to determine if "wake-up" words are present in the signal.”); 
in accordance with a determination that the audio signal comprises the predetermined trigger signal: 
waking a third one or more processors (¶0035-0036, Fig. 2. “If the received audio signal substantially matches the respective at least one predetermined profile, a second stage 208 can output a second activation signal.” “Third stage 210 receives the second activation signal output by second stage 208.” “A first state of the speech recognition engine can be a stand-by state in which only the components needed to recognize the second activation signal remain active. Once the second activation signal is received, the speech recognition engine can be transitioned to a fully-operational state.”); 
performing, via the third one or more processors, automatic speech recognition based on the audio signal (¶0036, Fig. 2. “Third stage 210 includes a speech recognition engine.” “In the fully-operational state, the speech recognition engine is able to recognize a fall vocabulary of words within the received audio signal.”); and 
in accordance with a determination that the audio signal does not comprise the predetermined trigger signal: forgoing waking the third one or more processors (¶0035-0036, Fig. 2. Activation not output if audio signal does not substantially match the profile.); and 
in accordance with a determination that the audio signal does not comprise the voice onset event: forgoing waking the second one or more processors (¶0032, Fig. 2. Activation signal not output if energy characteristics of audio signal does not meet/exceed a threshold.).
	As to claim 13, Rosner discloses wherein the determining whether the audio signal comprises the voice onset event further comprises determining an amount of voice activity in the audio signal (¶0032. “First stage 206 can be configured to compare one or more energy characteristics of the received audio signal to one or more respective thresholds.”).
	As to claim 14, Rosner discloses wherein the determining whether the audio signal comprises the voice onset event further comprises determining whether the amount of voice activity in the audio signal is greater than a voice activity threshold (¶0032. “If the energy characteristics of the received audio signal meets or exceeds the one or more thresholds, first stage 206 outputs a first activation signal that activates second stage 208. In doing so, first stage 206 monitors the ambient environment to determine if a speech signal has been received.”).
	As to claim 17, it is directed towards substantially the same subject matter as claim 1 and is therefore rejected using the same rationale as claim 1 above.
	As to claim 19, it is directed towards substantially the same subject matter as claim 1 and is therefore rejected using the same rationale as claim 1 above.

	Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 3-4 are rejected under 35 U.S.C. 103 as being unpatentable over Rosner, as applied to claim 1 above.
As to claim 3, Rosner does not expressly disclose wherein the first one or more processors comprises an application-specific integrated circuit or a digital signal processor configured to determine whether the audio signal comprises the voice onset event.
However, Rosner discloses that the embodiments implemented using multiple processors (¶0065) and that the audio signal is a digital signal output be A/D 204 (Fig. 2). DSPs (and ICs) are well known in the art and, before the effective filing date of the claimed invention, using a DSP for processing the digital signals would have been obvious to one of ordinary skill in the art. 
As to claim 4, Rosner does not expressly disclose wherein the second one or more processors comprises a digital signal processor or an application-specific integrated circuit.
However, Rosner discloses that the embodiments implemented using multiple processors (¶0065) and that the audio signal is a digital signal output be A/D 204 (Fig. 2). DSPs (and ICs) are well known in the art and, before the effective filing date of the claimed invention, using a DSP for processing the digital signals would have been obvious to one of ordinary skill in the art. 

Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Rosner, as applied to claim 1 above, in view of Piersol et al. (US 2020/0279552 A1) hereinafter “Piersol.”
As to claim 2, Rosner discloses in accordance with the determination that the audio signal comprises the predetermined trigger signal: 
providing, via the second one or more processors, an audio stream to the third one or more processors based on the audio signal (¶0035-0036, Fig. 2. Third stage performs speech recognition on the audio signal.); 
Rosner does not expressly disclose identifying an endpoint corresponding to the audio signal; and 
ceasing to provide the audio stream to the third one or more processors in response to identifying the endpoint.
Rosner in view of Piersol discloses identifying an endpoint corresponding to the audio signal (Piersol, ¶0023, Fig. 1. “The device may check to see if the audio has reached the end of the utterance (also called the endpoint), for example using endpointing techniques.”); and
ceasing to provide the audio stream to the third one or more processors in response to identifying the endpoint (Piersol, ¶0023, Fig. 1. “The device may continue sending audio data to the server until an endpoint of the utterance is detected (166:Yes). The device may then stop sending audio data (168).”).
Rosner and Piersol are analogous art because they are from the same field of endeavor with respect to speech processing. 
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to detect an endpoint, as taught by Piersol. The motivation would have been to identity the end of the utterance in order to only send the utterance for further processing. 

Claims 5, 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Rosner, as applied to claims 1, 17 and 19 above, in view of Dusan et al. (US 2021/0125609 A1) hereinafter “Dusan.”
As to claim 5, Rosner does not expressly disclose wherein the system further comprises: a head-wearable device comprising the first one or more processors and the second one or more processors, and 
an auxiliary unit comprising the third one or more processors, wherein the auxiliary unit is external to the head-wearable device and configured to communicate with the head-wearable device.
Rosner in view of Dusan discloses wherein the system further comprises: a head-wearable device comprising the first one or more processors and the second one or more processors (Dusan, ¶0029-0030 and Figs. 1 and 3. Headphones 2 include VAD determination and key-phrase detection.), and 
an auxiliary unit comprising the third one or more processors, wherein the auxiliary unit is external to the head-wearable device and configured to communicate with the head-wearable device (Dusan, ¶0042, Figs. 1 and 3. Multimedia device 7 receives trigger signal from headphone 2 and performs automatic speech recognition.).
Rosner and Dusan are analogous art because they are from the same field of endeavor with respect to speech recognition.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to use a head-wearable device and auxiliary device, as taught by Dusan. The motivation would have been to split up the processing load. 
	As to claims 18 and 20, they are rejected under claims 17 and 19 using the same motivation as claim 5 above.

Claims 6-12 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Rosner, as applied to claim 1 above, in view of Soto (US 2020/0213729 A1).
As to claim 6, Rosner does not expressly disclose wherein the predetermined trigger signal comprises a phrase.
Rosner in view of Soto discloses wherein the predetermined trigger signal comprises a phrase (Soto, ¶0135. Phrases detected be wake-word engine.).
Rosner and Soto are analogous art because they are from the same field of endeavor with respect to voice activation devices.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to recognize a phrase as a trigger signal, as taught by Soto. Rosner (¶0035) discloses wake-up words and phrases are just multiple words.
 As to claim 7, Rosner does not expressly disclose storing the audio signal in a buffer.
Rosner in view of Soto discloses storing the audio signal in a buffer (Soto, ¶0027, Fig. 5. “The NMD may buffer sound detected by a microphone of the NMD and then use the wake-word engine to process that buffered sound to determine whether a wake word is present.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to use a buffer, as taught by Soto. The motivation would have been to facilitate wake-word identification (Soto, ¶0031).
As to claim 8, Rosner does not expressly disclose in accordance with a determination that the audio signal comprises the voice onset event: performing, via the second one or more processors, acoustic echo cancellation based on the audio signal.
Rosner in view of Soto discloses in accordance with a determination that the audio signal comprises the voice onset event: performing, via the second one or more processors, acoustic echo cancellation based on the audio signal (Soto, ¶0114, Fig. 5. “AEC 564 receives the detected sound SD and filters or otherwise processes the sound to suppress echoes and/or to otherwise improve the quality of the detected sound.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to perform AEC, as taught by Soto. The motivation would have been to improve the quality of the detected sound (Soto, ¶0114).
As to claim 9, Rosner does not expressly disclose in accordance with a determination that the audio signal comprises the voice onset event: performing, via the second one or more processors, beamforming based on the audio signal.
Rosner in view of Soto discloses in accordance with a determination that the audio signal comprises the voice onset event: performing, via the second one or more processors, beamforming based on the audio signal (Soto, ¶0140, Fig. 5. Spatial processing of detected sound can comprise beam-forming algorithms.).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to use beam-forming algorithm, as taught by Soto. The motivation would have been to improve the quality of the detected sound (Soto, ¶0140).
As to claim 10, Rosner does not expressly disclose in accordance with a determination that the audio signal comprises the voice onset event: performing, via the second one or more processors, noise reduction based on the audio signal.
Rosner in view of Soto discloses in accordance with a determination that the audio signal comprises the voice onset event: performing, via the second one or more processors, noise reduction based on the audio signal (Soto, ¶0138 and ¶0140. Sound data processed to reduce noise.).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to use noise reduction, as taught by Soto. The motivation would have been to improve the quality of the detected sound (Soto, ¶0138 and ¶0140).
As to claim 11, Rosner does not expressly disclose receiving, via a second microphone, the audio signal, wherein the determination whether the audio signal comprises the voice onset event is based on an output of the first microphone and further based on an output of the second microphone.
Rosner in view of Soto discloses receiving, via a second microphone, the audio signal, wherein the determination whether the audio signal comprises the voice onset event is based on an output of the first microphone and further based on an output of the second microphone (Soto, ¶0112-0113, Fig. 5. Multiple microphones 222 used for input to voice processor to improve SNR of the detected sound. As cited above in claim 1, Rosner (¶0032) discloses the onset detection.).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to use multiple microphones, as taught by Soto. The motivation would have been to improve SNR of the detected sound (Soto, ¶0113) for improved voice onset detection.
As to claim 12, Rosner does not expressly disclose the audio signal includes a signal generated by a voice source of a user of a device comprising the first microphone and further comprising the second microphone.
Rosner in view of Soto discloses the audio signal includes a signal generated by a voice source of a user of a device comprising the first microphone and further comprising the second microphone (Soto, ¶0112-0113, Fig. 5. Multiple microphones 222 used for input to voice processor to improve SNR of the detected sound.).
The motivation is the same as claim 11 above.
Rosner in view of Soto does not expressly disclose a first distance from the first microphone to the voice source and a second distance from the second microphone to the voice source are different.
However, this is implicit based on having multiple microphones. Unless the voice source is perfectly aligned between two microphones, the distances will be different.
The motivation would have been to improve SNR of the detected signal.
As to claim 15, Rosner does not expressly disclose wherein the determining whether the audio signal comprises the voice onset event further comprises determining a first probability of voice activity with respect to the audio signal.
Rosner in view of Soto discloses wherein the determining whether the audio signal comprises the voice onset event further comprises determining a first probability of voice activity with respect to the audio signal (Soto, ¶0115 and ¶0140. “The spatial processor 566 may be configured to determine a speech presence probability.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to determine speech presence probability, as taught by Soto. The motivation would have been to improve accuracy.

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Rosner, as applied to claim 1 above, in view of Habets et al. (US 2015/0310857 A1) hereinafter “Habets.”
As to claim 16, Rosner does not expressly disclose wherein the determining whether the audio signal comprises the voice onset event further comprises: determining a second probability of voice activity with respect to the audio signal; and determining a combined probability of voice activity based on the first probability of voice activity and further based on the second probability of voice activity.	
Rosner in view of Habets discloses wherein the determining whether the audio signal comprises the voice onset event further comprises: determining a second probability of voice activity with respect to the audio signal; and determining a combined probability of voice activity based on the first probability of voice activity and further based on the second probability of voice activity (Habets, ¶0103-0104, Fig. 2. First and second speech probabilities determined. Second based on first so combined is implicit.).
Rosner and Habets are analogous art because they are from the same field of endeavor with respect to speech processing.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to determine multiple probabilities, as taught by Habets. The motivation would have been to improve the estimation of speech probability (Habets, ¶0105).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: Elkhatib et al. (US 2016/0066113 A1) (see at least Figs. 2-6 and corresponding description) and Gustavsson et al. (US 2016/0180837 A1) (see at least ¶0004-0007 and Fig. 1).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES K MOONEY whose telephone number is (571)272-2412. The examiner can normally be reached Monday-Thursday, 8:30-6:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached on (571) 272-7848. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/JAMES K MOONEY/Primary Examiner, Art Unit 2654