DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on April 26, 2022 has been entered.

Response to Arguments
Applicants argue that the prior art cited fails to teach the claims as amended.  Applicants’ arguments are persuasive, but are moot in view of new grounds of rejection.  The 101 rejection remains for reasons as set forth below.  It is noted that although the claims have been amended to include using a pre-trained model, the claims do not specifically teach how the model is trained and how the actual trained model is used.  Therefore, the 101 rejection remains for reasons as set forth below.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 21, 29 and 34 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.  In particular, the last limitation teaches determining that the sound originated from one or more speakers.  However, after reviewed the specification, it only shows that the determining the sound is based on the sound originated from the microphones not the speakers (see the originally filed specification p. 0035-0036 and 0085).

Claim Rejections - 35 USC § 101
          35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

          Claims 21-32, 34-36 and 38-40 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  

The claims are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. The claims are directed to the abstract idea of detecting wake expression, as explained in detail below. The limitation of generating first audio data corresponding to sound (can be done by a user speaking), determining one or more parameters associated with the first audio data (can be done by a user analyzing the data and making a determination), determining, using a machine-learned model and based at least in part on the one or more parameters, a confidence level that the first audio data includes a trigger expression (analyzing data to see if it includes a predetermined expression); and determining, based at least in part on the confidence level, that the sound originated from the one or more audio speakers (can be done by determining if the data came from speaker).  As drafted, the above is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting various processors nothing in the claim element precludes the steps from practically being performed by mental processing. The present claim language under its broadest reasonable interpretation, covers performance of mental processing and recites generic computer components, which all falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements which are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.
The dependent claims are also non-statutory for reasons set forth above.  For example, wherein a first parameter of the one or more parameters corresponds to an audio input characteristic and a second parameter of the one or more parameters corresponds to a device operation characteristic (can be done by a user analyzing the data and setting up parameters), wherein the audio input characteristic comprises an echo characteristic associated with the first audio data or a loudness characteristic associated with the first audio data (can be done by a user listening to the audio and determining how loud the data is), wherein the device operation characteristic comprises a presence of the one or more audio speakers, a loudness characteristic of sound generated by the one or more audio speaker, or an amount of echo reduction performed by the one or more processors (can be done by a user listening to the audio and determining how loud the data is), determining a pattern of input signals based at least partly on the one or more parameters; and generating a reference file based at least partly on the pattern of input signals (can be done by the user listening to the audio, making a determination and compiling the data), generating third audio data; and determining a confidence value indicating a comparison of the third audio data to the reference file, the second audio data being based at least partly on the confidence value (can be done by a user setting up particular values), wherein: a first parameter of the one or more parameters indicates a low audio speaker output loudness; a second parameter of the one or more parameters indicates a high input audio loudness; and the instructions, when executed by the one or more processors, further cause the system to determine whether to analyze content of the first audio data based at least partly on the first parameter and the second parameter (can be done by the user analyzing the data, setting up various parameters and making a determination based on those parameters), wherein: a first parameter of the one or more parameters indicates a low degree of echo cancellation; a second parameter of the one or more parameters indicates that the one or more audio speakers are producing sound; and the instructions, when executed by the one or more processors, further cause the system to determine whether to analyze content of the first audio data based at least partly on the first parameter and the second parameter (can be done by the user analyzing the data and making a determination), determining a first parameter based at least partly on content of the first audio data; determining a second parameter based at least partly on an echo associated with generating the first audio data; determining a third parameter based at least partly on a loudness associated with the first audio data; determining a fourth parameter based at least partly on operational information of the device; and generating a reference parameter based at least partly on the first parameter, the second parameter, the third parameter, or the fourth parameter (can be done by the user analyzing the data, setting up various parameters and making a determination based on those parameters),  generating, using the one or more audio speakers and as part of a device initialization, a reference sound; and detecting the reference sound, wherein the reference parameter is based at least partly on the reference sound (can be done by a user listening for a particular sound, wherein the first audio data is based at least partly on a user utterance spoken into the one or more microphones (can be done by a user listening to a user speak), wherein the plurality of parameters are based at least partly on a confidence value indicating a likelihood that the audio data includes a predefined expression (can be done by a user analyzing the speech and matching it against other data), wherein the plurality of parameters are based at least partly on an amount of echo associated with the audio data, wherein the plurality of parameters are based at least partly on a loudness associated with an output at an audio speaker of the computing device and wherein the plurality of parameters are based at least partly on whether an audio speaker of the computing device or a text-to-speech algorithm generate an output (can be done by a user setting up parameters). 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 21, 29, 34 and 36 is/are rejected under 35 U.S.C. 103 as being unpatentable over Vignoli (PGPUB 2006/0074686) in view of Byrne et al. (PGPUB 2011/0301955), hereinafter referenced as Byrne and in further view of Visser et al. (PGPUB 2013/0156209), hereinafter referenced as Visser.

Regarding claims 21, 29 and 34, Vignoli discloses a system and device, hereinafter referenced as a system comprising: 
one or more microphones (microphone; p. 0003, 0009, 0013, 0034-0036); 
one or more audio speakers (speaker; p. 0038, 0042)
 one or more processors (processor; p. 0041); and 
non-transitory computer-readable media storing instructions that, when executed by the one or more processors (memory; p. 0041), cause the system to: 
generate, using the one or more microphones, first audio data corresponding to sound (microphone; p. 0003, 0009, 0013, 0034-0036); 
determine one or more parameters associated with the first audio data (parameters; p.0013); and
determine based at least in part on the plurality of parameters, that the first audio data includes a trigger expression (determine that the audio has an activation word or a predetermined keyword; p. 0013, 0046); and 
determine that the sound originated from one or more speakers, but does not specifically teach that the confidence level that the first audio data includes a trigger expression, a housing, one or more microphones proximate a top of the housing and one or more audio speakers disposed proximate a bottom of the housing and directed at least partly away from the one or more microphones.
Byrne discloses a system comprising:
determining, using a machine-learned model (machine learning; p. 0019, 0027, 0038) a confidence level that the first audio data includes a trigger expression (likelihood connected to carrier phrase; p. 0022), to assist with learning a users’ intended action.
Therefore, it would have been obvious to one of ordinary skill of the art to modify the system as described above, to improve means for determining users’ intended actions based on speech input.
Vignoli in view of Byrne discloses a system as described above, but does not specifically teach a housing, one or more microphones proximate a top of the housing and one or more audio speakers disposed proximate a bottom of the housing and directed at least partly away from the one or more microphones.  However, Applicant microphones being on the top and the speakers on the bottom is simply design choice.  Nonetheless, Visser discloses a system comprising:
a housing (figs 1-2 with p. 0045-0046); 
one or more microphones proximate a top of the housing (microphones; figures 2-4 with p. 0053); 
one or more audio speakers (speakers) disposed proximate a bottom of the housing and directed at least partly away from the one or more microphones (proper distance; figs. 2-4 with p. 0053), to optimize performance.
Therefore, it would have been obvious to one of ordinary skill of the art to modify the system as described above, to avoid unwanted feedback.
Regarding claim 36, Vignoli discloses a method wherein the computing device comprises a speech interface device in communication with a server device (speech control unit; p. 0041). 

Claims 22-24, 27-28, 30-32 and 38-39 is/are rejected under 35 U.S.C. 103 as being unpatentable over Vignoli in view of Byrne and Visser and in further view of Zad-Issa (USPN 7,680,465).

Regarding claim 22, Vignoli in view of Byrne and Visser discloses the system as described above, but does not specifically teach wherein a first parameter of the one or more parameters corresponds to an audio input characteristic and a second parameter of the one or more parameters corresponds to a device operation characteristic.
Zad-Issa discloses a system wherein a first parameter of the one or more parameters corresponds to an audio input characteristic and a second parameter of the one or more parameters corresponds to a device operation characteristic (various parameters; column 3, line 56 – column 4, line 22), to enhance the sounds.
Therefore, it would have been obvious to one of ordinary skill of the art to modify the system as described above, to allow decisions to be made based on various data, for versatility.  
Regarding claim 23, it is interpreted and rejected for similar reasons as set forth above. In addition, Zad-Issa discloses a system wherein the audio input characteristic comprises an echo characteristic associated with the first audio data or a loudness characteristic associated with the first audio data (echo; column 4, lines 17-32 and column 5, lines 54-64, column 8, lines 1-5). 
Regarding claim 24, it is interpreted and rejected for similar reasons as set forth above. In addition, Zad-Issa discloses a system wherein the device operation characteristic comprises a presence of the one or more audio speakers, a loudness characteristic of sound generated by the one or more audio speaker, or an amount of echo reduction performed by the one or more processors (loudness/echo; column 4, lines 17-32 and column 8, lines 1-5). 
Regarding claim 27, it is interpreted and rejected for similar reasons as set forth above. In addition, Zad-Issa discloses a system wherein: 
a first parameter of the one or more parameters indicates a low audio speaker output loudness (loud parameter; column 3, line 56 – column 4, line 22); 
a second parameter of the one or more parameters indicates a high input audio loudness (volume parameter; column 3, line 56 – column 4, line 22); and 
the instructions, when executed by the one or more processors, further cause the system to determine whether to analyze content of the first audio data based at least partly on the first parameter and the second parameter (parameter data; column 3, line 56 – column 4, line 22). 
Regarding claim 28, it is interpreted and rejected for similar reasons as set forth above. In addition, Zad-Issa discloses a system wherein: 
a first parameter of the one or more parameters indicates a low degree of echo cancellation (echo; column 4, lines 17-32 and column 8, lines 1-5); 
a second parameter of the one or more parameters indicates that the one or more audio speakers are producing sound (speaker; column 3, line 56 – column 4, line 22); and 
the instructions, when executed by the one or more processors, further cause the system to determine whether to analyze content of the first audio data based at least partly on the first parameter and the second parameter (parameter data; column 3, line 56 – column 4, line 22). 
Regarding claim 30, it is interpreted and rejected for similar reasons as set forth in the combination of claims 22-24
Regarding claim 31, it is interpreted and rejected for similar reasons as set forth above. In addition, Zad-Issa discloses a system wherein the instructions, when executed by the one or more processors, further cause the device to: 
determine a first parameter based at least partly on content of the first audio data (parameter; column 3, line 56 – column 4, line 22); 
determine a second parameter based at least partly on an echo associated with generating the first audio data (echo; column 3, line 56 – column 4, line 22 with column 8, lines 1-5); 
determine a third parameter based at least partly on a loudness associated with the first audio data (loudness; column 3, line 56 – column 4, line 22); 
determine a fourth parameter based at least partly on operational information of the device (column 3, line 56 – column 4, line 22); and 
generate a reference parameter based at least partly on the first parameter, the second parameter, the third parameter, or the fourth parameter (parameter data; column 3, line 56 – column 4, line 22). 
Regarding claim 32, it is interpreted and rejected for similar reasons as set forth above. In addition, Zad-Issa discloses a system wherein the instructions, when executed by the one or more processors, further cause the device to: 
generate, using the one or more audio speakers and as part of a device initialization, a reference sound (speaker; column 3, line 56 – column 4, line 22); and 
detect, using the one or more microphones, the reference sound, wherein the reference parameter is based at least partly on the reference sound (parameter; column 3, line 56 – column 4, line 22). 
Regarding claim 38, Zad-Issa discloses a method wherein the plurality of parameters are based at least partly on an amount of echo associated with the first audio data (echo; column 4, lines 17-32 and column 8, lines 1-5). 
Regarding claim 39, Zad-Issa discloses a method wherein the plurality of parameters are based at least partly on a loudness associated with an output at the speaker (loudness; column 4, lines 17-32 and column 5, lines 54-64, column 8, lines 1-5).

Claims 25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Vignoli in view of Byrne and Visser and further view of Lu et al. (PGPUB 2009/0164215), hereinafter referenced as Lu.

Regarding claim 25, Vignoli in view of Byrne and Visser disclose a system, but does not specifically teach wherein the instructions, when executed by the one or more processors, further cause the system to: 
determine a pattern of input signals based at least partly on the one or more parameters; and 
generate a reference file based at least partly on the pattern of input signals. 
Lu discloses a system wherein the instructions, when executed by the one or more processors, further cause the system to: 
determine a pattern of input signals based at least partly on the one or more parameters (the air is stifling; p. 0070); and 
generate a reference file based at least partly on the pattern of input signals (database; p. 0070), to reduce operation complexity of a user. 
Therefore, it would have been obvious to one of ordinary skill of the art to modify the system as described above, to enhance the system.

Claim 26 is/are rejected under 35 U.S.C. 103 as being unpatentable over Vignoli in view of Byrne and Visser and in further view of Paquier et al. (PGPUB 2011/0111805), hereinafter referenced as Paquier.

Regarding claim 26, it is interpreted and rejected for similar reasons as set forth above.  In addition, Vignoli disclose a system wherein the instructions, when executed by the one or more processors, further cause the system to: 
generate, using the one or more microphones, third audio data (p.0003, 0009, 0013, 0034).  Furthermore, Byrne discloses determining a confidence value indicating a comparison of the third audio data to the reference file, the second audio data being based at least partly on the confidence value (likelihood connected to carrier phrase based on matching data; p. 0022), however, does not specifically teach analyzing, using the one or more parameters, the first audio data to generate text data corresponding to the first audio data; and causing, using the one or more audio speakers and based at least partly on the text data, output of second audio data.
Paquier discloses a system comprising:
analyzing, using the one or more parameters, the first audio data to generate text data corresponding to the first audio data (p. 0032-0034); and 
causing, using the one or more audio speakers (p. 0025) and based at least partly on the text data, output of second audio data (synthesize speech; p. 0032-0034), to provide various ways to output data.
Therefore, it would have been obvious to one of ordinary skill of the art to modify the method as described above, to provide synthesized data.

Claim 35 is/are rejected under 35 U.S.C. 103 as being unpatentable over Vignoli in view of Byrne and Visser and in further view of Solem et al. (PGPUB 2013/0346068), hereinafter referenced as Solem
Regarding claim 35, Vignoli in view of Byrne and Visser disclose a method as described above, but does not specifically teach wherein generating the audio output comprises content from a third-party application.
Solem disclose a system wherein generating the audio output comprises content from a third-party application (p. 0051), to provide underlying computing resources or infrastructure.
Therefore, it would have been obvious to one of ordinary skill of the art to modify the method as described above, to provide flexibility to the system.

Claim 40 is/are rejected under 35 U.S.C. 103 as being unpatentable over Vignoli in view of Byrne and Visser and in further view of Paquier et al. (PGPUB 2011/0111805), hereinafter referenced as Paquier.

Regarding claim 40, It is interpreted and rejected for similar reasons as set forth above, however, Vignoli in view of Byrne and Visser does not specifically teach wherein the plurality of parameters are based at least partly on whether the speaker or a text-to-speech algorithm generate an output. 
Paquier discloses a method wherein the plurality of parameters are based at least partly on whether the speaker or a text-to-speech algorithm generate an output (p. 0032-0034), to provide various modalities.
Therefore, it would have been obvious to one of ordinary skill of the art to modify the method as described above, to provide synthesized data.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  This information has been detailed in the PTO 892 attached (Notice of References Cited).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAKIEDA R JACKSON whose telephone number is (571)272-7619.  The examiner can normally be reached on Mon - Fri 6:30a-2:30p.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on 571.272.5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/JAKIEDA R JACKSON/Primary Examiner, Art Unit 2657