DETAILED ACTION
Introduction
1.	This office action is in response to Applicant’s submission filed on 10/27/2022.   Claims 1, 3-11, 13-21, and 23-33 are pending in the application and have been examined.

Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to RCE and Amendment
3.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on October 27, 2022has been entered.
 4.	Applicant’s arguments and amendments in the Amendment filed October 27, 2022 (herein “Amendment”) with respect to rejections of independent Claims 1, 11, 21, and 30 under 35 U.S.C. 103 have been fully considered, but are not considered persuasive.  With regard to the feature “select a subset of band outputs from the plurality of band outputs from the domain transformation circuit by filtering symmetric band outputs from the plurality of band outputs,” it is noted that “filtering” is broader than “discarding” which is how the specification describes the treatment of symmetric bands.  “Filtering” can be considered simply modifying some of the data in some way.  In this regard, Ramirez describes in Section 3.1.1 that the input signal is sent through a filter bank.  This would “filter” all of the data, including symmetric band outputs.  Since the claims do not recite “filtering only symmetric band outputs from the plurality of band outputs” or “removing symmetric band outputs from the plurality of band outputs,” Ramirez’s description of filtering all data renders obvious the claimed feature.  Therefore, in view of the above, while all of Applicant’s amendments and arguments have been fully considered, they are not persuasive with respect to Claims 1, 11, 21, and 30.

Claim Rejections - 35 USC § 103
5.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


6.	Claims 1, 3, 6-9, 11, 13, 16-19, 21, 23, and 26-30 are rejected under 35 U.S.C. 103 as being unpatentable over “Speech/Non-Speech discrimination combining advanced feature extraction and SVM learning” (Ramirez et al., hereinafter “Ramirez”) (Cited in IDS filed December 3, 2021) in view of U.S. Patent App. Pub. No. 20200395042 (Hanazawa) and U.S. Patent App. Pub. No. 20200279557 (Li, hereinafter “Li”).
	With regard to Claim 1, Ramirez describes:
A processing system configured for performing voice activity detection, comprising:
receive audio data from an audio source; (Figure 1 and Section 3, noisy speech is input to a feature extraction unit as shown below)
generate a plurality of model input features using a [[hardware-based]] feature generator based on the received audio data, (Input features X are generated by the feature extraction unit, Figure 1 and Section 3) wherein in order to generate the plurality of model input features using the hardware-based feature generator, the one or more processors are further configured to cause the processing system to:
preprocess the received audio data to generate domain transformation input data; (Section 3.1 the power spectral magnitude Xl is cited as “domain transformation input data”)
generate a plurality of band outputs with a domain transformation circuit based on the domain transformation input data; (Section 3.1.1 the signal magnitude El is cited as “band outputs”)
select a subset of band outputs from the plurality of band outputs from the domain transformation circuit by filtering symmetric band outputs from the plurality of band outputs; (Section 3.1.1 k out of K of the signal magnitudes El are selected after the input signal is sent through a filter bank, which is cited as “filtering symmetric band outputs.”  As all of the band outputs are filtered, the subset of symmetric band outputs are also filtered.) and
determine a signal to noise ratio for each band output of the subset of band outputs, (Section 3.1.1, equation 7)
wherein each signal to noise ratio for each band output is a model input feature of the plurality of model input features; and (Section 3.1.1, the output SNRs are the feature vector.)
determine a presence of voice activity in the audio data based on an output value generated by a [[hardware-based]] voice activity detection model based on the model input features.  (Voice activity Detection (VAD) flag generated based on features, Figure 1 and Section 3)

    PNG
    media_image1.png
    396
    698
    media_image1.png
    Greyscale

	Ramirez does not explicitly describe:
	“a memory comprising computer-executable instructions;
one or more processors configured to execute the computer-executable instructions and cause the processing system to: …
wherein the subset of band outputs comprises fewer band outputs than the plurality of band outputs.”
or using “hardware-based” elements.
However, Hanazawa describes a voice activity detector including processor 102 and memory 103.  (Figure 12B and paragraph 107).  Further, paragraphs 106 and 113 of Hanazawa describes that the device can perform the needed functions with software, hardware, firmware, or a combination thereof.  Further, the broadest reasonable interpretation of “hardware-based” includes a processor and memory.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the hardware of Hanazawa into the system of Ramirez to provide more design flexibility, as described in paragraphs 105-113 of Hanazawa.
Ramirez in view of Hanazawa does not explicitly describe “wherein the subset of band outputs comprises fewer band outputs than the plurality of band outputs.”
However, paragraph 8 of Li describes processing only a subset of frequency band outputs.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the subset of outputs of Li into the system of Ramirez in view of Hanazawa to focus on known frequency bands of interest, as described in paragraph 8 of Li.
With regard to Claim 3, Ramirez describes “the [[hardware-based]] feature generator comprises a [[hardware-implemented]] fast Fourier transformation circuit.” (Section 3.1, a DFT)
Ramirez does not explicitly describe using “hardware-based” elements.
However, paragraphs 106 and 113 of Hanazawa describes that a voice activity detection device can perform the needed functions with software, hardware, firmware, or a combination thereof.  Further, the broadest reasonable interpretation of “hardware-based” includes a processor and memory.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the hardware of Hanazawa into the system of Ramirez to provide more design flexibility, as described in paragraphs 105-113 of Hanazawa.
With regard to Claim 6, Ramirez describes “the [[hardware-based]] voice activity detection model comprises a [[hardware-implemented]] SVM model.”  (Section 3.1)
Ramirez does not explicitly describe using “hardware-based” elements.
However, paragraphs 106 and 113 of Hanazawa describes that a voice activity detection device can perform the needed functions with software, hardware, firmware, or a combination thereof.  Further, the broadest reasonable interpretation of “hardware-based” includes a processor and memory.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the hardware of Hanazawa into the system of Ramirez to provide more design flexibility, as described in paragraphs 105-113 of Hanazawa.
With regard to Claim 7, Ramirez describes “wherein the [[hardware-implemented]] SVM model comprises:
a first, multi-column SVM circuit; (SVM training circuit in Figure 1, the multi-columns are noted by the index i.) and
a second, single-column SVM circuit configured to generate the output value.  (SVM VAD circuit in Figure 1.)
Ramirez does not explicitly describe using “hardware-based” elements.
However, paragraphs 106 and 113 of Hanazawa describes that a voice activity detection device can perform the needed functions with software, hardware, firmware, or a combination thereof.  Further, the broadest reasonable interpretation of “hardware-based” includes a processor and memory.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the hardware of Hanazawa into the system of Ramirez to provide more design flexibility, as described in paragraphs 105-113 of Hanazawa.
With regard to Claim 8, Ramirez does not explicitly describe “the one or more processors are further configured to cause the processing system to:
load a plurality of model parameters for the hardware-implemented SVM model into the memory.”
However, paragraphs 102-104 of Hanazawa describe that a storage unit 100 stores parameters of a neural network for a processor to use.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the stored parameters of Hanazawa into the system of Ramirez to allow sharing of the parameters between a learning circuit and a detection circuit, as described in paragraphs 103-105 of Hanazawa.
With regard to Claim 9, Ramirez does not explicitly describe that “the subset of band outputs comprises eight band outputs.”  However, Section 3.1 of Ramirez describes that there are K band outputs, and thus it would be a simple design choice yielding predictable results to select 8 for the parameter K.
With regard to Claims 11, 13, and 16-19, system Claim 1 and method Claim 11 are related as system and the method of using same, with each claimed element's function corresponding to the claimed method step. Accordingly, Claim 11 is similarly rejected under the same rationale as applied above with respect to Claim 1.  In a similar manner, corresponding dependent Claims 13 and 16-19 are rejected under the same rationale as Claims 3 and 6-9.
With regard to Claims 21, 23, and 26-29, system Claim 1 and computer readable medium Claim 21 are related as system and a non-transitory computer readable medium which perform the method of using same, with each claimed element's function corresponding to the claimed method step. Accordingly, Claim 21 is similarly rejected under the same rationale as applied above with respect to Claim 1, and to the extent Ramirez does not explicitly teach a non-transitory computer readable medium, at least Hanazawa teaches a non-transitory computer readable medium as claimed in claim 21, in Hanazawa paras. 108 and 111.  In a similar manner, corresponding dependent Claims 23 and 26-29 are rejected under the same rationale as Claims 3 and 6-9.
With regard to Claim 30, system Claim 1 and means plus function Claim 30 are related as system and a means plus function claim which recites the same functions. Accordingly, Claim 30 is similarly rejected under the same rationale as applied above with respect to Claim 1.  

7.	Claims 4, 10, 14, 20, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Ramirez in view of Hanazawa and Li and further in view of U.S. Patent App. Pub. No. 20140358552 (Xu).
With regard to Claim 4, Ramirez describes:
“in order to determine the signal to noise ratio for each band output of the subset of band outputs, the one or more processors are further configured to cause the processing system to: 
determine a noise [[floor]] for each band output of the subset of band outputs;  (Section 3.3.1, Equation 6, noise is NlB)
apply a log function to the noise [[floor]] for each band output of the subset of band outputs; (Section 3.3.1, Equation 7, noise is NlB)
determine a signal power level for each band output of the subset of band outputs; (Section 3.3.1, Equation 6, signal is ElB) and 
apply the log function to the signal power level for each band output of the subset of band outputs, (Section 3.3.1, Equation 7, signal is ElB)
wherein the signal to noise ratio for each band output of the subset of band outputs comprises a log signal to noise ratio. (Section 3.3.1, Equation 7)
However, Ramirez in view of Hanazawa and Li does not describe that the noise is a noise floor.
Paragraph 48 of Xu describes that a noise floor is used as the noise level for computing an SNR.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the noise floor of Xu into the system of Ramirez in view of Hanazawa and Li to allow smoothing of the voice detection probability, as described in paragraph 47 of Xu.
With respect to Claim 10, Ramirez in view of Hanazawa and Li does not explicitly describe “the audio source comprises one or more microphones of the processing system.”
However, paragraph 27 of Xu describes microphone 102.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the microphone of Xu into the system of Ramirez in view of Hanazawa and Li to allow for voice signals to be detected, as described in paragraph 28 of Xu.
With regard to Claims 14 and 20, system Claim 1 and method Claim 11 are related as system and the method of using same, with each claimed element's function corresponding to the claimed method step. Accordingly, corresponding dependent Claims 14 and 20 are rejected under the same rationale as Claims 4 and 10.
With regard to Claim 24, system Claim 1 and computer readable medium Claim 21 are related as system and a computer readable medium which perform the method of using same, with each claimed element's function corresponding to the claimed method step. Accordingly, corresponding dependent Claim 24 is rejected under the same rationale as Claim 4.

8.	Claims 5, 15, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Ramirez in view of Hanazawa and Li and further in view of U.S. Patent App. Pub. No. 20110090100 (Shemirani et al., hereinafter “Shem”).
With regard to Claim 5, Ramirez in view of Hanazawa and Li does not explicitly describe:
“in order to preprocess the received audio data, the one or more processors are further configured to cause the processing system to:
split the received audio data into a first audio data stream and a second audio data stream;
apply a delay function to the second audio data stream to generate a delayed second audio data stream;
apply a window function to the first audio data stream and the delayed second audio data stream; and
apply a serial to parallel conversion to the first audio data stream and the delayed second audio data stream.”
However, Shem describes:
“in order to preprocess the received audio data, the one or more processors are further configured to cause the processing system to:
split the received audio data into a first audio data stream and a second audio data stream; (Paragraph 45, a single input signal may go to several inputs of a signal combiner.)
apply a delay function to the second audio data stream to generate a delayed second audio data stream; (Paragraph 46 describes that a delay line may be used to delay one of the input signals.)
apply a window function to the first audio data stream and the delayed second audio data stream; and (Paragraph 49 describes that the bandwidth of a signal combiner may vary, and this bandwidth is cited as “a window function.”)
apply a serial to parallel conversion to the first audio data stream and the delayed second audio data stream.”  (Paragraph 49 describes that the routing circuit described may include a serial to parallel converter to merge multiple signals.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the split-delay-combine functions of Shem into the system of Ramirez in view of Hanazawa and Li to so that the input signal for the passive signal combiner will be substantially continuous, as described in paragraph 49 of Shem.
With regard to Claim 15, system Claim 1 and method Claim 11 are related as system and the method of using same, with each claimed element's function corresponding to the claimed method step. Accordingly, corresponding dependent Claim 15 is rejected under the same rationale as Claim 5.
With regard to Claim 25, system Claim 1 and computer readable medium Claim 21 are related as system and a computer readable medium which perform the method of using same, with each claimed element's function corresponding to the claimed method step. Accordingly, corresponding dependent Claim 25 is rejected under the same rationale as Claim 5.

9.	Claims 31 and 32 are rejected under 35 U.S.C. 103 as being unpatentable over Ramirez in view of Hanazawa and Li and further in view of U.S. Patent App. Pub. No. 20160035350 (Jung et al., hereinafter “Jung”).
With regard to Claim 31, Ramirez in view of Hanazawa and Li does not explicitly describe “the hardware-based voice activity detection model comprises a context-specific model associated with an audio environment from a plurality of audio environments.”
However, paragraph 41 of Jung describes selecting a context-specific model associated with a specific audio environment from a plurality of audio environments.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the multiple models for each of a plurality of environments of Jung into the system of Ramirez in view of Hanazawa and Li to update the current model based on environmental effects, as described in paragraph 41 of Jung.
With regard to Claim 32, Ramirez in view of Hanazawa and Li does not explicitly describe “identifying the audio environment from which the audio data was received based on the plurality of model input features.”
However, paragraphs 72 and 73 of Jung describes identifying the audio environment (such an environment with a vacuum cleaner running) from which the audio data was received based on the plurality of model input features.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the multiple models for each of a plurality of environments of Jung into the system of Ramirez in view of Hanazawa and Li to update the current model based on environmental effects, as described in paragraph 41 of Jung.

10.	Claim 33 is rejected under 35 U.S.C. 103 as being unpatentable over Ramirez in view of Hanazawa, Li, and Jung and further in view of U.S. Patent App. Pub. No. 20060100866 (Alewine et al., hereinafter “Alewine”) and U.S. Patent App. Pub. No. 20170111737 (Painter et al., hereinafter “Painter”).
With regard to Claim 33, Ramirez in view of Hanazawa, Li, and Jung does not explicitly describe “the plurality of model input features comprises a smoothed energy measurement and a set of smoothed spectral coefficients.”
However, paragraph 33 of Alewine describes using a normalized energy measurement as a feature, and paragraphs 38 and 40 of Alewine describe using spectral coefficients as a feature.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the features of Alewine into the system of Ramirez in view of Hanazawa, Li, and Jung to minimize resource usage, as described in paragraph 40 of Alewine.
Ramirez in view of Hanazawa, Li, Jung, and Alewine does not explicitly describe using smoothed features.
However, paragraph 127 of Painter describes smoothing features such as energy measurements.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the smoothing of features of Painter into the system of Ramirez in view of Hanazawa, Li, Jung, and Alewine to reduce/remove audio artifacts, as described in paragraph 127 of Painter.

Conclusion
11.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
U.S. Patent No. 9,779,755 (Kay et al.) describes a device that removes redundant portions of audio data.
12.	 A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
13.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDWARD TRACY whose telephone number is (571)272-8332. The examiner can normally be reached Monday-Friday 9 AM- 5PM.  Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/EDWARD TRACY JR./Examiner, Art Unit 2656                                                                                                                                                                                                        
/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656