DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 09/06/2022 has been entered. 
Claims 2-13, 15-16 are pending in the application and have been examined. Claims 1 and 14 have been cancelled.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The response filed on 09/06/2022 has been correspondingly accepted and considered in this Office Action. Claims 2-13 and 15-16  have been examined. Claims 1 and 14  have been cancelled. 
Response to Arguments
Applicant's arguments filed 09/06/2022  have been fully considered as follows:
Applicant’s arguments with respect to claim 2 on pg. 9 states that
“The Office Action asserts, on page 15, that Bocklet teaches the aforementioned feature of acquiring a model, and also asserts that "background noise is interpreted to include non-speech like silence or the like."  However, by contrast, amended claim 2 recites that the keyword includes speech, the background noise includes non-speech, and that the keyword and the background noise do not include silence. As such, it is respectfully submitted that as a result of the present amendment, "background noise" cannot be "interpreted to include non-speech like silence or the like...”
	
Applicant’s arguments above with respect to claim 2 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
In response to the art rejection(s) of the remainder of dependent claims are rejected under 35 U.S.C 103, in case said claims are correspondingly discussed and/or argued for at least the same rationale presented in Remarks filed 09/06/2022  , Examiner respectfully notes as follows. For completeness, should the mentioned claims be likewise traversed for similar reasons to independent claims 2 and 13 correspondingly, Examiner respectfully directs Applicant to the same previous supra reasons provided in the response directed towards claims 2 and 13 correspondingly discussed above. For at least the same supra provided reasons, Examiner likewise respectfully disagrees, and Applicant's arguments have been fully considered but they are not persuasive.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 2-13 and 15-16 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Wu, Minhua, et al. "MONOPHONE-BASED BACKGROUND MODELING FOR TWO-STAGE ON-DEVICE WAKE WORD DETECTION." (cited in IDS).
Regarding Claim 2,  Wu teaches the apparatus of comprising a first acquisition processor configured to acquire speech data including a plurality of frames (see Wu, pg. 5495 sect. 2.1 where speech & non-speech frames are processed) ; a second acquisition processor configured to acquire a model trained to, upon input of a feature amount extracted from the speech data, output information indicative of likelihood of each of a plurality of classes including a component of a keyword and a component of background noise other than the keyword, the keyword including speech, the background noise including non-speech, the keyword and the background noise not including silence(see Wu, pg. 5495, sect 2 teaches a foreground HMM consists of wake word phones and several non-speech frames at the beginning, while the background HMM consists of a loop over single-state speech and nonspeech events);  a first calculation processor configured to calculate a keyword score indicative of occurrence probability of the component of the keyword, based on the information output from the model, by extracting the feature amount for each of the frames of the speech data and inputting the feature amount to the model ( see Wu, pg. 5495, sect. 2.1 where the 1st stage HMM decoding graph to compute posteriors for the keyword “Alexa” based on acoustic features), a second calculation processor configured to determine whether or not the speech data includes a candidate for the keyword based on the keyword score and a first threshold, and if the speech data is determined to include the candidate for the keyword, calculate a background noise score indicative of occurrence probability of the component of the background noise, based on the information output from the model, by extracting the feature amount for each of the frames corresponding to the candidate for the keyword and inputting the feature amount to the model (see Wu, pg. 5495, sect 2, A wake word is hypothesized by the first stage if the final state of the foreground HMM is reached and the difference between foreground and background log-likelihoods during the candidate segment exceeds a threshold; background log-likelihoods interpreted as background noise score); and a determination processor configured to determine whether or not the speech data includes the keyword based on at least the background noise score and a second threshold (see Wu, pg. 5495, sect. 2, sect 3 match score is computed to determine the real wake word; the match score processing based on wakeword and background score is interpreted as second threshold).
Regarding claim 3, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein the information includes correspondence between a phoneme as the component of the keyword and a first Hidden Markov Model, and correspondence between a phoneme as the component of the background noise and a second Hidden Markov Model (Wu, pg. 5495, sect 2.1 A wake word is hypothesized by the first stage if the final state of the foreground HMM is reached and the difference between foreground and background log-likelihoods during the candidate segment exceeds a threshold; phoneme of keyword and first HMM. Wu, pg. 5495, sect 3.2 teaches calculating a match score based on the background model which is learned from non-keyword audio, we would expect lower match score between the real wake word phone segment and background monophones, but higher match score between phone segment in false wake word hypothesis and background monophones).
Regarding claim 4, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein in calculating the keyword score, the first calculation  processor calculates occurrence probability of correspondence between a phoneme as the component of the keyword and a Hidden Markov Model, and calculates a cumulative value of the occurrence probability of the correspondence by using Viterbi algorithm ( Wu, pg. 5495, sect. 2.1, The foreground HMM consists of wake word phones and several non-speech frames at the beginning, while the background HMM consists of a loop over single-state speech and non-speech events. Viterbi decoding is performed on the HMM decoding graph using frames of acoustic features computed from the audio signal, and we are using a Deep Neural Network (DNN) based acoustic model to compute posteriors of different HMM states). 
Regarding claim 5, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein in calculating the background noise score, the second calculation processor calculates occurrence probability of correspondence between a phoneme as the component of the background noise and a Hidden Markov Model and calculates a cumulative value of the occurrence probability of the correspondence by using Viterbi algorithm( Wu, pg. 5495, sect. 3.1 Figure 2 is a simplified version of the new background model, since we are actually using 3-state HMM topology for these background monophones. We should still be able to use Viterbi decoding and DNN based acoustic model on the new decoding graph, but output targets of the DNN will be expanded accordingly).
Regarding claim 6, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein if the keyword score is larger than a first threshold ,and the background noise score is smaller than a second threshold, the determination processor determines that the speech data includes the keyword (see Wu, pg. 5495, sect 2.1 & 3.2, When the first stage log-likelihood ratio exceeds the threshold, the corresponding audio segment X which runs through is treated as a candidate wake word(first threshold). Additionally, we come up with a new score (MatchScore p,q) measuring the degree of match between each candidate's wake word phone segment p and every background monophone q, as indicated in equation 1. For each frame Xtwithin one wake word phone p, we take the maximum log likelihood among the three states of each background monophone q, and average these log likelihoods over the phone duration of p. Since the background model is only learned from non-keyword audio, we would expect lower match score between the real wake word phone segment and background monophones, but higher match score between phone segment in false wake word hypothesis and background monophones; match score is interpreted as comparison with background noise score and second threshold ). 
Regarding claim 7, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein if a difference between the keyword score and the background noise score is larger than a third threshold, the determination processor determines that the speech data includes the keyword ( see Wu, pg. 5495, sect 2, A wake word is hypothesized by the first stage if the final state of the foreground HMM is reached and the difference between foreground and background log-likelihoods during the candidate segment exceeds a threshold).
Regarding claim 8, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein if a ratio between the keyword score and the background noise score is larger than a fourth threshold, the processor determines that the speech data includes the keyword ( Wu, pg. 5495, sect. 2.2. When the first stage log-likelihood ratio exceeds the threshold, the corresponding audio segment X which runs through is treated as a candidate wake word; the first stage is foreground and background log-likelihoods).
Regarding claim 9, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein if the keyword score is larger than the first threshold, the second calculation processor determines that the speech data includes the candidate for the keyword, and calculates the background noise score for the frames corresponding to the candidate for the keyword by using start information and end information of the candidate for the keyword,  the determination processor determines that the speech data includes the keyword and  if the background noise score is smaller than the second threshold(Wu, pg. 5495, sect 2.2 and 3.2 When the first stage log-likelihood ratio exceeds the threshold, the corresponding audio segment X which runs through is treated as a candidate wake word. Specifically, the feature vector v for the candidate wake word X includes information from both the entire segment and individual phone segments. Segment-level features include the duration, keyword likelihood score, normalized likelihood score and posterior for the keyword. Additionally, we come up with a new score (MatchScore p,q) measuring the degree of match between each candidate's wake word phone segment p and every background monophone q, as indicated in equation 1. For each frame Xtwithin one wake word phone p, we take the maximum log likelihood among the three states of each background monophone q, and average these log likelihoods over the phone duration of p. Since the background model is only learned from non-keyword audio, we would expect lower match score between the real wake word phone segment and background monophones, but higher match score between phone segment in false wake word hypothesis and background monophones; match score is interpreted as comparison with background noise score and second threshold).
Regarding claim 10, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein if the keyword score is larger than the first threshold, the second calculation processor determines that the speech data includes the candidate for the keyword, and calculates the background noise score for the frames corresponding to the candidate for the keyword by using start information and end information of the candidate for the keyword, and if a difference between the keyword score and the background noise score is larger than a third threshold, the determination processor determines that the speech data includes the keyword(Wu, pg. 5495, sect 2.1 and 2.2 A wake word is hypothesized by the first stage if the final state of the foreground HMM is reached (interpreted as keyword score larger than first threshold) and the difference between foreground and background log-likelihoods during the candidate segment exceeds a threshold(interpreted as third threshold). When the first stage log-likelihood ratio exceeds the threshold, the corresponding audio segment X which runs through is treated as a candidate wake word).
Regarding claim 11, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein if the keyword score is larger than the first threshold, the second calculation processor determines that the speech data includes the candidate for the keyword, and calculates the background noise score for the frames corresponding to the candidate for the keyword by using start information and end information of the candidate for the keyword, and if a ratio between the keyword score and the background noise score is larger than a fourth threshold, the determination processor determines that the speech data includes the keyword( Wu, pg. 5495, sect. 2.2. A wake word is hypothesized by the first stage if the final state of the foreground HMM is reached (interpreted as keyword score larger than first threshold . When the first stage log-likelihood ratio exceeds the threshold(interpreted as forth threshold), the corresponding audio segment X which runs through is treated as a candidate wake word; the first stage is foreground and background log-likelihoods).
Regarding claim 12, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein the classes include a plurality of components of the background noise, and the second calculation processor calculates the background noise score for each of the plurality of components of the background noise in each of the frames (see Wu, pg. 5495, sect 3.1, the Monophone Based Background model calculates the background score).
Regarding claim 15, Wu teaches the apparatus of claim 2. Furthermore, Wu teaches wherein the background noise includes speech (see Wu, pg. 5495, Fig. 1 background HMM includes speech/non-speech loop ).
Regarding claim 13, is directed to a method claim corresponding to the apparatus claim presented in claim 2 and is rejected under the same grounds stated above regarding claim 2.
Regarding claim 16, is directed to a method claim corresponding to the apparatus claim presented in claim 15 and is rejected under the same grounds stated above regarding claim 15.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Morin, US Patent 6,985,859 teaches a method for spotting words in a speech signal and further provides for calculating a first confidence score based on a matching ratio between a first minimum recognition value and a first background score. The spotting module continuously estimates the background score of each word. (see Morin, Fig. 4, Col 6, lines 6-19 and Col 4, lines 53-57).
Weiss et. al. ,  US Patent 8,131,543 teaches the classifier which includes a Gaussian mixture model for speech and a Gaussian mixture model for noise and uses a speech/noise probability (SNP) calculator to determine the probabilities that a frame is associated with noise, speech, or both (see Weiss, col 6, lines 14-24).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NANDINI SUBRAMANI whose telephone number is (571)272-3916. The examiner can normally be reached Monday - Friday 12:00pm - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh M Mehta can be reached on (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NANDINI SUBRAMANI/Examiner, Art Unit 2656                                                                                                                                                                                                        
/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656