DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 09/09/2022 has been entered. 
This office action is in response to Applicant’s submission filed on 09/09/2022. Claims 1-10 are pending in the application and have been examined.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The response filed on 09/09/2022 has been correspondingly accepted and considered in this Office Action. Applicant’s amendments to claims 1, 2, 3, 5-7, 9 and 10 have been noted and claims 1-10 have been examined.
Response to Arguments
Applicant's arguments filed 09/09/2022 have been fully considered as follows:
Applicant’s arguments with respect to claim 1 on page 8 states that
“However, the technique noted in Heigold is completely different from the data- extraction technique recited in applicant's amended claim 1. Heigold differs from applicant's claim 1 in terms of the feature quantity to extract. Furthermore, Heigold does not describe or suggest "quantity" and "phoneme" recited in claim 1…. However, it is 
respectfully submitted that the "classifier" in Heigold depends on languages. On the other hand, the "keyword" in claim 1 does not change for each language. Thus, the Office's
assertion is inconsistent with the amended claimed features…. However, "second training data" in claim 1 are not "parameters for the model" but 
rather data for training a model.”
	
Applicant’s arguments above with respect to claim 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. For independent claims 9 and 10, Examiner respectfully directs Applicant to the same previous supra reasons provided in the response directed towards claims 1 discussed above.
In response to the art rejection(s) of the remainder of dependent claims are rejected under 35 U.S.C 103, in case said claims are correspondingly discussed and/or argued for at least the same rationale presented in Remarks filed 09/09/2022, Examiner respectfully notes as follows. For completeness, should the mentioned claims be likewise traversed for similar reasons to independent claim 1 correspondingly, Examiner respectfully directs Applicant to the same previous supra reasons provided in the response directed towards claims 1 discussed above. For at least the same supra provided reasons, Examiner likewise respectfully disagrees, and but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument of the independent claim 1.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-4 and 7-10 is rejected under 35 U.S.C. §103 as being unpatentable over Fukuda et. al. US Patent 10,832,129 in view of Pearce, U.S. Patent 9,953,634.
Regarding claim 1, Fukuda teaches an information processing apparatus comprising: a processor configured to operate as: (Fukuda, col 11 lines 46-55)  a first data acquisition unit configured to acquire first training data including at least one combination of a voice feature quantity and a correct phoneme label of the voice feature quantity (Fukuda, Fig. 2 step 204 ,col 5 line 64-col6 line 3 In step 204, the system inputs the obtained training data into the trained AM.  In response to the input of the obtained training data, the system calculates each posterior probability of context-dependent states corresponding to each phoneme in a phoneme class of a phoneme to which each frame in the training data belongs); a training unit configured to train, using the first training data, an acoustic model  in a manner to output the correct phoneme label in response to input of the voice feature quantity, the input being performed by the acoustic model, the acoustic model outputting the correct phoneme label (Fukuda, col 6 lines 50-53 ) In step 207, the system inputs the obtained training data, which was also input to the trained AM in step 204, into the NN and then updates the NN, using the soft label obtained in step 206. Fukuda, col 7 lines 21-45 teaches the training of the AM to output the correct phoneme label ).  However, Fukuda fails to teach an extraction unit configured to extract, from the first training data, second training data including one or more voice feature quantities of at least one of a preset keyword, a sub-word included in the preset keyword, a syllable included in the preset keyword, or a phoneme included in the preset keywordpreset keyword, the keyword model being generated from the trained acoustic model, the keyword model adapted for the preset keyword being retrained based on the trained acoustic model.
	However, Pearce teaches an extraction unit configured to extract, from the first training data, second training data including one or more voice feature quantities of at least one of a preset keyword, a sub-word included in the preset keyword, a syllable included in the preset keyword, or a phoneme included in the preset keyword (see Pearce, col 5 lines 30-36  The keyword sensing module 310 is operable to receive an utterance from a user (a spoken command or phrase) and determine whether the utterance includes a keyword or a key phrase. In response to that determination, the keyword sensing module 310 can pass along the recognized keyword or command for further processing; keyword or key phrase is preset keyword); and an adaptation processing unit configured to adapt, using the second training data, the trained acoustic model to a keyword model used for detection of the preset keyword, the keyword model being generated from the trained acoustic model, the keyword model adapted for the preset keyword being retrained based on the trained acoustic model (Pearce, col 5 lines 60-65  The collected (and optionally processed) utterances are used to create and train a speaker-dependent keyword sensing model 330. The speaker-dependent keyword sensing model 330 can replace the speaker-independent keyword sensing model 320 after the training is complete or substantially complete (see FIG. 3B, for example.).
Fukuda and Pearce are considered to be analogous to the claimed invention because they relate to speech recognition training methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Fukuda on training acoustic models with the passive training for keyword teachings of Pearce to improve the quality of training of the speaker dependent model ( see Pearce, col 1 lines 38-43).
Regarding claim 2, Fukuda in view of Pearce teaches the apparatus according to claim 1. Pearce further teaches the processor further configured to operate as a second data acquisition unit configured to acquire keyword utterance data including utterance voice of the preset keyword, wherein the adaptation processing unit adapts the acoustic model to the keyword model using the second training data and the keyword utterance data (Pearce, Fig. 4, steps 404 and 408 teaches updating the trained speaker dependent model based on detecting keyword or the key-phrase in the spoken utterance to create speaker dependent model for the keyword).
Regarding claim 3, Fukuda in view of Pearce teaches the apparatus according to claim 1. Pearce further teaches wherein the extraction unit extracts as the second training data, a data piece in which a proportion in number of a letter of the preset keyword, a letter of the sub-word, the syllable, or the phoneme to the data piece is a predetermined value or more  (Pearce, Fig. 3A and col 5 lines 51-60 teaches sensing the user's/speaker's spoken words and/or phrases (utterances) and the collected (and optionally processed) utterances are used to create and train a speaker-dependent keyword sensing model 330).
Regarding claim 4, Fukuda in view of Pearce teaches the apparatus according to claim 1. Pearce further teaches wherein the extraction unit extracts the second training data up to a predetermined number of data pieces (Pearce, col 7 lines 34-40 At block 402, the illustrated method 400 includes utilizing a speaker-independent model to detect a spoken keyword or a key phrase in spoken utterances. At block 404, the method includes passively training a speaker-dependent model to detect the spoken keyword or the key phrase in the spoken utterances using at least partially the spoken utterances; the keyword detection is interpreted as extracting the second training data up to a predetermined number of data pieces).
Regarding claim 7, Fukuda in view of Pearce teaches the apparatus according to claim 1, Pearce further teaches a preset keyword setting unit configured to receive setting of the keyword from a user (Pearce, col 5 lines 30-36, The keyword sensing module 310 is operable to receive an utterance from a user (a spoken command or phrase) and determine whether the utterance includes a keyword or a key phrase. In response to that determination, the keyword sensing module 310 can pass along the recognized keyword or command for further processing).
Regarding claim 8, Fukuda in view of Pearce teaches a keyword detecting apparatus configured to perform keyword detection using a keyword model adapted by the apparatus according to claim 1 (Fukuda, col 11 line 38-col 12 line 5) and is rejected under the same grounds stated above regarding claim 1.
Regarding claim 9, is directed to a method claim corresponding to the apparatus claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1.
Regarding claim 10, is directed to a non-transitory computer readable medium including computer executable instructions claim corresponding to the apparatus claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1.
Claims 5 and 6 are rejected under 35 U.S.C. §103 as being unpatentable over Fukuda et. al. US Patent 10,832,129 in view of Pearce, U.S. Patent 9,953,634 further in view of Liu (J. Liu, Z. Ling, S. Wei, G. Hu and L. Dai, "Cluster-based senone selection for the efficient calculation of deep neural network acoustic models," 2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP), 2016, pp. 1-5).
Regarding claim 5, Fukuda in view of Pearce teaches the apparatus according to claim 1. Pearce teaches wherein the extraction unit extracts data pieces as the second training data up to a predetermined number of data pieces (Pearce, col 7 lines 34-40 At block 402, the illustrated method 400 includes utilizing a speaker-independent model to detect a spoken keyword or a key phrase in spoken utterances. At block 404, the method includes passively training a speaker-dependent model to detect the spoken keyword or the key phrase in the spoken utterances using at least partially the spoken utterances; the keyword detection is interpreted as extracting the second training data up to a predetermined number of data pieces),  however Fukuda in view of Pearce fails to teach in descending order according to a proportion in number of a 10letter of the preset keyword, a letter of the sub-word, the syllable, or the phoneme to a data piece.  

    PNG
    media_image1.png
    110
    315
    media_image1.png
    Greyscale
However, Liu teaches in descending order according to a proportion in number of a letter of the preset keyword, a letter of the sub-word, the syllable, or the phoneme to a data piece (See Liu, pg. 3, col 2, lines 3-9, e.g.  “Figure 1 demonstrates the DNN structure with cluster-based senone selection for output calculation. Original DNN structure and its weight parameters are kept intactly. A new cluster layer is added on the top hidden layer to predict selected senones.”, Liu, pg. 2, col 2, lines 11-14, “Mathematically, supposing the c-th cluster has Kc frames of acoustic features in the training data set, its average posterior vector can be calculated  as (9).  yL are sorted in descending order, and the top Nc senones, of which the accumulated posterior exceeds a predefined confidence α are determined as the selected senones of the c-th cluster”).
Fukuda, Pearce and Liu are considered to be analogous to the claimed invention because they relate to speech recognition training methods. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Fukuda and Pearce on training keyword models with the descending cluster based training as taught by Liu to optimize methods of DNN calculation to accelerate the calculations (see Liu, pg. 1, Col2, lines 27-30).
Regarding claim 6, Fukuda in view of Pearce teaches the apparatus according to claim 1. Pearce further teaches wherein the extraction unit extracts as the second training data, data pieces in each of which a proportion in number of a letter of the preset keyword, a letter of the sub-word, the syllable, or the phoneme to a data piece is a predetermined value or more(Pearce, Fig. 3A and col 5 lines 51-60 teaches sensing the user's/speaker's spoken words and/or phrases (utterances) and the collected (and optionally processed) utterances are used to create and train a speaker-dependent keyword sensing model 330). However, Fukuda in view of Pearce fails to teach in descending order according to the proportion. 
However, Liu teaches in descending order according to a proportion (See Liu, pg. 2, col 2, lines 11-14, e.g.  “Figure 1 demonstrates the DNN structure with cluster-based senone selection for output calculation. Original DNN structure and its weight parameters are kept intactly. A new cluster layer is added on the top hidden layer to predict selected senones.”, Liu, pg. 3, col 2, lines 3-9, “Mathematically, supposing the c-th cluster has Kc frames of acoustic features in the training data set, its average posterior vector can be calculated  as in equation (9).  yL are sorted in descending order, and the top Nc senones, of which the accumulated posterior exceeds a predefined confidence α are determined as the selected senones of the c-th cluster”).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Bocklet et. al., (US. Patent Number 9,792,907), discloses (Bocklet et. al. abstract, Fig. 7, Fig 8 discloses techniques to update a key phrase model based on scores of sub-phonetic units from an acoustic model to generate a key phrase likelihood score and determining whether received audio input is associated with a predetermined key phrase based on the rejection likelihood score and the key phrase likelihood score). 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NANDINI SUBRAMANI whose telephone number is (571)272-3916. The examiner can normally be reached Monday - Friday 12:00pm - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh M Mehta can be reached on (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NANDINI SUBRAMANI/Examiner, Art Unit 2656                                                                                                                                                                                                        
/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656