DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) filed May 31, 2019 fails to comply with 37 CFR 1.98(a)(2), which requires a legible copy of each non-patent literature publication or that portion which caused it to be listed; and all other information or that portion which caused it to be listed.  Although most of the references in the May 31, 2019 IDS have been considered as indicated in the annotated PTO-1449 attached herein, a copy of Non-Patent Literature (NPL) citation number 16, Li et al., “Robust Automatic Speech Recognition: A Bridge to Practical Applications,” was not found as filed. It is noted that two copies of NPL citation 18 were filed however.

Claim Objections
Claim 22 is objected to because of the following informalities:  claim 22 features a typographical error of ending with two periods.  Appropriate correction is required.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 17, and therefore claims 18-22 which depend therefrom, are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because the claim recites “a non-transient, computer readable medium storing instructions,” (emphasis added) and given the broadest reasonable interpretation of transient to mean "lasting only for a short time; impermanent,” a signal per se is still a potential interpretation given the broadest reasonable interpretation for the claimed “non-transient, computer-readable medium.” Therefore, claim 17, and claims 18-22 which depend therefrom, are simply signals per-se and do not fall within one of the statutory categories for subject matter eligibility.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4, 9-12, and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over Fukuda et al., (US 2020/0034702 A1, herein “Fukuda”) in view of Li et al., “Certainty-Driven Consistency Loss for Semi-supervised Learning,” arXiv:1901.05657v1 [cs.CV], https://doi.org/10.48550/arXiv.1901.05657 (herein “Li NPL”).
Regarding claim 1, Fukuda teaches a system for conditional teacher-student model training, comprising (Fukuda paras. 15, 17 and 37, apparatus 10 implemented as one or more computers, that performs operations to train a student neural network): 
a computer processor (Fukuda para. 82, CPU performing the operations); and 
a memory storage device including instructions that when executed by the computer processor enable the system to (Fukuda paras. 82, 84, 86, operations performed by CPU designated by an instruction sequence of programs stored in computer readable media): 
access a trained teacher model configured to perform a task (Fukuda para. 48, a teacher neural network is selected (access) to be used for teacher-student training, where paras. 31 and 73 teach that the teacher neural network is for classification of audio data using an acoustic model to identify phonemes (a task)), 
create an untrained student model (Fukuda para. 56, a student neural network is trained, therefore the student model of the student neural network is created through the training), 
provide training data labeled with ground truths to the teacher model to produce teacher posteriors representing the training data (Fukuda paras. 54-56, teacher input data (training data labeled with ground truths) is input to the teacher neural networks, and soft label outputs (posteriors) are generated therefrom), 
when it is determined that a teacher posterior matches the associated ground truth label, conditionally use the teacher posterior to train the student model (Fukuda paras. 50, and 53-55, a selecting section (conditionally) selects a teacher neural network that outputs a soft label output closest (matches) to the correct data corresponding to the input data (associated ground truth label), where the selected teacher NN’s soft label output (teacher posterior) is the one selected to be used to train the student neural network (student model)), and 
conditionally use the ground truth label to train the student model (Fukuda para. 59, student training section may (conditionally) train the student neural network with both of soft label outputs and hard labels/teacher correct data (ground truth label)).
While Fukuda teaches that both hard labels and soft labels are used to train the student model, Fukuda does not explicitly teach the claimed “when it is determined that a teacher posterior does not match the associated ground truth label.”
Li NPL teaches when it is determined that a teacher posterior does not match the associated ground truth label (Li NPL page 5, sections 3.2 and 3.3, Abstract, in a certainty driven consistency loss approach in teacher-student training, a hard filtering is performed on a teacher’s soft outputs (teacher posterior) that are uncertain (are assumed to not match the associated ground truth label).
Therefore, taking the teachings of Fukuda and Li NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teacher-student training system and operations of Fukuda to include considerations of when a teacher’s soft outputs should be disregarded for training due to uncertainty of being associated (matching) to a target (ground truth) as disclosed in Li NPL at least because doing so would improve performance of the student and improve the quality of the targets given to the student (see Li NPL end of section 1 on page 2).
Regarding claim 2, Fukuda teaches wherein the teacher and student models are associated with at least one of: (i) domain adaptation, (ii) speaker adaptation, and (iii) model compression (Fukuda para. 60, student neural network is smaller than the teacher neural networks, thus a compression of the model in the teacher being taught to the student).
Regarding claim 3, Fukuda teaches wherein the teacher and student models are further associated with at least one of; (i) a neural network model, and (ii) an acoustic model in an automatic speech recognition system (Fukuda paras. 56 and 60, teacher and students are neural networks, and moreover, para 73 teaches that the teacher neural network is for classification of audio data using an acoustic model).
Regarding claims 4, 12, and 18, Fukuda teaches wherein the task is associated with automatic speech recognition and the training data is associated with audio data containing utterances (Fukuda paras. 29-31, training data is human speech as audio data that is processed through the neural network for identifying phonemes, thus “associated” with speech recognition).
Regarding claims 9 and 15, Fukuda teaches wherein the instructions further enable the system to: determine/determining whether student posteriors converge with the teacher posteriors (Fukuda para. 58, student training section determines when an L(ϴ) is minimized where  L(ϴ) as minimized represents a convergence of qi soft label output (posterior) from the teacher and pi is the soft label output (posterior) from the student), in response to determining that the student posteriors and the teacher posteriors converge, finalize the student model (Fukuda paras. 56, 58 and 62, first the L(ϴ) is minimized in step 340, then in subsequent step 350 (in response to) the training is ended (so student model is finalized)), and in response to determining that the student posteriors and the teacher posteriors do not converge (Fukuda paras. 56-58, operations of 340 are repeated to determine the L(ϴ), thus when the L(ϴ) is not minimized, then the student and teacher posteriors are not converged), conditionally update parameters of the student model (Fukuda para. 57, the student training section may (conditionally) repeat iterations at step 340 including adjusting the weights (parameters) of the student neural network).
Regarding claims 10 and 16, Fukuda teaches wherein parameters of the student model are updated according to a back propagation of the student posteriors (Fukuda para. 57, student training adjusting the plurality of weights (parameters) of a student neural network based on the soft label output of the student (student posteriors) using back propagation).
Regarding claim 11, Fukuda teaches a computer implemented (Fukuda paras. 82, 84, 86, operations performed by CPU designated by an instruction sequence of programs stored in computer readable media) method for model training, comprising (Fukuda para. 37, operations for training a student neural network): 
accessing a trained teacher model configured to perform a task (Fukuda para. 48, a teacher neural network is selected (access) to be used for teacher-student training, where paras. 31 and 73 teach that the teacher neural network is for classification of audio data using an acoustic model to identify phonemes (a task)); 
creating an untrained student model (Fukuda para. 56, a student neural network is trained, therefore the student model of the student neural network is created through the training); 
providing training data labeled with ground truths to the teacher model to produce teacher posteriors representing the training data (Fukuda paras. 54-56, teacher input data (training data labeled with ground truths) is input to the teacher neural networks, and soft label outputs (posteriors) are generated therefrom); 
when it is determined that a teacher posterior matches the associated ground truth label, automatically using, by a model training platform, the teacher posterior to train the student model (Fukuda paras. 50, and 53-55, a teacher neural network that outputs a soft label output closest (matches) to the correct data corresponding to the input data (associated ground truth label) is used for its soft label output (teacher posterior) to train the student neural network (student model)); and 
automatically using, by a model training platform, the ground truth label to train the student model (Fukuda para. 59, student training section (according to an operational flow in fig. 3, which para. 37 teaches is performed by apparatus 10, which para. 17 teaches is a computer, thus automatically) trains the student neural network with both of soft label outputs and hard labels/teacher correct data (ground truth label)).
While Fukuda teaches that both hard labels and soft labels are used to train the student model, Fukuda does not explicitly teach the claimed “when it is determined that a teacher posterior does not match the associated ground truth label.”
Li NPL teaches when it is determined that a teacher posterior does not match the associated ground truth label (Li NPL page 5, sections 3.2 and 3.3, Abstract, in a certainty driven consistency loss approach in teacher-student training, a hard filtering is performed on a teacher’s soft outputs (teacher posterior) that are uncertain (are assumed to not match the associated ground truth label).
Therefore, taking the teachings of Fukuda and Li NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teacher-student training system and operations of Fukuda to include considerations of when a teacher’s soft outputs should be disregarded for training due to uncertainty of being associated (matching) to a target (ground truth) as disclosed in Li NPL at least because doing so would improve performance of the student and improve the quality of the targets given to the student (see Li NPL end of section 1 on page 2).
Regarding claim 17, Fukuda teaches a non-transient, computer-readable medium storing instructions to be executed by a processor (Fukuda paras. 15, 17 and 37, 82, 84 and 86, apparatus 10 implemented as one or more computers, that performs operations to train a student neural network, the operations performed by CPU designated by an instruction sequence of programs stored in computer readable media) to perform a method for automatic speech recognition (Fukuda paras. 29-31, training data is human speech as audio data that is processed through the neural network for identifying phonemes, thus for with speech recognition), the method comprising: 
accessing a trained teacher model configured to perform a task (Fukuda para. 48, a teacher neural network is selected (access) to be used for teacher-student training, where paras. 31 and 73 teach that the teacher neural network is for classification of audio data using an acoustic model to identify phonemes (a task)), 
creating an untrained student model (Fukuda para. 56, a student neural network is trained, therefore the student model of the student neural network is created through the training), 
providing training data labeled with ground truths to the teacher model to produce teacher posteriors representing the training data (Fukuda paras. 54-56, teacher input data (training data labeled with ground truths) is input to the teacher neural networks, and soft label outputs (posteriors) are generated therefrom), 
when it is determined that a teacher posterior matches the associated ground truth label, conditionally using the teacher posterior to train the student model (Fukuda paras. 50, and 53-55, a selecting section (conditionally) selects a teacher neural network that outputs a soft label output closest (matches) to the correct data corresponding to the input data (associated ground truth label), where the selected teacher NN’s soft label output (teacher posterior) is the one selected to be used to train the student neural network (student model)), and 
conditionally using the ground truth label to train the student model (Fukuda para. 59, student training section may (conditionally) train the student neural network with both of soft label outputs and hard labels/teacher correct data (ground truth label)).
While Fukuda teaches that both hard labels and soft labels are used to train the student model, Fukuda does not explicitly teach the claimed “when it is determined that a teacher posterior does not match the associated ground truth label.”
Li NPL teaches when it is determined that a teacher posterior does not match the associated ground truth label (Li NPL page 5, sections 3.2 and 3.3, Abstract, in a certainty driven consistency loss approach in teacher-student training, a hard filtering is performed on a teacher’s soft outputs (teacher posterior) that are uncertain (are assumed to not match the associated ground truth label).
Therefore, taking the teachings of Fukuda and Li NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teacher-student training system and operations of Fukuda to include considerations of when a teacher’s soft outputs should be disregarded for training due to uncertainty of being associated (matching) to a target (ground truth) as disclosed in Li NPL at least because doing so would improve performance of the student and improve the quality of the targets given to the student (see Li NPL end of section 1 on page 2).
Claims 5-7, 13, and 19-22 are rejected under 35 U.S.C. 103 as being unpatentable over Fukuda in view of Li NPL, as set forth above regarding claims 1, 11 and 17, from which claims 5-7, 13 and 19-22 respectively depend, further in view of Li et al., “Large-Scale Domain Adaptation via Teacher-Student Learning,” August 17, 2017, arXiv:1708.05466v1 [cs.CL] https://doi.org/10.48550/arXiv.1708.05466 (herein “Li NPL2”).
Regarding claims 5, 13 and 19, Fukuda does not teach, but Li NPL2 teaches wherein the task is associated with automatic speech recognition domain adaptation of a neural network-based model (Li NPL2 section 3, Teacher student learning is for (task) domain adaptation of a neural network processing speech, and where section 4 teaches using an acoustic model, and section 4.3 teaching specifically children’s speech recognition).
Therefore, taking the teachings of Fukuda and Li NPL2 together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teacher-student training system and operations of Fukuda to include the objective of the training with domain adaptation for a speech recognition model in a neural network as disclosed in Li NPL2 at least because doing so would result in improvements in accuracy and reduction in word error rate in speech recognition applications (see Li NPL2 Abstract).
Regarding claims 6 and 20, Fukuda does not teach, but Li NPL2 teaches wherein the teacher model is selected based on a selected language (Li NPL2 section 4.2, in adapting clean to noisy speech, the Wall Street Journal (WSJ) 3-gram language model is used for decoding (what is being trained by the teacher model to the student), where the WSJ model is in English, thus a selected language being English).
Therefore, taking the teachings of Fukuda and Li NPL2 together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teacher-student training system and operations of Fukuda to include the decoding using the WSJ language model as disclosed in Li NPL2 at least because doing so would result in improvements in accuracy and reduction in word error rate in speech recognition applications (see Li NPL2 Abstract and section 4.2).
Regarding claims 7 and 21, Fukuda does not teach, but Li NPL2 teaches wherein the system is further operable to – claim 7/the method further comprises – claim 21: produce/producing target domain utterances by transforming source domain utterances according to at least one of (i) a Signal-to-Noise Ratio range, (ii) a codec by which the utterances are encoded, (iii) a frequency band for the utterances, (iv) a volume level, (v) an average speech frequency for the utterances, (vi) a room impulse response, (vii) a speaker and recorder distance, and (viii) a recoding channel (Li NPL2 section 4.1, the target domain is a simulated noisy Cortana task, where table 2 illustrates a breakdown in SNR ranges for the Cortana condition).
Therefore, taking the teachings of Fukuda and Li NPL2 together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teacher-student training system and operations of Fukuda to include the SNR considerations as disclosed in Li NPL2 at least because doing so would result in improvements in accuracy and reduction in word error rate in speech recognition applications (see Li NPL2 Abstract and section 4.1).
Regarding claim 22, Fukuda does not but Li NPL2 teaches wherein the task is associated with at least one of: (i) automatic speech recognition speaker adaptation of a neural network based model, (ii) device personalization providing limited data from a target speaker, (iii) noise speech recognition using clean/noisy speech pair data, (iv) far field speech recognition using close-talk/far-talk speech pair data, (v) kids speech recognition using adults/kids speech pair data, (vi) narrow- band speech recognition using wide-band/narrow-band speech pair data, and (vii) audio- codec speech recognition using original/codec speech pair data (Li NPL2 section 4.3, the teacher student learning method used for children’s speech recognition).
Therefore, taking the teachings of Fukuda and Li NPL2 together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teacher-student training system and operations of Fukuda to include the task being associated with children’s speech recognition as disclosed in Li NPL2 at least because such a task is important for home entertainment applications (see Li NPL2 section 4.3) and thus, doing so would and so would be use of a known technique to improve similar devices (methods, or products) in the same way. See MPEP 2143(I)(C).
Claims 8 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Fukuda in view of Li NPL, as set forth above regarding claims 4 and 12, from which claims 8 and 14 respectively depend, further in view of Liao, "Speaker adaptation of context dependent deep neural networks," 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013, pp. 7947-7951, doi: 10.1109/ICASSP.2013.6639212 (herein “Liao NPL”).
Regarding claims 8 and 14, Fukuda does not but Liao NPL teaches wherein the task is associated with automatic speech recognition speaker adaptation of a neural network-based model (Liao NPL section 1, considering a task of adapting a DNN for speaker adaptation using an acoustic model in speech recognition).
Therefore, taking the teachings of Fukuda and Liao NPL together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teacher-student training system and operations of Fukuda to include a task of speaker adaptation in automatic speech recognition as disclosed in Liao NPL at least because adapting a general acoustic model to new users (speaker adaptation) is common in automatic speech recognition systems (see Liao NPL Introduction), and so would be use of a known technique to improve similar devices (methods, or products) in the same way. See MPEP 2143(I)(C).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Fukuda et al., US 2019/0205748 A1, directed towards a teacher-student training system where both soft labels and hard labels are used to train a student model.
Li et al., US 2019/0051290 A1, directed towards domain adaptation in a teacher-student training system.
Li et al., US 2016/0078339 A1, directed towards details of a teacher-student training system including convergence criteria, and in particular the Kullback-Leibler divergence.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M KOETH whose telephone number is (571)272-5908. The examiner can normally be reached Monday-Friday, 09:30-18:30 EDT/EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MICHELLE M. KOETH
Primary Examiner
Art Unit 2656



/MICHELLE M KOETH/Primary Examiner, Art Unit 2656