Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-5,10-15 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Cantanzaro (20170148433).

As per claim 1, Cantanzaro (20170148433) teaches a method of constructing a translation model (as translation is asr – para 0006) including at least one hidden layer (as using RNN with hidden layers – para 0069), the method comprising: performing an imitation learning process of transferring learned knowledge of a pre-built reference model to the translation model to thereby miniaturize a size and a tree search structure of the reference model (as, batch processing of the models – para 0050, wherein the RNN is modified to reduce computational size; the aforementioned RNN can be of variable size based on performance vs accuracy – para 0079); the initial models are trained using large training sets – para 0035);  where the imitation learning process comprises: imitation learning a parameter distribution with respect to a word probability distribution (as probability distribution calculation – para 0069, used in word recognition – para 0094) of a pre-built reference model (as trained models – para 0067); and imitation learning a tree search structure of the reference model (as using hardware when employing tree structured interconnect models – para 0133, and as an exampling tree pruning n-gram models – para 0118). 

As per claim 2, Cantanzaro (20170148433) teaches the method of claim 1, wherein the imitation learning the parameter distribution comprises imitation learning a reference model parameter for determining the word probability distribution of the reference model using a loss function defined with respect to a word probability distribution of the at least one hidden layer of the translation model (as, calculating a CTC loss function – para 0045, in the prediction of phonemes – para 0045, as well as word recognition – para 0067 showing the training on word error rates, reflecting back on para 0066 showing the CTC loss function, and in general, for word recognition – para 0007). 

As per claim 3, Cantanzaro (20170148433) teaches the method of claim 2, wherein the loss function comprises a first loss function corresponding to a cross entropy between a word probability distribution and a ground-truth distribution of the translation model(as performing a CTC loss function including cross entropy – para 0045 in a RNN, wherein the RNN also includes a pairing with ground truth distributions – Para 0055). 

As per claim 4, Cantanzaro (20170148433) teaches the method of claim 2, wherein the loss function comprises a second loss function corresponding to a cross entropy between a word probability distribution of the translation model and the word probability distribution of the reference model (as, in the probability calculation – para 0064, used in a CTC loss calculation, utilizing cross-entropy parameters – para 0045). 

As per claim 5, Cantanzaro (20170148433) teaches the method of claim 2, wherein the imitation learning the parameter distribution comprises adjusting a model parameter for determining the word probability distribution of the at least one hidden layer such that the loss function is minimized (as performing a ctc loss function calculation – para 0066, using the probability distribution – para 0064). 


As per claim 10, Cantanzaro (20170148433) teaches the method of claim 1, wherein the translation model further comprises an input layer and an output layer, the method further comprising pruning parameters of the input layer, the at least one hidden layer, and the output layer according to an importance thereof (as, performing on a multi--layer level, and maximizing efficiency – para 0131; including pruning techniques – para 0118, 0240); and quantizing the parameters for each of the input layer, the at least one hidden layer, and the output layer (as on the encoding end, converting the input to a fixed length vector and  on the decoding end,   converting the fixed length vector into a sequence of output predictions – para 0044; these functions are performed on the multilayer dnn – para 0044, reflecting back on para 0049 – showing the multi-layers). 

As per claim 11, Cantanzaro (20170148433) teaches the method of claim 10, further comprising performing re-learning on the translation model on the basis of the pruned and quantized parameters (as training/learning based on the pruned models – para 0118, 0240). 

As per claim 12, Cantanzaro (20170148433) teaches the method of claim 1, wherein each hidden layer is represented by a series of hidden state vectors, wherein the translation model comprises hidden layers of which the number is less than the number of hidden layers of the reference model, and the hidden state vector of the translation model is represented in a dimension lower than a dimension of the hidden state vector of the reference model (as shown in Table 1, the architecture of 1 layer of input, M RNN layers, and N total layers; in comparison to gru’s with less layers and faster computation speeds – para 0090-0093, and following Table 3). 

	Claim 15 is an apparatus claim whose components perform the steps in method claims 1-5,10-14 above and as such, claim 15 is similar in scope and content to claims 1-5, 10-14 above; therefore, claim 15 is rejected under similar rationale as presented against claims 1-5,10-14 above.  Furthermore, Cantanzaro (20170148433) teaches an apparatus with processing units performing the aforementioned steps – para 0039.



Allowable Subject Matter

Claims 6-9 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.  Claims 13,14 are allowable over the prior art of record.  The prior art of record, (representative Cantanzaro (20170148433)) teaches ground truth datasets in training models, using rnn structure, and multiple probability calculations for loss, but does not explicitly teach the claimed third probability loss.

Response to Arguments

Applicant's arguments filed 5/6/2022 have been fully considered but they are not persuasive.  As per applicants arguments found on the bottom of pp7 to the first thirteen lines of pp10 of the response, namely presents arguments to the newly amended claim language; examiner notes the further recitations to the Catanzaro reference meeting these claim limitations, as detailed above.   As to the remaining arguments starting on the bottom of pp 10 (‘regarding claim 13’) to the top of pp 11 of the response, examiner notes the indication of allowable subject matter for claim 13. 

Conclusion

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Zhong (20190130248) teaches ground truths (para 0012) in neural networks (para 0020), using cross entropy loss functions (para 0049 – 0052).
Dirac (20150379430) teaches ground truths with decision trees (para 0324, para 0213)
Lee (20150379429) teaches the use of neural network algorithms (para 0089) using decision trees and pruning (para 0202-0204), with ground truths (para 0317) and cross entropy (0209, 0220)
Chandramouli (20120254333) teaches decision trees, neural network (para 0109) using ground truth information (para 0116), with pruning and stemming (para 0245), and computing cross entropy values between the data string and target string (para 0627)

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Opsasnick, telephone number (571)272-7623, who is available Monday-Friday, 9am-5pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Mr. Richemond Dorvil, can be reached at (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Michael N Opsasnick/Primary Examiner, Art Unit 2658                                                                                                                                                                                                        06/14/2022