DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
This Office Action is in response to the applicants’ amendment filed on July 5, 2022 and wherein Applicant has amended claims 6, 11, 18. 
In virtue of this communication, claims 1-20 are currently pending in this Office Action.

Double Patenting
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office Action. See the argument of paragraph 4 of page 8 in Remarks filed on July 5, 2022.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(a):

(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 6, 18 are rejected under 35 U.S.C. 112(a), as failing to comply with the written description requirement. The claims contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor(s) or joint inventor, or for pre-AIA  the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 6 recites “wherein the metadata includes non-audio data comprising one or more of an application identifier, a speaker identifier, a device identifier, …, a dialogue state”, which is not supported by the original disclosure including the specification, drawing, and original claims, for example, the claimed “speaker identifier”, “channel identifier”, etc. may be the specific or “correlated utterances” (para [0070]) or “independent of requirements such as utterances from the same speaker (para [0071])” (Note: “not utterances” or non-speech does not inherently  be the claimed “non-audio data”), which does not count for a support of the claimed “metadata includes non-audio data comprising …”. 
Claim 18 is rejected for the at least similar reason as described in claim 6 above because claim 18 recites the similar deficient features as recited in claim 6.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention..

Claims 1-2, 4-5, 9-14, 16-17, 19-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Chengalvarayan et al (US 20080010057 A1, hereinafter Chengalvarayan).
Claim 1: Chengalvarayan teaches a method (title and abstract, ln 1-17, a method in fig. 4a-4b and a system in fig. 2) comprising: 
estimating, via a model (including acts or representation of procedures with identity matrices or adaptation parameters, including initialized adaptation routines being independent of speaker and environmental characteristics, mathematical transformation with the adaptation parameters prior to decoding at a first step, adjustment of matrix transformation upon the parameter training method at a second step, feedback of the adjusted or trained matrix of adaptation parameters for use at the first step, etc., para [0063]-[0064]) trained on at least metadata (suitable parameter training method, e.g., trained via maximum likelihood estimation, discriminative training to adjust the matrix of transformation or adaptation parameters with the acoustic feature vectors of the selected hypothesis as metadata, para [0064]) to prepare the model to output a set of parameters associated with a chosen acoustic environment (e.g., the trained transform A and bias b parameters being computed iteratively in FMLLR model, para [0065]; the adaptation parameters being trained as a low or a high SNR adaptation parameter in vehicle highway or vehicle idle condition, para [0085]; similarly in conventional approach, trained based on particular vehicle environment and particular speaker, para [0071]), the set of parameters (e.g., transform A and bias b above in FMLLR) being useful for performing automatic speech recognition (e.g., applying the adaptation parameters A and b via AY+b to the speech frame and iteratively  computing the adaptation parameters A and b such that likelihood values of the transformed adaptation data are maximized to the selected acoustic model during the decoding or speech recognition, para [0065]; the trained adaptation parameters being stored and recalled for a later use, para [0085], i.e., being useful to the ASR in fig. 2 inherently) by an automatic speech recognition system (including the decoder module 214, applications of acoustic models 220, Grammar models 218, word models 222, language models 224 to the decoder module 214, pre-processor 212, post processor 216, etc., in fig. 2), wherein the estimating is performed dynamically during speech recognition of first speech received by the automatic speech recognition system (receiving via one or more microphones, para [0035]; including initialized ASR adaptation routines, performing transformation with the adaptation parameters prior to the decoding at step 335; the adaptation parameters or matric are adjusted according to the recognized speech and feedback for the future use, para [0063]-[0064]; the adaptation parameters being trained based on the selected acoustic model, grammar model, etc., para [0068], para [0078]); 
receiving second speech at the automatic speech recognition system (subsequent speech having the same characteristics is transformed by using one or more trained or feedback adaptation parameters, para [0071]; similarly, adaptation parameters trained and stored for subsequent segments of speech, para [0082]-[0084], i.e., receiving the subsequent speech or subsequent segments of speech via the one or more microphones 132 is inherency for processing the subsequent speech or the subsequent segments of speech including  transforming the feature vectors of the subsequent speech above); 
applying, by the automatic speech recognition system, the set of parameters to recognize the second speech to yield text (transforming the feature vectors of the subsequent speech or subsequent segments of the speech prior to the decoding by using specific adaptation parameters, trained by selected acoustic model, language model, and/or grammar model, para [0078]-[0087]); and 
outputting the text from the automatic speech recognition system (via the post processor 216 for converting the recognized speech to a text for use with other aspects of the ASR system or other vehicle system in fig. 2, i.e., output the text, at step 345 in fig. 3, para [0055]).
Claim 13 has been analyzed and rejected according to claim 1 above and Chengalvarayan further teaches an automatic speech recognition system (an ASR system architecture 210 in fig. 2 and implementing the method steps of fig. 4a-4b) comprising.
a processor (including a processor 116 in fig. 1); and 
a non-transitory computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations of the method steps of claim 1 (a memory 122 having one or more computer programs 124, para [0028]-[0030]).
Claim 20 has been analyzed and rejected according to claim 1 above and Chengalvarayan further teaches, 
applying the set of parameters to the automatic speech recognition system (initialized ASR routines with default adaptation parameters like identity matrices, para [0063] or loaded and saved pre-trained adaptation parameters in the vehicle, para [0090]) to yield a tuned automatic speech recognition system (the ASR system of fig. 2 with adaptation parameters such as transform A and bias b being iteratively calculated and applied, para [0065]; the ASR with the adaptation parameters being trained and saved for later recall, para [0071], para [0082], i.e., tuned ASR system);
recognizing second speech received at the tuned automatic speech recognition system to yield text (the trained adaptation parameters used for transforming the feature vectors of the subsequent speech or subsequent segments of the speech, para [0078]-[0087] and the discussion in claim 1 above).
Claim 2: Chengalvarayan further teaches, according to claim 1 above, wherein the chosen acoustic environment is selected from a plurality of different acoustic environments (e.g., high noise environment background, e.g., driving on highway, and low noise environment background, e.g., vehicle idle, para [0076], para [0085]).
Claim 4: Chengalvarayan further teaches, according to claim 1 above, prior to estimating the set of parameters, applying an initial set of parameters for recognizing initial speech received at that automatic speech recognition system (ASR adaptation routines being initialized with default adaptation parameters, e.g., identity matrices, para [0063], para [0089]-[0090] and applied to a speech as initial speech, e.g., via trained transform A and bias b parameters in FMLLR model and the discussion in claim 1 above), wherein the set of parameters replaces the initial set of parameters (trained adaptation parameters being saved and recalled for the subsequent speech or subsequent segments of the speech, the discussion in claim 1 above).
Claim 5: Chengalvarayan further teaches, according to claim 1 above, wherein the model utilizes one or more of a signal-to-noise ratio estimate, reverberation time estimate, a short-term window frequency analysis, a mel-scale frequency cepstral analysis, time-domain signal audio signal directly and the metadata to estimate the set of parameters (low or high SNR adaptation parameters corresponding to high or low noise environment, para [0076]; the acoustic features with suitable training method to adjust the matrix of adaptation parameters, para [0064] and training the adaptation parameters based on acoustic model para [0078], grammar para [0079], as metadata).
Claim 9: Chengalvarayan further teaches, according to claim 1 above, wherein estimating and applying the set of parameters to processing the second speech to yield the text is performed in a batch mode with parameters estimated based on full utterances (the speech is processed by the ASR with the initial or pre-trained adaptation parameters in fig. 2 at the first step, acoustic features of the recognized speech with hypothesis is with training the adaptation parameters so that the adaptation parameters are adjusted by likelihood estimation, discriminative training, etc., at the second step, and the adjusted adaptation parameters are feedback and used at the first step for the subsequent process on an ensuing segment of speech, para [0064] or subsequent speech having the same characteristics, para [0071] or subsequent segments of speech, para [0080]).
Claim 10: Chengalvarayan further teaches, according to claim 9 above, wherein the applying of the set of parameters to processing the second speech to yield the text is performed in either a delayed decoding pass by the automatic speech recognition system (the adjusted or trained adaptation parameters used for subsequent speech to transform feature vectors of the subsequent speech, para [0071]; e.g., using FMLLR according to AY+b prior to decoding, para [0065]; the transformed feature vectors then delivered to the decoder for ASR in fig. 2, and thus, inherently, there is delayed decoding pass due to the transformation and selection of adaptation parameters, 425, 440, etc., prior to the decoding 455 in fig. 4a), or in a rescoring that rescores result options from a first speech recognition pass given estimated hyper parameters (determined whether short speech segment or segment with transition or not, i.e., scoring is low or high, to determine whether the trained adaptation parameters to be saved for the recall for the subsequent speech or not, para [0071]).
Claim 11: Chengalvarayan further teaches, according to claim 10 above, wherein the estimated hyper parameters estimated on one utterance (the adaptation parameters adjusted based on the feature vectors of the recognized speech upon the input utterance received by one or more microphones 132, para [0064]) are only applied in decoding of a respectively next utterance (also used for subsequent process, para [0064] and for transforming feature vectors of subsequent speech having the same characteristics, para [0071]).
Claim 12: Chengalvarayan further teaches, according to claim 1 above, wherein estimating the set of parameters useful for performing automatic speech recognition yields (1) the set of parameters directly as target layer outputs or (2) the set of parameters as a predefined parameter configuration chosen from a group of predefined parameter configurations (selected from a group of predefined parameter configurations such as selected from a group containing low/high SNR parameters upon noise level at step 440, 445, 450 in fig. 4a and directly saved as adaptation parameter for digit or commands, in fig. 4a).
Claim 14 has been analyzed and rejected according to claims 13, 2 above.
Claim 16 has been analyzed and rejected according to claims 13, 4 above.
Claim 17 has been analyzed and rejected according to claims 13, 5 above.
Claim 19 has been analyzed and rejected according to claims 13, 12 above.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 3, 6, 8, 15, 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chengalvarayan above) and in view of reference Ravindran et al (US 20160284349 A1, hereinafter Ravindran).
Claim 3: Chengalvarayan teaches all the elements of claim 3, according to claim 1 above, including a language model scale (a bilingual decoder, with a first and a second parameters, para [0078]).
However, Chengalvarayan does not explicitly teach the set of parameters comprises one or more of a word insertion penalty, a silence prior, a word penalty, a beam pruning width, a language model scale, an acoustic model scale, duration model scale, other search pruning control parameters, and a language model interpolation vector.
Ravindran teaches an analogous field of endeavor by disclosing a method (title and abstract, ln 1-2 and fig. 4) and wherein a set of parameters is disclosed (including parameters being used for ASR in figs. 6-8) and wherein the set of parameters comprises one or more of a word insertion penalty (word error rate representing a word insertion error or penalty, para [0023]), a silence prior (scaler for acoustic scores representing silence or error, para [0062]), a word penalty (error due to delete words, substitution word, para [0023]), a beam pruning width (beamwidth is modified according to WER and RTF in fig. 6), a language model scale (token buffer size used on the language model, para [0045]; transition weights of the arcs, para [0031]), an acoustic model scale (acoustic scale factors to adjust relative between the acoustic model score and the language model score, para [0061]), duration model scale, other search pruning control parameters, and a language model interpolation vector (including beamwidth, token buffer size, weighting factors for a language model, etc., para [0024]) for benefits of achieving an improvement of ASR performance by tuning ASR parameters for reducing the computational load, fast processing, and higher quality of the ASR (para [0002], para [0024]). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied wherein the set of parameters comprises one or more of the word insertion penalty, the silence prior, the word penalty, the beam pruning width, the language model scale, the acoustic model scale, the duration model scale, other search pruning control parameters, and the language model interpolation vector, as taught by Ravindran, to the set of parameters in the method, as taught by Chengalvarayan, for the benefits discussed above.
Claim 6: the combination of Chengalvarayan and Ravindran further teaches, according to claim 1 above,  wherein the metadata includes non-audio data comprising one or more of an application identifier, a speaker identifier, a device identifier, a channel identifier, a date/time, a geographic location (Chengalvarayan, SNR parameters in low noise level or high noise level in fig. 4a, Ravindran, from GPS to define a location of a user or a device, para [0037]), an application context (Chengalvarayan, acoustic model such as digital model 455a or command model 455b, para [0078], and Ravindran, indicating the user is running with/without the wind, biking without traffic noise, para [0063]), or a dialogue state.
Claim 8: the combination of Chengalvarayan and Ravindran further teaches, according to claim 1 above,  wherein the metadata is incorporated into the model via one-hot encoding or embedding (Chengalvarayan, the speech type as metadata, as input to determine the adaptation parameters at steps 425, 430,435, in fig. 4a, i.e., hot encoding manner; similarly, SNR as metadata is input to determine adaptation parameters at steps 440, 445, 450 in fig. 4a and Ravindran, the SNR as metadata is an input to determine parameters such as beamwidth, acoustic scale factor, current token buffer size, etc., para [0040] and location/activity information as metadata are directly input to the parameter refinement unit 214 to determine the sub-vocabulary language model for running based or not, etc., para [0041]).
Claim 15 has been analyzed and rejected according to claims 13, 3 above.
Claim 18 has been analyzed and rejected according to claims 17, 6 above.

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Chengalvarayan above).
Claim 7: Chengalvarayan teaches all the elements of claim 7, according to claim 1 above, except wherein the model comprises one of a feedforward neural network, unidirectional or bidirectional recurrent neural network, a convolutional neural network or a support vector machine model.
An Official Notice is taken that a model for estimating and outputting parameters in the automatic speech recognition field, comprising feedfoward neural network, unidirectional or bidirectional recurrent neural network, a convolution neural network, or a support vector machine model is notoriously well-known in the art before the effective filing date of the claimed invention for benefits of improving efficiency and accuracy and achieving simplification of implementation of the ASR (e.g., US 20160307565 A1, deep neural support vector model for improving accuracy by reducing noise interference, para [0002]; US 20170040016 A1, Convolutional Neural Network for simplified implementation, etc.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied the model comprising the deep neural support vector model, and convolutional neural network, etc., as taught by the well-known in the art, to the model for outputting and estimating the set of parameters in the method, as taught by Chengalvarayan, for the benefits discussed above.

The prior art (US 20130268270 A1 by Jiang et al) made of record and not relied upon is considered pertinent to applicant's disclosure because Jiang above disclosed adaptive parameters used by an ASR for repeat utterance or freely speaking for accuracy of the ASR, which is part of the disclosures disclosed by the applicant.

Response to Arguments

Applicant's arguments filed on July 5, 2022 have been fully considered and but are not persuasive. 
With respect to the prior art rejection of independent claim 1 under 35 USC §102(a)(1), as set forth in the Office Action, the Applicant argued: “Chengalvarayan at best discloses training adaptation parameters rather than a model as recited in claim 1” because “the claimed model is patentably distinct from the adaptation parameters which are trained in Chengalvarayan” asserted in paragraph 2 of page 10 in Remarks filed on July 5, 2022.
In response to the argument cited above, the Office respectfully disagrees because claim 1 broadly recited “a model trained on at least metadata to prepare the model” with no recitation of what the claimed “model” is and thus, Chengalvarayan’s “adaptation parameter” would be, as discussed in the office action above, interpreted as representative of a model because the “adaptation parameter” is further disclosed to be applied to perform ASR (para [0065]) which as a whole, is consistent with the claimed “model”, but Applicant is in silence about the citation of the office action and thus, the argument above is not persuasive.
With respect to the prior art rejection of independent claim 1 under 35 USC §102(a)(1), as set forth in the Office Action, the Applicant further argued about the claimed “metadata”: “the adaptation parameters, in Chengalvarayan, are not trained on at least metadata as recited in claim 1” because “Chengalvarayan discloses observing speech segments for one or more certain characteristics and then adaptation parameters associated with the characteristics are trained, see paragraph [0071] of Chengalvarayan. Thus, even the adaptation parameters, in Chengalvarayan, are trained based on observing speech segments rather than metadata as recited in claim 1” and because as the application in background of “A common practice is to tune these parameters on sample audio data with a goal of good automatic speech recognition performance within …” as conventional approach, as asserted in paragraphs 3-6 of page 10 in Remarks filed on July 5, 2022.
In response to the argument above, the Office further respectfully disagrees because Chengalvarayan does not only disclose “speech segments are observed for one or more certain characteristics and then adaptation parameters associated with the characteristics are trained and saved … para [0071]”, but also disclosed that training “adaptation parameters” is performed by including “loading SNR parameters” which can be the disclosed metadata (step 440, 445, 450 in fig. 4a), and further using Grammars (step 455c), digital specific acoustic models, command specific acoustic models (steps 455a, 455b in fig. 4a), including “HMM engines” associated with “acoustic feature vectors ” and SNR parameters (para [0051]-[0052]) and wherein “the acoustic feature vectors” is disclosed of the selected hypothesis (para [0010]) and the “hypothesis is as the claimed metadata” because and the hypothesis can be best identified by a Bayesian HMM process (para [0052], etc.), and it is clear that Chengalvarayan’s “hypothesis” is not the argued “speech segment” and however, Applicant is in silence about the citation of the office action and thus, the argument above is also not persuasive. In addition, Ravindran also teaches training a “model” (represented by parameter refinement unit 34 in fig. 1) and trained by metadata (data from sensors 31, such as a location of the audio device, an activity, etc., para [0026]) and Applicant is also in silence about the further disclosure by Ravindran.
Therefore, on the bases of above analyses and evidences from the prior art, the prior art rejection of independent claim 1 under 35 USC §102(a)(1), as set forth in the Office Action above, is maintained. For the at least similar reasons discussed above, the prior art rejection of other independent claim 13 and dependent claims 2-12, 14-20 is also maintained. 
In the response to this office action, the examiner respectfully requests that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line numbers in the specification and/or drawing figure(s). This will assist the Examiner in prosecuting this application.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LESHUI ZHANG whose telephone number is (571)270-5589.  The examiner can normally be reached on Monday-Friday 6:30am-4:00pm EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached on 571-272-7848.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LESHUI ZHANG/
Primary Examiner, Art Unit 2654