DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the response to this office action, the Examiner respectfully requests that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line numbers in the specification and/or drawing figure(s). This will assist the Examiner in prosecuting this application.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.  A nonstatutory obviousness-type double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and  In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. 
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).

Claims 1-19 rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1, 4-12, 16-17 of U.S. Patent No. 10,810,996 B2. Although the claims of the instant application is a broader version of the claims of U.S. Patent No. 10,810,996 B2, and the following is the comparison between claims 1-19 of the instant application and the conflicting claims 1, 4-12, 16-17 of the U.S. Patent No. 10,810,996 B2:
Claim(s) in the current application
Conflicting claim(s) in patent application No. 10,810,996 B2
1. A method comprising: estimating, via a model trained on at least metadata to prepare the model to output a set of parameters associated with a chosen acoustic environment, the set of parameters being useful for performing automatic speech recognition by an automatic speech recognition system, wherein the estimating is performed dynamically during speech recognition of first speech received by the automatic speech recognition system; receiving second speech at the automatic speech recognition system; applying, by the automatic speech recognition system, the set of parameters to recognize the second speech to yield text; and outputting the text from the automatic speech recognition system.

2. The method of claim 1, wherein the chosen acoustic environment is selected from a plurality of different acoustic environments.

3. The method of claim 1, wherein the set of parameters comprises one or more of a word insertion penalty, a silence prior, a word penalty, a beam pruning width, a language model scale, an acoustic model scale, a duration model scale, other search pruning 

4. The method of claim 1, further comprising, prior to estimating the set of parameters, applying an initial set of parameters for recognizing initial speech received at that automatic speech recognition system, wherein the set of parameters replaces the initial set of parameters.

5. The method of claim 1, wherein the model utilizes one or more of a signal-to-noise ratio estimate, reverberation time estimate, a short-term window frequency analysis, a mel-scale frequency cepstral analysis, time-domain signal audio signal directly and the metadata to estimate the set of parameters.


6. The method of claim 5, wherein the metadata comprises one or more of an applicationId, a speakerId, a deviceId, a channelId, a date/time, a geographic location, an application context, or a dialogue state.

7. The method of claim 1, wherein the model comprises one of a feedforward neural network, unidirectional or bidirectional recurrent neural network, a convolutional neural network or a support vector machine model.

8. The method of claim 1, wherein the metadata is incorporated into the model via one-hot encoding or embedding.

9. The method of claim 1, wherein estimating and applying the set of parameters to 

10. The method of claim 9, wherein the applying of the set of parameters to processing the second speech to yield the text is performed in either a delayed decoding pass by the automatic speech recognition system, or in a rescoring that rescores result options from a first speech recognition pass given estimated hyper parameters.

11. The method of claim 9, wherein estimated hyper parameters estimated on one utterance are only applied in decoding of a respectively next utterance.

12. The method of claim 1, wherein estimating the set of parameters useful for performing automatic speech recognition yields (1) the set of parameters directly as target layer outputs or (2) the set of parameters as a predefined parameter configuration chosen from a group of predefined parameter configurations.


13. An automatic speech recognition system comprising: a processor; and a non-transitory computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising: estimating, via a model trained on at least metadata to prepare the model to output a set of parameters associated with a chosen acoustic environment, the set of parameters 

14. The automatic speech recognition system of claim 13, wherein the chosen acoustic environment is selected from a plurality of different acoustic environments.

15. The automatic speech recognition system of claim 13, wherein the set of parameters comprises one or more of a word insertion penalty, a silence prior, a word penalty, a beam pruning width, a language model scale, an acoustic model scale, a duration model scale, other search pruning control parameters, and a language model interpolation vector.

16. The automatic speech recognition system of claim 13, wherein the non-transitory computer-readable storage medium stores further instructions which, when executed by the processor, cause the processor to perform further operations comprising: prior to estimating the set of parameters, applying an initial set of parameters for recognizing initial speech received at that automatic speech recognition system, wherein the set of parameters replaces the initial set of parameters.

17. The automatic speech recognition system of claim 13, wherein the model utilizes one or more of a signal-to-noise ratio estimate, reverberation time estimate, a short-term window frequency analysis, a mel-scale frequency cepstral analysis, time-domain signal audio signal directly and the metadata to estimate the set of parameters.





18. The automatic speech recognition system of claim 17, wherein the metadata comprises one or more of an applicationId, a speakerId, a deviceId, a channelId, a date/time, a geographic location, an application context, or a dialogue state.

19. The automatic speech recognition system of claim 13, wherein estimating the set of parameters useful for performing automatic speech recognition yields (1) the set of parameters directly as target layer outputs or (2) the set of parameters as a predefined parameter configuration chosen from a group of predefined parameter configurations.
















4. The method of claim 1, wherein the model utilizes one or more of a signal-to-noise ratio estimate, reverberation time estimate, a short-term window frequency analysis, a mel-scale frequency cepstral analysis, time-domain signal audio signal directly, and the metadata to estimate the second set of parameters.

5. The method of claim 4, wherein the metadata comprises one or more of an applicationId, a speakerId, a deviceId, a channelId, a date/time, a geographic location, an application context, or a dialogue state.

6. The method of claim 1, wherein the model comprises one of a feedforward neural network, unidirectional or bidirectional recurrent neural network, a convolutional neural network or a support vector machine model.

7. The method of claim 1, wherein the metadata is incorporated into the model via one-hot encoding or embedding.

8. The method of claim 1, wherein estimating and applying the second set of parameters to processing the second speech to yield the 

9. The method of claim 8, wherein the applying of the second set of parameters to processing the second speech to yield the second text is performed in either a delayed decoding pass by the automatic speech recognition system, or in a rescoring that rescores result options from a first speech recognition pass given estimated hyper parameters.

10. The method of claim 8, wherein estimated hyper parameters estimated on one utterance are only applied in decoding of a respectively next utterance.

11. The method of claim 1, wherein estimating the second set of parameters useful for performing automatic speech recognition yields (1) the second set of parameters directly as target layer outputs or (2) the second set of parameters as a predefined parameter configuration chosen from a group of predefined parameter configurations.

12. An automatic speech recognition system comprising: a processor; and a non-transitory computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising: applying a first set of parameters for recognizing first speech to yield a first text, wherein the first set of parameters comprises one or more of a word insertion penalty, a silence prior, a word penalty and a beam pruning width; 

























17. The automatic speech recognition system of claim 16, wherein the metadata features comprise one or more of an applicationId, a speakerId, a deviceId, a channelId, a date/time, a geographic location, an application context and a dialogue state. 

11. The method of claim 1, wherein estimating the second set of parameters useful for performing automatic speech recognition yields (1) the second set of parameters directly as target layer outputs or (2) the second set of parameters as a predefined parameter configuration chosen from a group of predefined parameter configurations.


Claim 20 rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claim 1 of U.S. Patent No. 10,810,996 B2 and in view of Chengalvarayan et al (US 20080010057 A1). The conflicting claim of the U.S. Patent No. 10,810,996 B2 teaches all the features of claim 20, except a feature of “applying the set of parameters to the automatic speech recognition system to yield a tuned automatic speech Chengalvarayan teaches this feature by saving the adjusted adaptation parameters within the ASR (the ASR with the adaptation parameters being trained and saved for later recall, para [0071], para [0082], i.e., tuned ASR system); and recognizing second speech received at the tuned automatic speech recognition system to yield text (the trained adaptation parameters used for transforming the feature vectors of the subsequent speech or subsequent segments of the speech, para [0078]-[0087] and the discussion in claim 1 above) for benefits of achieving an improvement operation and implementation of ASR by automatically and adaptively adjusting ASR parameters based on the environment and specific user (para [0004], using multiple adaptation parameters and adaptively adjusted, para [0071], para [0074], para [0087]).  Therefore, it would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention to apply wherein applying the set of parameters to the automatic speech recognition system to yield a tuned automatic speech recognition system; and recognizing second speech received at the tuned automatic speech recognition system to yeield text, as taught by Chengalvarayan, to the applying for recognizing as recited in claim 1 of the conflict U.S. Patent No. 10,810,996 B2, for benefits discussed above. A comparison of claims of the instant application with the conflicting claims in U.S. Patent No. 10,810,996 B2 is listed below for reference:
Claim(s) in the current application:


20. A method comprising:estimating, dynamically during speech recognition of first 




1. A method comprising: applying a first set of parameters for recognizing first speech .



Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention..

Claims 1-2, 4-5, 9-14, 16-17, 19-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Chengalvarayan et al (US 20080010057 A1, hereinafter Chengalvarayan).
Claim 1: Chengalvarayan teaches a method (title and abstract, ln 1-17, a method in fig. 4a-4b and a system in fig. 2) comprising: 
estimating, via a model (including acts or representation of procedures with identity matrices or adaptation parameters, including initialized adaptation routines being independent of speaker and environmental characteristics, mathematical transformation with the adaptation parameters prior to decoding at a first step, adjustment of matrix transformation upon the parameter training method at a second step, feedback of the adjusted or trained matrix of adaptation parameters for use at the first step, etc., para [0063]-[0064]) trained on at least metadata (suitable parameter training method, e.g., trained via maximum likelihood estimation, discriminative training to adjust the matrix of transformation or adaptation parameters with the acoustic feature vectors of the selected hypothesis as metadata, para [0064]) to prepare the model to output a set of parameters associated with a chosen acoustic environment (e.g., the trained transform A and bias b parameters being computed iteratively in FMLLR model, para [0065]; the adaptation parameters being trained as a low or a high SNR adaptation parameter in vehicle highway or vehicle idle condition, para [0085]; similarly in conventional approach, trained based on particular vehicle environment and particular speaker, para [0071]), the set of parameters (e.g., transform A and bias b above in FMLLR) being useful for performing automatic speech recognition (e.g., applying the adaptation parameters A and b via AY+b to the speech frame and iteratively  computing the adaptation parameters A and b such that likelihood values of the transformed adaptation data are maximized to the selected acoustic model during the decoding or speech recognition, para [0065]; the trained adaptation parameters being stored and recalled for a later use, para [0085], i.e., being useful to the ASR in 
receiving second speech at the automatic speech recognition system (subsequent speech having the same characteristics is transformed by using one or more trained or feedback adaptation parameters, para [0071]; similarly, adaptation parameters trained and stored for subsequent segments of speech, para [0082]-[0084], i.e., receiving the subsequent speech or subsequent segments of speech via the one or more microphones 132 is inherency for processing the subsequent speech or the subsequent segments of speech including  transforming the feature vectors of the subsequent speech above); 
applying, by the automatic speech recognition system, the set of parameters to recognize the second speech to yield text (transforming the feature vectors of the subsequent speech or subsequent segments of the speech prior to the decoding by using specific adaptation parameters, trained by selected acoustic model, language model, and/or grammar model, para [0078]-[0087]); and 

Claim 13 has been analyzed and rejected according to claim 1 above and Chengalvarayan further teaches an automatic speech recognition system (an ASR system architecture 210 in fig. 2 and implementing the method steps of fig. 4a-4b) comprising.
a processor (including a processor 116 in fig. 1); and 
a non-transitory computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations of the method steps of claim 1 (a memory 122 having one or more computer programs 124, para [0028]-[0030]).
Claim 20 has been analyzed and rejected according to claim 1 above and Chengalvarayan further teaches, 
applying the set of parameters to the automatic speech recognition system (initialized ASR routines with default adaptation parameters like identity matrices, para [0063] or loaded and saved pre-trained adaptation parameters in the vehicle, para [0090]) to yield a tuned automatic speech recognition system (the ASR system of fig. 2 with adaptation parameters such as transform A and bias b being iteratively calculated and applied, para [0065]; the ASR with the adaptation parameters being trained and saved for later recall, para [0071], para [0082], i.e., tuned ASR system);

Claim 2: Chengalvarayan further teaches, according to claim 1 above, wherein the chosen acoustic environment is selected from a plurality of different acoustic environments (e.g., high noise environment background, e.g., driving on highway, and low noise environment background, e.g., vehicle idle, para [0076], para [0085]).
Claim 4: Chengalvarayan further teaches, according to claim 1 above, prior to estimating the set of parameters, applying an initial set of parameters for recognizing initial speech received at that automatic speech recognition system (ASR adaptation routines being initialized with default adaptation parameters, e.g., identity matrices, para [0063], para [0089]-[0090] and applied to a speech as initial speech, e.g., via trained transform A and bias b parameters in FMLLR model and the discussion in claim 1 above), wherein the set of parameters replaces the initial set of parameters (trained adaptation parameters being saved and recalled for the subsequent speech or subsequent segments of the speech, the discussion in claim 1 above).
Claim 5: Chengalvarayan further teaches, according to claim 1 above, wherein the model utilizes one or more of a signal-to-noise ratio estimate, reverberation time estimate, a short-term window frequency analysis, a mel-scale frequency cepstral analysis, time-domain signal audio signal directly and the metadata to estimate the set of parameters (low or high SNR adaptation parameters corresponding to high or low noise environment, para [0076]; the acoustic features with suitable training method to adjust the matrix of adaptation parameters, 
Claim 9: Chengalvarayan further teaches, according to claim 1 above, wherein estimating and applying the set of parameters to processing the second speech to yield the text is performed in a batch mode with parameters estimated based on full utterances (the speech is processed by the ASR with the initial or pre-trained adaptation parameters in fig. 2 at the first step, acoustic features of the recognized speech with hypothesis is with training the adaptation parameters so that the adaptation parameters are adjusted by likelihood estimation, discriminative training, etc., at the second step, and the adjusted adaptation parameters are feedback and used at the first step for the subsequent process on an ensuing segment of speech, para [0064] or subsequent speech having the same characteristics, para [0071] or subsequent segments of speech, para [0080]).
Claim 10: Chengalvarayan further teaches, according to claim 9 above, wherein the applying of the set of parameters to processing the second speech to yield the text is performed in either a delayed decoding pass by the automatic speech recognition system (the adjusted or trained adaptation parameters used for subsequent speech to transform feature vectors of the subsequent speech, para [0071]; e.g., using FMLLR according to AY+b prior to decoding, para [0065]; the transformed feature vectors then delivered to the decoder for ASR in fig. 2, and thus, inherently, there is delayed decoding pass due to the transformation and selection of adaptation parameters, 425, 440, etc., prior to the decoding 455 in fig. 4a), or in a rescoring that rescores result options from a first speech recognition pass given estimated hyper parameters (determined whether short speech segment or segment with transition or 
Claim 11: Chengalvarayan further teaches, according to claim 9 above, wherein estimated hyper parameters estimated on one utterance (the adaptation parameters adjusted based on the feature vectors of the recognized speech upon the input utterance received by one or more microphones 132, para [0064]) are only applied in decoding of a respectively next utterance (also used for subsequent process, para [0064] and for transforming feature vectors of subsequent speech having the same characteristics, para [0071]).
Claim 12: Chengalvarayan further teaches, according to claim 1 above, wherein estimating the set of parameters useful for performing automatic speech recognition yields (1) the set of parameters directly as target layer outputs or (2) the set of parameters as a predefined parameter configuration chosen from a group of predefined parameter configurations (selected from a group of predefined parameter configurations such as selected from a group containing low/high SNR parameters upon noise level at step 440, 445, 450 in fig. 4a and directly saved as adaptation parameter for digit or commands, in fig. 4a).
Claim 14 has been analyzed and rejected according to claims 13, 2 above.
Claim 16 has been analyzed and rejected according to claims 13, 4 above.
Claim 17 has been analyzed and rejected according to claims 13, 5 above.
Claim 19 has been analyzed and rejected according to claims 13, 12 above.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 3, 6, 8, 15, 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chengalvarayan above) and in view of reference Ravindran et al (US 20160284349 A1, hereinafter Ravindran).
Claim 3: Chengalvarayan teaches all the elements of claim 3, according to claim 1 above, including a language model scale (a bilingual decoder, with a first and a second parameters, para [0078]).
However, Chengalvarayan does not explicitly teach the set of parameters comprises one or more of a word insertion penalty, a silence prior, a word penalty, a beam pruning width, a language model scale, an acoustic model scale, duration model scale, other search pruning control parameters, and a language model interpolation vector.
Ravindran teaches an analogous field of endeavor by disclosing a method (title and abstract, ln 1-2 and fig. 4) and wherein a set of parameters is disclosed (including parameters being used for ASR in figs. 6-8) and wherein the set of parameters comprises one or more of a word insertion penalty (word error rate representing a word insertion error or penalty, para [0023]), a silence prior (scaler for acoustic scores representing silence or error, para [0062]), a word penalty (error due to delete words, substitution word, para [0023]), a beam pruning width  
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied wherein the set of parameters comprises one or more of the word insertion penalty, the silence prior, the word penalty, the beam pruning width, the language model scale, the acoustic model scale, the duration model scale, other search pruning control parameters, and the language model interpolation vector, as taught by Ravindran, to the set of parameters in the method, as taught by Chengalvarayan, for the benefits discussed above.
Claim 6: the combination of Chengalvarayan and Ravindran further teaches, according to claim 5 above,  wherein the metadata comprises one or more of an applicationId, a speakerId, a deviceId, a channelId, a date/time, a geographic location (Ravindran, from GPS to define a location of a user or a device, para [0037]), an application context (Ravindran, indicating the user is running with/without the wind, biking without traffic noise, para [0063]), or a dialogue state.
Claim 8: the combination of Chengalvarayan and Ravindran further teaches, according to claim 1 above,  wherein the metadata is incorporated into the model via one-hot encoding or embedding (Chengalvarayan, the speech type as metadata, as input to determine the adaptation parameters at steps 425, 430,435, in fig. 4a, i.e., hot encoding manner; similarly, SNR as metadata is input to determine adaptation parameters at steps 440, 445, 450 in fig. 4a and Ravindran, the SNR as metadata is an input to determine parameters such as beamwidth, acoustic scale factor, current token buffer size, etc., para [0040] and location/activity information as metadata are directly input to the parameter refinement unit 214 to determine the sub-vocabulary language model for running based or not, etc., para [0041]).
Claim 15 has been analyzed and rejected according to claims 13, 3 above.
Claim 18 has been analyzed and rejected according to claims 17, 6 above.

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Chengalvarayan above).
Claim 7: Chengalvarayan teaches all the elements of claim 7, according to claim 1 above, except wherein the model comprises one of a feedforward neural network, unidirectional or bidirectional recurrent neural network, a convolutional neural network or a support vector machine model.
An Official Notice is taken that a model for estimating and outputting parameters in the automatic speech recognition field, comprising feedfoward neural network, unidirectional or bidirectional recurrent neural network, a convolution neural network, or a support vector machine model is notoriously well-known in the art before the effective filing date of the 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied the model comprising the deep neural support vector model, and convolutional neural network, etc., as taught by the well-known in the art, to the model for outputting and estimating the set of parameters in the method, as taught by Chengalvarayan, for the benefits discussed above.

The prior art (US 20130268270 A1 by Jiang et al) made of record and not relied upon is considered pertinent to applicant's disclosure because Jiang above disclosed adaptive parameters used by an ASR for repeat utterance or freely speaking for accuracy of the ASR, which is part of the disclosures disclosed by the applicant.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LESHUI ZHANG whose telephone number is (571)270-5589.  The examiner can normally be reached on Monday-Friday 6:30am-4:00pm EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached on 571-272-7848.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/LESHUI ZHANG/
Primary Examiner, Art Unit 2654