DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 16-17, 19-22, 24, 28, 30-31, 33-34, 42-44 and 47 is/are rejected under 35 U.S.C. 102 (a)(1) as being anticipated by Sharifi (9,589,564).
Regarding claim 1:  Sharifi discloses obtaining a speech signal (Fig. 1, user utterance 180); and performing a recognition of the speech signal (Fig. 1, automated speech recognizer 150), including generating a dialect parameter (The classifiers 112A-C may classify the acoustic features in parallel to reduce latency. To reduce processing, the system 100 may include the classifiers 112A-C in a hierarchy where classifiers 112A-C may be in a top level and classifiers 122A-B may be in a sub level that may depend from the top level classifiers 112A-C. The top level classifiers 112A-C may be specific to particular languages and the sub level classifiers 122A-B may be specific to particular dialects of the particular languages. For example, the top 
Regarding claim 16:  Sharifi satisfies all the elements of claim 1.  Sharifi further discloses in response to a user input received by a user (Fig. 1, user utterance 180), determining dialect information indicated by the user input to be the input dialect data (The classifiers 112A-C may classify the acoustic features in parallel to reduce latency. To reduce processing, the system 100 may include the classifiers 112A-C in a hierarchy where classifiers 112A-C may be in a top level and classifiers 122A-B may be in a sub level that may depend from the top level classifiers 112A-C. The top level classifiers 112A-C may be specific to particular languages and the sub level classifiers 122A-B may be specific to particular dialects of the particular languages. For example, the top level English classifier 112B may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in English, whether American English or British English, a first sub level classifier 122A may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in American English, and a second sub level classifier 122B may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken 
Regarding claim 17:  Sharifi satisfies all the elements of claim 1.  Sharifi further discloses wherein the generating of the dialect parameter comprises:  calculating the input dialect data from the speech signal (Fig. 1, user utterance 180) using a dialect classification model (Fig. 1, 112A-112C).
Regarding claim 19:  Sharifi satisfies all the elements of claim 17.  Sharifi further discloses wherein the calculating of the input dialect data comprises:  determining an output of at least one layer (Spanish, English or French) of the dialect classification model (Fig. 1, 112A-112C) to be 
Regarding claim 20:  Sharifi satisfies all the elements of claim 1.  Sharifi further discloses wherein the generating of the dialect parameter comprises:  calculating the input dialect data from an output of at least one implemented layer (Spanish, English or French) of the dialect speech recognition model (The locale-specific parameter selector 140 may receive an indication of the speech locale selected by the speech locale selector 130. For example, the locale-specific parameter selector 140 may receive an indication that the speech locale selector 130 has selected to use American English as the speech locale. Based on the indication, the locale-specific parameter selector 140 may select parameters for automated speech recognition. For example, the locale-specific parameter selector 140 may select to use parameters for automated speech recognition that correspond to parameters that more accurately recognize American English speech. The parameters may specify using a speech recognition model that corresponds to the selected speech locale. For example, the locale-specific parameter selector 140 may select parameters that specify using an American English speech recognition model for recognizing speech in an utterance when the American English speech locale is selected., col. 7, ln. 21-38).
Regarding claim 21:  Sharifi satisfies all the elements of claim 1.  Sharifi further discloses wherein the input dialect data is the speech signal (Fig. 1, user utterance 180).
Regarding claim 22:  Sharifi satisfies all the elements of claim 1.  Sharifi further discloses wherein the parameter generation model (The locale-specific parameter selector 140 may receive an indication of the speech locale selected by the speech locale selector 130. For example, the locale-specific parameter selector 140 may receive an indication that the speech locale selector 130 has selected to use American English as the speech locale. Based on the indication, the 
Regarding claim 24:  Sharifi satisfies all the elements of claim 1.  Sharifi further discloses wherein the generating of the dialect parameter comprises:  obtaining, as the input dialect data, 
Regarding claim 28:  Sharifi satisfies all the elements of claim 1.  Sharifi further discloses retraining the parameter generation model (The locale-specific parameter selector 140 may receive an indication of the speech locale selected by the speech locale selector 130. For example, the locale-specific parameter selector 140 may receive an indication that the speech locale selector 130 has selected to use American English as the speech locale. Based on the indication, the locale-specific parameter selector 140 may select parameters for automated speech recognition. For example, the locale-specific parameter selector 140 may select to use parameters for automated speech recognition that correspond to parameters that more accurately recognize American English speech. The parameters may specify using a speech recognition model that corresponds to the selected speech locale. For example, the locale-specific parameter selector 140 may select parameters that specify using an American English speech recognition model for recognizing speech in an utterance when the American English speech locale is selected., col. 7, ln. 21-38) based on the speech signal (Fig. 1, user utterance 180) and the input dialect data (The classifiers 112A-C may classify the acoustic features in parallel to reduce latency. To reduce processing, the system 100 may include the classifiers 112A-C in a hierarchy where classifiers 112A-C may be in a top level and classifiers 122A-B may be in a sub level that may depend from the top level classifiers 112A-C. The top level classifiers 112A-C may be specific to particular languages and the sub level classifiers 122A-B may be specific to particular dialects of the particular languages. For example, the top level English classifier 112B may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in English, whether American English or British English, a first sub level classifier 122A may 
Regarding claim 30:  Sharifi satisfies all the elements of claim 1.  Sharifi further discloses identifying a language of a user (Spanish, English or French) and selecting a trained speech recognition model (Fig. 1, automated speech recognizer 150), from among plural respective different language trained speech recognition models (Fig. 1, automated speech recognizer 150) stored in a memory (Fig. 1, database 160), corresponding to the identified language (Spanish, English or French), wherein the applying of the dialect parameter (The classifiers 112A-C may classify the acoustic features in parallel to reduce latency. To reduce processing, the system 100 may include the classifiers 112A-C in a hierarchy where classifiers 112A-C may be in a top level and classifiers 122A-B may be in a sub level that may depend from the top level classifiers 112A-C. The top level classifiers 112A-C may be specific to particular languages and the sub level classifiers 122A-B may be specific to particular dialects of the particular languages. For example, the top level English classifier 112B may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in English, whether American English or British English, a first sub level classifier 122A may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in American English, and a second sub level classifier 122B may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in British English. The hotword "OK COMPUTER" for American English may include the same 
Regarding claim 31:  Sharifi satisfies all the elements of claim 1.  Sharifi further discloses wherein the generating of the dialect parameter comprises:  dynamically generating a dialect parameter (The classifiers 112A-C may classify the acoustic features in parallel to reduce latency. To reduce processing, the system 100 may include the classifiers 112A-C in a hierarchy where classifiers 112A-C may be in a top level and classifiers 122A-B may be in a sub level that may depend from the top level classifiers 112A-C. The top level classifiers 112A-C may be specific to particular languages and the sub level classifiers 122A-B may be specific to particular dialects of the particular languages. For example, the top level English classifier 112B may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in English, whether American English or British English, a first sub level classifier 122A may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in American English, and a second sub level classifier 122B may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in British English. The hotword "OK COMPUTER" for American English may include the same terms as the hotword "OK COMPUTER" for British English, but may be pronounced slightly differently., col. 5, ln. 30-49) each time a speech signal (Fig. 1, user utterance 180) is obtained.
Regarding claim 31:  Sharifi satisfies all the elements of claim 1.  Arguments analogous to those stated in the rejection of claim 1 are applicable.  A non-transitory computer-readable 
Regarding claim 34:  Sharifi discloses one or more memories storing a parameter generation model (The locale-specific parameter selector 140 may receive an indication of the speech locale selected by the speech locale selector 130. For example, the locale-specific parameter selector 140 may receive an indication that the speech locale selector 130 has selected to use American English as the speech locale. Based on the indication, the locale-specific parameter selector 140 may select parameters for automated speech recognition. For example, the locale-specific parameter selector 140 may select to use parameters for automated speech recognition that correspond to parameters that more accurately recognize American English speech. The parameters may specify using a speech recognition model that corresponds to the selected speech locale. For example, the locale-specific parameter selector 140 may select parameters that specify using an American English speech recognition model for recognizing speech in an utterance when the American English speech locale is selected., col. 7, ln. 21-38), a trained speech recognition model (The locale-specific parameter selector 140 may receive an indication of the speech locale selected by the speech locale selector 130. For example, the locale-specific parameter selector 140 may receive an indication that the speech locale selector 130 has selected to use American English as the speech locale. Based on the indication, the locale-specific parameter selector 140 may select parameters for automated speech recognition. For example, the locale-specific parameter selector 140 may select to use parameters for automated speech recognition that correspond to parameters that more accurately recognize American English speech. The parameters may specify using a speech recognition model that corresponds to the selected speech locale. For example, the locale-specific parameter selector 140 may select 
Regarding claim 42:  Sharifi satisfies all the elements of claim 34.  Sharifi further discloses wherein the processor (Fig. 4, processor 402) is configured to determine, to be the input dialect data (The classifiers 112A-C may classify the acoustic features in parallel to reduce latency. To reduce processing, the system 100 may include the classifiers 112A-C in a hierarchy where classifiers 112A-C may be in a top level and classifiers 122A-B may be in a sub level that may depend from the top level classifiers 112A-C. The top level classifiers 112A-C may be specific to particular languages and the sub level classifiers 122A-B may be specific to particular dialects of the particular languages. For example, the top level English classifier 112B may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in English, whether American English or British English, a first sub level classifier 122A may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in American English, and a second sub level classifier 122B may correspond to a classifier that has been trained to detect "OK COMPUTER" as spoken in British English. The hotword "OK COMPUTER" for American English may include the same terms as the hotword "OK COMPUTER" for British English, but 
Regarding claim 43:  Sharifi satisfies all the elements of claim 34.  Sharifi further discloses wherein the processor (Fig. 4, processor 402) is configured to calculate the input dialect data from the speech signal (Fig. 1, user utterance 180) using a dialect classification model (Fig. 1, 112A-112C).
Regarding claim 44:  Sharifi satisfies all the elements of claim 34.  Sharifi further discloses wherein the input dialect data is the speech signal (Fig. 1, user utterance 180).
Regarding claim 47:  Sharifi satisfies all the elements of claim 34.  Sharifi further discloses further comprising a microphone (The acoustic feature extractor 110 may receive sounds corresponding to an utterance (in the figure, "OK COMPUTER") said by a user 180, where the sounds may be captured by an audio capture device, e.g., a microphone that converts sounds into an electrical signal. The acoustic feature extractor 110 may extract acoustic features from the utterance. The acoustic features may be Mel-frequency cepstrum coefficients (MFCCs) or filter bank energies computed over windows of an audio signal., col. 4, ln. ln. 22-30), wherein the processor (Fig. 4, processor 402) is further configured to control the microphone (The acoustic feature extractor 110 may receive sounds corresponding to an utterance (in the figure, "OK COMPUTER") said by a user 180, where the sounds may be captured by an audio capture device, e.g., a microphone that converts sounds into an electrical signal. The acoustic feature extractor 110 may extract acoustic features from the utterance. The acoustic features may be Mel-frequency cepstrum coefficients (MFCCs) or filter bank energies computed over windows .
Reasons for Allowance
Claim 48 is allowed. 
The present invention is directed to speech recognition. Each independent claim identifies the uniquely distinct features:
one or more memories storing a parameter generation model, a dialect classification model, a trained speech recognition model, and instructions, where the trained speech recognition model is a neural network model with at least the one or more layers, each of the one or more layers including at least a node connected to one or more hierarchically previous layer nodes and/or one or more temporally previous nodes according to respective weighted connections; and
a processor, which by executing the instructions is configured to:
generate an input dialect data, by using the dialect classification model with respect to an obtained speech signal, where the input dialect data is a determined indication of a classified dialect of the speech signal or probabilistic data of a complex dialect of the speech signal;
generate respective dialect parameters from the input dialect data using the parameter generation model;
apply the respective dialect parameters to the trained speech recognition model to generate a dialect speech recognition model; and
generate a speech recognition result through an implementation, with respect to the speech signal, of the dialect speech recognition model to generate the speech recognition result for the speech signal,

wherein the applying of the respective dialect parameters includes inserting a connection weighting or setting, replacing, or modifying respective connection weights in each of the one or more layers, less than all of the respective weighted connections.

The closest prior art, US 9,589,564 (“Sharifi”), fails to anticipate or render obvious at least the above limitations.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Allowable Subject Matter
Claims 2-15, 18, 23, 25, 27, 29, 32, 35-41 and 45-46 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLOTTE M BAKER whose telephone number is (571)272-7459.  The examiner can normally be reached on Mon - Fri 8:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, NAY A MAUNG can be reached on (571)272-7778.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/CHARLOTTE M BAKER/Primary Examiner, Art Unit 2664