DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 24 September 2019 in reference to application 16/481,076.  Claims 1-20 are pending and have been examined.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 

(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “a data transcription module,” “ a corpus generation module,” “an acoustic model generation module,” and “a speech recognition engine” in claim 11, “a first module,” “a second module,” and “a third module” in claim 12, “a data collection module” in claim 13, and “a feature extraction module,” “ a deep learning module,” “a core regional dialect item extraction module,” “ and “a core standardization module” in claim 15.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.


Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1-20 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of copending Application No. 16/575,317 in view of Chen et al. (US PAP 2020/0258497) as laid out in the chart below. 
This is a provisional nonstatutory double patenting rejection.
Instant Application
US Application 16/575,317
Claim 1: A regional dialect phoneme adaptive training method, performed by a regional dialect phoneme adaptive 
Claim 1: A language modeling method, performed by a language modeling 

transcribing text data by sorting regional dialect-containing speech data from collected speech data; 
generating a regional dialect corpus using the text data and the regional dialect-containing speech data; and 
generating a regional dialect corpus using the text data and the regional dialect-containing speech data; and 
generating an acoustic model and a language model using the regional dialect corpus, 
generating an acoustic model and a language model using the regional dialect corpus, 
wherein the generating an acoustic model and a language model comprises extracting phonemes of a regional dialect item and training a phoneme adaptive model based on the extracted phonemes.
wherein the generating an acoustic model and a language model comprises marking speech data on word spacing of a regional dialect sentence using a tag, and training the language model based on the speech data.
Claim 2: The regional dialect phoneme adaptive training method of claim 1, further comprising collecting speech data through a speech recognition service domain.
Claim 2: The language modeling method of claim 1, further comprising collecting speech data of a user through a speech recognition service domain.
Claim 3: The regional dialect phoneme adaptive training method of claim 2, wherein, in the collecting speech data, speech data of users using different regional dialects is collected through speech input/output interfaces of various electronic devices.
Claim 3: The language modeling method of claim 2, wherein, in the collecting speech data, speech data of users using different regional dialects is collected through speech input/output interfaces of various electronic devices.
Claim 4: The regional dialect phoneme adaptive training method of claim 1, wherein the transcribing text data comprises: 
Claim 4: The language modeling method of claim 1, wherein the transcribing text data comprises: 
removing an abnormal vocalization from the collected speech data; 
removing an abnormal vocalization from the collected speech data; 
selecting regional dialect-containing speech data using a reliability measurement of the speech data; and 
selecting regional dialect-containing speech data using a reliability measurement of the speech data; and.
obtaining transcription data from the regional dialect-containing speech data.
obtaining transcription data from the regional dialect-containing speech data
Claim 5: The regional dialect phoneme adaptive training method of claim 1, wherein the 30Docket No. 3130-3143 generating a regional dialect corpus comprises: 
Claim 5: The language modeling method of claim 1, wherein the generating a regional dialect corpus comprises: 

extracting a feature from the regional dialect-containing speech data;  27Docket No. 3130-3144 
performing clustering of similar regional dialect items in the regional dialect-containing speech data using the extracted feature; 
performing clustering of similar regional dialect items in the regional dialect-containing speech data using the extracted feature; 
extracting a core regional dialect item from a similar regional dialect item cluster; and 
extracting a core regional dialect item from a similar dialect item cluster; and 
standardizing a regional dialect corpus using the extracted core regional dialect item…
standardizing a regional dialect corpus using the extracted core regional dialect item.
Claim 6: The regional dialect phoneme adaptive training method of claim 5, wherein, in the extracting a feature from the regional dialect-containing speech data, at least one among pronunciation string features, lexical features, domain features, and frequency features of a regional dialect item is extracted.
Claim 6: The language modeling method of claim 5, wherein, in the extracting a feature from the regional dialect-containing speech data, at least one among pronunciation string features, lexical features, domain features, and frequency features of a regional dialect item is extracted.
Claim 7: The regional dialect phoneme adaptive training method of claim 6, wherein the domain features comprise information on a type of an electronic apparatus providing a speech recognition service for the user, information on a region in which the electronic apparatus is located, and information on an age group of the user of the electronic apparatus.
Claim 7: The language modeling method of claim 6, wherein the domain features comprise information on a type of an electronic apparatus providing a speech recognition service for the user, information on a region in which the electronic apparatus is located, and information on an age group of the user of the electronic apparatus.
Claim 8: The regional dialect phoneme adaptive training method of claim 5, wherein, in the performing clustering of similar regional dialect items, a degree of similarity between features is measured through a weight calculation between the features according to an unsupervised learning method, and regional dialect items having a degree of similarity higher than a threshold are clustered.
Claim 8: The language modeling method of claim 5, wherein, in the performing clustering of similar dialect items, a degree of similarity between features is measured through a weight calculation between the features according to an unsupervised learning method, and regional dialect items having a degree of similarity higher than a threshold are clustered.
Claim 9: The regional dialect phoneme adaptive training method of claim 5, wherein, in the extracting a core regional dialect item from the similar regional dialect item cluster, N number of objects having the highest frequency features in a 
Claim 9: The language modeling method of claim 5, wherein, in the extracting a core regional dialect item from the similar dialect item cluster, N number of objects having the highest frequency features in a cluster are extracted, and a core object is 
Claim 10: The regional dialect phoneme adaptive training method of claim 5, wherein, in the standardizing a regional dialect corpus, an existing regional dialect item is replaced with a core object regional dialect item, and verification is performed through a similarity measurement between an original regional dialect sentence and a replaced sentence
Claim 10: The language modeling method of claim 5, wherein, in the standardizing a regional dialect corpus, an existing regional dialect item is replaced with a core object regional dialect item, and verification is performed through a similarity measurement between an original dialect sentence and a replaced sentence.
Claim 11: A regional dialect phoneme adaptive training system, comprising: 
Claim 11: A language modeling system, comprising:  28Docket No. 3130-3144 
a data transcription module transcribing text data from regional dialect-containing speech data of collected speech data; 
a data transcription module transcribing text data from regional dialect-containing speech data of collected speech data; 
a corpus generation module generating a regional dialect corpus using the text data and the regional dialect-containing speech data; 
a corpus generation module generating a regional dialect corpus using the text data and the regional dialect-containing speech data; 
an acoustic model generation module and a language model generation module generating an acoustic model and a language model, respectively, using the regional dialect corpus; and 
an acoustic model generation module and a language model generation module generating an acoustic model and a language model, respectively, using the regional dialect corpus; and 
a speech recognition engine recognizing speech using the trained acoustic model and the trained language model, 
a speech recognition engine recognizing speech using the trained acoustic model and the trained language model, 
wherein the generating an acoustic model and a language model comprises extracting phonemes of a regional dialect item and training a phoneme adaptive model based on the extracted phonemes.
wherein the generating an acoustic model and a language model comprises marking speech data on word spacing of a regional dialect sentence using a tag, and training the language model based on the speech data.
Claim 12: The regional dialect phoneme adaptive training system of claim 11, wherein the acoustic model generation module comprises: 
Claim 12: The language modeling system of claim 11, wherein the language model generation module comprises: 
a first module extracting phonemes of a regional dialect item from regional dialect-containing speech data; 
a first module marking a silent syllable using a tag on speech data extracted from the regional dialect-containing speech data; 

a second module segmenting the speech data into word-phrases; and 
a third module training a phoneme adaptive model using the extracted phonemes and the extracted frequency.
a third module extracting a frequency of units of the word-phrases.
Claim 13: The regional dialect phoneme adaptive training system of claim 11, further comprising a data collection module collecting speech data of users using different regional dialects through speech input/output interfaces of various electronic devices.
Claim 13: The language modeling system of claim 11, further comprising a data collection module collecting speech data of users using different regional dialects through speech input/output interfaces of various electronic devices.
Claim 14: The regional dialect phoneme adaptive training system of claim 11, wherein the data transcription module removes an abnormal vocalization from collected speech data, selects regional dialect-containing speech data using a reliability measurement of the speech data, and generates transcription data from the regional dialect-containing speech data.
Claim 14: The language modeling system of claim 11, wherein the data transcription module removes an abnormal vocalization from collected speech data, selects regional dialect- containing speech data using a reliability measurement of the speech data, and generates transcription data from the regional dialect-containing speech data.
Claim 15: The regional dialect phoneme adaptive training system of claim 11, wherein the corpus generation module comprises: 
Claim 15: The language modeling system of claim 11, wherein the corpus generation module 29Docket No. 3130-3144 comprises: 
a feature extraction module extracting a feature from the regional dialect-containing speech data; 
a feature extraction module extracting a feature from the regional dialect-containing speech data; 
a deep learning module performing clustering of similar regional dialect items in the regional dialect-containing speech data using the extracted feature; 
a deep learning module performing clustering of similar dialect items in the regional dialect-containing speech data using the extracted feature; 
a core regional dialect item extraction module extracting a core regional dialect item from a similar regional dialect item cluster; and 
a core regional dialect item extraction module extracting a core regional dialect item from a similar dialect item cluster; and 
a corpus standardization module standardizing a regional dialect corpus using the extracted core regional dialect item.
a corpus standardization module standardizing a regional dialect corpus using the extracted core regional dialect item.
Claim 16: The regional dialect phoneme adaptive training system of claim 15, wherein the feature extraction module extracts at least one among pronunciation 
Claim 16: The language modeling system of claim 15, wherein the feature extraction module extracts at least one among pronunciation string features, 
Claim 17: The regional dialect phoneme adaptive training system of claim 16, wherein the domain features comprise information on a type of an electronic apparatus providing a speech recognition service for the user, information on a region in which the electronic apparatus is located, and information on an age group of the user of the electronic apparatus.
Claim 17: The language modeling system of claim 16, wherein the domain features comprise information on a type of an electronic apparatus providing a speech recognition service for the user, information on a region in which the electronic apparatus is located, and information on an age group of the user of the electronic apparatus.
Claim 18: The regional dialect phoneme adaptive training system of claim 15, wherein the 33Docket No. 3130-3143 deep learning module measures a degree of similarity between features through a weight calculation between the features according to an unsupervised learning method, and clusters regional dialect items having a degree of similarity higher than a threshold.
Claim 18: The language modeling system of claim 15, wherein the deep learning module measures a degree of similarity between features through a weight calculation between the features according to an unsupervised learning method, and clusters regional dialect items having a degree of similarity higher than a threshold.
Claim 19: The regional dialect phoneme adaptive training system of claim 15, wherein the core regional dialect item extraction module extracts N number of objects having the highest frequency features in a cluster, and extracts a core object through a feature similarity calculation with other objects in the cluster.
Claim 19: The language modeling system of claim 15, wherein the core regional dialect item extraction module extracts N number of objects having the highest frequency features in a cluster, and extracts a core object through a feature similarity calculation with other objects in the cluster
Claim 20: The regional dialect phoneme adaptive training system of claim 15, wherein the corpus standardization module replaces an existing regional dialect item with a core regional dialect item, and performs verification through a similarity measurement between an original regional dialect sentence and a replaced sentence.
Claim 20: The language modeling system of claim 15, wherein the corpus standardization 30Docket No. 3130-3144 module replaces an existing regional dialect item with a core regional dialect item, and performs verification through a similarity measurement between an original regional dialect sentence and a replaced sentence.


Application 16/575,317 does not specifically teach wherein the generating an acoustic model and a language model comprises extracting phonemes of a regional 
In the same field of building acoustic and language models, Chen et al. (US PAP 2020/0258497) teaches wherein the generating an acoustic model and a language model comprises extracting phonemes of a regional dialect item and a frequency of the phonemes of the regional dialect item, and training a phoneme adaptive model based on the extracted phonemes and the extracted frequency (0050-55, and claim 7, phoneme frequency values used to generate language models).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to generate models using phoneme frequencies as taught by Chen in the system of 16/575,317 in order to increase recognizer accuracy (Chen 0009).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1 is/are rejected under 35 U.S.C. 103 as being unpatentable over Biadsy et al (US Patent 8,583,432) in view of Chen et al (US PAP 2020/0258497).

Consider claim 1, Biadsy teaches a language modeling method, performed by a language modeling system (abstract), the language modeling method comprising: 
transcribing text data by sorting regional dialect-containing speech data from collected speech data (col 12 lines 15-31, dialect recognition performed on training data set, Col 9, lines 28-35, Col 12 lines 48-56, using GALE training corpus, which consists of speech audio and its transcription, regional dialects within); 
generating a regional dialect corpus using the text data and the regional dialect-containing speech data (col 12 lines 15-31 portions containing each dialect determined); and 
generating an acoustic model and a language model using the regional dialect corpus (col 7 lines 20-30, generating acoustic models for each dialect, col 12 lines 32-47, language models built for each dialect), 
wherein the generating an acoustic model and a language model comprises extracting phonemes of a regional dialect item, and training a phoneme adaptive model based on the extracted phonemes (col 7 lines 20-30, generating acoustic models for each dialect, phoneme extraction, col 12 lines 32-47, language models built for each dialect based on portions determined to contain dialect).
Biadsy does not specifically teach wherein the generating an acoustic model and a language model comprises extracting phonemes of a regional dialect item and a frequency of the phonemes of the regional dialect item, and training a phoneme adaptive model based on the extracted phonemes and the extracted frequency.
In the same field of building acoustic and language models, Chen et al. (US PAP 2020/0258497) teaches wherein the generating an acoustic model and a language 
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to generate models using phoneme frequencies as taught by Chen in the system of Biadsy in order to increase recognizer accuracy (Chen 0009).

Claims 2 and 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Biadsy and Chen as applied to claims 1 above, and further in view of Kapralova et al. (US PAP 2016/0093294).

Consider claim 2, Biadsy and Chen teach the language modeling method of claim 1, but does not specifically teach further comprising collecting speech data of a user through a speech recognition service domain.
In the same field of model building, Kapralova teaches collecting speech data of a user through a speech recognition service domain (figure 1, 0022-26, speech recognition inputs and results collected and used to generated new recognition models).
Therefore it would have been obvious to one of ordinary skill in the art to collect training data from speech input as taught by Kapralova in the system of Biadsy and Chen in order to quickly generate a large amount of training data (Kapralova 0007).

Consider claim 3, Kapralova and Biadsy teaches the language modeling method of claim 2, wherein, in the collecting speech data, speech data of users using different regional dialects (Biadsy Col 9, lines 28-35 training data contains different dialects) is collected through speech input/output interfaces of various electronic devices (Kapralova 0018, speech from many users and devices may be collected).

Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Biadsy and Chen as applied to claims 1 above, and further in view of Faizakov et al. (US PAP 20080240396).

Consider claim 4, Biadsy and Chen teach the language modeling method of claim 1, wherein the transcribing text data comprises: 
selecting regional dialect-containing speech data using a reliability measurement of the speech data (Biadsy col 9 15-30 probabilities determined that an utterance conforms to a dialect); and 
obtaining transcription data from the regional dialect-containing speech data (Col 9, lines 28-35, Col 12 lines 48-56, using GALE training corpus, which consists of speech audio and its transcription, regional dialects within).
Biadsy and Chen do not specifically teach removing an abnormal vocalization from the collected speech data.
In the same field of training models, Faizakov teaches removing an abnormal vocalization from the collected speech data (0053, disfluencies such as “umm” are removed).
.

Claim 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Biadsy in view of Kapralova and further in view of Chen.

Consider claim 11, Biadsy teaches a language modeling system (abstract), comprising: 
a corpus generation module generating a regional dialect corpus using the text data and the regional dialect-containing speech data (col 12 lines 15-31, dialect recognition performed on training data set, Col 9, lines 28-35, Col 12 lines 48-56, using GALE training corpus, which consists of speech audio and its transcription, regional dialects within); 
an acoustic model generation module and a language model generation module generating an acoustic model and a language model, respectively, using the regional dialect corpus; (col 7 lines 20-30, generating acoustic models for each dialect, col 12 lines 32-47, language models built for each dialect), and
a speech recognition engine recognizing speech using the trained acoustic model and the trained language model (Figures 4 and 5, speech recognition using trained models),

Biadsy does not specifically teach a data transcription module transcribing text data from regional dialect-containing speech data of collected speech data.
In the same field of model building, Kapralova teaches a data transcription module transcribing text data from regional dialect-containing speech data of collected speech data (figure 1, 0022-26, speech recognition inputs and results collected and used to generated new recognition models).
Therefore it would have been obvious to one of ordinary skill in the art to collect training data from speech input as taught by Kapralova in the system of Biadsy in order to quickly generate a large amount of training data (Kapralova 0007).
Biadsy and Kapralova do not specifically teach wherein the generating an acoustic model and a language model comprises extracting phonemes of a regional dialect item and a frequency of the phonemes of the regional dialect item, and training a phoneme adaptive model based on the extracted phonemes and the extracted frequency.
In the same field of building acoustic and language models, Chen et al. (US PAP 2020/0258497) teaches wherein the generating an acoustic model and a language model comprises extracting phonemes of a regional dialect item and a frequency of the phonemes of the regional dialect item, and training a phoneme adaptive model based 
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to generate models using phoneme frequencies as taught by Chen in the system of Biadsy and Kapralova in order to increase recognizer accuracy (Chen 0009).

Consider claim 12, Chen teaches the language modeling system of claim 11, wherein the acoustic model generation module comprises: 
a first module extracting phonemes of a regional dialect item from regional dialect-containing speech data (0039, determining phoneme sequences from training data, which in combination with Biadsy would be dialect specific); 
a second module extracting a frequency of the phonemes of the regional dialect item (0050-55, and claim 7, phoneme frequency values used to generate language models); and 
a third module training a phoneme adaptive model using the extracted phonemes and the extracted frequency (0050-55, and claim 7, phoneme frequency values used to generate language models).

Consider claim 13, Kapralova and Biadsy teaches the language modeling system of claim 11, further comprising a data collection module collecting speech data of users using different regional dialects through speech input/output interfaces of various electronic devices. (Biadsy Col 9, lines 28-35 training data contains different dialects) is .

Claim 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Biadsy Kapralova and Chen as applied to claims 11 above, and further in view of Faizakov et al. (US PAP 20080240396).

Consider claim 14, Biadsy Kapralova and Chen teach the language modeling system of claim 11, wherein the data transcription module: 
selects regional dialect-containing speech data using a reliability measurement of the speech data (Biadsy col 9 15-30 probabilities determined that an utterance conforms to a dialect); and 
generates transcription data from the regional dialect-containing speech data (Col 9, lines 28-35, Col 12 lines 48-56, using GALE training corpus, which consists of speech audio and its transcription, regional dialects within).
Biadsy Kapralova and Chen do not specifically teach removing an abnormal vocalization from the collected speech data.
In the same field of training models, Faizakov teaches removing an abnormal vocalization from the collected speech data (0053, disfluencies such as “umm” are removed).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to remove disfluencies from speech data as taught by Faizakov in the .

Allowable Subject Matter
Claims 5-10 and 15-20 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and a terminal disclaimer filed to overcome the double patenting rejections set forth in this office action.  The following is a statement of reasons for the indication of allowable subject matter:  

Consider claim 5, Biadsy and Chen teach the language modeling method of claim 1, wherein the generating a regional dialect corpus comprises: 
extracting a feature from the regional dialect-containing speech data (col 12 lines 15-25, distinguishing phones of dialect).
However the prior art of record does not teach or fairly suggest the limitations of  27Docket No. 3130-3144 
“performing clustering of similar regional dialect items in the regional dialect-containing speech data using the extracted feature; 
extracting a core regional dialect item from a similar dialect item cluster; and 
standardizing a regional dialect corpus using the extracted core regional dialect item” when combined with each and every other limitation of the claim and the base claim.  Therefore claim 5 contains allowable subject matter.

Claim 15 contains similar subject matter as claim 5 and therefore contains allowable subject as well.

Claims 6-10 and 16-20 depend on and further limit claims 5 and 15 and therefore contain allowable subject matter as well.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and is listed on the Notice of References Cited.  
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451.  The examiner can normally be reached on 7:30-12 Monday and Friday, 7:30-6 Tuesday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for 


DOUGLAS GODBOLD
Examiner
Art Unit 2658



/DOUGLAS GODBOLD/           Primary Examiner, Art Unit 2658