DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 1/6/2021 has been entered.
Response to Amendment
In response to the office action from 11/3/2020, the applicant has submitted an RCE, amending claims 1-11, 13-20, while arguing to traverse the prior art rejections. Since prior art was found which overcame the latest amendments, therefore the examiner after determination of allowable subject matter, recommended an examiner’s amendment to thereby amend claims 1, 8 and 13 to place the case in condition for allowance. Therefore claims 1-4, 6-8, 10-13 and 15-20 with the examiner’s amendments that follow, are allowable over the prior art of record for the below provided reasons for allowance.
EXAMINER’S AMENDMENT
The examiner has changed the title of the invention to “ACTIVE LEARNING FOR LARGE-SCALE SEMI-SUPERVISED CREATION OF SPEECH RECOGNITION TRAINING CORPORA BASED ON NUMBER OF TRANSCRIPTION MISTAKES AND NUMBER OF WORD OCCURRENCES” so as to be more descriptive of the invention.
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in an interview with the attorney on file Mr. Kyle Schlueter on 3/11/2021.

Amend claims 1, 8, 13, cancel claims 5, 9 and 14.

As Per Claim 1:

Claim 1 (proposed amendment):  A method for generating training data for an automatic speech recognition (ASR) system, said method comprising:
receiving a user command to perform speech corpus gathering for an ASR system training database;
 first textual content from said ASR system training database based on identification of textual content for which said ASR system makes a number of transcription mistakes exceeding a first threshold, wherein said transcription mistakes are mistakes that said ASR system makes in recognizing and/or transcribing received audible speech into text;
at least one of (i) selecting, from said ASR system training database, second textual content having fewer than a second threshold number of occurrences that is greater than one in said ASR system training database, and (ii) selecting third textual content based upon an n-gram distribution on a language;
presenting a textual representation of  that is selected from amongst said first, second, and third textual content;
receiving, from a user, a speech utterance that is spoken by said user and that includes said selected textual content;
providing said speech utterance to said ASR system, wherein said speech utterance is represented as an audio file; and
storing said audio file in said ASR system training database, wherein said ASR system training database associates said audio file that represents said speech utterance received from said user with said selected textual content.

As Per Claim 5:


As Per Claim 8:
Claim 8 (proposed amendment):  A system for generating training data for an automatic speech recognition (ASR) system, said system comprising:
a selection circuit to
select textual content based on identification of first textual content for which said ASR system makes a number of transcription mistakes exceeding a first threshold, wherein said transcription mistakes are mistakes that said ASR system makes in recognizing and/or transcribing received audible speech into text,
select one or more of (i) from an ASR system training database, second textual content having fewer than a second threshold number of occurrences that is greater than one in said ASR system training database, and (ii) selecting third textual content based upon an n-gram distribution on a language text;
an audio input circuit to receive a speech utterance in response to a textual representation of  that is selected from amongst said first, second, and third textual content, and store said speech utterance as an audio file;
an application programming interface (API) interaction circuit to interact with an ASR API so as to retrieve a transcript of said audio file;
 and
a data augmentation circuit to process said audio file thereby generating a modified audio file;
wherein said system training database stores 

As Per Claim 9:

Cancel.

As Per Claim 13:

Claim 13 (proposed amendment):  A computer program product including one or more non-transitory machine-readable mediums encoded with instructions that when executed by one or more processors cause a process to be carried out for generating training data for an automatic speech recognition (ASR) system, said process comprising:

in response to receiving said command, extracting first textual content stored in said ASR system training database based on identification of textual content for which said ASR system makes a number of transcription mistakes exceeding a first threshold, wherein said transcription mistakes are mistakes that said ASR system makes in recognizing and/or transcribing received audible speech into text;
extracting, from said ASR system training database, second textual content having fewer than a second threshold number of occurrences greater than one in said ASR system training database;
presenting, via said user interface, a textual representation of extracted  that is selected from amongst said first and second textual content extracted from said ASR system training database;
after presenting said textual representation of said extracted textual content, receiving, from said user, a speech utterance that is spoken by said user and that includes said extracted textual content;
providing said speech utterance to said ASR system, wherein said speech utterance is represented as an audio file;
receiving a transcript for said speech utterance from said ASR system;
extracted textual content, and after receiving said speech utterance, presenting said transcript for review in said user interface;
receiving, via said user interface, a modification to said transcript that is provided by said user;
generating a modified transcript based on said received modification; and
storing said modified transcript and said audio file in said ASR system training database.

As Per Claim 14:

Cancel.
Allowable Subject Matter
The following is an examiner’s statement of reasons for allowance: The independent claims 1, 8, and 13 concern construction and training of a “corpus” “training database” for applications to speech recognition. The method relies on three distinct criteria in choosing which words and/or phrases to be stored in the said database. The first criterion is based on a “number of transcription mistakes” for a word to “exceed” “a first threshold”. The second criterion is based on “number of occurrences” of a word not to exceed a “second threshold” (also called 
If a word satisfies any of the said criteria, it is subsequently stored in the said “database” along with a corresponding “speech utterance” associated with the said word or phrase.
Kahn et al. (US 2006/0149558) do teach in ¶ 0281 sentence 2: “training session file” (ASR system training database) “include” “audio files (one file per utterance)” (audio file) “and verbatim text corresponding to each audio file” (and its selected associated textual content). This follows the following steps: “If there is a misrecognition” “the disclosure further teaches that correction may be saved and submitted with training session file” (¶ 0088); “A training session file 310 (FIG. 3) may be generated that includes audio-aligned text 350. The user may embed audio 538 or compress embedded audio 539 in the file” (¶ 0248 last sentence). These do teach a “training session file” which maps to the claims’ “corpus” or “database” to be populated by words initially “misrecognize[ed]” (transcription mistaken words), but subsequently corrected, along with their associated audio files.
Kahn et al. though is silent on using any criterion not based on “misrecognition” in deciding to store a word in its database. Note that this criterion is particularly useful for storage of words for which transcription mistakes are attributed to the ASR system not properly decoding a pronunciation.

Although Ganong III teaches two of the criteria above, but it always “adds” (stores) any “idiom” (under represented word) which occurs even once, or is lower than a threshold of one word occurrence. 
The instant application though has a mechanism by which it allows selecting words for the database based on frequency of occurrence to depend on thresholds greater than one. That way the database is filled more selectively and its storage utilized more efficiently.
Further search did not produce any reference teaching these criteria, and therefore these claims became allowable. Claims 2-4, 6-7 (dependent on claim 1), 10-12 (dependent on claim 8), 15-20 (dependent on claim 13) further limit their allowed independent parent claims and are thus allowable under similar rationale.

Conclusion
 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARZAD KAZEMINEZHAD whose telephone number is (571)270-5860.  The examiner can normally be reached on 10:30 am to 11:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, DANIEL C WASHBURN can be reached on (571)272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status 






/Farzad Kazeminezhad/
Art Unit 2657
March 11th 2021.