DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 16-18, 20-25, 27-28 and 30-35 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by McGraw et al. (US 2013/0110492).

Claim 16,
McGraw teaches a method to transcribe communications, the method comprising: obtaining a plurality of hypothesis transcriptions of audio data ([Fig. 1] [0025] ASR 108 receives audio signal; the incremental recognizer buffer 118 keeps track of top incremental speech recognition hypotheses 124); 
determining a plurality of consistent words that are included in two or more of the plurality of hypothesis transcriptions ([Fig. 1] [0025-0026] the stability evaluator 120 then incrementally identifies segments of the top incremental speech recognition hypotheses and determines the stability of each segment (a segment refers to a word and stability of a word refers to a particular word that is persisted in the top hypotheses 124 without changing over time)); 
in response to determining the plurality of consistent words, directing the plurality of consistent words to a device for presentation of the plurality of consistent words; and presenting the plurality of consistent words in a rolling fashion, a pace of the presentation of the plurality of consistent words in the rolling fashion being variable ([Fig. 1] [0027-0028] incrementally displaying only the segments within the top hypotheses that the stability evaluator has determined to be stable; stable segments from output 122 are communicated the user device as transcriptions 130; the transcriptions are sent to the user device at a pre-determined time intervals (1, 10, 20 ms etc.) as stable segments within the top incremental speech recognition hypotheses are identified by the stable segment generation system of the ASR).

Claim 17,
McGraw further teaches the method of claim 16, further comprising: determining an update word in a final transcription of the audio data that is different from any of the plurality of consistent words; and in response to determining the update word, directing an indication of the update word to the device, the update word replacing one or more of the plurality of consistent words in the presentation of the plurality of consistent words ([Figs. 2A-B] [0027-0028] [0033-0034] [0042] transcription 130 (final transcription); the user interface 104a, b incrementally displays only the segments within the top hypotheses 124 that the stability evaluator 120 has determined to be stable; See example of Figs. 2A-B; unstable words are updated until the stability evaluator are considered stable based on a threshold; modifying the word identified to another word).

Claim 18,
McGraw further teaches the method of claim 16, wherein the presentation of the plurality of consistent words by the device is configured to occur before a final transcription of the audio data is provided to the device ([0006] [0027-0028] output partial incremental speech recognition hypotheses that each represent an incremental speech recognizer's top incremental speech recognition hypothesis at a different point in time; the transcriptions 130 are sent to the user device 106; the user interface 104a, b incrementally displays only the segments within the top hypotheses 124 that the stability evaluator 120 has determined to be stable).

Claim 20,
At least one non-transitory computer-readable media configured to store one or more instructions that when executed by at least one processor cause or direct a system to perform the method of claim 16. (Claim 20 contains subject matter similar to claim 16, and thus is rejected under similar rationale)

Claim 21,
A method to transcribe communications, the method comprising: obtaining a plurality of hypothesis transcriptions of audio data; determining one or more consistent words that are included in two or more of the plurality of hypothesis transcriptions; in response to determining the one or more consistent words, directing the one or more consistent words to a device for presentation of the one or more consistent words; determining an update word in a final transcription of the audio data that is different from any of the one or more consistent words; and in response to determining the update word, directing an indication of the update word to the device, the update word replacing the one or more consistent words in the presentation of the one or more consistent words. (Claim 21 contains subject matter similar to claims 16 and 17, and thus is rejected under similar rationale)

Claim 22,
McGraw further teaches the method of claim 21, further comprising obtaining the audio data during a communication session between a second device ([Fig. 1] user device) and the device ([Fig. 1] ASR 108), the audio data originating at the second device ([Fig. 1] user device).

Claim 23,
The method of claim 21, wherein the presentation of the one or more consistent words by the device is configured to occur before the final transcription of the audio data is provided to the device. (Claim 23 contains subject matter similar to claim 18, and thus is rejected under similar rationale)

Claim 24,
The method of claim 21, wherein the one or more consistent words are presented in a rolling fashion, a pace of the presentation of the one or more consistent words in the rolling fashion being variable. (Claim 24 contains subject matter similar to claim 16, and thus is rejected under similar rationale)

Claim 25,
McGraw further teaches the method of claim 21, wherein the plurality of hypothesis transcriptions are obtained sequentially over time ([0025] the incremental recognizer buffer 118 keeps track of top incremental speech recognition hypotheses 124 as they become available from the recognizer 116 over time) and determining the one or more consistent words includes comparing a first hypothesis transcription of the plurality of hypothesis transcriptions with a second hypothesis transcription of the plurality of hypothesis transcriptions ([Figs. 1 and 2A] [0026] the stability evaluator 120 can use a timer 128 to measure how long a particular word has persisted (i.e. persistence or age) in the top hypothesis, and a stability metric can then be assigned based on this measurement), the second hypothesis transcription directly following the first hypothesis transcription among the plurality of hypothesis transcriptions ([Fig. 2A] plurality of hypotheses over different time intervals).

Claim 27,
At least one non-transitory computer-readable media configured to store one or more instructions that when executed by at least one processor cause or direct a system to perform the method of claim 21. (Claim 27 contains subject matter similar to claim 21, and thus is rejected under similar rationale)

Claim 28,
A method to transcribe communications, the method comprising: obtaining a plurality of hypothesis transcriptions of audio data, each of the plurality of hypothesis transcriptions including one or more words determined to be a transcription of portions of the audio data; determining one or more consistent words that are included in two or more of the plurality of hypothesis transcriptions, each of the two or more of the plurality of hypothesis transcriptions including words from a first portion of the audio data; and in response to determining the one or more consistent words, providing the one or more consistent words to a device for presentation of the one or more consistent words by the device, the presentation of the one or more consistent words configured to occur before a final transcription of the audio data is provided to the device. (Claim 28 contains subject matter similar to claims 16 and 18, and thus is rejected under similar rationale)

Claim 30,
McGraw further teaches the method of claim 28, wherein the plurality of hypothesis transcriptions are obtained from a speech recognition system that includes a single speech engine configured to recognize speech ([Fig. 1] recognizer 116).

Claim 31,
McGraw further teaches the method of claim 28, wherein a first portion of the audio data associated with a first one of the plurality of hypothesis transcriptions includes all of the audio data associated with at least one of the plurality of hypothesis transcriptions obtained previous to obtaining the first one of the plurality of hypothesis transcriptions ([Fig. 2A] [0032-0033] plurality of transcriptions of the speech input at different time intervals; the current transcription of the speech input includes the previous hypothesized transcription in order to determine the stability of the segment).

Claim 32,
The method of claim 28, wherein the plurality of hypothesis transcriptions are obtained sequentially over time and the determining the one or more consistent words includes comparing a first hypothesis transcription of the plurality of hypothesis transcriptions with a second hypothesis transcription of the plurality of hypothesis transcriptions, the second hypothesis transcription directly following the first hypothesis transcription among the plurality of hypothesis transcriptions. (Claim 32 contains subject matter similar to claim 25, and thus is rejected under similar rationale)

Claim 33,
The method of claim 28, wherein the presentation of the one or more consistent words occurs in a rolling fashion, a pace of the presentation of the one or more consistent words in the rolling fashion being variable. (Claim 33 contains subject matter similar to claim 16, and thus is rejected under similar rationale)

Claim 34,
The method of claim 28, further comprising: determining an update word in the final transcription that is different from any of the one or more consistent words; and in response to determining the update word, directing an indication of the update word to the device, the update word replacing the one or more consistent words in the presentation of the one or more consistent words. (Claim 34 contains subject matter similar to claim 17, and thus is rejected under similar rationale)

Claim 35,
At least one non-transitory computer-readable media configured to store one or more instructions that when executed by at least one processor cause or direct a system to perform the method of claim 28. (Claim 35 contains subject matter similar to claim 28, and thus is rejected under similar rationale)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 19, 26 and 29 are rejected under 35 U.S.C. 103 as being unpatentable over McGraw et al. (US 2013/0110492) and further in view of Zhou et al. (US 2018/0096678).

Claim 19,
McGraw teaches all the limitations in claim 16. The difference between the prior art and the claimed invention is that McGraw does not explicitly teach wherein the plurality of hypothesis transcriptions are obtained from a speech recognition system that includes a plurality of speech engines configured to recognize speech.
Zhou teaches wherein the plurality of hypothesis transcriptions are obtained from a speech recognition system that includes a plurality of speech engines configured to recognize speech ([Fig. 1] general purpose and domain  specific speech engines; a first plurality of candidate speech recognition results using a first general-purpose speech recognition engine; a second plurality of the candidate speech recognition results using at least one domain-specific speech recognition engine).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of McGraw with teachings of Zhou by modifying enhanced stability prediction for incrementally generated speech recognition hypotheses as taught by McGraw to include wherein the plurality of hypothesis transcriptions are obtained from a speech recognition system that includes a plurality of speech engines configured to recognize speech as taught by Zhou for the benefit of improving the operation of speech recognition systems that utilize multiple speech recognition engines (Zhou [0001]).

Claim 26,
The method of claim 21, wherein the plurality of hypothesis transcriptions are obtained from a speech recognition system that includes a plurality of speech engines configured to recognize speech. (Claim 26 contains subject matter similar to claim 19, and thus is rejected under similar rationale)

Claim 29,
The method of claim 28, wherein the plurality of hypothesis transcriptions are obtained from a speech recognition system that includes a plurality of speech engines configured to recognize speech. (Claim 29 contains subject matter similar to claim 19, and thus is rejected under similar rationale)

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
CN 109243461 teaches a voice recognition method, first-region device, device and storage medium, by obtaining the terminal device carried by the audio collecting device collects the voice signal obtained, and collecting to obtain the voice signal when the terminal device is located; using the second speech recognition model of the first speech recognition model and the preset region stored in advance corresponding to the first region of the universal speech recognition processing the voice signal, so that based on the second recognition result of the first speech recognition model of the first recognition result and the second speech recognition model. determining and outputting a recognition result of the target output. The technical solution of the invention embodiment can improve the accuracy of the speech recognition, the user experience is improved.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHREYANS A PATEL whose telephone number is (571)270-0689. The examiner can normally be reached Monday-Friday 8am-5pm PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

SHREYANS A. PATEL
Examiner
Art Unit 2657



/SHREYANS A PATEL/               Examiner, Art Unit 2656