DETAILED ACTION

Introduction

1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . A response was filed in this application on 08/10/2022 after the non-final rejection of 05/12/2022. Claims 11, 20 have been amended in this submission; while no new claims have been added and claim 17 has been cancelled. Thus claims 1-16, 18-20 are currently pending for reconsideration by the Examiner and are examined below.

Response to amendments

2.	With regards to claims 11-20, the Applicants have acknowledged the allowable subject matter (Claim 17) indicated in the last office action and have accordingly amended the independent claims 11 and 20 to include said allowable subject matter. The Applicants have also cancelled claim 17. Rejection of Claims 11-16 and 18-20 is therefore withdrawn.  





Response to arguments

3.	With regards to the prior art rejection of claims 1-10, the Applicants arguments have been fully considered but are unpersuasive for at least the following reasons. 

Regarding independent claim 1, the Applicant argue that nothing in columns 11 and 12 of Examiner quoted prior art Talieh (U.S. Patent # 11315569 B1) appears to discuss anything regarding a transcription of a second communication session generated by a second transcription generation technique in response to an indication from a first device. According to the Applicants, everything in columns 11 and 12 of Talieh appears to disclose concepts regarding a single transcription. As such, the Applicants content that Talieh fails to teach or suggest "directing a transcription of a second communication session to a second device for presentation to the user, the transcription being generated by the second transcription generation technique in response to the indication from the first device," as recited by claim 1. 

The Examiner respectfully disagrees and argues that Talieh teaches in columns 11-12, an updated, user-corrected (second) transcript that can then be either stored or displayed to all the speakers of the meeting (via a second communication session). Although the Applicants are allowed to be their own lexicographers, the Examiners are required to give a broad but reasonable interpretation to the claim language. The second transcription is the corrected and updated transcript which is then displayed to all the participants of a meeting (which is the second communication session). The Applicants have not responded to these specific findings and interpretations of the Examiner, hence the metes and bounds of the instant claim are met. The Applicants have not presented any other arguments with regards to the dependent claims and therefore they too are deemed addressed by the discussion so far. 

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

4.	Claims 1-2, 4-6 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Endo (U.S. Patent # 7228275 B1) in view of Talieh (U.S. Patent # 11315569 B1).

With regards to claim 1, Endo teaches a method to transcribe communications, the method comprising obtaining a performance of a first transcription generation technique with respect to generating transcriptions of audio of a first communication session associated with a user (Col. 2, lines 28-67 and figure 2, teach a speech recognition system that recognizes an input speech signal by using a plurality of speech recognizers each outputting recognized speech texts and associated confidence scores);

obtaining a performance of a second transcription generation technique with respect to generating transcriptions of the audio of the first communication session (Col. 2, lines 28-67 and figure 2, teach a speech recognition system that recognizes an input speech signal by using a plurality of speech recognizers each outputting recognized speech texts and associated confidence scores);

determining a report based on the performance of the first transcription generation technique and the performance of the second transcription generation technique (Col. 2, lines 28-67 and figure 2, further teach that the recognized speech texts and their confidence scores are sent to a decision module which generates a report of which confidence score is higher);

directing the report to a first device associated with the user (Col. 2, lines 28-67 along with figure 2 and Col. 11, lines 45-67, further teach that the speech recognition text with a higher raw or adjusted confidence score is selected and sent as an input to a user device such as a car navigation system or home appliance);

However, Endo may not explicitly detail that in response to the report, obtaining an indication from the first device. This is taught by Talieh (Columns 11-12, teach that multiple speaker specific transcripts can me merged to create a meeting transcript. A user can then correct any errors in this meeting transcript);

Talieh also teaches directing a transcription of a second communication session to a second device for presentation to the user, the transcription being generated by the second transcription generation technique in response to the indication from the first device (Columns 11-12, teach also that once the user has corrected the meeting transcript, this updated transcript can then be either stored or displayed to all the speakers of the meeting).

Endo and Talieh can be considered as analogous art as they belong to a similar field of endeavor in speech transcription services. It would thus have been obvious to one having ordinary skill in the art to advantageously combine the teachings of Talieh (User directed display of a transcription to a particular device) with those of Endo (Use of a transcription comparator and decision module to pick best transcription) so as to provide high accuracy transcription in case of multiple speakers (Talieh, col. 1). 

With regards to claim 2, Endo teaches the method of claim 1, wherein the performance of the second transcription generation technique is based on one or more of the following transcription accuracy, transcription latency, and number of transcription corrections (Col. 2, lines 28-67 and figure 2, further teach that the first speech recognizer recognizes the input speech signal and generates a first speech text and a first confidence score indicating the level of accuracy of the first speech text. Likewise, the second speech recognizer also recognizes the input speech signal and generates a second speech text and a second confidence score indicating the level of accuracy of the second speech text. The decision module selects either the first speech text or the second speech text as the output speech text depending upon which of the first and second confidence scores is higher).

With regards to claim 4, Endo teaches the method of claim 1, wherein the report includes a recommendation for the second transcription generation technique and the indication includes a selection of the second transcription generation technique (Col. 2, lines 28-67 and figure 2, further teach that the first speech recognizer recognizes the input speech signal and generates a first speech text and a first confidence score indicating the level of accuracy of the first speech text. Likewise, the second speech recognizer also recognizes the input speech signal and generates a second speech text and a second confidence score indicating the level of accuracy of the second speech text. The decision module selects either the first speech text or the second speech text as the output speech text depending upon which of the first and second confidence scores is higher).

With regards to claim 5, Endo may not explicitly detail the limitation wherein the first device and the second device are the same device. However, Talieh teaches this (Claim 14, teaches providing at the first client device a graphical user interface for user selection of one or more of the speaker-specific transcripts, receiving a user selection of the first speaker-specific transcript and displaying the first speaker-specific transcript at the first client device).

Endo and Talieh can be considered as analogous art as they belong to a similar field of endeavor in speech transcription services. It would thus have been obvious to one having ordinary skill in the art to advantageously combine the teachings of Talieh (User directed display of a transcription to a particular device) with those of Endo (Use of a transcription comparator and decision module to pick best transcription) so as to provide high accuracy transcription in case of multiple speakers (Talieh, col. 1). 

With regards to claim 6, Endo teaches the method of claim 1, further comprising before determining the report, directing a second transcription of the first communication session, the second transcription generated by the first transcription generation technique (Col. 8, lines 50-67 and figure 4, teach an example of the speech recognition system of the present invention attempting to recognize the input speech "Ten University Avenue, Palo Alto" using two grammar-based speech recognizers and a statistical speech recognizer. The first grammar-based speech recognizer may recognize the input speech as "Ten University Avenue, Palo Alto" with a confidence score of 66. The second grammar-based speech recognizer may recognize the input speech as "Ten University Avenue, Palo Cedro" with a confidence score of 61. The statistical speech recognizer may recognize the input speech as "When University Avenue, Palo Alto" with a confidence score of 60. The speech recognition system will select the speech recognition result "Ten University Avenue Palo Alto" from the first grammar-based speech recognizer, since it has the highest confidence score);

However, Endo may not explicitly detail that the communication session that involves the second device to the second device. However, Talieh teaches this (Columns 11-12, teach that multiple speaker specific transcripts can me merged to create a meeting transcript. A user can then correct any errors in this meeting transcript. Once the user has corrected the meeting transcript, this updated transcript can then be either stored or displayed to all the speakers of the meeting. Claim 14, teaches providing at the first client device a graphical user interface for user selection of one or more of the speaker-specific transcripts, receiving a user selection of the first speaker-specific transcript and displaying the first speaker-specific transcript at the first client device).

Endo and Talieh can be considered as analogous art as they belong to a similar field of endeavor in speech transcription services. It would thus have been obvious to one having ordinary skill in the art to advantageously combine the teachings of Talieh (User directed display of a transcription to a particular device) with those of Endo (Use of a transcription comparator and decision module to pick best transcription) so as to provide high accuracy transcription in case of multiple speakers (Talieh, col. 1). 

With regards to claim 10, this is a computer readable medium (CRM) claim for the corresponding method claim 1. These two claims are related as method and CRM of using the same, with each claimed CRM element's function corresponding to the claimed method step. Accordingly, claim 10 is similarly rejected under the same rationale as applied above with respect to method claim 1.

5.	Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Endo in view of Talieh and further in view of Engelke (U.S. Patent Application Publication # 2001/0005825 A1).

With regards to claim 9, Endo and Talieh may not explicitly detail the limitation wherein one of the first transcription generation technique and the second transcription generation technique includes a revoicing of audio before transcription generation. However, Engelke teaches this (Para 24, teaches revoicing before transcription). 

Endo, Talieh and Engelke can be considered as analogous art as they belong to a similar field of endeavor in speech transcription services. It would thus have been obvious to one having ordinary skill in the art to advantageously combine the teachings of Engelke (Use of revoicing before transcription) with those of Endo and Talieh (Use of transcription services) so as to provide speech recognition programs that don’t need to be trained to a particular speaker and thus can handle direct translation of speech from a variety of users (Engelke, para 4). 

Allowable Subject Matter

6.	Claims 3 and 7-8 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The prior art of record, alone or in combination, does not currently suggest or teach the invention as outlined in these claims. The Examiner shall outline more detailed reasons for allowance as and when the Application goes to allowability. Claims 11-16 and 18-20 are allowable. The prior art of record, alone or in combination, does not currently suggest or teach the invention as outlined in these claims. The Examiner shall outline more detailed reasons for allowance as and when the Application goes to allowability.

Conclusion

7.	THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  The following prior art, made of record but not relied upon, is considered pertinent to applicant's disclosure: Chen (U.S. Patent Application Publication # 2020/0394258 A1), Ha (U.S. Patent # 8682672 B1). These references are also included in the PTO-892 form attached with this office action.

A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. If you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). In case you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to NEERAJ SHARMA whose contact information is given below.  The examiner can normally be reached on Monday to Friday 8 am to 5 pm. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Louis-Desir can be reached on 571-272-7799 (Direct Phone).  The fax number for the organization where this application or proceeding is assigned is 571-273-8300.


/NEERAJ SHARMA/
Primary Examiner, Art Unit 2659
571-270-5487 (Direct Phone)
571-270-6487 (Direct Fax)
neeraj.sharma@uspto.gov (Direct Email)