DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
In response to the amendment filed 6/29/2022; claims 1-19 are pending.

	
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1 – 19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.  
Step 1: Is the claimed invention a statutory category of invention?
Claims 1, 14 and 19 are directed to a method for captioning a hearing user’s voice (Step 1, Yes).

Step 2A, Prong 1: Does the claim recite an abstract idea?
The limitation of steps: 
… ( c) identifying an HU device identifier associated with the HU device used to initiate the incoming call … (e) comparing the HU voice signal to HU voice profiles associated with the HU device identifier to identify a current HU voice profile associated with the HU voice signal; (f) selecting the voice model that is associated with the current HU voice profile as a current voice model; (g) using the current voice model to transcribe the HU voice signal to text … repeating steps (d) …  to continually identify a current HU voice model and use the current voice model to transcribe (Claim 1).
storing the caption text in a memory device without initially presenting the text captions; receiving a caption activation signal from the assisted user at a first time during the voice signal; and presenting the caption text corresponding to a period prior to the first time to the AU via an AU communication device display screen (Claim 14). 
receiving the CA generated text at the AU communication device; prior to receiving the CA generated text at the AU communication device, presenting the ASR text via an AU communication device display screen; and subsequent to receiving the CA generated text at the AU communication device, presenting the CA generated text via the AU communication device display screen (Claim 19) as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components.  This type of mental process can be practically performed in the human mind, for instance by a human can mental compare HU profile (i.e., language preference, gender, accent, … etc.) associated with the HU device identifier (i.e., user name, telephone number … etc.), mental select a voice model and continually refine a current HU voice model.    Thus, this is managing personal behavior or relationships or interactions between people (including social activities, teaching, and following rules or instructions) as well as a concept performed in the human mind (including an observation, evaluation, judgment, opinion). Furthermore, the claims, under their broadest reasonable interpretation, cover performance of the limitations in the mind but for the recitation of generic computer components. Therefore, the claims recite a judicial exception. (STEP 2A, PRONG 1: YES).

Step 2A, Prong 2: Does the claim recite additional elements that integrate the judicial exception into a practical application? 
Per the 2019 Revised Patent Subject Matter Eligibility Guidance, if a claim as a whole integrates the recited judicial exception into a practical application of that exception, a claim is not "directed to" a judicial exception. Alternatively, a claim that does not integrate a recited judicial exception into a practical application is directed to the exception. Evaluating whether a claim integrates an abstract idea into a practical application is performed by a) identifying whether there are any additional elements recited in the claim beyond the abstract idea, and b) evaluating those additional elements individual and in combination to determine whether they integrate the abstract idea into a practical application, using one or more of the considerations laid out by the Supreme Court and the Federal Circuit. Exemplary considerations indicative that an additional element (or combination of elements) may have or has not been integrated into a practical application are set forth in the 2019 PEG.
With respect to the instant claims, claims 1, 14, 19 recite the additional elements of: HU device, AU communication device, a memory device, a voice recognition database, a display screen, memory voice recognition database, a remote relay, a speaker.  It is particularly noted that the use of a computing device "as a tool" to perform an abstract method and steps that only amount to extra solution activity are indicated in the 2019 PEG as examples that an additional element has not been integrated into a practical application.  Even in combination, the recited additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits, such as an improvement to a computing system, on practicing the abstract idea.   Although the claims recite the components, identified above, for performing at least some of the recited functions, these elements are recited at a high level of generality in a conventional arrangement for performing their basic computer functions (i.e., collecting, processing, outputting data). For instance, the elements to the claims identify that non-descript "Automated Speech Recognition (ASR) engine" perform the majority of the claimed transcription functions.  Further evidence of these elements not amounting to a particular machine is provided by the Applicant's own specification which identifies these components as real time transcription software program ( e.g., software run on a third party server in the "cloud" ( e.g., Watson, Google Voice, etc.)) to obtain an initial text transcription substantially in real time (STEP 2A, Prong 2: NO). 

Step 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception?
Claims 1, 14, 19 recite the additional elements of: HU device, AU communication device, a memory device, a voice recognition database, a display screen, memory voice recognition database, a remote relay, a speaker  set forth above for Step 2A, Prong 2.   Applicant specification only describes these features in a highly generic manner by stating that … a communication device (e.g., a telephone) including a keyboard for dialing phone numbers and a handset including a speaker and a microphone for communication with other devices. In other embodiments device 14 may include a computer, a smart phone, a smart tablet, etc., that can facilitate audio communications with other devices.  Devices 12 and 14 may use any of several different communication protocols including analog or digital protocols, a VOIP protocol or others (Applicant’s, published application, [0118]). There is no indication in the specification that Applicant have achieved an advancement or improvement in ASR technology. Dependent claims 2 – 13 and 15 – 18 inherit the deficiencies of their respective parent claims through their dependencies and do not recite additional limitations sufficient to direct the claims to more than the claimed abstract idea, and are thus rejected for the same reasons.  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1 – 18 are rejected under 35 U.S.C. 103 as being unpatentable over Kahn et al. (US 2006/0149558 A1) in view of Engelke et al. (US 2011/0170672 A1).
Re claims 1, 14:
1. Kahn teaches a method for captioning a hearing user's (HU's) voice during a call with an assisted user (AU) (Kahn, [0615] – [0616], “the audio file 416 (FIG. 4) may include speech input 412, corresponding to 301 of FIG. 3, from two or more speakers at a meeting, legal proceeding, or telephone conversation”), the method comprising the steps of: 
(a) storing a plurality of HU voice profiles and associated voice models for each of a plurality of HU device identifiers in a voice recognition database (Kahn, [0137], “one or more databases”; fig. 7, “Speech User Profile”; [0204], “An audio file 416 may also be converted by two or more instances of the same speech recognition program 450 using two or more speech user profiles or different configurable options”; [0083], “output comparison of two or more synchronized session files using different speaker models or different configurable options may also be used to rapidly determine best speaker model and settings for a particular speaker”); 
(b) subsequent to receiving an incoming call at an AU communication device (Kahn, [0152]; [0531], “a telephone call-in system”; [0616]); 
(c) identifying an HU device identifier associated with the HU device used to initiate the incoming call (Kahn, fig. 3; fig. 8, “Speaker Recognition (Identification … ”; [0232], “In select speech user 404, the process selects speaker name or other identification. Work type is selected 406 to assist the transcriptionist in formatting completed document, letter, or report. Each speaker may have different preferences”; [0297]); 
(d) receiving HU voice signal during the call (Kahn, [0152]; [0531]; [0616]);
(e) comparing the HU voice signal to HU voice profiles associated with the HU device identifier to identify a current HU voice profile associated with the HU voice signal (Kahn, [0320], “Depending upon the speakers present, a different multispeaker speech user profile 312 (FIG. 3) may be selected. This permits selection of appropriate speech user profile 312 (FIG. 3) based upon actual speakers present”; [0652], “The source acoustic data 301 may be from one or more speakers or computer sources, with data selection further determined by other factors that affect acoustic characteristics, such as language and dialect, speech user medical condition, speech rate, background noise, recording device, type of sound card or telephony card, audio postprocessing, and segmentation techniques. As pronunciation may be dependent upon vocabulary and context, different acoustic models may be based upon topic or work type”); 
(f) selecting the voice model that is associated with the current HU voice profile as a current voice model (Kahn, [0320], “Depending upon the speakers present, a different multispeaker speech user profile 312 (FIG. 3) may be selected. This permits selection of appropriate speech user profile 312 (FIG. 3) based upon actual speakers present”; [0652], “The source acoustic data 301 may be from one or more speakers or computer sources, with data selection further determined by other factors that affect acoustic characteristics, such as language and dialect, speech user medical condition, speech rate, background noise, recording device, type of sound card or telephony card, audio postprocessing, and segmentation techniques. As pronunciation may be dependent upon vocabulary and context, different acoustic models may be based upon topic or work type”); 
(g) using the current voice model to transcribe the HU voice signal to text (Kahn, [0204]; [0286]); 
(h) presenting the text on a display screen (Kahn, [0518]; [0537]); and 
(i) repeating steps (d) through (h) to continually identify a current HU voice model and use the current voice model to transcribe (Kahn, fig. 3, 311 – Model Builders (Figs. 7, 8); figs. 7 – 8; figs. 7 – 8 show a loop structure for updating the speech profile (acoustic model)). 

Kahn does not explicitly describe (h) presenting the text on a display screen of the AU communication device.   Engelke teaches a method of operating a captioned telephone call in which an assisted user is connected by a captioned telephone device which is connected both by one line to a remote user and a second line to a relay providing captioning for a conversation.   Engelke further teaches Kahn’s deficiency; specifically, (h) presenting the text on a display screen of the AU communication device (Engelke, [0014]).  Therefore, in view of Engelke, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the method described in Kahn, by providing the text on a display screen of the AU communication device as taught by Engelke, in order to allow a hearing impaired individual to communicate via a telephone (Engelke, [0003]). 

14. A method for captioning a hearing user's (HU's) voice during a call with an assisted user (AU) (Kahn, [0615] – [0616], “the audio file 416 (FIG. 4) may include speech input 412, corresponding to 301 of FIG. 3, from two or more speakers at a meeting, legal proceeding, or telephone conversation”), the method comprising the steps of: 
during a voice call between an HU communication device and an AU communication device, receiving an HU voice signal (Kahn, [0615] – [0616], “the audio file 416 (FIG. 4) may include speech input 412, corresponding to 301 of FIG. 3, from two or more speakers at a meeting, legal proceeding, or telephone conversation”); 
using an automated speech recognition (ASR) engine to generate caption text for the HU voice signal (Kahn, fig. 3, 304; [0092], “The common text segmentation module may also be used by automatic processes converting, analyzing, or interpreting text, such as text-to-speech … ”);
storing the caption text in a memory device (Kahn, fig. 3, “One more Session files 305”); and 
presenting the caption text corresponding via an communication device display screen (Kahn, fig. 3, “Output Session Files 309”). 

Kahn does not explicitly disclose storing the caption text in a memory device without initially presenting the text captions; receiving a caption activation signal from the assisted user at a first time during the voice signal; and presenting the caption text corresponding to a period prior to the first time to the AU via an AU communication device display screen.   Engelke teaches a method of operating a captioned telephone call in which an assisted user is connected by a captioned telephone device which is connected both by one line to a remote user and a second line to a relay providing captioning for a conversation (Engelke, Abstract).  Engelke teaches the limitation: storing the caption text in a memory device without initially presenting the text captions (Engelke, [0045], “The transcription of the message is transmitted by the relay to the captioned telephone device and is stored as well as a text message. When the assisted user returns, the text message is stored in memory of the captioned telephone device 12 and the voice message and/or number of the calling party can be stored as well”); receiving a caption activation signal from the assisted user at a first time during the voice call (Engelke, [0046]); and presenting the caption text corresponding to a period prior to the first time to the AU via an AU communication device display screen (Engelke, [0024]; [0036]; [0046], “the assisted user returns, he or she lifts the handset of the telephone and presses a button on the captioned telephone device”).   Therefore, in view of Engelke, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the method described in Kahn, by providing the text on a display screen of the AU communication device as taught by Engelke, in order to allow a hearing impaired individual to communicate via a telephone (Engelke, [0003]) and provide a call answering service with a time delay (Engelke, [0045]).

Re claims 2 – 6, 11:
2. The method of claim 1 further including the steps of, determining that the HU voice signal does not match any of the stored HU voice profiles, using a default voice model to generate text and training the default voice model to generate a new voice model (Kahn, [0040], “The enrollment may consist of reading from a prepared script to adapt the speaker-independent model to the speaker's speaking style, input device, and background noise”; speak-independent model is a default model; Kahn, [0137], “one or more databases”; fig. 7, “Speech User Profile”; [0204], “An audio file 416 may also be converted by two or more instances of the same speech recognition program 450 using two or more speech user profiles or different configurable options”; [0083], “output comparison of two or more synchronized session files using different speaker models or different configurable options may also be used to rapidly determine best speaker model and settings for a particular speaker”). 

3. The method of claim 2 further including using the HU voice signal to generate a new voice profile and storing the new voice profile and the new voice model in the memory voice recognition database for subsequent use (Kahn, [0320], “Depending upon the speakers present, a different multispeaker speech user profile 312 (FIG. 3) may be selected. This permits selection of appropriate speech user profile 312 (FIG. 3) based upon actual speakers present”; [0652], “The source acoustic data 301 may be from one or more speakers or computer sources, with data selection further determined by other factors that affect acoustic characteristics, such as language and dialect, speech user medical condition, speech rate, background noise, recording device, type of sound card or telephony card, audio postprocessing, and segmentation techniques. As pronunciation may be dependent upon vocabulary and context, different acoustic models may be based upon topic or work type”; fig. 3, 311 – Model Builders (Figs. 7, 8); figs. 7 – 8). 

4. The method of claim 3 wherein the new voice model and new voice profile are stored along with the HU device identifier (Kahn, [0320], “Depending upon the speakers present, a different multispeaker speech user profile 312 (FIG. 3) may be selected. This permits selection of appropriate speech user profile 312 (FIG. 3) based upon actual speakers present”; [0652], “The source acoustic data 301 may be from one or more speakers or computer sources, with data selection further determined by other factors that affect acoustic characteristics, such as language and dialect, speech user medical condition, speech rate, background noise, recording device, type of sound card or telephony card, audio postprocessing, and segmentation techniques. As pronunciation may be dependent upon vocabulary and context, different acoustic models may be based upon topic or work type” ; fig. 3, 311 – Model Builders (Figs. 7, 8); figs. 7 – 8). 

5. The method of claim 2 further including, upon determining that the HU voice signal does not match any of the stored HU voice profiles, having a call assistant (CA) transcribe the HU voice signal to text which is presented via the display screen while the new voice model is trained (Kahn, fig. 3, “Training Session File”; “Manual”; fig. 8, “Model Builders”). 

6. The method of claim 5 further including monitoring accuracy of the new voice model during training and, once accuracy exceeds a threshold level, switching from the CA generated text to use the new voice model to generate the text that is presented via the display screen (Kahn, fig. 19 – 1, “Target Accuracy”; [0390]; [0411], “it may be convenient to assume the accuracy of computer-generated results where there is agreement by a threshold number of transcribed session files 436 of FIG. 4 concerning the transcription of a phrase or word without post-human review. In these situations, it may also be practical for the unvalidated results of matched transcription to be exported as paired text and audio 534”; [0671]; [0714], “If it is determined 1977 that target accuracy is achieved, the new acoustic model 1979 may be used”). 

11. The method of claim 2 wherein the step of using a default voice model includes identifying HU voice signal characteristics and selecting one of a plurality of default voice models based on the identified HU voice signal characteristics (Kahn, [0320], “Depending upon the speakers present, a different multispeaker speech user profile 312 (FIG. 3) may be selected. This permits selection of appropriate speech user profile 312 (FIG. 3) based upon actual speakers present”; [0652], “The source acoustic data 301 may be from one or more speakers or computer sources, with data selection further determined by other factors that affect acoustic characteristics, such as language and dialect, speech user medical condition, speech rate, background noise, recording device, type of sound card or telephony card, audio postprocessing, and segmentation techniques. As pronunciation may be dependent upon vocabulary and context, different acoustic models may be based upon topic or work type”). 

Re claims 7 – 8:
7. The method of claim 1 wherein the AU communication device links to a remote relay for captioning services and wherein the HU voice profiles and voice models are stored at the relay.  8. The method of claim 1 wherein the HU voice profiles and voice models are stored in the AU communication device (Kahn, [0137], “one or more databases”; fig. 7, “Speech User Profile”; [0204], “An audio file 416 may also be converted by two or more instances of the same speech recognition program 450 using two or more speech user profiles or different configurable options”; [0083], “output comparison of two or more synchronized session files using different speaker models or different configurable options may also be used to rapidly determine best speaker model and settings for a particular speaker”). 

Re claims 9 – 10:
9. The method of claim 1 wherein each HU voice model is periodically modified as additional HU voice signal is processed to generate text (Kahn, fig. 3, 311 – Model Builders (Figs. 7, 8); figs. 7 – 8; figs. 7 – 8 show a loop structure for updating the speech profile (acoustic model)). 

10. The method of claim 9 wherein a call assistant CA corrects errors in the text and the system automatically modifies an HU voice model based on CA error corrections (Kahn, [0313], “For quality control, the transcript may be reviewed in the session file editor 500 (FIG. 5) to determine accuracy, correct errors, and remove non-dictated items before further processing”; [0088]). 

Re claims 12 – 13:
Kahn does not explicitly disclose the method wherein the HU communication device identifier is a phone number / a network address.   Engelke teaches Kahn’s deficiency (Engelke, [0016]; [0034]; [0046]; [0048]).  Therefore, in view of Engelke, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the method described in Kahn, by providing the identifiers as taught by Engelke, since it was known in the art to connect telecommunication devices via a telephone number and internet protocol (IP Address). 

Re claim 15:
15. The method of claim 14 wherein the step of claim 14 wherein the AU communication device includes a user interface that includes a caption activation feature that the AU may use to generate the caption activation signal (Engelke, [0024]; [0036]; [0046]). 

Re claim 16:
Kahn does not explicitly disclose the period prior to the first time includes a duration of 20 seconds or less.   Engelke teaches Kahn’s deficiency (Engelke, [0033]). ]).  Therefore, in view of Engelke, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the method described in Kahn, by providing the duration as taught by Engelke, since Engelke states “Normally it will take anywhere from a few seconds to tens of seconds before the captioning service is set up though the relay, depending on how busy the relay is at that moment” (Engelke, [0033]).

Re claims 17 - 18:
17. The method of claim 14 further including broadcasting the HU voice signal in essentially real time to the AU via a speaker (Kahn, [0615] – [0616], “the audio file 416 (FIG. 4) may include speech input 412, corresponding to 301 of FIG. 3, from two or more speakers at a meeting, legal proceeding, or telephone conversation”; [0075]; Engelke, Abstract, “which is connected both by one line to a remote user and a second line to a relay providing captioning for a conversation”; [0040], “The voice or voices of the other people on the line are also sent to the relay for captioning”). 

18. The method of claim 17 further including, upon receiving the caption activation signal, generating caption text for the HU voice signal as the HU voice signal is received and presenting the continuing caption text via the device display as that text is generated (Kahn, fig. 3; fig. 8, “Speaker Recognition (Identification … ”; [0232], “In select speech user 404, the process selects speaker name or other identification. Work type is selected 406 to assist the transcriptionist in formatting completed document, letter, or report. Each speaker may have different preferences”; [0297]; Engelke, Abstract, “which is connected both by one line to a remote user and a second line to a relay providing captioning for a conversation”; [0040], “The voice or voices of the other people on the line are also sent to the relay for captioning”). 

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Kahn et al. (US 2006/0149558 A1) in view of Cloran et al. (US 2010/0063815 A1).
Re claim 19:
19. A method for captioning a hearing user's (HU's) voice during a call with an assisted user (AU) (Kahn, [0615] – [0616], “the audio file 416 (FIG. 4) may include speech input 412, corresponding to 301 of FIG. 3, from two or more speakers at a meeting, legal proceeding, or telephone conversation”), the method comprising the steps of: 
during a voice call between an HU communication device and an AU communication device, receiving an HU voice signal (Kahn, [0615] – [0616], “the audio file 416 (FIG. 4) may include speech input 412, corresponding to 301 of FIG. 3, from two or more speakers at a meeting, legal proceeding, or telephone conversation”); 
using an automated speech recognition (ASR) engine to generate ASR text for the HU voice signal (Kahn, fig. 3, 304; [0092], “The common text segmentation module may also be used by automatic processes converting, analyzing, or interpreting text, such as text-to-speech … ”); 
forming a link to a call assistant (CA) at a remote relay (Kahn, fig. 3, “MANUAL”; [0180], “manual transcription”); 
transmitting the HU voice signal to the CA for transcription to CA generated text (Kahn, [0073], “Again, this may be of particular assistance to the novice transcriptionist in training who can review errors in the transcribed text and quickly listen to the associated audio to learn how a speaker pronounced a particular word”; [0545]);  
receiving the CA generated text at the AU communication device; prior to receiving the CA generated text at the AU communication device, presenting the ASR text via an AU communication device display screen (Kahn, fig. 15, “Text Compare Text 1, Text 2”; [0072], “To facilitate rapid job completion and preparation of final text, the transcriptionist may use text comparison with a text file or text in one or more manually or automatically transcribed session files”; [0078]; figs. 5; [0066], “session file editor permit the transcriptionist to rapidly complete routine transcription jobs using the editor's text processor and built-in controls for audio playback using, for instance, a transcriptionist foot control”); and 
subsequent to receiving the CA generated text at the AU communication device, presenting the text via the communication device display screen (Kahn, fig. 3, “Output Session Files 309”).

Kahn does not explicitly disclose broadcasting the HU voice signal via a speaker to the AU; receiving a caption activation signal from the assisted user at a first time; in response to receiving the caption activation signal; presenting the CA generated text via the AU communication device display screen.

Cloran teaches an invention relates to telephonic communications (Cloran, Abstract).  Cloran teaches the limitation: broadcasting the HU voice signal via a speaker to the AU (Cloran, Abstract, [0018], “the caller's audio is sent to that main conference, and the audio from others in the conference is sent to the caller”); receiving a caption activation signal from the assisted user at a first time (Cloran, [0033], “The system preferably offers users a continuously variable level approach to managing the fidelity of the transcription, either automatically or at the request of the user. Using this approach, the web portal allows users to view the transcript while the conference is occurring”; [0039], “requests a transcription update from web portal 580 using javascript”); in response to receiving the caption activation signal; using an automated speech recognition (ASR) engine to generate ASR text for the HU voice signal (Cloran, [0009], “this system relates to automated processing of human speech”); subsequent to receiving the CA generated text at the AU communication device, presenting the CA generated text via the AU communication device display screen (Cloran, [0032], “The automated transcription subsystem converts the analyst's speech into text that is then associated with the speech of the conference call user”; [0034], “it can be corrected in any of a variety of ways discussed herein, and the user interface is updated accordingly”; [0055]).  Therefore, in view of Cloran, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the method described in Kahn, by providing a caption request as taught by Cloran, in order to provide participants a user interface that is updated in substantially real time with the transcribed text from the audio stream(s) (Cloran, Abstract) and increase the fidelity level of the transcription (Cloran, [0033]).

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claim 1 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1- 61 of U.S. Patent No. 10389876 (’876) in view of Kahn et al. (US 2006/0149558 A1). 
Re Claim 1 
‘876 teaches a plurality of HU voice profiles (‘876, claims 16 – 19, “voice model”), a plurality of HU identifiers (‘876, claim 17, “each stored voice model is associated with a hearing user's device identifier in the database”) and selecting the voice model (‘876, claim 17, “the device identifier can be used to identify the voice model associated with the hearing user”).  ‘876 does not explicitly dislcose ( e) comparing the HU voice signal to HU voice profiles associated with the HU device identifier to identify a current HU voice profile associated with the HU voice signal; (f) selecting the voice model that is associated with the current HU voice profile as a current voice model; (g) using the current voice model to transcribe the HU voice signal to text; (h) presenting the text on a display screen of the AU communication device; and (i) repeating steps (d) through (h) to continually identify a current HU voice model and use the current voice model to transcribe.  

Kahn teaches ‘876’s deficiency (Kahn, [0320], “Depending upon the speakers present, a different multispeaker speech user profile 312 (FIG. 3) may be selected. This permits selection of appropriate speech user profile 312 (FIG. 3) based upon actual speakers present”; [0652], “The source acoustic data 301 may be from one or more speakers or computer sources, with data selection further determined by other factors that affect acoustic characteristics, such as language and dialect, speech user medical condition, speech rate, background noise, recording device, type of sound card or telephony card, audio postprocessing, and segmentation techniques. As pronunciation may be dependent upon vocabulary and context, different acoustic models may be based upon topic or work type”; [0320], “Depending upon the speakers present, a different multispeaker speech user profile 312 (FIG. 3) may be selected. This permits selection of appropriate speech user profile 312 (FIG. 3) based upon actual speakers present”; [0652], “The source acoustic data 301 may be from one or more speakers or computer sources, with data selection further determined by other factors that affect acoustic characteristics, such as language and dialect, speech user medical condition, speech rate, background noise, recording device, type of sound card or telephony card, audio postprocessing, and segmentation techniques. As pronunciation may be dependent upon vocabulary and context, different acoustic models may be based upon topic or work type”; [0204]; [0286]; [0518]; [0537]; fig. 3, 311 – Model Builders (Figs. 7, 8); figs. 7 – 8; figs. 7 – 8 show a loop structure for updating the speech profile (acoustic model)).  Therefore, in view of Kahn, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the method described in ‘876, by providing the comparing a voice models as taught by Kahn, since Kahn states “determining the word error rate with the different models, and selecting the acoustic model for use with highest accuracy” (Kahn, [0049]; [0199]). 


Claim 1 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 – 27 of U.S. Patent No. 10542141 (‘141) in view of Kahn et al. (US 2006/0149558 A1).
Re Claim 1:
‘141 teaches a plurality of HU voice profiles (‘141, claims 1 – 4 and 8, “voice-to-text model”, “voice profile”), a plurality of HU identifiers (‘141, claim 3, “by associating the voice-to-text model with the unique identifier”), selecting the voice model (‘141, claim 8, “an identity of the HU includes identifying a voice profile associated with the HU”) and compare voice profiles (‘141, claim 18, “comparing the HU voice signal to at least a subset of the voice recognition profiles to identify”).  ‘141 does not explicitly disclose (g) using the current voice model to transcribe the HU voice signal to text; (h) presenting the text on a display screen of the AU communication device; and (i) repeating steps (d) through (h) to continually identify a current HU voice model and use the current voice model to transcribe.  

Kahn teaches ‘141 deficiency (Kahn, [0320], “Depending upon the speakers present, a different multispeaker speech user profile 312 (FIG. 3) may be selected. This permits selection of appropriate speech user profile 312 (FIG. 3) based upon actual speakers present”; [0652], “The source acoustic data 301 may be from one or more speakers or computer sources, with data selection further determined by other factors that affect acoustic characteristics, such as language and dialect, speech user medical condition, speech rate, background noise, recording device, type of sound card or telephony card, audio postprocessing, and segmentation techniques. As pronunciation may be dependent upon vocabulary and context, different acoustic models may be based upon topic or work type”; [0320], “Depending upon the speakers present, a different multispeaker speech user profile 312 (FIG. 3) may be selected. This permits selection of appropriate speech user profile 312 (FIG. 3) based upon actual speakers present”; [0652], “The source acoustic data 301 may be from one or more speakers or computer sources, with data selection further determined by other factors that affect acoustic characteristics, such as language and dialect, speech user medical condition, speech rate, background noise, recording device, type of sound card or telephony card, audio postprocessing, and segmentation techniques. As pronunciation may be dependent upon vocabulary and context, different acoustic models may be based upon topic or work type”; [0204]; [0286]; [0518]; [0537]; fig. 3, 311 – Model Builders (Figs. 7, 8); figs. 7 – 8; figs. 7 – 8 show a loop structure for updating the speech profile (acoustic model)).  Therefore, in view of ‘141, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the method described in ‘876, by providing the comparing a voice models as taught by Kahn, since Kahn states “determining the word error rate with the different models, and selecting the acoustic model for use with highest accuracy” (Kahn, [0049]; [0199]). 

Claim 4 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 - 61 of U.S. Patent No. 10389876 (‘876).  Although the claims at issue are not identical, they are not patentably distinct from each other because the subject matter claimed in the instant application is fully disclosed in the more specific claims of ‘876.
‘876 teaches present the caption text corresponding to a period prior to the first time (‘876, claims 1 – 2, 6 - 7), a caption activation signal (‘876, claim 5).   

Claim 4 is provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 - 20 of copending Application No. 17/321222 (‘222). Although the claims at issue are not identical, they are not patentably distinct from each other because the subject matter claimed in the instant application is fully disclosed in the more specific claims of ‘876 (See claims 1 – 20).
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim 19 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 - 61 of U.S. Patent No. 10389876 (‘876). 
The subject matter claimed in the instant application is fully disclosed in the more specific claims of ‘876 (claims 1 – 61).

Claim 19 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 – 20 of U.S. Patent No. US 2019/0312973 A1).
The subject matter claimed in the instant application is fully disclosed in the more specific claims of ‘973 (claims 1 – 20).

Response to Arguments
Applicant's arguments filed 6/29/2022 have been fully considered but they are not persuasive. 
Applicant argues:
In example 39, the Office discusses a method for training a neural network for facial detection in which digital facial images are received, transformed, and then used with digital non-facial images for analytical purposes … In particular, "the claim does not recite a mental process because the steps are not practically performed in the human mind" and "the claim does not recite any method of organizing human activity such as a fundamental economic concept or managing interactions between people."
The Office respectfully submits that the claimed method can be practically performance by a human mind.  As stated in the 101 rejection above, the Examiner brings up a scenario that this type of mental process can be practically performed in the human mind, for instance by a human can mental compare HU profile (i.e., language preference, gender, accent, … etc.) associated with the HU device identifier (i.e., user name, telephone number … etc.); mental select a voice model to match the language preference and continually refine a current HU voice model.    Applicant has not disputed how such scenario would require level of complexity that is impossible for the human to perform. 

Applicant argues:
As in Example 39, those steps indicate that Applicant's claims are not directed to a judicial exception under Prong One. As with that example, Applicant's claims do not recite a mental process, because the steps of comparing the hearing user's voice signal to hearing user voice profiles associated with a hearing user device identifier to determine a current hearing user voice model and then using that current voice model to transcribe that voice signal to text are not steps that can be practically performed in the mind.
The Office respectfully disagrees.  The transcriptionist has been routinely performed transcription for many applications before the filing date of the instant application; for example, a TV captionist and an interpreter have been known to mentally perform transcription without using computer ASR. 

Applicant argues:
although the Office action alleges that this claim recites a mental process because "a human can mental [sic] compare HU profile associated with the HU device identifier (telephone number, language preference ... etc.)," this statement does not accurately reflect what is recited in claim 1. The claim does not recite comparing a HU profile against an HU device identifier. Instead, it recites comparing the HU voice signal against HU voice profiles. Thus, the method requires an analysis of the electrical voice signal against different voice profiles and not merely comparing some form of HU profile against, e.g., a phone number. That former analysis, like the analysis in example 39, is not one that can practically be performed in the mind.
The Office respectfully disagrees.  Second, according to MPEP 2111 [R-5], during patent examination, the pending claims must be “given their broadest reasonable interpretation consistent with the specification.” The Federal Circuit’s en banc decision in Phillips v. AWH Corp., 415 F.3d 1303, 75 USPQ2d 1321 (Fed. Cir. 2005) expressly recognized that the USPTO employs the “broadest reasonable interpretation” standard.  Based on BRI, a HU profile (i.e., language preference, gender, accent, … etc.) associated with the HU device identifier (i.e., user name, telephone number … etc.).   Applicant has not provided an argument that a language preference, gender of the transcription, or accent of the voice model can not be selected by a human.  There is no indication that a person can not mentally observe a AU user and pick an language preference for the AU user. 

Applicant argues:
The MPEP provides several examples of ways to show whether a judicial exception has been integrated into a practical application, including "an improvement in the functioning of a computer, or an improvement to other technology or technical field" and "implementing the judicial exception with, or using the judicial exception in conjunction with, a particular machine or manufacture that is integral to the claim." MPEP § 2106.04(0)(1) … In this way, the claims recite an improvement in the technical field of call transcription by providing a voice model to serve as the basis for transcription that is tied or more closely tied to the hearing user/the hearing user's voice signal, as opposed, e.g., to a one-size-fits-all approach that does not use voice models to aid in transcription or that uses a single voice model for transcription for all possible HU voice signals that are received.
The Office respectfully disagrees.   Claims 1, 14 and 19 do not explicitly recite any statutory product to perform the claimed steps (i.e., storing, identifying, receiving, comparing, ….).  Claims 1, 14 and 19 merely recite a few communication devices: AU communication device, HU device and a non-descriptive ASR engine.  Indeed, the Specification concedes that the ASR is well-known, routine, and conventional. See, e.g., Applicant’s published application, para. [0131] (“Other ways of altering or training the voice-to-text software are well known in the art and any way of training the software may be used at block 92.").  None of the listed devices are recited to perform the claimed method steps.   Hence, there is no improvement to a device that is not claimed.

Applicant argues:
regardless of whether Kahn discloses storing multiple voice profiles, none of the portions of Kahn cited in the Office action with regard to the elements of claim 1 discussed above teach or suggest device identifiers, generally, or storing the voice profiles and associated models for each of a plurality of HU device identifiers or identifying an HU device identifier associated with the HU device used to initiate the incoming call. As such, the cited art as relied upon in the Office action fails to teach or suggest all elements of claim 1. Applicant, therefore, respectfully requests reconsideration and withdrawal of the rejection of claim 1, as well as allowance of claim 1 and its respective dependent claims.
The Office respectfully disagrees.  Kahn teaches a plurality of HU device identifiers (Kahn, [0137], “one or more databases”; fig. 7, “Speech User Profile”; [0204], “An audio file 416 may also be converted by two or more instances of the same speech recognition program 450 using two or more speech user profiles or different configurable options”; [0083], “output comparison of two or more synchronized session files using different speaker models or different configurable options may also be used to rapidly determine best speaker model and settings for a particular speaker”).

Applicant argues:
regardless of whether Kahn discloses storing multiple voice profiles, none of the portions of Kahn cited in the Office action with regard to the elements of claim 1 discussed above teach or suggest device identifiers, generally, or storing the voice profiles and associated models for each of a plurality of HU device identifiers or identifying an HU device identifier associated with the HU device used to initiate the incoming call. As such, the cited art as relied upon in the Office action fails to teach or suggest all elements of claim 1. Applicant, therefore, respectfully requests reconsideration and withdrawal of the rejection of claim 1, as well as allowance of claim 1 and its respective dependent claims.
The Office respectfully disagrees.  Kahn teaches a plurality of HU device identifiers (Kahn, [0629], “SAPI Phone ID”), a plurality of voice profiles (Kahn, [0020], “the representational model may be termed a speech user profile, or speaker model, and may consist of an acoustic model, language model, lexicon, and other speaker-related data. Other types of speech and language applications may share some or all of these components of the speech user profile.”; [0137], “one or more databases”; fig. 7, “Speech User Profile”; [0083], “output comparison of two or more synchronized session files using different speaker models or different configurable options may also be used to rapidly determine best speaker model and settings for a particular speaker”) and receiving an incoming call for real time conversation (Kahn, [0126], “The telephone system”; [0152], “The speech input may be an audio file recorded from live dictation into microphone … telephone”; [0531], “a telephone call-in system, this may be accomplished by line-tap techniques that record channel conversations”; [0083], “use for real-time and server-based speech recognition”; [0616], “from two or more speakers at a meeting, legal proceeding, or telephone conversation”; [0158], “In the session file editor step 306, the one or more session files produced by automatic or manual processes may be reviewed in one or more instances of a session file editor 500 …  The person reviewing the session file may be the original speaker or another party”)).  

Applicant argues:
The process referred to in Engelke occurs because the assisted user did not answer the voice call but instead had the caller leave a message. Thus, Engelke does not teach or suggest receiving a caption activation signal from the assisted user at a first time during the voice call, as recited in amended claim 14, and it would not have been obvious to modify Kahn in view of Engelke to include this highlighted element, as doing so would improperly change the principle of operation of the call answering service disclosure relied upon in Engelke. See MPEP § 2143.01 (VI) (proposed modification cannot change the principle of operation of a reference").
The Office disagrees.  It’s unclear how the Engelke’672 change the principle of operation of Kahn, since both inventions are related to a method of operating a captioned telephone call.  Engelke’672 also teaches “a captioned telephone device which is connected both by one line to a remote user and a second line to a relay providing captioning for a conversation” (Engelke’672, Abstract).  Engelke’672 teaches the steps of when the assisted user initiates a call by dialing a telephone number on the first telephone line (Engelke’672, [0006]).  The test for obviousness is not whether the features of a secondary reference may be bodily incorporated into the structure of the primary reference; nor is it that the claimed invention must be expressly suggested in any one or all of the references.  Rather, the test is what the combined teachings of the references would have suggested to those of ordinary skill in the art.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981).  

Applicant argues:
First, the Office action cites FIG. 15 of Khan, but that figure describes "an exemplary embodiment of resegmenting an asynchronized text from a session file or text file to create an identical number of segments with audio-tagged text from the reference session file." Khan at, [0117]. Nothing in that description relates to the elements of claim 19 discussed above. Second, the Office action cites paragraph [0072], but that paragraph discusses a transcriptionist comparing their transcribed text against another manually or automatically transcribed file. Nothing in that paragraph discusses what text is displayed on the assisted user's device and when. Third, the Office action cites FIG. 5 and related paragraph [0066] of Khan, but that figure is a flow diagram of a session file editor process and the related paragraph discuss how the transcriptionist may complete transcription jobs. Again, they do not teach or suggest the timing and type of text presented on the assisted user's display screen recited in claim 19. Finally, the Office action cites the "output session file 309" element of FIG. 3 of Khan, but again, that only discloses the outputting of a file. It, too, does not teach or suggest the timing and type of text presented on the assisted user's display screen recited in claim 19. For at least the foregoing reasons, the cited art as relied upon by the Office does not teach or suggest at least these elements of claim 19. 
The Office respectfully disagrees.  Kahn teaches a transcription system allows a speaker to review the transcription (Kahn, [0158], “In the session file editor step 306, the one or more session files produced by automatic or manual processes may be reviewed in one or more instances of a session file editor 500 (as will be described in association with FIGS. 5, 5A, 5B, 5C). In each case, the one or more transcribed session files may be opened in one or more of the buffered read/write windows of the session file editor 500 (FIG. 5). The person reviewing the session file may be the original speaker or another party”); An manual processing can further modify one or more previously created session file (Kahn, [0159], “transformations in one or more instances of a session file editor 500 (FIG. 5) before postprocessing 307, and may involve both manual processing and automatic subroutines that modify one or more previously created session files”).  
Cloran also teaches the limitation: transmitting the HU voice signal to the CA for transcription to CA generated text; receiving the CA generated text at the AU communication device; prior to receiving the CA generated text at the AU communication device, presenting the ASR text via an AU communication device display screen; and subsequent to receiving the CA generated text at the AU communication device, presenting the CA generated text via the AU communication device display screen (Cloran, Abstract, “The system provides for participants a user interface that is updated in substantially real time with the transcribed text from the audio stream(s). A single audio line can be used for simple transcription, and multiple audio lines are used to provide a real-time transcript of a conference call, deposition, or the like”; [0035], “Newly transcribed passages and newly issued corrections are provided as updates to the user interface in substantially real time”).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACK YIP whose telephone number is (571)270-5048. The examiner can normally be reached Monday thru Friday; 9:00 AM - 5:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, XUAN THAI can be reached on (571) 272-7147. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/JACK YIP/Primary Examiner, Art Unit 3715