DETAILED ACTION
1.	This communication is in response to the Amendments and Arguments filed on 6/23/2022. Claims 58-77 are pending and have been examined. Claims 1-57 are cancelled. 
Response to Amendments and Arguments
2.	(1) The applicant has not corrected the typos made in Claims 71 and 72, as noted in the previous Office action: Claim 71 dependent on claim 71 and Claim 72 dependent on claim 72. (2) The applicant has not corrected Claims 63, 70, 75 for “the transformed conversation” which has antecedent issue subject to 35 112b rejection as noted in the previous Office action. (3) The applicant chooses to postpone addressing double patenting rejections until later.
Applicant's arguments with respect to claim rejections under 35 USC 103 have been fully considered, but they are not persuasive. In particular, the applicant argues that the references do not teach “automatically segmenting the live audio-form conversation .. when a speaker change occurs .. such that each segment .. is spoken by only one speaker ..” In response, the examiner respectfully disagrees. Note that PALAKODETY teaches: [Abstract] “determine whether there is a change of one speaker to another speaker within an audio sequence ..” and [0055-0057] “The segmentation module 340 divides the audio recording into short audio segments. For example, an audio segment may be of a length between tens and hundreds of milliseconds, depending on the desired temporal resolution .. the segmentation module 340 extracts one or more audio segments and sends to the determination module 350 to determine a speaker for each audio segment … once the speaker for every audio segment has been determined by the determination module 350, the combination module 360 combines continuous audio segments of the same speaker ..” The examiner requests that the applicant clarify the “segmentation” step, such as how are the start and end of a segment determined (such as when a speaker change occurs) to distinguish from PALAKODETY’s teaching. There is no description in the Specification for this critical processing step.
As for the limitation “wherein in near-real time is a time delay less than one minute,” the examiner notes that the processing time delay is a system design choice dependent on the selected processor’s computing power and what other tasks are being processed at the same time. It is not considered a patentable limitation.
Claim Rejections - 35 USC § 103
3.	Claims 58-63 are rejected under 35 U.S.C. 103 as being unpatentable over Bastide, et al. (US 20130311177; hereinafter BASTIDE) in view of Palakodety, et al. (US 20180197548; hereinafter PALAKODETY), and further in view of Bobbitt, et al. (US 20090307189; hereinafter BOBBITT).
As per claim 58, BASTIDE (Title: Automated Collaborative Annotation of Converged Web Conference Objects) discloses “A system for processing and presenting a conversation (BASTIDE, [0006], computer system; Title: .. Web Conference ..), the system comprising:
a sensor configured to, [ upon receipt of a user instruction ], capture a live audio-form conversation; a processor configured to, upon receiving the live audio-form conversation when the live audio-form conversation is being captured (BASTIDE, [0038], user interface adapter 622 (for connecting .. microphone <read on sensor> and/or other user interface device ..); [0006], receiving .. spoken input from at least one of the two people <read on conversation. Also see PALAKODETY [0008], [0013] for conversation>):
automatically transcribe, in real time or in near-real time with the live audio-form conversation (<real-time or near-real time is a system design choice dependent on the selected processor computing power and what other tasks are being processed at the same time>), the live audio-form conversation into a live synchronized text, the live synchronized text being synchronized with the live audio-form conversation (BASTIDE, [0006], transcribing .. the received spoken input into transcribed text <read on a ready mechanism for transcribing, real-time or not>. Also see PALAKODETY [0008] for automatic transcribing, and LI [0005] for synchronized text);  
automatically generate, in real time or in near-real time with the live audio-form conversation, one or more segments of the live audio-form conversation and one or more segments of the live synchronized text by at least [ automatically segmenting the live audio-form conversation and the live synchronized text when a speaker change occurs or a natural pause occurs such that each segment of the one or more segments of the live audio-form conversation is spoken by only one speaker ] and is synchronized with only one segment of the one or more segments of the live synchronized text; and
[ automatically assign, in real time or in near-real time with the live audio-form conversation, only one speaker label to each segment ] of the one or more segments of the live synchronized text, each one speaker label representing one speaker; and a presenter configured to [ present, in real time or in near-real time with the live audio-form conversation, the labeled live synchronized text and the live audio-form conversation ]; wherein in near-real time is a time delay less than one minute (BASTIDE, [0006], annotating, with the computer system, the page by displaying in the vicinity of the location of the feature the correlated text element; [0039], the Presenter displays Slide1 in the web conference. <Examiner’s Note: real-time or near-real time is a system design choice dependent on the selected processor computing speed and what other tasks are being processed at the same time>).”
BASTIDE does not expressly disclose “upon receipt of a user instruction .. automatically segmenting the live audio-form conversation and the live synchronized text when a speaker change occurs or a natural pause occurs such that each segment of the one or more segments of the live audio-form conversation is spoken by only one speaker ... automatically assign, in real time or in near-real time with the live audio-form conversation, only one speaker label to each segment ..” However, this feature is taught by PALAKODETY (Title: System and method for diarization of speech, automated generation of transcripts, and automatic information extraction).
In the same field of endeavor, PALAKODETY teaches: [0036] “the client device 170 provides a user interface (UI) .. with which the user may interact with the client device 170 to perform functions <read on receipt of a user instruction> .. For example, the client device 170 may be a device used in doctor's office for record patient's health information or history,” [0008] “automatically generate a text transcript corresponding to an audio conversation <read on live conversation>,” [0013] “in a doctor's office to automatically generate a transcript of a patient encounter and to, based on information verbally supplied in the encounter,” [Abstract] “determine whether there is a change of one speaker to another speaker within an audio sequence ..” [0055-0057] “The segmentation module 340 divides <read on automatically> the audio recording into short audio segments .. the segmentation module 340 extracts one or more audio segments and sends to the determination module 350 to determine a speaker for each audio segment … The combination module 360 combines continuous audio segments with the same identified speaker <read on automatic speaker label assignment> and. For example, once the speaker for every audio segment has been determined by the determination module 350, the combination module 360 combines continuous audio segments of the same speaker. This way, the original input audio sequence may be organized into blocks for each of which the speaker has been identified.”
Therefore, it would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the claimed invention to incorporate the teachings of PALAKODETY in the system taught by BASTIDE to provide a ready mechanism for automatic audio/text segmenting and speaker-label assigning for presentation.
BASTIDE in view of PALAKODETY does not expressly disclose “present .. the labeled live synchronized text and the live audio-form conversation ..” However, this feature is taught by BOBBITT (Title: ASYNCHRONOUS WORKFLOW PARTICIPATION WITHIN AN IMMERSIVE COLLABORATION ENVIRONMENT).
In the same field of endeavor, BOBBITT teaches: [0033] “all aspects of the virtual workspace can be captured and incorporated into the time-based record <read on presentation>. For instance, data including, but not limited to, text, audio, video, files, sources, contacts, discussions, etc. can be maintained in a chronological format within the time-based record.” Also see LI for synchronized text, and the Examiner’s Note above for real-time or near-real-time.
Therefore, it would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the claimed invention to incorporate the teachings of BOBBITT in the system taught by BASTIDE and PALAKODETY, to present complete recording for a collaboration environment such as an audio conferencing.
As per claim 59 (dependent on Claim 58), BASTIDE in view of PALAKODET and BOBBITT further discloses “wherein the live audio-form conversation includes a human-to-human conversation in audio form (PALAKODETY, [0013], in a doctor's office to automatically generate a transcript of a patient encounter and to, based on information verbally supplied in the encounter; BASTIDE, Title: .. Web Conference; [0006], receiving .. spoken input from at least one of the two people).”
As per claim 60 (dependent on Claim 58), BASTIDE in view of PALAKODET and BOBBITT further discloses “wherein the live human-to-human conversation includes a meeting conversation (PALAKODETY, [0013], in a doctor's office to automatically generate a transcript of a patient encounter and to, based on information verbally supplied in the encounter).”
As per claim 61 (dependent on Claim 58), BASTIDE in view of PALAKODET and BOBBITT further discloses “wherein the live human-to-human conversation includes a phone conversation (BASTIDE, [0040], a Presenter starts a web conference using, e.g., IBM SmartCloud Meeting or any other appropriate mechanism <read on mobile phone communication>).”
As per claim 62 (dependent on Claim 58), BASTIDE in view of PALAKODET and BOBBITT further discloses “automatically present, in real time or in near-real time with the live audio-form conversation, the speaker-assigned segmented live synchronized text and the corresponding segmented live audio-form conversation (BOBBITT, [0033], all aspects of the virtual workspace can be captured and incorporated into the time-based record <read on a ready mechanism for presentation>. For instance, data including, but not limited to, text, audio, video, files, sources, contacts, discussions, etc. can be maintained in a chronological format within the time-based record; PALAKODETY, [0055-0057], The combination module 360 combines continuous audio segments with the same identified speaker <read on speaker label assignment>. Also see LI [0005] for synchronized text).”   
As per claim 63 (dependent on Claim 58), BASTIDE in view of PALAKODET and BOBBITT further discloses “present the transformed conversation to be both navigable and searchable (Examiner’s Note: This claim is unclear because “the transformed conversation” has antecedent issue subject to 35 112b rejection. BOBBITT, [0036], The record management component enables a user to search and view all, or some portion, of the time-based record <Note that ‘navigable and searchable’ can be broadly interpreted>).

4.	Claims 64-65 are rejected under 35 U.S.C. 103 as being unpatentable over BASTIDE in view of PALAKODET and BOBBITT, and further in view of Li, et al. (US 20120275761; hereinafter LI). 
As per claim 64 (dependent on Claim 63), BASTIDE in view of PALAKODET and BOBBITT further discloses “present one or more matches of a searched text in a first                 [ highlighted state ], the one or more matches being one or more parts of the synchronized text (BOBBITT, [0066], a comprehensive, organized and searchable record of events, actions and other information that take place within the virtual environment <where ‘search’ reads on ‘match’>).”
BASTIDE in view of PALAKODET and BOBBITT does not expressly disclose “highlighted state ..” However, this feature is taught by LI (Title: Utilizing subtitles in multiple languages to facilitate second-language learning).
In the same field of endeavor, LI teaches: [0005] “these subtitles can be synchronized with spoken words in the video file, such that words in both sets of subtitles can be highlighted as such words are spoken in the video file.” LI also teaches: [0056] “subtitles in the source language that are synchronized in time with the audibly spoken words.”
Therefore, it would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the claimed invention to incorporate the teachings of LI in the system taught by BASTIDE, PALAKODET and BOBBITT, to provide a ready mechanism for highlighting text for any purpose.
As per claim 65 (dependent on Claim 64), BASTIDE in view of PALAKODET, BOBBITT and LI further discloses “wherein the presenter is further configured to highlight the live audio-form conversation at one or more timestamps, the one or more timestamps corresponding to the one or more matches of the searched text respectively (BOBBITT, [0033], all aspects of the virtual workspace can be captured and incorporated into the time-based record. For instance, data including, but not limited to, text, audio, video, files, sources, contacts, discussions, etc. can be maintained in a chronological format within the time-based record; LI, [0005], these subtitles can be synchronized with spoken words in the video file, such that words in both sets of subtitles can be highlighted <read on highlighting at any time or at any timestamp> as such words are spoken in the video file).”

5.	Claims 66-68, 69-72 (similar in scope to claims 58-60, 62-65) are rejected under the same rationale and the same references as applied above for claims 58-60, 62-65.
Claim 71 dependent on claim 71 and Claim 72 dependent on claim 72 must be corrected.
Claims 73, 74-77 (similar in scope to claims 58, 62-65) are rejected under the same rationale and the same references as applied above for claims 58, 62-65.  
Double Patenting

6.	Independent claims 58, 66, 73 are rejected on the ground of non-statutory double patenting as being unpatentable over corresponding independent claim 1 of U.S. patent 10978073 (original application 16//027511) in view of the prior art cited in this Office action. Dependent claims are similarly rejected based on their dependency on rejected independent claims. 
The present application, Claim 58: 
A system for processing and presenting a conversation, the system comprising:
a sensor configured to, upon receipt of a user instruction, capture a live audio-form conversation; a processor configured to, upon receiving the live audio-form conversation as the live audio-form conversation is captured:
automatically transcribe, in real time or in near-real time with the live audio-form conversation, the live audio-form conversation into a live synchronized text, the live synchronized text being synchronized with the live audio-form conversation;  
automatically generate, in real time or in near-real time with the live audio-form conversation, one or more segments of the live audio-form conversation and one or more segments of the live synchronized text by at least automatically segmenting the live audio-form conversation and the live synchronized text when a speaker change occurs or a natural pause occurs such that each segment of the one or more segments of the live audio-form conversation is spoken by only one speaker and is synchronized with only one segment of the one or more segments of the live synchronized text; and
automatically assign, in real time or in near-real time with the live audio-form conversation, only one speaker label to each segment of the one or more segments of the live synchronized text, each one speaker label representing one speaker; and a presenter configured to present, in real time or in near-real time with the live audio-form conversation, the labeled live synchronized text and the live audio-form conversation; wherein in near-real time is a time delay less than one minute.
Patent #10978073, Claim 1: 
A system for processing and presenting a conversation, the system comprising:
a sensor configured to capture an audio-form conversation;
a controller configured to switch the sensor between a capturing state and an idling state;
an interface configured to receive a user instruction to instruct the controller to switch the sensor between the capturing state and the idling state;
a processor configured to:
automatically transform the audio-form conversation into a transformed conversation, the transformed conversation including a synchronized text, the synchronized text being synchronized with the audio-form conversation;
automatically generate one or more segments of the audio-form conversation and one or more segments of the synchronized text by at least automatically segmenting the audio-form conversation and the synchronized text when a speaker change occurs or a natural pause occurs such that each segment of the one or more segments of the audio-form conversation is spoken by only one speaker in audio form and is synchronized with only one segment of the one or more segments of the synchronized text; and
automatically assign only one speaker label to each segment of the one or more segments of the synchronized text, each one speaker label representing one speaker; and
a presenter configured to present the transformed conversation including the synchronized text and the audio-form conversation.

Conclusion
7.	THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).   
	A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 		
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FENG-TZER TZENG whose telephone number is (571)272-4609. The examiner can normally be reached on M-F (8:30-5:00). The fax phone number where this application or proceeding is assigned is 571-273-4609.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir (SPE) can be reached on 571-272-7799. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/FENG-TZER TZENG/		7/7/2022Primary Examiner, Art Unit 2659