DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification

The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 30-37 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being incomplete for omitting essential elements, such omission amounting to a gap between the elements.  See MPEP § 2172.01.  The omitted elements are: in independent claim 30, the claim ends with the phrase “; and.”, which is vague and indefinite.  The claim elements of claims 31-37 do not remedy the indefiniteness of claim 30, and as such, are rejected as well, based on the dependency to claim 30.  For prior art examination purposes only, examiner will interpret the claim with omitting the ending phrase “ ;and.”.  Correction is required.  


Double Patenting

The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 


Claims 21-37 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-16 of U.S.Patent 10,522153 (reference application). Although the claims at issue are not identical, they are not patentably distinct from each other because the extra processing steps of the ‘153 Patent are not necessary to realize the functionality of the claims in the instant invention.  Further, the method claims of ‘153 contain the matching structure of database/servers and processors, that are in the system claims of the instant invention.  See table below.

16/703030
21. A non-transitory computer-readable medium having instructions stored thereon for facilitating diarization of audio files from a customer service interaction, wherein the instructions, when executed by a processing system, direct the processing system to: 

receive a set of textual transcripts from a transcription server and a set of audio files associated with the set of textual transcripts from an audio database server; 

perform a blind diarization on the set of textual transcripts and the set of audio files to segment and cluster the textual transcripts into a plurality of textual speaker clusters, wherein the number of textual speaker clusters is at least equal to a number of speakers in the textual transcript; 
automatedly apply at least one heuristic to the textual speaker clusters to select textual speaker clusters likely to be associated with an identified group of speakers; 













analyze the selected textual speaker clusters to create at least one linguistic model; 











apply the linguistic model to transcribed audio data to label a portion of the transcribed 








save the at least one linguistic model to a 

linguistic database server and associating it with the labeled speaker; 



and apply the saved at least one linguistic model from the linguistic database server to a new audio file transcript from an audio source to perform diarization of the new audio file by blind diarizing the new audio file, comparing each new textual speaker cluster to the at least 

22. The non-transitory computer-readable medium of claim 21, wherein the identified group of speakers are customer service agents and the audio data is audio data of a customer service interaction between at least one customer service agent and at least one customer. 

23. The non-transitory computer-readable medium of claim 21, further directing the processing system to: receive a set of recorded audio data; and transcribe the set of recorded audio data to produce the set of textual transcripts. 

24. The non-transitory computer-readable medium of claim 21, wherein the at least one 

25. The non-transitory computer-readable medium of claim 21, wherein the analysis of the selected textual speaker clusters includes determining word use frequencies for words in the selected textual speaker clusters with the processor, determining word use frequencies for words in the non-selected textual speaker clusters with the processor, and comparing the word use frequencies for words in the selected textual speaker clusters to the word use frequencies for words in the non-selected textual speaker clusters with the processor to identify a plurality of discriminating words for use in the at least one linguistic model. 

26. The non-transitory computer-readable medium of claim 21, wherein the analysis of the selected textual speaker clusters includes receiving a plurality of scripts associated with 

27. The non-transitory computer-readable medium of claim 26, further directing the processing system to: calculate a difference between the word use frequencies for each word in the selected textual speaker clusters and the non-selected textual speaker clusters; and compare the difference to a predetermined selection threshold, wherein if the difference is greater than the predetermined selection threshold, the word is identified as a discriminating word. 



29. The non-transitory computer-readable medium of claim 21, wherein the audio data is streaming audio data. 

30. A system for diarization and labeling of audio data, the system comprising: an audio database server comprising a plurality of audio files; a transcription server that transcribes the audio files into textual transcripts; a processor that: receives a set of textual transcripts from the transcription server and a set of audio files associated with 







32. The system of claim 31, further comprising applying the saved at least one linguistic model from the linguistic database server to a new audio file transcript from an audio source to perform diarization of the new audio file by blind diarizing the new audio file, comparing each new textual speaker cluster to the at least one linguistic model, and labeling each textual speaker cluster as belonging to a customer service agent or belonging to a customer. 

33. The system of claim 30, wherein the textual speaker clusters are associated in groups of at least two, wherein the group of at least two includes a textual speaker cluster originating from the identified group of speakers and at least one textual speaker 

34. The system of claim 30, wherein the identified group of speakers are customer service agents and the audio files are audio files of a customer service interaction between at least one customer service agent and at least one customer. 

35. The system of claim 30, wherein the processor further: receives a set of recorded audio files; and transcribes the set of recorded audio files to produce the set of textual transcripts. 


36. The system of claim 30, wherein the at least one heuristic is detection of a script associated with the identified group of speakers. 

37. The system of claim 30, wherein the processor further: calculates a difference between the word use frequencies for each word in the selected transcripts and the non-selected transcripts; and compares the difference to a predetermined selection threshold, wherein if the difference is greater than the predetermined selection threshold, the word is identified as a discriminating word.  


9. A non-transitory computer-readable medium having instructions stored thereon for facilitating diarization of audio files from a customer service interaction, wherein the instructions, when executed by a processing system, direct the processing system to: 

receive a set of textual transcripts from a transcription server; receive a set of audio files associated with the set of textual transcripts from an audio database server; 

perform a blind diarization on the set of textual transcripts and the set of audio files to segment and cluster the textual transcripts into a plurality of textual speaker clusters, wherein the number of textual speaker clusters is at least equal to a number of speakers in the textual transcript;
 automatedly apply at least one heuristic to the textual speaker clusters with a processor to select textual speaker clusters likely to be associated with an identified group of speakers, 

wherein the at least one heuristic is a comparison of a plurality of scripts associated with the identified group of speakers to each set of the textual speaker clusters and a 

 analyze the selected textual speaker clusters with the processor to create at least one linguistic model, wherein the analysis includes determining word use frequencies for words in the selected textual speaker clusters with the processor, determining word use frequencies for words in the non-selected textual speaker clusters with the processor, and comparing the word use frequencies for words in the selected transcripts to the word use frequencies for words in the non-selected transcripts with the processor to identify a plurality of discriminating words for use in the at least one linguistic model; and apply the linguistic model to transcribed audio data with the processor to label a portion of the 
(examiner notes that the system of the ‘297
Patent claims contain a storage device, which, one of ordinary skill in the art would easily recognize to read on the claimed computer readable medium.
10. The non-transitory computer-readable medium of claim 29, further directing the processing system to save the at least one linguistic model to a linguistic database server and associating it with the labeled speaker. 

11. The non-transitory computer-readable medium of claim 29, further directing the processing system to apply the saved at least one linguistic model from the linguistic database server to a new audio file transcript from an audio source to perform diarization of the new audio file by blind diarizing the new audio file, comparing each new textual speaker cluster to the at least one linguistic model, and labeling each textual speaker 

13. The non-transitory computer-readable medium of claim 9, wherein the identified group of speakers are customer service agents and the audio files are audio files of a customer service interaction between at least one customer service agent and at least one customer. 

14. The non-transitory computer-readable medium of claim 9, further directing the processing system to: receive a set of recorded audio files; and transcribe the set of recorded audio files to produce the set of textual transcripts. 

15. The non-transitory computer-readable medium of claim 29, wherein the at least one heuristic is detection of a script associated with the identified group of speakers. 



determining word use frequencies for words in the selected textual speaker clusters with the processor, determining word use frequencies for words in the non-selected textual speaker clusters with the processor, and comparing the word use frequencies for words in the selected transcripts to the word use frequencies for words in the non-selected transcripts with the processor to identify a plurality of discriminating words for use in the at least one linguistic model;


(from claim 9)… wherein the at least one heuristic is a comparison of a plurality of scripts associated with the identified group of speakers to each set of the textual speaker clusters and a correlation score between each of the textual speaker clusters and the plurality of scripts is calculated and the 



16. The non-transitory computer-readable medium of claim 9, further directing the processing system to: calculate a difference between the word use frequencies for each word in the selected transcripts and the non-selected transcripts; and compare the difference to a predetermined selection threshold, wherein if the difference is greater than the predetermined selection threshold, the word is identified as a discriminating word.

12…wherein the textual speaker clusters are associated in groups of at least two, wherein the group of at least two includes a textual speaker cluster originating from the identified 


(from claim 9)…to label a portion of the transcribed audio data as having been spoken by the identified group of speakers.

1. A method of diarization, the method comprising: 



receiving a set of textual transcripts from a transcription server and a set of audio files associated with the set of textual transcripts from an audio database server; performing a blind diarization on the set of textual transcripts and the set of audio files to segment and cluster the textual transcripts 
(from claim 9 [Wingdings font/0xE0]) … and apply the linguistic model to transcribed audio data with the processor to label a portion of the transcribed audio data as having been spoken by the identified group of speakers.

2. The method of claim 1, further comprising saving the at least one linguistic model to a linguistic database server and associating it with the labeled speaker. 

3. The method of claim 2, further comprising applying the saved at least one linguistic model from the linguistic database server to a new audio file transcript from an audio source to perform diarization of the new audio file by blind diarizing the new audio 
file, comparing each new textual speaker cluster to the at least one linguistic model, and labeling each textual speaker cluster as belonging to a customer service agent or belonging to a customer. 

4. The method of claim 1, wherein the textual speaker clusters are associated in groups of at least two, wherein the group of at least two includes a textual speaker cluster originating from the identified group of speakers and at least one textual speaker cluster originating from an other speaker, and wherein the non-selected speaker clusters are assumed to have originated from the other speaker. 


5. The method of claim 1, wherein the identified group of speakers are customer service agents and the audio files are audio files of a customer service interaction between at least one customer service agent and at least one customer. 

6. The method of claim 1, further comprising: receiving a set of recorded audio files; and transcribing the set of recorded audio files to produce the set of textual transcripts. 



7. The method of claim 1, wherein the at least one heuristic is detection of a script associated with the identified group of speakers. 


8. The method of claim 1, further comprising: calculating a difference between the word use frequencies for each word in the selected 








Allowable Subject Matter

Claims 21-37 are allowable over the prior art of record.  Claims 21-38 are allowed over the prior art of record.

The following is a statement of reasons for the indication of allowable subject matter:
.


Conclusion

Please see related art listed on the PTO-892 form.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Opsasnick, telephone number (571)272-7623, who is available Monday-Friday, 9am-5pm. 

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Michael N Opsasnick/
Primary Examiner, Art Unit 2658
11/16/2021