Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Acknowledgement  
Acknowledgement is made of applicant’s amendment made on 04/19/2022. Applicant’s submission filed has been entered and made of record.
Status of the Claims
Claims 1, 5-8, 15-18, 22-33, and 36-38 are pending. 
Allowable Subject Matter
Claims 1, 5-8, 15-18, 22-33, and 36-38 are allowed for the following reasons: 
Claim 1 recites a method comprising: 
selecting, by a computer system, to identify voices of a representative and different customers, recordings of conversations between multiple speakers, each of the conversations being between the representative and the different customers, wherein the selected recordings include voice data and metadata that provide information regarding i) the representative, ii) a first customer, and iii) a second customer, and wherein the selected recordings comprise i) a first recording of a first conversation between the representative and the first customer and ii) a second recording of a second conversation between the representative and the second customer; 
generating, by the computer system, multiple sets of voice segments from the selected recordings by:
applying a statistical function to the selected recordings to mitigate background noise, 
removing portions of the selected recordings that do not include any voices so as to create the voice segments, and 
clustering the voice segments into the multiple sets, the clustering based at least in part on the voice segments having similar features, wherein each set includes one or more voice segments corresponding to one of the multiple speakers, and wherein each voice segment corresponds to a portion of the selected recordings that includes a voice of one of the multiple speakers, a first voice segment including the voice of the representative in the first conversation with the first 2customer and a second voice segment including the voice of the representative in the second conversation with the second customer; 
identifying, by the computer system, a given set of voice segments associated with a given speaker of the multiple speakers; 
determining, by the computer system, 
a first speaker-identification parameter by establishing, based on the voice data associated with the given set of voice segments, a vocal characteristic of the given speaker, and 
a second speaker-identification parameter by generating, based on the voice data associated with the given set of voice segments, a language model that specifies terms and/or phrases used by the given speaker in the conversations; 
determining, by the computer system, an identity of the given speaker by: 
comparing the first speaker-identification parameter to a series of fingerprints to discover a matching fingerprint, wherein the matching fingerprint is representative of the vocal characteristic of the given speaker derived from a past recording of a past conversation involving the given speaker and at least one other speaker, and 
establishing the identity based on the matching fingerprint and the language model; 
retrieving, by the computer system, identification information related to the given speaker from a network-accessible source; and 
assigning, by the computer system, the identification information to the given set of voice segments associated with the given speaker.
Claim 23 recites a non-transitory computer-readable storage medium storing instructions that are executable by a speaker identification system, the instructions comprising: 
instructions for selecting, to identify voices of a representative and different customers, real-time call data of conversations between multiple speakers, each of the conversations being between the representative and the different customers, wherein the real-time call data includes audio data, video data, and metadata that provides information regarding i) the representative, ii) a first customer, and iii) a second customer, and wherein the real-time call data comprises i) first real-time call data of a first conversation between the representative and the first customer and ii) 5second real-time call data of a second conversation between the representative and the second customer; 
instructions for generating multiple groups of voice segments from the real-time call data, wherein each group of voice segments includes one or more voice segments corresponding to a voice of one of the multiple speakers, a first voice segment including the voice of the representative in the first conversation with the first customer and a second voice segment including the voice of the representative in the second conversation with the second customer, 
instructions for identifying a given group of voice segments associated with a given speaker of the multiple speakers; 
instructions for determining multiple speaker-identification parameters by analyzing the audio data and the video data associated with the given group of voice segments, wherein a first speaker-identification parameter of the multiple speaker-identification parameters is representative of an audio feature to be used to identify the given speaker, and wherein a second speaker-identification parameter of the multiple speaker- identification parameters is representative of a video feature to be used to identify the given speaker; 
instructions for determining an identity of the speaker by comparing the first speaker-identification parameter to data from a first information system that internal to an organization where the speaker identification system is deployed, and the second speaker-identification parameter to data from a second information system that is external to the organization but accessible to the speaker identification system via a network; and 
instructions for assigning identification information of the given speaker to the given group of voice segments associated with the given speaker.
Claim 36 recites a computer-implemented method comprising: 
selecting, to identify voices of a representative and different customers, recordings of conversations between multiple speakers, each of the conversations being between the representative and the different customers, wherein the selected recordings include voice data, video data, and metadata that provide information regarding i the representative, ii) a first customer, and iii) a second customer, and wherein the selected recordings comprise i) a first recording of a first conversation between the representative and the first customer and ii) a second recording of a second conversation between the representative and the second customer; 
generating multiple voice segments by analyzing the voice data, wherein each voice segment of the multiple voice segments represents a portion of the voice data that corresponds to a single speaker; 
forming multiple sets of voice segments by grouping the multiple voice segments by speaker, wherein each set of voice segments is associated with a different speaker of the multiple speakers, a first voice segment including the voice of the representative in the first conversation with the first customer and a second voice segment including the voice of the representative in the second conversation with the second customer; 
generating, based on the video data, a facial image of a speaker for the purpose of establishing an identity of the speaker; 
obtaining contact information for each speaker of the multiple speakers that is extracted or derived from corresponding invitations to the conversations; 
retrieving, based on the contact information, at least one facial image of each speaker of the multiple speakers from a social networking service, wherein for each speaker of the multiple speakers, the corresponding contact information is associated with a profile from which the at least facial image is retrieved; 
9comparing the facial image of the speaker to the facial images retrieved from the social networking service to find a matching image; and 
establishing the identity of the speaker based on identification information associated with the matching image.
US 2018/0254051 A1 disclose system for role modeling in call center scenario where the roles for the speakers comprises a customer and a customer service agent (¶12). Utilizing speaker diarization to distinguish one speaker from another, speaker recognition, and text classification, roles for each of the speakers can be determined when analyzing a large volume of calls for a call center where a customer service agent is speaking on multiple calls (¶12). Customer service agents can be identified by utilizing the assumption that the customer service agents speak on multiple calls at a call center. This is contrary to a customer who speaks on one or two calls when calling into a call center. By using speaker diarization and speaker recognition, labels can be applied to agents who appear on multiple calls to provide labelled training data. This labelled training data is utilized by a supervised or unsupervised classifier to develop a role classification model (¶12). 
In particular, the system 200 receives audio data 202 that includes either single speaker audio recordings or audio recordings of a conversation between two or more speakers. For example, an audio conversation in a customer service call center can be between a customer service representative (agent) and a customer calling in for customer service support (customer) (¶19).
 In the call center example, the speaker recognition engine can start with a training set of k=10 audio conversations where there is a single agent that speaks on all k calls, and there are k different customers that speak on each of the k calls (¶21).  Using agglomerative clustering, the 10 closest models are found after a constraint is considered. A constraint, for example, can be that only one i-vector from each call can be assigned in the 10 closest models group. This i-vector representation is used to directly detect (using speaker recognition techniques across a database of conversations) which speaker is the agent in the diarized text files (¶21). For a call center, a customer service agent (Agent) can be distinguished from customers because agents speak on many of calls unlike a customer that tends to speak on just one call (¶22).
However, since US 2018/0254051 A1 focused on role modeling, rather than determining an identity of the speaker as required by claims 1, 23, and 36, prior arts of record do not disclose or render obvious the combination of limitations set forth in claims 1, 23, and 36. Therefore, Claims 1, 5-8, 15-18, 22-33, and 36-38 are allowed.   
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be labeled “comments on statement of reasons for allowance”. 
Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to examiner Richard Z. Zhu whose telephone number is 571-270-1587 or examiner’s supervisor King Poon whose telephone number is 571-272-7440. Examiner Richard Zhu can normally be reached on M-Th, 0730:1700.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RICHARD Z ZHU/Primary Examiner, Art Unit 2675                                                                                                                                                                                                        04/23/2022