DETAILED ACTION
This action is in response to the initial filing of Application no. 17/099,803 on 11/17/2020.
Claims 1-43 are still pending in this application, with claims 1, 12, 23,29, 31,33, 35, 37, 39, 41 and 43 are independent.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Allowable Subject Matter
Aside from the non-prior are rejections, it has been determined that the prior art fails to teach or suggest in reasonable combination the limitations recited in independent claims 1 (with dependent claims 2 – 11) and 12 (with dependent claims 13 – 22). Furthermore, 
Cohen et al. (US 2018/0342250) discloses the limitations recited in claims 1 and 12 and 37 (Fig.6; [0034 – 0037]  [0046– 0054]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, wherein successively assessing the voiceprint for each data chunk comprises computing for the voiceprint of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second speaker; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Dalmasso et al. (US 2017/0061968) (“Dalmasso”) discloses the limitations recited in claims 1 and 12 ([0039 – 0046]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, wherein successively assessing the voiceprint for each data chunk comprises computing for the voiceprint. of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second speaker; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Dimitriadis et al. (US 2018/0166067) (“Dimitriadis”) discloses the limitations recited in claims 1 and 12 (Fig.7 and Fig.8; [0057 – 0065]), except for the following: generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel.
Krause (US 2014/0348308) discloses the limitations recited in claims 1 and 12 ([0027 – 0033]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, wherein successively assessing the voiceprint for each data chunk comprises computing for the voiceprint. of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second speaker; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Wasserblat et al. (US 2013/0246064) discloses the limitations recited in claims 1 and 12 (Fig.5 and Fig.6; [0088 - 0098]) except for the following: except for the following: generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel.
Aside from the non-prior are rejections, it has been determined that the prior art fails to teach or suggest in reasonable combination the limitations recited in independent claims 29 (with dependent claim 30) and 37 (with dependent claims 38). Furthermore, 
Cohen et al. (US 2018/0342250) discloses the limitations recited in claims 29 and 37 and 37 (Fig.6; [0034 – 0037]  [0046– 0054]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, wherein successively assessing the voiceprint for each data chunk comprises applying a similarity algorithm to voiceprint for each data chunk; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Dalmasso et al. (US 2017/0061968) (“Dalmasso”) discloses the limitations recited in claims 29 and 37 ([0039 – 0046]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, wherein successively assessing the voiceprint for each data chunk comprises applying a similarity algorithm to voiceprint for each data chunk; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Dimitriadis et al. (US 2018/0166067) (“Dimitriadis”) discloses the limitations recited in claims 29 and 37 (Fig.7 and Fig.8; [0057 – 0065]), except for the following: generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel.
Krause (US 2014/0348308) discloses the limitations recited in claims 29 and 37 ([0027 – 0033]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, wherein successively assessing the voiceprint for each data chunk comprises applying a similarity algorithm to voiceprint for each data chunk; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Wasserblat et al. (US 2013/0246064) discloses the limitations recited in 29 and 37 (Fig.5 and Fig.6; [0088 - 0098]) except for the following: except for the following: generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel.
Aside from the non-prior are rejections, it has been determined that the prior art fails to teach or suggest in reasonable combination the limitations recited in independent claims 31 (with dependent claim 32) and 39 (with dependent claim 40). Furthermore, 
Cohen et al. (US 2018/0342250) discloses the limitations recited in claims 31 and 39 (Fig.6; [0034 – 0037]  [0046– 0054]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Dalmasso et al. (US 2017/0061968) (“Dalmasso”) discloses the limitations recited in claims 31 and 39  ([0039 – 0046]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Dimitriadis et al. (US 2018/0166067) (“Dimitriadis”) discloses the limitations recited in claims 31 and 39  (Fig.7 and Fig.8; [0057 – 0065]), except for the following: generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel.
Krause (US 2014/0348308) discloses the limitations recited in claims 31 and 39 ([0027 – 0033]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Wasserblat et al. (US 2013/0246064) discloses the limitations recited in 31 and 39  (Fig.5 and Fig.6; [0088 - 0098]) except for the following: except for the following: generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel.
Aside from the non-prior are rejections, it has been determined that the prior art fails to teach or suggest in reasonable combination the limitations recited in independent claims 33 (with dependent claim 34) and 41(with dependent claim  42). Furthermore, 
Cohen et al. (US 2018/0342250) discloses the limitations recited in claims 33 and 41 (Fig.6; [0034 – 0037]  [0046– 0054]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Dalmasso et al. (US 2017/0061968) (“Dalmasso”) discloses the limitations recited in claims 33 and 41 ([0039 – 0046]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Dimitriadis et al. (US 2018/0166067) (“Dimitriadis”) discloses the limitations recited in claims 33 and 41 (Fig.7 and Fig.8; [0057 – 0065]), except for the following: generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel.
Krause (US 2014/0348308) discloses the limitations recited in claims 33 and 41 ([0027 – 0033]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Wasserblat et al. (US 2013/0246064) discloses the limitations recited in claims 33 and 41 (Fig.5 and Fig.6; [0088 - 0098]) except for the following: except for the following: generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel.
Aside from the non-prior are rejections, it has been determined that the prior art fails to teach or suggest in reasonable combination the limitations recited in independent claims 35 (with dependent claim 36) and 43 (with dependent claim 44). Furthermore, 
Cohen et al. (US 2018/0342250) discloses the limitations recited in claims 35 and 43 (Fig.6; [0034 – 0037]  [0046– 0054]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Dalmasso et al. (US 2017/0061968) (“Dalmasso”) discloses the limitations recited in claims 35 and 43  ([0039 – 0046]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Dimitriadis et al. (US 2018/0166067) (“Dimitriadis”) discloses the limitations recited in claims 35 and 43  (Fig.7 and Fig.8; [0057 – 0065]), except for the following: generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel.
Krause (US 2014/0348308) discloses the limitations recited in claims 35 and 43  ([0027 – 0033]) except for the following: successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; and generating by the processor the accumulated voiceprint using the verified data chunks of the speaker to be authenticated.
Wasserblat et al. (US 2013/0246064) discloses the limitations recited in claims 35 and 43  (Fig.5 and Fig.6; [0088 - 0098]) except for the following: except for the following: generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel.
Aside from the non-prior art rejection, it has been determined that the prior art fails to teach or suggest in reasonable combination the limitations recited in independent claim 23 (with dependent claims 24 – 28). Furthermore,  Wasserblat et al. (US 2013/0246064) discloses a method for generating a representative voiceprint of a speaker from an audio stream data with speech over an audio channel (Abstract), the method comprising: in a processor (speaker segmentation component and controller, Fig.1, 148, Fig.3, Fig.7, 705; [0019] [0105]) receiving audio stream data of an audio stream with speech from a first speaker (customer) and a second speaker (agent) speaking over an audio channel (Fig.1, 112) and a reference voiceprint of the first speaker (customer model, Fig.1, 146; [0028] [0033] [0055 — 0058]) (Fig.3, 300 and Fig.6, 610; [0032] [0059 — 0061] [0088)]): distinguishing by the processor parts of the audio stream with speech from the first speaker of the second speaker (as shown by voice activity detection (VAD) 302 voice activity may be detected.. the areas around the detected pitch harmonics may be classified as speech areas and the rest of the signal may be classified as non-speech, [0062]); dividing by the processor the audio stream data into a plurality of data chunks (frames) having a predefined time interval (an RT buffer containing audio signals captured from an interaction may be sliced into frames, typically one second in length, [0065] [0089]); generating by the processor a voiceprint from the speech of each data chunk (acoustic features may be extracted from a frame, [0063] [0065] [0095]);  assigning by the processor a similarity score to the voiceprint generated for each speech data chunk by applying a similarity algorithm that compares a voiceprint to a the reference voiceprint of the first speaker, wherein the similarity score is indicative of the speech in the voiceprint of the speech data chunk belonging to the first speaker (a first probability score or value may be computed to represent the probability that the acoustic feature was produced by a customer, [0065] [0089] [0095]); and generating by the processor a representative voice print (model) of the second speaker using voiceprints of the speech data chunks (in some cases a caller may not be associated with a model.. in such or other scenarios, embodiments of the invention may analyze one or more segments to produce an analysis result and may generate, in real time a model for a source, [0093]). Yet, Wasserblat fails to teach and/or suggest the following: distinguishing by the processor data chunks from the plurality of data chunks with speech from the first speaker or the second speaker in the predefined time interval; upon detecting that the audio stream ended, identifying by the processor the speech data chunks with voiceprints having respective similarity scores lower than a predefined threshold; and generating by the processor the representative voiceprint of the second speaker using voiceprints of the identified speech data chunks

Double Patenting
A rejection based on double patenting of the “same invention” type finds its support in the language of 35 U.S.C. 101 which states that “whoever invents or discovers any new and useful process... may obtain a patent therefor...” (Emphasis added). Thus, the term “same invention,” in this context, means an invention drawn to identical subject matter. See Miller v. Eagle Mfg. Co., 151 U.S. 186 (1894); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Ockert, 245 F.2d 467, 114 USPQ 330 (CCPA 1957).
A statutory type (35 U.S.C. 101) double patenting rejection can be overcome by canceling or amending the claims that are directed to the same invention so they are no longer coextensive in scope. The filing of a terminal disclaimer cannot overcome a double patenting rejection based upon 35 U.S.C. 101.

Claims 2, 13, 23 – 28, 30, 32, 34, 36, 38, 40, 42 and 44 are rejected under 35 U.S.C. 101 as claiming the same invention as that of claims 1, 11 and 21 –  34 of prior U.S. Patent No. 10,885,920. This is a statutory double patenting rejection.

The claim mapping is as follows.
Current Application

1. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative Voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a. plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein successively assessing the voiceprint for each data chunk comprises computing for the voiceprint. of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second. speaker.

2. the method according to claim 1, wherein after the assessing of the voiceprint, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval, and when the clock counter has a time value greater than a predefined threshold, performing the generating of the accumulated voiceprint.


12. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data oh a audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined. time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging, to the speaker to be authenticated or to the second speaker using the representative voiceprints, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to successively assess the voiceprint for each data chunk by computing for the voiceprint of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second speaker.

13. The computerized system according to claim 12, wherein after the assessing of the voiceprint, the processor is further configured, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval; and when the clock counter has a time value greater than a predefined threshold, to perform the generating of the accumulated voiceprint.

23, A method for generating a representative voiceprint of a speaker from audio stream data with speech over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a first speaker and a second speaker speaking over an audio channel and a reference voiceprint of the first speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; distinguishing by the processor data chunks from the plurality of data chunks with speech from the first speaker or the second speaker in the predefined time interval; generating by the processor a voiceprint for the speech in each speech data chunk; assigning by the processor a similarity score to the voiceprint generated for each speech data chunk by applying a similarity algorithm that compares each generated voiceprint to the reference voiceprint of the first speaker, wherein the similarity score is indicative of the speech in the voiceprint of the speech data chunk belonging to the first speaker: upon detecting that the audio stream ended, identifying by the processor the speech data chunks with voiceprints having respective similarity scores lower than a predefined threshold; and generating by the processor a representative voiceprint of the second speaker using voiceprints of the identified speech data chunks.

24. The method according to claim 23, further comprising storing the representative voiceprint of the second speaker in a database with representative voiceprints of multiple speakers,

25. The method according to claim 23, wherein identifying the speech data chunks from the plurality of data chunks comprises applying a voice activity detection algorithm to the plurality of data chunks,

26. The method according to claim 23, wherein the voiceprint comprises an i-vector.

27. The method according to claim 23. wherein the similarity algorithm uses a log likelihood ratio.

28 The method according to claim 23, further comprising calculating the predefined threshold from a decision boundary of a distribution of the similarity scores for voiceprints generated from the speech data chunks.

29. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a. speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein successively assessing the voiceprint for each data chunk comprises applying a similarity algorithm to the voiceprint for each data chunk,

30. The method according to claim. 29, wherein after the assessing of the voiceprint, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval, and when the clock counter has a time value greater than a predefined threshold, performing the generating of the accumulated voiceprint.

31. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the voiceprint comprises an i-vector.

32. The method according to claim 31, wherein after the assessing of the voiceprint, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated,. incrementing a clock counter by the predefined time interval, and when the clock counter has a tine value greater than a predefined threshold, performing the generating of the accumulated voiceprint.

33. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising; in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated. voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein comparing the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated comprises computing a similarity score for the accumulated voiceprint using the representative voiceprint of the speaker to be authenticated,

34. The method according to claim 33, wherein after the assessing of the voiceprint. upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval, and when the clock counter has a time value greater than a predefined threshold, performing the generating of the accumulated voiceprint.

35. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel; and detecting if the speech in the audio stream data belongs to an additional speaker on the audio channel by verifying that the speech in the voiceprints of successive data chunks in a second predefined time interval. does not belong to the speaker to be authenticated or to the second speaker.

36. The method according to claim 35, wherein after the assessing of the voiceprint, upon verifying that the voiceprint for the assessed data chunk has speech. belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval, and when the clock counter has a time value greater than a predefined threshold, pertormi.ng the generating of the accumulated voiceprint.

37, A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to successively assess the voiceprint for each data chunk by applying a similarity algorithm to the voiceprint for each data chunk.

38. The computerized system according to claim 37, wherein after the assessing of the voiceprint, the processor is further configured, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval: and when the clock counter has a time value greater than a. predefined threshold, to perform the generating of the accumulated voiceprint.

39. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream. data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the voiceprint comprises an i-vector,

40. The computerized system according to claim 39, wherein after the assessing of the voiceprint, the processor is further configured, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval; and when the clock counter has a time value greater than a predefined threshold, to perform the generating of the accumulated voiceprint.

41. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each. data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated by computing a similarity score for the accumulated voiceprint using the representative voiceprint of the speaker to be authenticated

42. The computerized system according to claim 41, wherein after the assessing of the voiceprint, the Processor is further configured, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval; and when the clock counter has a time value greater than a predefined threshold, to perform the generating of the accumulated voiceprint.

43. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a. memory: and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to detect if the speech in the audio stream data belongs to an additional speaker on the audio channel by verifying that the speech in the voiceprints of successive data chunks in a second predefined time interval does not belong to the speaker to be authenticated or to the second speaker.

44. The computerized system according to claim 43, wherein after the assessing of the voiceprint, the processor is further configured, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval; and when the dock counter has a time value greater than a predefined threshold, to perform the generating of the accumulated voiceprint
US 10,885,920

1. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval; when the clock counter has a time value greater than a predefined threshold, generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein successively assessing the voiceprint for each data chunk comprises computing for the voiceprint of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second speaker.


11. A computerized system for separating and authenticating speech of a speaker on an, audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated when the clock counter has a time value greater than a predefined threshold, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to successively assess the voiceprint for each data chunk by computing for the voiceprint of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second speaker.


21. A method for generating a representative voiceprint of a speaker from audio stream data with speech over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a first speaker and a second speaker speaking over an audio channel and a reference voiceprint of the first speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; distinguishing by the processor data chunks from the plurality of data chunks with speech from the first speaker or the second speaker in the predefined time interval; generating by the processor a voiceprint for the speech in each speech data chunk; assigning by the processor a similarity score to the voiceprint generated for each speech data chunk by applying a similarity algorithm that compares each generated voiceprint to the reference voiceprint of the first speaker, wherein the similarity score is indicative of the speech in the voiceprint of the speech data chunk belonging to the first speaker; upon detecting that the audio stream ended, identifying by the processor the speech data chunks with voiceprints having respective similarity scores lower than a predefined threshold; and generating by the processor a representative voiceprint of the second speaker using voiceprints of the identified speech data chunks.

22. The method according to claim 21, further comprising storing the representative voiceprint of the second speaker in a database with representative voiceprints of multiple speakers.

23. The method according to claim 21, wherein identifying the speech data chunks from the plurality of data chunks comprises applying a voice activity detection algorithm to the plurality of data chunks.

24. The method according to claim 21, wherein the voiceprint comprises an i-vector.

25. The method according to claim 21, wherein the similarity algorithm uses a log likelihood ratio.

26. The method according to claim 21, further comprising calculating the predefined threshold from a decision boundary of a distribution of the similarity scores for voiceprints generated from the speech data chunks.

27. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval; when the clock counter has a time value greater than a predefined threshold, generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein successively assessing the voiceprint for each data chunk comprises applying a similarity algorithm to the voiceprint for each data chunk.

28. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second, speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval; when the clock counter has a time value greater than a predefined threshold, generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the voiceprint comprises an i-vector.

29. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval; when the clock counter has a time value greater than a predefined threshold, generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein comparing the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated comprises computing a similarity score for the accumulated voiceprint using the representative voiceprint of the speaker to be authenticated.

30. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively, assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk: as speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval; when the clock counter has a time value greater than a predefined threshold, generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel; and detecting if the speech in, the audio stream data belongs to an additional speaker on the audio channel by verifying that the speech in the voiceprints of successive data chunks in a second predefined time interval does not belong to the speaker to be authenticated or to the second speaker.

31. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a dock counter by the predefined time interval, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated when the dock counter has a time value greater than a predefined threshold, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to successively assess the voiceprint for each data chunk by applying a similarity algorithm to the voiceprint for each data chunk.

32. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the verifying for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated when the clock counter has a time value greater than a predefined threshold, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the voiceprint comprises an i-vector.

33. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a dock counter by the predefined time interval, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated when the clock counter has a time value greater than a predefined threshold, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated by computing a similarity score for the accumulated voiceprint using the representative voiceprint of the speaker to be authenticated.

34. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated when the clock counter has a time value greater than a predefined threshold, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to detect if the speech in the audio stream data belongs to an additional speaker on the audio channel by verifying that the speech in the voiceprints of successive data chunks in a second predefined time interval does not belong to the speaker to be authenticated or to the second speaker.


The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1, 3- 12, 14 -22, 29, 31, 33, 35, 37, 39, 41 and 43 rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 3 – 20 and 27 - 34 of U.S. Patent No. 10,885,920. Although the claims at issue are not identical, they are not patentably distinct from each other.

Current Application

1. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative Voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a. plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein successively assessing the voiceprint for each data chunk comprises computing for the voiceprint. of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second. speaker.

2. the method according to claim 1, wherein after the assessing of the voiceprint, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval, and when the clock counter has a time value greater than a predefined threshold, performing the generating of the accumulated voiceprint.


12. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data oh a audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined. time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging, to the speaker to be authenticated or to the second speaker using the representative voiceprints, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to successively assess the voiceprint for each data chunk by computing for the voiceprint of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second speaker.

13. The computerized system according to claim 12, wherein after the assessing of the voiceprint, the processor is further configured, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval; and when the clock counter has a time value greater than a predefined threshold, to perform the generating of the accumulated voiceprint.

23, A method for generating a representative voiceprint of a speaker from audio stream data with speech over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a first speaker and a second speaker speaking over an audio channel and a reference voiceprint of the first speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; distinguishing by the processor data chunks from the plurality of data chunks with speech from the first speaker or the second speaker in the predefined time interval; generating by the processor a voiceprint for the speech in each speech data chunk; assigning by the processor a similarity score to the voiceprint generated for each speech data chunk by applying a similarity algorithm that compares each generated voiceprint to the reference voiceprint of the first speaker, wherein the similarity score is indicative of the speech in the voiceprint of the speech data chunk belonging to the first speaker: upon detecting that the audio stream ended, identifying by the processor the speech data chunks with voiceprints having respective similarity scores lower than a predefined threshold; and generating by the processor a representative voiceprint of the second speaker using voiceprints of the identified speech data chunks.

24. The method according to claim 23, further comprising storing the representative voiceprint of the second speaker in a database with representative voiceprints of multiple speakers,

25. The method according to claim 23, wherein identifying the speech data chunks from the plurality of data chunks comprises applying a voice activity detection algorithm to the plurality of data chunks,

26. The method according to claim 23, wherein the voiceprint comprises an i-vector.

27. The method according to claim 23. wherein the similarity algorithm uses a log likelihood ratio.

28 The method according to claim 23, further comprising calculating the predefined threshold from a decision boundary of a distribution of the similarity scores for voiceprints generated from the speech data chunks.

29. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a. speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein successively assessing the voiceprint for each data chunk comprises applying a similarity algorithm to the voiceprint for each data chunk,

30. The method according to claim. 29, wherein after the assessing of the voiceprint, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval, and when the clock counter has a time value greater than a predefined threshold, performing the generating of the accumulated voiceprint.

31. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the voiceprint comprises an i-vector.

32. The method according to claim 31, wherein after the assessing of the voiceprint, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated,. incrementing a clock counter by the predefined time interval, and when the clock counter has a tine value greater than a predefined threshold, performing the generating of the accumulated voiceprint.

33. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising; in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints; generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated. voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein comparing the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated comprises computing a similarity score for the accumulated voiceprint using the representative voiceprint of the speaker to be authenticated,

34. The method according to claim 33, wherein after the assessing of the voiceprint. upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval, and when the clock counter has a time value greater than a predefined threshold, performing the generating of the accumulated voiceprint.

35. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel; and detecting if the speech in the audio stream data belongs to an additional speaker on the audio channel by verifying that the speech in the voiceprints of successive data chunks in a second predefined time interval. does not belong to the speaker to be authenticated or to the second speaker.

36. The method according to claim 35, wherein after the assessing of the voiceprint, upon verifying that the voiceprint for the assessed data chunk has speech. belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval, and when the clock counter has a time value greater than a predefined threshold, pertormi.ng the generating of the accumulated voiceprint.

37, A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to successively assess the voiceprint for each data chunk by applying a similarity algorithm to the voiceprint for each data chunk.

38. The computerized system according to claim 37, wherein after the assessing of the voiceprint, the processor is further configured, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval: and when the clock counter has a time value greater than a. predefined threshold, to perform the generating of the accumulated voiceprint.

39. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream. data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the voiceprint comprises an i-vector,

40. The computerized system according to claim 39, wherein after the assessing of the voiceprint, the processor is further configured, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval; and when the clock counter has a time value greater than a predefined threshold, to perform the generating of the accumulated voiceprint.

41. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each. data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated by computing a similarity score for the accumulated voiceprint using the representative voiceprint of the speaker to be authenticated

42. The computerized system according to claim 41, wherein after the assessing of the voiceprint, the Processor is further configured, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval; and when the clock counter has a time value greater than a predefined threshold, to perform the generating of the accumulated voiceprint.

43. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a. memory: and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to detect if the speech in the audio stream data belongs to an additional speaker on the audio channel by verifying that the speech in the voiceprints of successive data chunks in a second predefined time interval does not belong to the speaker to be authenticated or to the second speaker.

44. The computerized system according to claim 43, wherein after the assessing of the voiceprint, the processor is further configured, upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval; and when the dock counter has a time value greater than a predefined threshold, to perform the generating of the accumulated voiceprint
US 10,885,920

1. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval; when the clock counter has a time value greater than a predefined threshold, generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein successively assessing the voiceprint for each data chunk comprises computing for the voiceprint of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second speaker.

2. The method according to claim 1, wherein verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated comprises assessing that the first similarity score is greater than the second similarity score.

3. The method according to claim 1, wherein successively assessing the voiceprint for each data chunk comprises applying a similarity algorithm to the voiceprint for each data chunk.

4. The method according to claim 1, wherein the voiceprint comprises an i-vector.

5. The method according to claim 4, wherein generating the i-vector for each data chunk in said plurality of data chunks comprises: dividing each data chunk into frames; extracting Mel-Frequency Cepstrum (MFCC) features for each of the frames; and extracting the i-vector for each data chunk from the MFCC features for each frame using a universal background model (UBM) and a total variability matrix (TVM).

6. The method according to claim 1, wherein comparing the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated comprises computing a similarity score for the accumulated voiceprint using the representative voiceprint of the speaker to be authenticated.

7. The method according to claim 6, further comprising authenticating the speaker speaking with the second speaker over the audio channel by assessing that the computed similarity score for the accumulated voiceprint is greater than a predefined threshold.

8. The method according to claim 1, further comprising detecting if the speech in the audio stream data belongs to an additional speaker on the audio channel by verifying that the speech in the voiceprints of successive data chunks in a second predefined time interval does not belong to the speaker to be authenticated or to the second speaker.

9. The method according to claim 8, wherein detecting the additional speaker on the audio channel comprises computing, for each of the voiceprints of the successive data chunks, similarity scores for both the speaker to be authenticated and the second speaker and assessing that both similarity scores for each of the voiceprints of the successive data chunks are less than a preset similarity score threshold.

10. The method according to claim 8, further comprising generating a representative voiceprint of the additional speaker by merging the voiceprints of the successive data chunks with the speech of the detected additional speaker.

11. A computerized system for separating and authenticating speech of a speaker on an, audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated when the clock counter has a time value greater than a predefined threshold, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to successively assess the voiceprint for each data chunk by computing for the voiceprint of each data chunk, a first similarity score for the speaker to be authenticated and a second similarity score for the second speaker.

12. The computerized system according to claim 11, wherein the processor is configured to verify that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated by assessing that the first similarity score is greater than the second similarity score.

13. The computerized system according to claim 11, wherein the processor is configured to successively assess the voiceprint for each data chunk by applying a similarity algorithm to the voiceprint for each data chunk.

14. The computerized system according to claim 11, wherein the voiceprint comprises i-vector.

15. The computerized system according to claim 14, wherein the processor is configured to generate the i-vector for each data chunk in said plurality of data chunks by dividing each data chunk into frames, extracting Mel-Frequency Cepstrum (MFCC) features for each of the frames, and extracting the i-vector for each data chunk from the MFCC features for each frame using a universal background model (UBM) and a total variability matrix (TVM).

16. The computerized system according to claim 11, wherein the processor is configured to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated by computing a similarity score for the accumulated voiceprint using the representative voiceprint of the speaker to be authenticated.

17. The computerized system according to claim 16, wherein the processor is configured to authenticate the speaker speaking with the second speaker over the audio channel by assessing that the computed similarity score for the accumulated voiceprint is greater than a predefined threshold.

18. The computerized system according to claim 11, wherein the processor is configured to detect if the speech in the audio stream data belongs to an additional speaker on the audio channel by verifying that the speech in the voiceprints of successive data chunks in a second predefined time interval does not belong to the speaker to be authenticated or to the second speaker.

19. The computerized system according, to claim 18, wherein the processor is configured to detect the additional speaker on the audio channel by computing, for each of the voiceprints of the successive data chunks, similarity scores for both the speaker to be authenticated and the second speaker and assessing that both similarity scores for each of the voiceprints of the successive data chunks are less than a preset similarity score threshold.

20. The computerized system according to claim 18, wherein the processor is configured to generate a representative voiceprint of the additional speaker by merging the voiceprints of the successive data chunks with the speech of the detected additional speaker.

21. A method for generating a representative voiceprint of a speaker from audio stream data with speech over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a first speaker and a second speaker speaking over an audio channel and a reference voiceprint of the first speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; distinguishing by the processor data chunks from the plurality of data chunks with speech from the first speaker or the second speaker in the predefined time interval; generating by the processor a voiceprint for the speech in each speech data chunk; assigning by the processor a similarity score to the voiceprint generated for each speech data chunk by applying a similarity algorithm that compares each generated voiceprint to the reference voiceprint of the first speaker, wherein the similarity score is indicative of the speech in the voiceprint of the speech data chunk belonging to the first speaker; upon detecting that the audio stream ended, identifying by the processor the speech data chunks with voiceprints having respective similarity scores lower than a predefined threshold; and generating by the processor a representative voiceprint of the second speaker using voiceprints of the identified speech data chunks.

22. The method according to claim 21, further comprising storing the representative voiceprint of the second speaker in a database with representative voiceprints of multiple speakers.

23. The method according to claim 21, wherein identifying the speech data chunks from the plurality of data chunks comprises applying a voice activity detection algorithm to the plurality of data chunks.

24. The method according to claim 21, wherein the voiceprint comprises an i-vector.

25. The method according to claim 21, wherein the similarity algorithm uses a log likelihood ratio.

26. The method according to claim 21, further comprising calculating the predefined threshold from a decision boundary of a distribution of the similarity scores for voiceprints generated from the speech data chunks.

27. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval; when the clock counter has a time value greater than a predefined threshold, generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein successively assessing the voiceprint for each data chunk comprises applying a similarity algorithm to the voiceprint for each data chunk.

28. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second, speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval; when the clock counter has a time value greater than a predefined threshold, generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the voiceprint comprises an i-vector.

29. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval; when the clock counter has a time value greater than a predefined threshold, generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein comparing the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated comprises computing a similarity score for the accumulated voiceprint using the representative voiceprint of the speaker to be authenticated.

30. A method for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the method comprising: in a processor, receiving audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker; dividing by the processor the audio stream data into a plurality of data chunks having a predefined time interval; generating by the processor a voiceprint for each data chunk in said plurality of data chunks in which speech is detected; successively, assessing by the processor the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk: as speech belonging to the speaker to be authenticated, incrementing a clock counter by the predefined time interval; when the clock counter has a time value greater than a predefined threshold, generating by the processor an accumulated voiceprint using the verified data chunks of the speaker to be authenticated; and comparing by the processor the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel; and detecting if the speech in, the audio stream data belongs to an additional speaker on the audio channel by verifying that the speech in the voiceprints of successive data chunks in a second predefined time interval does not belong to the speaker to be authenticated or to the second speaker.

31. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a dock counter by the predefined time interval, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated when the dock counter has a time value greater than a predefined threshold, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to successively assess the voiceprint for each data chunk by applying a similarity algorithm to the voiceprint for each data chunk.

32. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the verifying for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated when the clock counter has a time value greater than a predefined threshold, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the voiceprint comprises an i-vector.

33. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a dock counter by the predefined time interval, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated when the clock counter has a time value greater than a predefined threshold, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated by computing a similarity score for the accumulated voiceprint using the representative voiceprint of the speaker to be authenticated.

34. A computerized system for separating and authenticating speech of a speaker on an audio stream of speakers over an audio channel, the computerized system comprising: a memory; and a processor configured to receive audio stream data of an audio stream with speech from a speaker to be authenticated speaking with a second speaker over an audio channel, and representative voiceprints of the speaker to be authenticated and the second speaker, to divide the audio stream data into a plurality of data chunks having a predefined time interval, to generate a voiceprint for each data chunk in said plurality of data chunks in which speech is detected, to successively assess the voiceprint for each data chunk from a start of the audio stream data as to whether the voiceprint for each data chunk has speech belonging to the speaker to be authenticated or to the second speaker using the representative voiceprints, and upon verifying that the voiceprint for the assessed data chunk has speech belonging to the speaker to be authenticated, to increment a clock counter by the predefined time interval, to generate an accumulated voiceprint using the verified data chunks of the speaker to be authenticated when the clock counter has a time value greater than a predefined threshold, and to compare the accumulated voiceprint to the representative voiceprint of the speaker to be authenticated, so as to authenticate the speaker speaking with the second speaker over the audio channel, wherein the processor is configured to detect if the speech in the audio stream data belongs to an additional speaker on the audio channel by verifying that the speech in the voiceprints of successive data chunks in a second predefined time interval does not belong to the speaker to be authenticated or to the second speaker.


	As shown above, the limitations recited in claims 1, 3 – 20 and 27 - 34 of U.S. Patent No. 10,885,920 anticipate the limitations recited in claims 1, 3- 12, 14 -22, 29, 31, 33, 35, 37, 39, 41 and 43 of the currently pending application. Therefore, claims 1, 3- 12, 14 -22, 29, 31, 33, 35, 37, 39, 41 and 43 of the currently pending application are obvious variants of claims 1, 3 – 20 and 27 - 34 of U.S. Patent No. 10,885,920.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SONIA L GAY whose telephone number is (571)270-1951. The examiner can normally be reached Monday-Friday 9-5 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SONIA L GAY/Primary Examiner, Art Unit 2657