Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

EXAMINER'S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

Authorization for this examiner’s amendment was given in an interview with Praveer Kumar Gupta Reg. No.: 75977 on 2/9/2022.

The application has been amended as follows: 
1.	A method for automatic speaker diarization, the method comprising: 
removing, at a call analytics server (CAS), non-speech portions from a call audio to produce a pre-processed audio, the call audio comprising speech from at least two speakers; 
dividing, at the CAS, the pre-processed audio to a plurality of audio segments, each segment of the plurality of segments corresponding to speech from a single speaker of the at least two speakers, wherein the dividing comprises: 
selecting a first time-window of the call audio at position i-1, and a second time- window of the call audio at position i, the second time-window equal to the first time- 
calculating the Kullback-Leibler (KL) divergence measure for each of the first time-window at i-1 and the second-time window at i; 
shifting the first time-window and the second window by a duration of a pre-defined time resolution to positions i+1 and i+2, respectively; 
calculating the Kullback-Leibler (KL) divergence measure (D) for each of the first time-window at i+1 and the second-time window at i+2; and detecting a change point if the following conditions are met: 
Condition 1: D (i , i+1) > D (i+1 , i+2), and
Condition 2: D (i , i+1) > D (i-1 , i), and
clustering, at the CAS, the plurality of segments into at least two groups corresponding the at least two speakers.
5. 	The method of claim 1, wherein the clustering comprises: deriving the mel frequency cepstral coefficients (MFCC) values for each audio segment of the plurality of audio segments; calculating numerical array with MFCC values for each audio segment; and perform a clustering technique to yield the at least two groups of audio segments.
6. 	An apparatus for automatic speaker diarization, the apparatus comprising: 
a processor; and 
a memory communicably coupled to the processor, wherein the memory comprises computer-executable instructions, which when executed using the processor, perform a method comprising: 

dividing, at the CAS, the pre-processed audio to a plurality of audio segments, each segment of the plurality of segments corresponding to speech from a single speaker of the at least two speakers, wherein the dividing comprises: 
selecting a first time-window of the call audio at position i-1, and a second time-window of the call audio at position i, the second time-window equal to the first time-window in duration, and the second time-window next and adjacent to the first time-window, where i is an integer greater than 0, 
calculating the Kullback-Leibler (KL) divergence measure (D) for each of the first time-window at i-1 and the second-time window at i, 
shifting the first time-window and the second window by a duration of a pre-defined time resolution to positions i+1 and i+2, respectively, 
calculating the Kullback-Leibler (KL) divergence measure for each of the first time-window at i+1 and the second-time window at i+2, and detecting a change point if the following conditions are met:
Condition 1: D (i , i+1) > D (i+1 , i+2), and
Condition 2: D (i , i+1) > D (i-1 , i), and
clustering, at the CAS, the plurality of segments into at least two groups corresponding the at least two speakers.
10.	The apparatus of claim 1, wherein the clustering comprises: deriving the mel frequency cepstral coefficients (MFCC) values for each audio segment of the plurality of 
11.	A non-transitory computer-readable storage medium, the computer- readable storage medium including instructions that when executed by a computer, cause the computer to: 
remove, at a call analytics server (CAS), non-speech portions from a call audio to produce a pre-processed audio, the call audio comprising speech from at least two speakers; 
divide, at the CAS, the pre-processed audio to a plurality of audio segments, each segment of the plurality of segments corresponding to speech from a single speaker of the at least two speakers, wherein the dividing comprises: 
select a first time-window of the call audio at position i-1, and a second time- window of the call audio at position i, the second time-window equal to the first time- window in duration, and the second time-window next and adjacent to the first time- window, where i is an integer greater than 0; 
calculate the Kullback-Leibler (KL) divergence measure for each of the first time- window at i-1 and the second-time window at i; shift the first time-window and the second window by a duration of a pre-defined time resolution to positions i+1 and i+2, respectively;
calculate the Kullback-Leibler (KL) divergence measure (D) for each of the first time- window at i+1 and the second-time window at i+2; and detect a change point if the following conditions are met:  
Condition 1: D (i , i+1) > D (i+1 , i+2), and
Condition 2: D (i , i+1) > D (i-1 , i), and

14.	The computer-readable storage medium of claim 11, wherein the clustering comprises: derive the mel frequency cepstral coefficients (MFCC) values for each audio segment of the plurality of audio segments; calculate numerical array with MFCC values for each audio segment; and perform a clustering technique to yield the at least two groups of audio segments.

Reasons for Allowance
The following is an examiner’s statement of reasons for allowance: The record is clear on the reasons for allowance.

Related Art
Wang et al (WO 2020146042) discloses segmentation of utterance of speech into a plurality of segments using probabilistic generative model and d-vectors. Such reference fails to disclose the recited limitations of this application.

Zhao et al (WO 2019227672A1) discloses segmentation of speech into a plurality of windows by moving a first sliding window and second sliding window simultaneously along the time axis direction for a preset period of time and producing a plurality of segmentation points of the sliding window to obtain a plurality of first speech fragments and a plurality of second speech fragments. Such reference fails to disclose the recited limitations of this application.


Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LINDA WONG whose telephone number is (571)272-6044. The examiner can normally be reached 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on (571) 272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LINDA WONG/Primary Examiner, Art Unit 2655