DETAILED ACTION
1.	This communication is in response to the Amendments and Arguments filed on 2/19/2021. Claims 1-20 are pending and have been examined.  
Response to Amendments and Arguments
2.	 With respect to claim rejections under 35 USC 103, in particular, independent claims 1, 10, 14 have been amended to recite “receiving speech data being partitioned into a plurality of utterances from a plurality of speakers.”
In response, note that RODRIGUEZ teaches: Fig. 1 and Fig. 3 and [Abstract] “detecting a target speaker's utterance locally” which reads on “speech data being partitioned into a plurality of utterances” which is subject to broad interpretation. The applicant is requested to make this amended limitation more specific (such as the details of the “participation” step including speech end-pointing and possibly with speaker-specific partition/collection, and so on) for further consideration - if the applicant considers this limitation different from what the references have taught. 
Claim Rejections - 35 USC § 103

3.	Claims 1, 3-6, 9-17, 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Rodriguez, et al. (US 20140379332; hereinafter RODRIGUEZ) in view of Yao (US 20070033044; hereinafter YAO) and further in view of Dehak, et al. (InterSpeech, ISCA 2011; hereinafter DEHAK).
As per claim 1, RODRIGUEZ (Title: Identification of a local speaker) discloses “A method for speaker recognition (RODRIGUEZ, Abstract, speaker identification), comprising:
receiving speech data being partitioned into a plurality of utterances from a plurality of speakers that include a plurality of voice features (RODRIGUEZ, Fig. 1 and Fig. 3; Abstract, speech data being partitioned into a plurality of utterances’ which can be broadly interpreted. Also read on a plurality of speakers – see Fig. 3> .. extracting features from the detected utterance locally);  
extracting a plurality of variability factors from the speech data based on a trained probabilistic model of the voice features of the plurality of speakers, [ the variability factors having a particular dimensionality ], wherein the variability factors reflect speech characteristics of individual speakers as defined by the trained probabilistic model (RODRIGUEZ, [0004], Gaussian Mixture Model <read on trained probabilistic model. Also see YAO below>; Abstract, extracting features from the detected utterance <where ‘voice features’ read on ‘variability factors’ which are not defined in the Specification and can be broadly interpreted>);
[ reducing the dimensionality of the plurality of variability factors using a non-parametric analysis, thereby generating dimensionality reduced variability factors26INTP51 9US P111621-US ]; and 
defining a score space based at least on [ the dimensionality reduced variability factors ] (RODRIGUEZ, Abstract, analyzing the extracted features <read on the original or the dimensionality-reduced variability factors> in the local device to obtain information on the speaker identification; [0065], compare one audio against N models, so that N scores are obtained <read on ‘score space’ which is not defined in the Specification and can be broadly interpreted>. We sort them and we take the highest one (s); [0027], the scores have a certain distribution i.e., Gaussian).”  
RODRIGUEZ does not expressly disclose “the variability factors having a particular dimensionality ..” However, this feature is taught by YAO (Title: System and method for creating generalized tied-mixture hidden Markov models for automatic speech recognition).
GTM-HMMs for acoustic modeling .. features are 10-dimensional mel-frequency cepstral coefficient, or MFCC ..” where the “10-dimensional MFCC” reads on “the variability factors having a particular dimensionality.”
Therefore, it would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the claimed invention to incorporate the teachings of YAO in a speaker identification system (as taught by RODRIGUEZ) for improved identification performance such as by using MFCC as identification features.
RODRIGUEZ in view of YAO does not expressly disclose “reducing the dimensionality of the plurality of variability factors using a non-parametric analysis, thereby generating dimensionality reduced variability factors.” However, this feature is taught by DEHAK (Title: Language Recognition via Ivectors and Dimensionality Reduction).
In the same field of endeavor, DEHAK teaches: Title (where “Ivectors” also reads on “the variability factors having a particular dimensionality”) and [Sec. 4.1, Para 1] “We evaluated several dimensionality reduction approaches .. We also compared the i-vector approach with two other well known language identification systems” and [Sec. 2.4 and 2.5, Para 1] “Linear discriminant analysis (LDA) is a very popular technique for dimension reduction in the machine learning field .. Neighborhood component analysis (NCA) is a dimension reduction technique.” Both read on “non-parametric analysis.”
Therefore, it would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the claimed invention to incorporate the teachings of DEHAK in a speaker identification system (as taught by RODRIGUEZ and YAO) for improved identification performance with reduced-dimension variability factors.
claim 3 (dependent on claim 1), RODRIGUEZ in view of YAO and DEHAK further discloses “receiving subsequent speech data from a target speaker; scoring multiple variability factors of the target speaker using the score space; and identifying the target speaker based at least on a score of the multiple variability factors (RODRIGUEZ, Abstract, Method for speaker identification includes detecting a target speaker's utterance locally; extracting features from the detected utterance locally, analyzing the extracted features <read on multiple variability factors> in the local device to obtain information on the speaker identification; [0065], compare one audio against N models, so that N scores are obtained. We sort them and we take the highest one (s); [0027], the scores have a certain distribution i.e., Gaussian).”
As per claim 4 (dependent on claim 1), RODRIGUEZ in view of YAO and DEHAK further discloses “wherein the non-parametric analysis is a Nearest Neighbor Discriminant Analysis (NNDA) (DEHAK, Sec. 2.5, Para 1, the nearest neighbor classifier; RODRIGUEZ, [0020], the extracted features can be encoded by vector quantization).” 
As per claim 5 (dependent on claim 4), RODRIGUEZ in view of YAO and DEHAK further discloses “using a nearest neighbor rule which maintains within-class and between-class variations of the plurality of variability factors to reduce dimensionality (DEHAK, Sec. 2.5, Para 1, the nearest neighbor classifier; Sec. 2.4, Para 1, The matrices Σb and Σw correspond to the between classes and within class covariance matrices, respectively).”26 INTP51 9US P111621-US  
As per claim 6 (dependent on claim 1), RODRIGUEZ in view of YAO and DEHAK further discloses “comprising defining the score space using a probabilistic discriminant analysis of the dimensionality reduced variability factors (RODRIGUEZ, [0065], compare one audio against N models, so that N scores are obtained. We sort them and we take the highest one .”26 INTP51 9US P111621-US  
As per claim 9 (dependent on claim 1), RODRIGUEZ in view of YAO and DEHAK further discloses “wherein the plurality of voice features are determined using Mel frequency cepstral coefficients (MFCC) (RODRIGUEZ, [0054], MEL frequency cepstral coefficients (MFCC)).”
Claims 10, 11-12, 13 (device claims, similar in scope to method claims 1 and 6, 3-4, 6) are rejected under the same rationale as applied above for claims 1 and 6, 3-4, 6.  
Claims 14, 15, 16, 17, 19, 20 (computer-readable medium claims, similar in scope to method claims 1, 3, 4, 6, 2, 4) are rejected under the same rationale as applied above for claims 1, 3, 4, 6, 2, 4. Note that Specification [0037] also states “The computer-readable media may enable persistent and/or non-transitory storage of information in contrast to mere signal transmission, carrier waves or signals.”

26445554.	Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over RODRIGUEZ in view of YAO and DEHAK, and further in view of Vogt, et al. (Elsevier ScienceDigest, 2008; hereinafter VOGT).
23As per claim 2 (dependent on claim 1), RODRIGUEZ in view of YAO and DEHAK further discloses “wherein the variability factors include [ speaker-dependent factors and session-dependent factors ] (RODRIGUEZ, Abstract, extracting features from the detected utterance).”
speaker-dependent factors and session-dependent factors ..” However, this feature is taught by VOGT (Title: Explicit modelling of session variability for speaker verification). 
In the same field of endeavor, VOGT teaches: [Sec. 2, Para 4] “as the observed feature vectors are assumed to be conditional on both an explicit session-dependent part and a session-independent speaker part ..” and [Sec. 2, Para 9] “Adaptation must be used for a subspace to describe the relationships of the component means, that is the speaker mean should comprise a shared speaker-independent mean plus a speaker-dependent offset.”
Therefore, it would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the claimed invention to incorporate the teachings of VOGT in a speaker identification system (as taught by RODRIGUEZ, YAO and DEHAK) for improved identification performance by taking care of modelling mismatch in speaker recognition.

5.	2644555555.Claims 7-8, 18 are rejected under 35 U.S.C. 103 as being unpatentable over RODRIGUEZ in view of YAO and DEHAK, and further in view of Chi, et al. (CN 102663409B; hereinafter CHI).
As per claim 7 (dependent on claim 1), RODRIGUEZ in view of YAO and DEHAK further discloses “comprising extracting the plurality of variability factors using a total variability matrix [trained by a Universal Background Model (UBM) trained by a Gaussian Mixture Model (GMM)] (RODRIGUEZ, Abstract, extracting features from the detected utterance <where ‘voice features’ read on ‘variability factors’ which are not defined in the Specification and can be broadly interpreted>; [0004], Gaussian Mixture Model; [0019], The extracted features may be present in form of feature (characteristic) vectors <as elements of matrix> comprising .”26 INTP51 9US P111621-US  
RODRIGUEZ in view of YAO and DEHAK does not expressly disclose “.. trained by a Universal Background Model (UBM) trained by a Gaussian Mixture Model (GMM) ..” However, this feature is taught by CHI (Title: Pedestrian tracking method based on HOG-LBP).
In the same field of endeavor, CHI teaches: “Gaussian mixture model-universal background model ..”
Therefore, it would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the claimed invention to incorporate the teachings of CHI in a speaker identification system (as taught by RODRIGUEZ, YAO and DEHAK) for improved identification performance using advanced speaker independent identification models.
As per claim 8 (dependent on claim 7), RODRIGUEZ in view of YAO, DEHAK and CHI further discloses “wherein the total variability matrix is further trained using Baum-Welch statistics of the plurality of voice features (YAO, [0035], the well-known Baum-Welch EM algorithm (see, e.g., L. R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” in proceedings of the IEEE, 77(2), 1989, pp. 257-286); [0040], trains all HMM parameters using the Baum-Welch E-M algorithm).”  	 
Claim 18 (computer-readable medium claim, similar in scope to method claim 7) is rejected under the same rationale as applied above for claim 7. Note that Specification [0037] also states “The computer-readable media may enable persistent and/or non-transitory storage of information in contrast to mere signal transmission, carrier waves or signals.”


Conclusion 
6.	THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).   
	A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 		
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FENG-TZER TZENG whose telephone number is (571)272-4609. The examiner can normally be reached on M-F (8:30-5:00). The fax phone number where this application or proceeding is assigned is 571-273-4609.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir (SPE) can be reached on 571-272-7799. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 

/FENG-TZER TZENG/		2/22/2021Primary Examiner, Art Unit 2659