DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 28 June 2021 in reference to application 16/091,926.  Claims 1, 3-5, 11, 13-15, and 18-20 are pending and have been examined.

Response to Amendment
The amendment filed 28 June 2021 has been accepted and considered in this office action.  Claims 1, 3, 11, 13, 16, and 18-20 have been amended, and claims 2, 12, and 17 cancelled.

Response to Arguments
Applicant’s arguments, see Remarks, filed 28 June 2021, with respect to rejections made under 35 U.S.C. 101 have been fully considered and are persuasive.  The 101 rejection of claims 16-20 has been withdrawn. 

Applicant's arguments filed 28 June 2021 with regard to the prior art rejections have been fully considered but they are not persuasive. Applicant argues, see Remarks page 13-15, that Wasserblat and Ghaemmaghami fail to teach the equation 
    PNG
    media_image1.png
    26
    230
    media_image1.png
    Greyscale

.

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Claim(s) 1, 3, 11, 13, 16, and 18 s/are rejected under 35 U.S.C. 103 as being unpatentable over Wasserblat et al. (US PAP 2013/0246064) in view of Ghaemmaghami et al. (US PAP 2019/0304470).

Consider claim 1, Wasserblat teaches a method for voiceprint recognition (Abstract), comprising: 
establishing and training a universal recognition model, wherein the universal recognition model is indicative of a distribution of voice features under a preset 
establishing an initial recognition model (0044, establishing UBM); and 
training the initial recognition model according to an iterative algorithm to obtain the universal recognition model (0044, training UBM using agent audio collection using Expectation-maximization, which is an iterative algorithm); 
acquiring voice data under the preset communication medium (0043, 0047, individual agents audio may be used to create individual agent models); 
creating a corresponding voiceprint vector according to the voice data (0043, 0047-48, extracting features from voice for processing, features must be in same format as those represented by UBM Gaussians  for MAP adaption); and 
determining a voiceprint feature corresponding to the voiceprint vector according to the universal recognition model (0047-48, using particular agent audio features to generate user model from UBM using MAP adaptation.  Models are representation of representative speech characteristics, i.e. voiceprint, see 0044).
Wasserblat does not specifically teach wherein the step of training the initial recognition model according to an iterative algorithm to obtain the universal recognition model comprises: 
acquiring likelihood probability p corresponding to a current voiceprint vector represented by a plurality of normal distributions according to the initial recognition model:
  
    PNG
    media_image1.png
    26
    230
    media_image1.png
    Greyscale
  -5-PATENTSZP-1034US

In the same field of building Universal Background Models using EM,  Ghaemmaghami teaches wherein the step of training the initial recognition model according to an iterative algorithm to obtain the universal recognition model comprises: 
acquiring likelihood probability p corresponding to a current voiceprint vector represented by a plurality of normal distributions according to the initial recognition model:
  
    PNG
    media_image1.png
    26
    230
    media_image1.png
    Greyscale
  -5-PATENTSZP-1034US
wherein, x represents current voice data, λ represents model parameters which include ω, μ, and Σ, ω represents a weight of a i-th normal distribution, μ represents a mean value of the i-th normal distribution, Σ. represents a covariance matrix of the i-th normal distribution, pi represents a probability of generating the current voice data by the i-th normal distribution, and M is the number of sampling points (SEE equation 5 and para 0117);
	Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use well known methods of training taught by Ghaemmaghami to carry out the Expectation-maximization approach specified by Wasserblat in order to make use of well-known methods that can accurate adapt models (Ghaemmaghami 0114).

Consider claim 3,  Ghaemmaghami teaches wherein the step of training the initial recognition model according to an iterative algorithm to obtain the universal recognition model comprises: 
calculating a probability of the i-th normal distribution according to the equation:  

    PNG
    media_image2.png
    41
    543
    media_image2.png
    Greyscale
 
wherein, D represents the dimension of the current voiceprint vector (Equation 6, para 0117); 
selecting parameter values of ω, μ, and Σ, to maximize the log-likelihood function L: 
 
    PNG
    media_image3.png
    26
    308
    media_image3.png
    Greyscale
 (0117-0126, maximization of lambda function by selecting ω, μ, and Σ)
 acquiring updated model parameters in each iterative update:  

    PNG
    media_image4.png
    241
    521
    media_image4.png
    Greyscale
 
wherein, i represents the i-th normal distribution, ω represents an updated weight of the i-th normal distribution, μ represents an updated mean value, Σ represents an updated covariance matrix, and θ is an included angle between the voiceprint vector and the horizontal line (equations 11-13, paragraph 0129); and 

 
    PNG
    media_image5.png
    47
    263
    media_image5.png
    Greyscale
  -6-PATENTSZP-1034US 
wherein, the sum of posterior probabilities of the plurality of normal distributions is defined as the iterated universal recognition model (equation 9, paragraph 0128).

Consider claim 11, Wasserblat teaches A device for voiceprint recognition (abstract), comprising a memory and a processor, wherein a computer readable instruction capable of running on the processor is stored in the memory (0110-11 memory and CPU), and when executing the computer readable instruction, the processor implements following steps of: 
establishing and training a universal recognition model, wherein the universal recognition model is indicative of a distribution of voice features under a preset communication medium (0029, creating UBM for all agents at call center which includes shared channel acoustics) and wherein the step of establishing and training a universal recognition model comprises: 
establishing an initial recognition model (0044, establishing UBM); and 
training the initial recognition model according to an iterative algorithm to obtain the universal recognition model (0044, training UBM using agent audio collection using Expectation-maximization, which is an iterative algorithm); 
acquiring voice data under the preset communication medium (0043, 0047, individual agents audio may be used to create individual agent models); 

determining a voiceprint feature corresponding to the voiceprint vector according to the universal recognition model (0047-48, using particular agent audio features to generate user model from UBM using MAP adaptation.  Models are representation of representative speech characteristics, i.e. voiceprint, see 0044).
Wasserblat does not specifically teach wherein the step of training the initial recognition model according to an iterative algorithm to obtain the universal recognition model comprises: 
acquiring likelihood probability p corresponding to a current voiceprint vector represented by a plurality of normal distributions according to the initial recognition model:
  
    PNG
    media_image1.png
    26
    230
    media_image1.png
    Greyscale
  -5-PATENTSZP-1034US
wherein, x represents current voice data, λ represents model parameters which include ω, μ, and Σ, ω represents a weight of a i-th normal distribution, μ represents a mean value of the i-th normal distribution, Σ. represents a covariance matrix of the i-th normal distribution, pi represents a probability of generating the current voice data by the i-th normal distribution, and M is the number of sampling points;
In the same field of building Universal Background Models using EM,  Ghaemmaghami teaches wherein the step of training the initial recognition model according to an iterative algorithm to obtain the universal recognition model comprises: 

  
    PNG
    media_image1.png
    26
    230
    media_image1.png
    Greyscale
  -5-PATENTSZP-1034US
wherein, x represents current voice data, λ represents model parameters which include ω, μ, and Σ, ω represents a weight of a i-th normal distribution, μ represents a mean value of the i-th normal distribution, Σ. represents a covariance matrix of the i-th normal distribution, pi represents a probability of generating the current voice data by the i-th normal distribution, and M is the number of sampling points (SEE equation 5 and para 0117);
	Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use well known methods of training taught by Ghaemmaghami to carry out the Expectation-maximization approach specified by Wasserblat in order to make use of well-known methods that can accurate adapt models (Ghaemmaghami 0114).
 
Claims 13 contains similar limitations as claim 3 and is therefore rejected for the same reasons.

Consider claim 16, Wasserblat teaches A computer readable storage medium which stores a computer readable instruction, wherein when executing the computer readable instruction, at least one processor implements (0110-11, memory storing instructions for CPU) the following steps of:

establishing an initial recognition model (0044, establishing UBM); and 
training the initial recognition model according to an iterative algorithm to obtain the universal recognition model (0044, training UBM using agent audio collection using Expectation-maximization, which is an iterative algorithm);
acquiring voice data under the preset communication medium (0043, 0047, individual agents audio may be used to create individual agent models); 
creating a corresponding voiceprint vector according to the voice data (0043, 0047-48, extracting features from voice for processing, features must be in same format as those represented by UBM Gaussians  for MAP adaption); and 
determining a voiceprint feature corresponding to the voiceprint vector according to the universal recognition model (0047-48, using particular agent audio features to generate user model from UBM using MAP adaptation.  Models are representation of representative speech characteristics, i.e. voiceprint, see 0044).
Wasserblat does not specifically teach wherein the step of training the initial recognition model according to an iterative algorithm to obtain the universal recognition model comprises: 

  
    PNG
    media_image1.png
    26
    230
    media_image1.png
    Greyscale
  -5-PATENTSZP-1034US
wherein, x represents current voice data, λ represents model parameters which include ω, μ, and Σ, ω represents a weight of a i-th normal distribution, μ represents a mean value of the i-th normal distribution, Σ. represents a covariance matrix of the i-th normal distribution, pi represents a probability of generating the current voice data by the i-th normal distribution, and M is the number of sampling points;
In the same field of building Universal Background Models using EM,  Ghaemmaghami teaches wherein the step of training the initial recognition model according to an iterative algorithm to obtain the universal recognition model comprises: 
acquiring likelihood probability p corresponding to a current voiceprint vector represented by a plurality of normal distributions according to the initial recognition model:
  
    PNG
    media_image1.png
    26
    230
    media_image1.png
    Greyscale
  -5-PATENTSZP-1034US
wherein, x represents current voice data, λ represents model parameters which include ω, μ, and Σ, ω represents a weight of a i-th normal distribution, μ represents a mean value of the i-th normal distribution, Σ. represents a covariance matrix of the i-th normal distribution, pi represents a probability of generating the current voice data by the i-th normal distribution, and M is the number of sampling points (SEE equation 5 and para 0117);


Claims 18 contains similar limitations as claim 3 and is therefore rejected for the same reasons.

Claim 4, 5, 14, 15, 19, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wasserblat and Ghaemmaghami as applied to claims 1, 11, and 16 above, in view of Azhari et al. (Fast Universal Background Model Training on GPUs using Compute Unified Device Architecture).

Consider claim 4, Wasserblat and Ghaemmaghami teach the method according to claim 1, wherein the step of creating a corresponding voiceprint vector according to the voice data comprises: 
performing fast Fourier transform on the voice data (0043, features may be Fourier coefficients extracted from signal), but does not specifically teach the fast Fourier transform equation is formulated as:
  
    PNG
    media_image6.png
    28
    412
    media_image6.png
    Greyscale
 
wherein, x(n) represents input voice data, and N represents the number of Fourier transform points.

  
    PNG
    media_image6.png
    28
    412
    media_image6.png
    Greyscale
 
wherein, x(n) represents input voice data, and N represents the number of Fourier transform points (section II A, equation 1).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use the textbook DFT equation of Azhari in the system of Wasserblat and Ghaemmaghami in order to use an extremely well known textbook method of calculating Fourier transform coefficients. 

Consider claim 5, Wasserblat and Ghaemmaghami teach the method according to claim 1, but do not specifically teach wherein the step of determining a voiceprint feature corresponding to the voiceprint vector according to the universal recognition model comprises: 
decoupling the voiceprint vector; 
processing in parallel the voiceprint vector using a plurality of graphics processing units to obtain a plurality of processing results; and 
combining the plurality of processing results to determine the voiceprint feature.
In the same field of model training using UBMs, Azhari teaches 
decoupling the voiceprint vector (i.e. Section IV A, splitting into ranges of windows); 
processing in parallel the voiceprint vector using a plurality of graphics processing units to obtain a plurality of processing results (Section IV A, MFCC 
combining the plurality of processing results to determine the voiceprint feature (Section IV A, MFCC reassembled…  also see Section IV where EM of modeling is performed by breaking up tasks into parallel processing steps).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use parallel processing as taught by Azhari in building models in the system of Wasserblat and Ghaemmaghami in order to more efficiently compute voiceprint models (Azhari introduction).

Claims 14 contains similar limitations as claim 4 and is therefore rejected for the same reasons.

Claims 15 contains similar limitations as claim 5 and is therefore rejected for the same reasons.

Claims 19 contains similar limitations as claim 4 and is therefore rejected for the same reasons.

Claims 20 contains similar limitations as claim 5 and is therefore rejected for the same reasons.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451.  The examiner can normally be reached on 7:30-12 Monday and Friday, 7:30-6 Tuesday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602.  The fax phone 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


DOUGLAS GODBOLD
Examiner
Art Unit 2658



/DOUGLAS GODBOLD/Primary Examiner, Art Unit 2658