DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365(c) is acknowledged.  

Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file. 

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/11/2019 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
	
Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Independent claims 1, 10 and 19 recite “the current voice data”. The antecedent limitations only mention “acquiring historical voice data” and never mention any steps of related to “current voice data”. The limitation “the current voice data” has insufficient antecedent basis. 

Dependent claims fail to remedy deficiency of their corresponding independent claims. 

In addition, dependent claims 3 and 12 recite “acquiring all the historical voice data”. 
acquiring historical voice data”, it is unclear if a term “historical voice data” refer to a portion of “all the historical voice data” or refer to different historical voice data from different users.

By reviewing the specification, it appears applicant intended to express voice data from many users as “all the historical voice data”. Since “the historical voice data” appears to express voice data from one user. The claimed “all the historical voice data” is ambiguous because the term could refer to all voice data from the same user or from different users. The examiner suggests clarifying the meaning of the limitation recited in claims 3 and 12.  Dependent claims 4, 6-9, 13, 15-18 depend from claims 3 and 12, respectively. 

	Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3, 5, 10, 12, 14 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al. (US PG Pub. 2003/0088414) in view of Gomar (US PG Pub. 2013/0225128).



Gomar discloses a speaker recognition system that a user could provide training utterances to create / train his voice model. The user could also evaluate the existing voice model by providing some test utterances (Abstract, Fig. 17, [00147]).  

Regarding claims, 1, 10 and 19, Huang discloses a method, a device and a non-transitory computer-readable medium, performed by a voice interaction device, (Fig. 1, [0063]) comprising: 
acquiring historical voice data, acquiring historical voice feature vectors corresponding to the historical voice data ([0008], [0011-0012], collecting speech data in background without prior knowledge of the speakers who spoke the respective utterances; [0035], extract feature vectors from the speed data), and performing clustering on the historical voice feature vectors to obtain a voice feature cluster (Fig. 2, #340, vector clustering algorithm based on likelihood), the voice feature cluster comprising at least one historical voice feature vector with a similar feature ([0038], vectors from the same speaker are similar); 
after it is determined that the voice feature cluster matches a high-frequency user condition, training a corresponding user voice model according to the historical voice feature vectors contained in the voice feature cluster ([0012], when enough data are collected, training speaker-specific models using self-tagged data; [0076] when not enough data available, asking user to speak more utterances; Examiner note: the claimed “a high-frequency user condition” is interpreted as when enough speech data are collected from a particular user); 
after it is detected that a current voice feature vector of the current voice data matches the user voice model, initiating a user identity association request associated with the current voice data (Fig. 1, #130, [0018], [0032], [0059], after generating speaker specific model, asking a user to provide a test utterance to evaluate if the model is sufficient to identify the current speaker; Examiner note, a user provided test utterance is the claimed “a current voice data”); and 
after it is determined that a response message corresponding to the user identity association request is received, binding user identity information in the response message to the user voice model ([0063-0066], performing speaker identification experiments using all test utterances to evaluate trained speaker specific models; Examiner note, when a test utterance is identified from a particular speaker, the model specific to this particular speaker is linked to identity information in the response message to the user voice model).

Huang discloses generating and training speaker specific models based on speech data collected in the background without user’s intervention. Huang also discloses evaluating the generated speaker specific models by doing speaker identification experiments using all test utterances (Fig. 1, [0063-0067]). Although Huang meets the broadly recited two last limitations, by reviewing the specification, the a current voice feature vector of the current voice data matches the user voice model, initiating a user identity association request associated with the current voice data” and “a response message corresponding to the user identity association request is received, binding user identity information in the response message to the user voice model”, the examiner further cites Gomar to show the claimed feature. 

	Gomar discloses a speaker recognition system to identify a speaker based on biometric voice print (Abstract, [0022]). Gomar discloses a user could enroll to the speaker recognition system by providing enrollment utterances (Gomar, [0081], Fig. 17, #1710). The user also could test the existing voice prints for identifying a speaker by providing a test utterance (Gomar, [0147]). The test voice is claimed “current voice data” and the speaker identification result (e.g., correctly identify a speaker) is claimed “binding user identify information in the response message to the user voice model”. 

	It would have been obvious to a person having ordinary skill in the art at the time the invention was filed to combine Huang’s teaching with Gomar’s teaching to test the speaker recognition system and see if the system could correctly recognize a test utterance (i.e., binding user identity information … to the user voice model”. One having ordinary skill in the art would have been motivated to make such a modification for a cost-effective voice biometric capability used on mobile device (Gomar, [0020]). In addition, since all the claimed elements were known in the prior art and one skilled in 

Regarding claims 5 and 14, Huang in view of Gomar further discloses determining a mean value OR an interpolation of the historical voice feature vectors contained in the voice feature cluster, to obtain a target historical voice feature vector, and using the target historical voice feature vector as a model parameter of the user voice model corresponding to the voice feature cluster (Huang, [0051-0055], using k-means clustering algorithm. Please note,  K-means algorithm refers to K clusters with a mean value in each class; For background information of K-means clustering algorithm, please see https://en.wikipedia.org/wiki/K-means_clustering; Examiner note, the cited reference only need to teach ONE alternative connected by using “OR”).

Regarding claims 3 and 12, Huang discloses training Gaussian mixture models and applying clustering algorithm to generate speak specific models (Huang, Fig. 3, [0035-0037]). Huang does not discloses “a global difference space matrix” and projecting voice data to reduce dimensionality of the voice feature vectors. By reviewing 

Gomar discloses applying the iVector technique to reduction dimension of feature vectors by using a matrix T (Gomar, [0091], [0130-0133], Please note, a total variability matrix, T, corresponds to the claimed “a global difference space matrix”).

	It would have been obvious to a person having ordinary skill in the art at the time the invention was filed to combine Huang’s teaching with Gomar’s teaching to use i-vector method to reduce feature vector dimension. One having ordinary skill in the art would have been motivated to make such a modification for a cost-effective voice biometric capability used on mobile device (Gomar, [0020]) so that the model could be stored on a mobile device (Gomar, [0142]). In addition, since all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods, and in the combination each element merely would have performed the same function as it did separately. “A combination of familiar elements according to known methods is likely to be obvious when it does no more than yield predictable results.” KSR, 550 U.S. ___, 82 USPQ2d at 1395 (2007). One of ordinary skill in the art would have recognized that the results of the combination were predictable.


Allowable Subject Matter
Claims 2, 4, 6-9, 11, 13, 15-18 and 20 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The examiner discovered several references related to a key concept of generating speaker specific models using a non-supervised method to collect speech data (also called “implicit enrollment”, “stealthy enrollment”, or “enrollment free”). These relevant references are included in the attached PTO-892 form. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jialong He, whose telephone number is (571) 270-5359.  The examiner can normally be reached on Monday – Friday, 8:00AM – 4:30PM, EST.

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Pierre Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for 

/JIALONG HE/Primary Examiner, Art Unit 2659