DETAILED ACTION
Claims 1-26 are pending.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1, 2, 3, 16, 17, 19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Colibro et al (US 2014/0244257 A1: hereafter – Colibro).
For claim 1, Colibro discloses a processor-implemented method of personalizing a speech recognition model (Colibro: [0014] — used for user-specific speech recognition, [0038] — a processor), the method comprising:
obtaining statistical information of first scaling vectors combined with a base model for speech recognition (Colibro: [0014] — a Universal Background Model (UBM) supervector representing statistical parameters (UBM generally known to represent a base model) indicated by                         
                            m
                        
                    );
obtaining utterance data of a user (Colibro: [0016] — “extract feature coefficients from a received speech signal” from an individual user); and
generating a personalized speech recognition model by modifying a second scaling vector combined with the base model based on the utterance data of the user and the statistical information (Colibro: [0014], [0016] — the equation                         
                            s
                            =
                            m
                            +
                            T
                            ∙
                            w
                        
                     which modifies a second scaling vector –                         
                            w
                        
                    , combined with the base model –                         
                            m
                        
                     to obtain the vector –                         
                            s
                        
                     that represents the statistical parameters that correspond to the individual user).
For claim 2, claim 1 is incorporated and Colibro discloses the method, wherein:
the first scaling vectors correspond to a plurality of speakers (Colibro: [0014] — a UBM supervector (UBM generally correspond to a plurality of speakers)); and
the second scaling vector corresponds to the user (Colibro: [0016] — the equation                         
                            s
                            =
                            m
                            +
                            T
                            ∙
                            w
                        
                     with                         
                            w
                        
                     being an i-vector corresponding to the individual user).
For claim 3, claim 1 is incorporated and Colibro discloses the method, wherein the generating comprises:
initializing the second scaling vector (Colibro: [0015] — “the i-vector based speaker verification module 210 employs background statistical parameters, usually determined prior to system deployment, previously generated i-vectors representing speakers’ voice characteristics, and i-vectors generated based on speech signal(s)” (indicating an initialisation of the i-vector taken as the second scaling vector)); and
training the second scaling vector based on the utterance data of the user and the statistical information (Colibro: [0016] — the equation                         
                            s
                            =
                            m
                            +
                            T
                            ∙
                            w
                        
                     indicating a training of the second scaling vector using the utterance data of the user and the statistical information).
As for claim 16, computer program product claim 16 and method claim 1 are related as computer program product storing executable instructions required for performing the claimed method steps on a computer. Colibro in [0042] provides teaching for a non-transitory machine-readable medium suitable to read upon this claim. Accordingly, claim 16 is similarly rejected under the same rationale as applied above with respect to method claim 1.
For claim 17, Colibro discloses a processor-implemented method of personalizing a speech recognition model (Colibro: [0014] — used for user-specific speech recognition, [0038] — a processor), the method comprising:
obtaining a base model for speech recognition using speech data corresponding to a plurality of speakers (Colibro: [0014] — a Universal Background Model (UBM) supervector representing statistical parameters (UBM generally known to represent a base model) for a plurality of speakers);
generating statistical information of scaling vectors combined with the base model by applying datasets including the speech data to the scaling vectors (Colibro: [0014], [0016] — the equation                         
                            s
                            =
                            m
                            +
                            T
                            ∙
                            w
                        
                     which modifies a second scaling vector –                         
                            w
                        
                    , combined with the base model –                         
                            m
                        
                     to obtain the vector –                         
                            s
                        
                     that represents the statistical parameters that correspond to the individual user); and
providing the statistical information to generate a personalized speech recognition model (Colibro: [0014], [0016] — the equation                         
                            s
                            =
                            m
                            +
                            T
                            ∙
                            w
                        
                     which modifies a vector –                         
                            w
                        
                    , combined with the base model –                         
                            m
                        
                     to obtain the vector –                         
                            s
                        
                     that represents the statistical parameters that correspond to the individual user).
19, claim 17 is incorporated and Colibro discloses the method of claim 17, wherein the generating of the statistical information comprises:
generating per-speaker datasets based on the speech data (Colibro: [0032] — obtaining representations of voice characteristics associated with corresponding speakers);
training the scaling vectors using each of the per-speaker datasets (Colibro: [0016] — the equation                         
                            s
                            =
                            m
                            +
                            T
                            ∙
                            w
                        
                     indicating a training of the second scaling vector using the utterance data of the user and the statistical information); and
generating the statistical information of the scaling vectors based on a result of training the scaling vectors (Colibro: [0014], [0016] — the equation                         
                            s
                            =
                            m
                            +
                            T
                            ∙
                            w
                        
                     which modifies a second scaling vector –                         
                            w
                        
                    , combined with the base model –                         
                            m
                        
                     to obtain the vector –                         
                            s
                        
                     that represents the statistical parameters that correspond to the individual user).
Claim 25 is rejected under 35 U.S.C. 102(a)(1) as being anticipated by Jagatheesan et al (US 2015/0025890 A1: hereafter – Jagatheesan).
For claim 25, Jagatheesan discloses an apparatus for personalizing a speech recognition model, the apparatus comprising:
a communication interface configured to obtain statistical information of first scaling vectors combined with a base model for speech recognition (Jagatheesan: [0093] — communication interface);
a sensor configured to obtain utterance data of a user (Jagatheesan: [0007] — microphone); and
one or more processors configured to generate a personalized speech recognition model by modifying a second scaling vector combined with the base model based on the a processor).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 7, 8 and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Colibro (US 2014/0244257 A1) as applied to claim 1, in view of Cumani et al (US 20140222428 A1: hereafter – Cumani).
For claim 7, Colibro fails to explicitly disclose the limitations of this claim, for which Cumani is now introduced to teach as the method, wherein the statistical information includes either one or both of a mean and a variance generated by approximating a Gaussian distribution of the first scaling vectors corresponding to a plurality of speakers (Cumani: [0018] — a GMM-UBM whereby a statistical background model is represented by a UBM supervector that uses feature vectors from extracted speech signals associated with a plurality of speakers, with the sub-vectors of the supervector representing means of the corresponding Gaussian component, the subvectors representing means of their corresponding Gaussian components).
The reference of Colibro provides teaching for obtaining statistical information of first scaling vectors. It differs from the claimed invention in that the claimed invention further provides that the statistical information includes a mean or variance generated by approximating a Gaussian distribution of the scaling vectors. This isn’t new to the 
For claim 8, claim 7 is incorporated and the combination of Colibro in view of Cumani discloses the method, wherein:
each of the first scaling vectors comprises a plurality of elements (Cumani: [0018] — a supervector comprising a pack stack of sub-vectors); and
the mean and the variance is calculated for each of the plurality of elements (Cumani: [0018] — the subvectors represent means of the corresponding Gaussian component (Gaussian distributions are indicative of both its mean and its variance)).
For claim 24, claim 17 is incorporated the method, wherein the generating of the statistical information comprises generating a mean and a variance of the scaling vectors by approximating a Gaussian distribution of the scaling vectors (Cumani: [0018] — a GMM-UBM whereby a statistical background model is represented by a UBM supervector that uses feature vectors from extracted speech signals associated with a plurality of speakers, with the sub-vectors of the supervector representing means of the corresponding Gaussian component, the subvectors representing means of their corresponding Gaussian components (Gaussian distributions are indicative of both its mean and its variance.
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Colibro (US 2014/0244257 A1) as applied to claim 1, in view of SUMMERFIELD (US 2016/0372116 A1).
For claim 15, claim 1 is incorporated but the reference of Colibro fails to teach the limitations of this claim, for which Summerfield is now introduced to teach as the method, further comprising:
recognizing a speech of the user using the speech recognition model (Summerfield: [0018] — speech recognition models for performing speech recognition of individual users).
The reference of Colibro provides teaching for generating a personalised speech recognition model and also teaches of speech recognition, but differs from the claimed invention in that the claimed invention further provides that the speech recognition model is used to recognise the speech of the user. This isn’t new to the art as the reference of Summerfield is seen to teach above. Hence, at the time the application was effectively field, one of ordinary skill in the art would have found it obvious to incorporate the teaching of Summerfield into that of Calibro, given the predictable result of providing an improved speech recognition model that is able to adapt to recognising the speaker’s utterances.
Claims 20 and 21 are is rejected under 35 U.S.C. 103 as being unpatentable over Colibro (US 2014/0244257 A1) as applied to claim 19, in view of Visser et al (US 2019/0341026 A1: hereafter – Visser).
For claim 20, claim 19 is incorporated but the reference of Colibro fails to teach the limitations of this claim, for which Visser is now introduced to teach as the method, wherein:
creating a new label for a speaker, such as ‘Josh,’ which gets associated with the speaker’s utterances); and
the generating of the per-speaker datasets comprises:
classifying per-speaker speech data using the speaker identifier included in the  speech data (Visser: [0034] — causing the speaker with the speaker label of ‘Josh’ to provide certain training words to be used to generate a recognition model for the indicate speaker), and
generating the per-speaker datasets using the per-speaker speech data (Visser: [0034] — storing audio data of Josh speaking which may be used to train a speech recognition model to recognise speech by Josh (the trained model being a generated per-speaker dataset of the speaker)).
The reference of Colibro provides teaching for generating per-speaker datasets based on the speech data, but differs from the claimed invention in that the claimed invention further provides teaching for including a speaker identifier with the speech data. This is however not new to the art as the reference of Visser is seen to teach above. Hence, at the time the application was effectively filed, one of ordinary skill in the art would have found it obvious to incorporate the teaching of Visser into that of Colibro, providing labelling for speech data, given the predictable result of adapting speech recognition models to individual speakers.
For claim 21, claim 20 is incorporated and the combination of Colibro in vier of Visser discloses the method, wherein the generating of the per-speaker datasets using the per-speaker speech data comprises either one or both of:
generating a single dataset using all the per-speaker speech data (Visser: [0034] — ‘[a]dditionally, the processor 102 may store audio data of Josh speaking as the audio data 124, which may be used as training data to train a speech recognition model to recognize speech by Josh’); and
generating a single dataset using a portion selected at random from the per-speaker speech data.
Claim 23 is rejected under 35 U.S.C. 103 as being unpatentable over Colibro (US 2014/0244257 A1) as applied to claim 19, in view of Weinstein et al (US 2015/0039299 A1: hereafter – Weinstein).
For claim 23, claim 19 is incorporated but the reference of Colibro fails to effectively teach the limitation of this claim, for which Weinstein is now introduced to teach as the method, wherein the training comprises training the scaling vectors independently for each of the per-speaker datasets (Weinstein: [0049] — training data including i-vectors; [0059] — discriminative training for individual speaker i-vector (the i-vectors being taken as the scaling vector)).
The reference of Colibro provides teaching for obtaining scaling vectors used in speech recognition, but differs from the claimed invention in that the claimed invention further teaches training scaling vectors independently for each per-speaker dataset. This isn’t new to the art as the reference of Weinstein is shown to teach above. Hence, at the time the application was effectively filed one of ordinary skill in the art would have found it obvious to incorporate the teaching of Weinstein in that of Colibro, given the predictable result of adapting speech recognition to individual speakers.
Claim 26 is rejected under 35 U.S.C. 103 as being unpatentable over Colibro (US 2014/0244257 A1) in view of Reynolds (U.S. 7,379,868 B2).
m 26, Colibro discloses a processor-implemented method of personalizing a speech recognition model (Colibro: [0014] — used for user-specific speech recognition, [0038] — a processor), the method comprising:
obtaining [[distribution variance information of elements of]] first scaling vectors, wherein the27012052.1704 first scaling vectors are combined with a speech recognition base model and were previously-trained based on speech datasets of a plurality of speakers (Colibro: [0014] — a Universal Background Model (UBM) supervector representing statistical parameters (UBM generally known to represent a base model for a plurality of speakers) indicated by                         
                            m
                        
                    ; [0034] — performing supplemental retraining by using previously used enrolment and adaptation data); and
generating a personalized speech recognition model by training a second scaling vector combined with the base model based on utterance data of a user [[and the distribution variance information]] (Colibro: [0014], [0016] — the equation                         
                            s
                            =
                            m
                            +
                            T
                            ∙
                            w
                        
                     which modifies a second scaling vector –                         
                            w
                        
                    , combined with the base model –                         
                            m
                        
                     to obtain the vector –                         
                            s
                        
                     that represents the statistical parameters that correspond to the individual user).
The reference of Colibro fails to teach of obtaining distribution variance information of elements of the first scaling vectors. This is however not new to the art as is shown to be taught by Reynolds.
Reynolds teaches Col 3 lines 32-40 — implementing baseline model as Gaussian mixture models with the speaker model parameters including a variance vector; Col 2 lines 4-7 — the background/baseline model being generated from a collective group of speakers.
Hence, at the time the application was effectively filed, one of ordinary skill in the art would have found the incorporation of the teaching of Reynolds into that of Colibro, .
Allowable Subject Matter
Claims 4, 5, 6, 9, 10, 11, 12, 13, 14, 18 and 22 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure:
GAO (US 2021/0020161 A1) teaches about obtaining trained parameters that represent a speaker, wherein the parameters comprise a mean and a standard deviation of a Gaussian probability distribution, the features in a speaker vector having been generated using Gaussian distributions [0066].
Any inquiry concerning this communication or earlier communications from the Examiner should be directed to OLUWADAMILOLA M OGUNBIYI whose telephone number is (571)272-4708. The Examiner can normally be reached on Monday - Thursday (8:00 AM - 5:30 PM Eastern Standard Time).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner’s Supervisor, DANIEL C WASHBURN can be reached on (571)272-5551. The fax phone 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/OLUWADAMILOLA M OGUNBIYI/Examiner, Art Unit 2657

/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657