DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 22 December 2020 in reference to application 17/131,702.  Claims 1-19 are pending and have been examined.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 2, 4-6, and 8-19 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Krishnamoorthy et al. (US Patent 10,490,195).

Consider claim 1, Krishnamoorthy teaches a method for enrolling a user in a multi-stage enrollment voice authentication or identification system (abstract), comprising: 
capturing a plurality of speech samples of the user's speech to obtain user speech samples, wherein the user speech samples include the user speaking at least two different sentence types (col 5 lines 52- col 6 lines 12, selecting utterances from different domains, i.e. sentence types to be uttered by user for enrollment); 
computing feature-space representations for each of the plurality of user speech samples (col 4 lines 5-15, audio data converted to energy information vectors, col 23 lines 40-60, col 25 lines 18-35 for details); 
generating a user enrollment voiceprint by aggregating the feature-space representations (col 4 lines 5-17, generating speaker profile vector for the speaker by combining energy information vectors of multiple utterances); 
associating the user enrollment voiceprint with user information (col 3 lines 50-55, log in credentials, col 2 lines 35-45, personalization information); and 
storing the user enrollment voiceprint and associated user information in a database of enrolled users (col 6 lines 25-35, speaker profile vectors stored for later comparison).

Consider claim 2, Krishnamoorthy teaches the method of claim 1, wherein the two different sentence types include any two of: (1) a declarative sentence; (2) an imperative sentence; (3) an interrogative sentence; (4) an exclamatory sentence (col 7 lines 60-67, enrollment utterances may include imperative (play classic rock) and interrogatory sentences (is it hot outside)).

Consider claim 4, Krishnamoorthy teaches the method of claim 2, further comprising: 
capturing a speech sample of the user requesting authentication or identification (col 6 lines 26-29, receiving utterance); 
generating a voiceprint for the requesting user (col 6 lines 26-29 converting to voiceprint); 
determining whether a requesting user voiceprint matches any enrolled- user voiceprints in the database of enrolled users (col 6 lines 29-35 comparing vector to stored profiles).

Consider claim 5, Krishnamoorthy teaches the method of claim 4, wherein determining whether the requesting user voiceprint matches any enrolled-user voiceprints in the database of enrolled users further comprises: 
computing correlations between a requesting user voiceprint vector and enrolled-user voiceprint vectors (col 9 lines 45-50, comparison of vectors); and 
comparing the correlations to a threshold (col 9 lines 45-50, comparison to threshold).

Consider claim 6, Krishnamoorthy teaches the method of claim 5, further comprising: 
determining that the requesting user voiceprint and at least one of the enrolled-user voiceprints is a match if one of the correlations exceeds the threshold (col 9 lines 45-50, if match exceeds threshold, then user identified.).

Consider claim 8, Krishnamoorthy teaches the method of claim 5, further comprising: 
determining that no match exists between the requesting user voiceprint and the enrolled-user voiceprints if none of the correlations exceeds the threshold (col 29, lines 35-47, if threshold not met, system may indicate that a user could not be identified.)

Consider claim 9, Krishnamoorthy teaches a method for matching a speaker in a multi-stage enrollment voice authentication or identification system (abstract), comprising: 
capturing a plurality of speech samples from the speaker during an enrollment process, the plurality of speech samples including the speaker speaking at least two different sentence types (col 5 lines 52- col 6 lines 12, selecting utterances from different domains, i.e. sentence types to be uttered by user for enrollment), further comprising: 
generating an enrollment voiceprint for the speaker from the plurality of speech samples to obtain a speaker enrollment voiceprint (col 4 lines 5-17, generating speaker profile vector for the speaker by combining energy information vectors of multiple utterances); 
associating the speaker enrollment voiceprint with information about the speaker gathered during enrollment (col 3 lines 50-55, log in credentials, col 2 lines 35-45, personalization information); 
inputting an input audio signal containing speech samples of the speaker (col 6 lines 26-29, receiving utterance); Page 23 of 26Docket No. DTS-0315-US-02 
determining a target speech signal by identifying and retaining segments of the input audio signal that contain speech and identifying and discarding segments of the input audio signal which do not contain speech (col 24, lines 1-10, VAD way be used and audio segments with no speech activity may be discarded); 
computing an authentication voiceprint for the speaker from the target speech signal (col 6 lines 26-29 converting to voiceprint) and comparing the speaker authentication voiceprint to a set of one or more enrolled-user voiceprints corresponding to one or more enrolled users to obtain a comparison (col 6 lines 29-35 comparing vector to stored profiles); 
making an output determination based on the comparison to determine whether the speaker is a match with an enrolled user (col 9 lines 45-50, comparison to threshold); and 
making a decision based on whether the speaker is a match (col 6 lines 30-45, command execution may be customized based on user identification).

Claim 10 contains similar limitations as claim 2 and is therefore rejected for the same reasons. 

Consider claim 11, Krishnamoorthy teaches the method of claim 9, wherein the set of one or more enrolled-user voiceprints includes the speaker enrollment voiceprint (col 6 lines 25-35, speaker profile vectors stored for later comparison).

Consider claim 12, Krishnamoorthy teaches the method of claim 11, further comprising: 
determining that the comparison is a match between the speaker authentication voiceprint and the speaker enrollment voiceprint (col 6 lines 29-35 comparing vector to stored profiles which were previously enrolled); and
 wherein making a decision based on whether the speaker is a match further comprises making the decision based on there being a positive match (col 6 lines 30-45, command execution may be customized based on user identification if user identified).

Consider claim 13, Krishnamoorthy teaches the method of claim 9, wherein computing an authentication voiceprint for the speaker from the target speech signal further comprises providing the target speech signal as input to a processing system which includes a deep neural network (DNN) (col 25 lines 18-65, using an encoder to computer features vectors and matching for speaker recognition, see line 60 DNN).

Consider claim 14, Krishnamoorthy teaches the method of claim 13, wherein the deep neural network is configured to compute a vector representation of the target speech signal and wherein the vector representation of the target speech signal is the speaker authentication voiceprint (col 25 lines 18-65, using an encoder to computer features vectors and matching for speaker recognition).

Consider claim 15, Krishnamoorthy teaches The method of claim 13, further comprising normalizing the output of the deep neural network such that the vector representation has unit norm (col 26 lines 2-6, output of encoder may be speaker recognition. Col 29 lines 3-15, user recognition scores may be scaled to appropriate scoring scheme.).

Consider claim 16, Krishnamoorthy teaches the method of claim 9, wherein comparing the speaker authentication voiceprint to a set of one or more enrolled-user voiceprints corresponding to one or more enrolled users further comprises computing a correlation between a vector representation of the speaker authentication voiceprint and vector representations of each of the enrolled-user voiceprints (col 9 lines 45-50, comparison of vectors – input vector to those of enrolled speakers).

Consider claim 17, Krishnamoorthy teaches the method of claim 16, further comprising: 
determining that the correlation exceeds a threshold (col 9 lines 45-50, comparison to threshold); and 
determining that the speaker authentication voiceprint is a match with at least one of the enrolled-user voiceprints for which the correlation exceeds a threshold (col 9 lines 45-50, comparison to threshold and match determine if threshold exceeded).

Consider claim 18, Krishnamoorthy teaches the method of claim 16, further comprising: 
determining that the speaker is positively authenticated as an enrolled user (col 9 lines 45-50, comparison to threshold to determine authentication); and 
making a decision based on the authentication of the speaker (col 6 lines 30-45, command execution may be customized based on user identification).

Consider claim 19, Krishnamoorthy teaches the method of claim 16, further comprising: 
determining that the speaker is positively identified as an enrolled user (col 9 lines 45-50, comparison to threshold to determine authentication); and 
making a decision based on the identity of the speaker (col 6 lines 30-45, command execution may be customized based on user identification).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Krishnamoorthy in view of Pearce (US Patent 9,990,926).

Consider claim 3, Krishnamoorthy teaches the method of claim 1, but does not specifically teach wherein the plurality of speech samples includes a free-response example wherein the user is not reading aloud a specified sentence.
In the same field of voice recognition, Pearce teaches wherein the plurality of speech samples includes a free-response example wherein the user is not reading aloud a specified sentence (col 3 lines 45-60, speech for enrollment is collected passively… no enrollment questions asked or prompts given).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use passive enrollment as taught by Pearce in the system of Krishnamoorthy in order to allow the user to enroll without providing specific speech, increasing user convenience. 

Claim(s) 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Krishnamoorthy in view of Milavsky et al. (US PAP 2019/032724).

Consider claim 7, Krishnamoorthy teaches the method of claim 6, but does not specifically teach further comprising: 
determining that two or more of the correlations exceeds the threshold; and 
determining that a one of the enrolled-user voiceprints having a maximum correlation is a match to the requesting user voiceprint.
In the same field of speaker identification, Milavsky teaches 
determining that two or more of the correlations exceeds the threshold (0029, determining multiple speakers exceed threshold); and 
determining that a one of the enrolled-user voiceprints having a maximum correlation is a match to the requesting user voiceprint (0029, selecting speaker with the highest confidence score as the matched speaker).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use the highest score if thresholds are exceeded by several comparisons as taught by Milavsky in the system of Krishnamoorthy in order to allow for increased accuracy of speaker identification.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Stafylakis (US PAP 2020/0312337) teaches training voice recognition with multiple utterances as well.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451. The examiner can normally be reached 6:30am-5pm Monday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on (571)272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

DOUGLAS GODBOLD
Examiner
Art Unit 2655



/DOUGLAS GODBOLD/           Primary Examiner, Art Unit 2655