DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 3/8/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Status of Claims
Claims 1-20 are pending in this application.  

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-3, 7, 11-13 and 16-18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ramasubramanian (U.S. Patent 10,706,857).
As per claims 1, 11 and 16, Ramasubramanian discloses:
A system comprising: 
at least one processor (Column 6, line 63 – Column 7, line 8 – the invention is embodied via a processor and software stored in a computer memory) ; and 
a memory storing instructions that, when executed by the at least one processor, cause the system to perform a method  (Column 6, line 63 – Column 7, line 8 – the invention is embodied via a processor and software stored in a computer memory) comprising: 
providing audio waveform data that corresponds to a voice sample to a temporal convolutional network for evaluation, wherein the temporal convolutional network pre- processes the audio waveform data and outputs an identity embedding associated with the audio waveform data (Figure 1, item 2 and Column 11, line 54 – Column 12, line 16 – audio of a speaker is fed into the multi time frequency resolution CNN module); 
obtaining the identity embedding associated with the voice sample from the temporal convolutional network (Figure 1, items 2 & 4 and Column 11, line 54 – Column 12, line 16 – the 2D CNN layers output a flattened feature vector); and 
determining information describing a speaker associated with the voice sample based at least in part on the identity embedding (Figure 1, item 6 and Column 11, line 54 and Column 11, line 54 – Column 12, line 16 – the classifier determines which speaker or class of speaker spoke in the audio).
Claim 1 is directed to the method of using the system of claim 11, so is rejected for similar reasons.
Claim 16 is directed to a non-transitory computer readable medium, which could be a memory, containing instructions causing a processor to perform as the system of claim 11, so is rejected for similar reasons.

As per claims 2, 12 and 17, Ramasubramanian teaches all of the limitations of claims 1, 11 and 16 above. Ramasubramanian further discloses:
determining, by the computing system, an identity of the speaker associated with the voice sample based at least in part on the identity embedding (Figure 1, item 6 and Column 11, line 54 and Column 11, line 54 – Column 12, line 16 – the classifier determines which speaker spoke in the audio).

As per claims 3, 13 and 18, Ramasubramanian teaches all of the limitations of claims 1, 11 and 16 above. Ramasubramanian further discloses:
determining, by the computing system, that the speaker associated with the voice sample matches a known speaker based at least in part on the identity embedding (Figures 1 & 2 and Column 11, line 54 – Column 12, line 45 – the output of the system can either be speaker identification or speaker verification (i.e. verifying the speaker is a known speaker)).

As per claim 7, Ramasubramanian teaches all of the limitations of claim 1 above. Ramasubramanian further discloses:
he temporal convolutional network embeds the voice sample in a continuous high-dimensional identity space (Figure 3 and Column 13, line 55 – Column 15, line 34).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 4, 14 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Ramasubramanian (U.S. Patent 10,706,857) in view of Zhang et al. (Non-Patent Literature “Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings”).
As per claims 4, 14 and 19, Ramasubramanian discloses all of the limitations of claims 3, 13 and 18 above. Ramasubramanian further discloses:
the temporal convolutional network is trained based on a triplet loss function (Column 13, lines 22 – the system can be trained using a triplet loss function)
Ramasubramanian fails to explicitly disclose, but Zhang et al. in the same field of endeavor teaches:
a triplet loss function that evaluates a plurality of triplets, wherein a triplet includes an anchor voice sample, a positive voice sample that has a threshold level of similarity to the anchor voice sample, and a negative voice sample that does not have a threshold level of similarity to the anchor voice sample (Section IIB – Triplet Loss – anchor embeddings are more similar to the positive embeddings than to the negative embeddings).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the system, method and computer readable medium of Ramasubramanian with the triplet loss function of Zhang et al. because it is a case of simple substitution of one known element for another to obtain predictable results.

Claims 5, 15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ramasubramanian (U.S. Patent 10,706,857) and Zhang et al. (Non-Patent Literature “Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings”) in view of Turpault et al. (Non-Patent Literature “SEMI-SUPERVISED TRIPLET LOSS BASED LEARNING OF AMBIENT AUDIO EMBEDDINGS”).
As per claims 5, 15 and 20, the combination of Ramasubramanian and Zhang et al. discloses all of the limitations of claims 4, 14 and 19 above. The combination fails to disclose but Turpault et al. in the same field of endeavor teaches:
a triplet of voice samples is selected based on semi-hard triplet mining (Section 2.3 - semi-hard mining is used to select the audio samples).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the system, method and computer readable medium of Ramasubramanian and Zhang et al with the triplet mining of Turpault et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Ramasubramanian (U.S. Patent 10,706,857) in view of Qian et al. (Non-Patent Literature “SoftTriple Loss: Deep Metric Learning Without Triplet Sampling”, listed in IDS dated 3/8/2021).
As per claim 6, Ramasubramanian discloses all of the limitations of claim 1 above. Ramasubramanian fails to disclose but Qian et al. in the same field of endeavor teaches:
the temporal convolutional network is trained based on at least one of: contrastive loss, lifted structures loss, N-pair loss, angular loss, or SoftTriple loss (Abstract – the invention utilizes SoftTriple loss).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the system, method and computer readable medium of Ramasubramanian with the SoftTriplet loss of Qian et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

Claims 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Ramasubramanian (U.S. Patent 10,706,857) in view of Alzantot et al. (Non-Patent Literature “Deep Residual Neural Networks for Audio Spoofing Detection”).
As per claim 8, Ramasubramanian discloses all of the limitations of claim 1 above. Ramasubramanian fails to disclose but Alzantot et al. in the same field of endeavor teaches:
the temporal convolutional network includes a plurality of temporal convolutional neural residual blocks (TCN blocks) that pre-process the audio waveform data (Section 3.2 and Figures 1 & 2 – each of the convolutional residual block contains 32 filters).

As per claim 9, the combination of Ramasubramanian and Alzantot et al. discloses all of the limitations of claim 8 above. Alzantot et al. in the combination further discloses:
a TCN block applies a set of temporal causal convolutions to the audio waveform data (Section 3.2 and Figures 1 & 2 – each of the convolutional residual block contains 32 filters).

Allowable Subject Matter
Claim 10 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Examiner Notes
The Examiner cites particular columns and line numbers in the references as applied to the claims above for the convenience of the Applicant.  Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well.  It is respectfully requested that, in preparing responses, the Applicant fully considers the references in its entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or as disclosed by the Examiner. 
Communications via Internet e-mail are at the discretion of the applicant and require written authorization. Should the Applicant wish to communicate via e-mail, including the following paragraph in their response will allow the Examiner to do so:
“Recognizing that Internet communications are not secure, I hereby authorize the USPTO to communicate with me concerning any subject matter of this application by electronic mail. I understand that a copy of these communications will be made of record in the application file.”
Should e-mail communication be desired, the Examiner can be reached at Edwin.Leland@USPTO.gov

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDWIN S LELAND III whose telephone number is (571)270-5678. The examiner can normally be reached 8:00 - 5:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tammy Goddard can be reached on (571) 272-7773. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/EDWIN S LELAND III/Primary Examiner, Art Unit 2677