Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claim Objections
1.	Claim 14 objected to because of the following informalities:  claim 14 recites “The method of claim 1” should be – The method of claim 8 -.  Appropriate correction is required.
Claim Rejections - 35 USC § 103
2.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

3.	Claims 1-3, 7-10, 14-17 are rejected under 35 U.S.C. 103 as being unpatentable over submitted prior arts Jianfeng et al. (“Training Multi-Task Adversarial Network For Extracting Noise-Robust speaker Embedding”) in view of Xin et al. (“Channel adversarial training for cross channel text independent speaker recognition”).
As to claims 1 and 15, Jianfeng teaches a system and a computer readable medium comprising: a processing unit; and a memory storage device including program code that when executed by the processing unit (abstract – an encoder, a classifier, and a discriminator, speech recognition systems that obviously comprise processor, memory storage including program code in order to perform multi task training and extracting noise robust speaker embedding) causes to the system to: instantiate a feature extractor to receive speech frames and extract features from the speech frames based on first parameters of the feature extractor (Fig. 1, p. 1 section I – an encoder extracts noise-robust speaker embedding); instantiate a speaker classifier to identify a speaker based on the features extracted by the feature extractor and on second parameters of the speaker classifier (Fig. 1, p. 1 section I – a classifier that classifies the speakers); instantiate a condition classifier to identify a noise condition based on the features extracted by the feature extractor and on third parameters of the condition classifier (Fig. 1; p. 1 section I – a discriminator that discriminates the noise type of the speaker embedding; p. 2 section 2.2 the discriminator is just an output layer with M (the number of noise types in training data) nodes); determine a speaker classification loss associated with the speaker classifier (Fig. 1 – Cross Entropy Loss and p. 2 section 2.3 Loss Function); determine a condition classification loss associated with the condition classifier (section 2.3 – Cross EntropyLoss and FL-Loss or Anti-Loss; section 2.3 Ladv according to equation 1); train the first parameters of the feature extractor to maximize the condition classification loss (equation 1, maxE of Ladv); and train the third parameters of the condition classifier to minimize the condition classification loss (section 2.3 minD of Ladv according to equation 1). Jianfeng does not explicitly discuss 
	Xin teaches training the first parameters of the feature extractor and the second parameters of the speaker classifier to minimize the speaker classification loss (p. 3 1st column Section 4 – optimizing “theda sub G” by minimizing the speaker label prediction loss; equation 9).
	It would have been obvious before the effective filing date of the claimed invention to incorporate the teachings of Xin into the teachings of Jianfeng for the purpose of using a cross entropy loss function as a measure of performance and the general aim of the training to minimize the loss function, that is generally known in the field of neural networks.
	As to claims 2, 9, and 16, Jianfeng does not explicitly discuss the system of claim 1, the method of claim 8, and the medium of claim 15, wherein training of the first parameters of the feature extractor and the second parameters of the speaker classifier to minimize the speaker classification loss, training of the first parameters of the feature extractor to maximize the condition classification loss, and training of the third parameters of the condition classifier to minimize the condition classification loss occur substantially simultaneously. However, Jianfeng teaches the whole frame work of Fig. 1 is implemented by a single network (multi-task adversarial network) to learn the network parameters at the same time and simultaneously improve its noise robustness. It would have been obvious to train the first parameters of the feature extractor and the second parameters of the speaker classifier to minimize the speaker classification loss, to train the first parameters of the feature extractor to maximize the condition classification loss, 
As to claims 3, 10, and 17, Jianfeng does not explicitly discuss the system of claim 1, the method of claim 8, and the medium of claim 15, wherein identification of a noise condition comprises determination of a posterior associated with each of a plurality of noise environments, and wherein determination of a speaker classification loss comprises determination of a posterior associated with each of a plurality of test speakers. However, Jianfeng teaches at least a posterior (p. 2 section 2.2 – the probability of assigning a correct label) and classifier and discriminator network generally output posterior probabilities. It would have been obvious that the classifier and discriminator layers of Jianfeng output respective posteriors.
As to claims 7 and 14, Jianfeng does not explicitly discuss the system of claim 1 and the method of claim 8, further comprise: input speech frames of a plurality of enrollment speakers to the feature extractor to extract features associated with each of the plurality of enrollment speakers based on the trained first parameters; input speech frames of a test speaker to the feature extractor to extract features associated with the test speaker based on the trained first parameters; and 15Docket No: 406413-US-NPdetermine an identify of the test speaker based on similarities between the features associated with the test speaker and the features associated with each of the plurality of enrollment speakers. However, Jianfeng teaches the i-vector system achieved significant success in modeling speaker identity and channel variability in the i-vector space and using neural network to verify speakers have explored its potential capability in speaker recognition tasks (p. 1 section 
As to claim 8, Jianfeng teaches a computer-implemented method comprising: receiving speech frames at a feature extractor capable of extracting features from the speech frames based on first parameters of the feature extractor (Fig. 1, p. 1 section I – an encoder extracts noise-robust speaker embedding); receiving features extracted by the feature extractor at a speaker classifier capable of identifying a speaker based on the received features and on second parameters of the speaker classifier (Fig. 1, p. 1 section I – a classifier that classifies the speakers); receiving features extracted by the feature extractor at a condition classifier capable of identifying a noise condition based on the received features and on third parameters of the condition classifier (Fig. 1; p. 1 section I – a discriminator that discriminates the noise type of the speaker embedding; p. 2 section 2.2 the discriminator is just an output layer with M (the number of noise types in training data) nodes); training the first parameters of the feature extractor to maximize a condition classification loss associated with the condition classifier (equation 1, maxE of Ladv); and training the third parameters of the condition classifier 
Xin teaches training the first parameters of the feature extractor and the second parameters of the speaker classifier to minimize the speaker classification loss (p. 3 1st column Section 4 – optimizing “theda sub G” by minimizing the speaker label prediction loss; equation 9).
It would have been obvious before the effective filing date of the claimed invention to incorporate the teachings of Xin into the teachings of Jianfeng for the purpose of using a cross entropy loss function as a measure of performance and the general aim of the training to minimize the loss function, that is generally known in the field of neural networks.
Allowable Subject Matter
4.	Claims 4, 11, 18 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Claims 5-6, 12-13, and 19-20 objected because they depend on objected claims 4, 11, and 18, respectively.
	As to dependent claims 4, 11, and 18, prior arts of record fail to teach, or render obvious, alone or in combination the system of claim 1, the method of claim 8, and the medium of claim 15 further comprise: receiving features extracted by the feature extractor at a condition valuation network; training the first parameters of the extractor to maximize a condition regression loss associated with the condition valuation network; 
Conclusion
5.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Roh et al. (2020/0168230) teaches method and apparatus for processing voice data of speech.

6.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Quynh H. Nguyen whose telephone number is (571)272-7489.  The examiner can normally be reached on Monday-Friday 7AM-3PM.  If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached on 571-272-7488.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Any response to this action should be mailed to:
                        Commissioner of Patents and Trademarks
                        P.O. Box 1450
                        Alexandria, VA  22313-1450

Or faxed to:

                    (571) 273-8300, for formal communications intended for entry and for 
                          Informal or draft communications, please label “PROPOSED” or “DRAFT.”
                             
 Hand-delivered responses should be brought to: 

                         Customer Service Window 
                         Randolph Building 
                         401 Dulany Street 
                         Alexandria, VA 22314



/QUYNH H NGUYEN/Primary Examiner, Art Unit 2652