DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant's arguments with respect to 35 U.S.C. 103 rejection of claims 1, 6 and 11 have been considered and found persuasive due to amendments, and the rejection has been withdrawn. See detailed reason for allowance below.
 
Allowable Subject Matter
Claims 1, 3-6, 8-11 and 13-15 are allowed.
The following is an examiner’s statement of reasons for allowance: Korjani (US 10,706,856) in view of Deng et al. (“Deep Convex Net: A Scalable Architecture for Speech Pattern Classification, Aug. 28-31, 2011) in view of Vanhoucke (US 8,484,022 B1) teach An audio signal encoding method, comprising: applying an audio signal to a training model including N autoencoders provided in a cascade structure such that the N autoencoders are each connected in series; encoding an output result derived through the training model; and generating a bitstream with respect to the audio signal based on the encoded output result, wherein the training model is derived by connecting the N autoencoders in a cascade form, and training a subsequent autoencoder using a residual signal not learned by a previous 
Korjani (US 10,706,856) teaches a speaker identification/verification system comprises at least one feature extractor for extracting a plurality of audio features from speaker voice data, a plurality of speaker-specific subsystems, and a decision module. Each of the speaker-specific subsystem comprises: a neural network configured to generate an estimate of the plurality of extracted audio features based on the plurality of extracted audio features, and an error module. Each of the plurality of neural networks is associated with one of a plurality of speakers, and the one speaker associated with each of the plurality of neural networks is different for all neural networks. The error module is configured to estimate an error based on the plurality of extracted audio features and the estimate of the plurality of extracted audio features generated by the associated neural network. The neural networks are speaker-specific auto-encoders trained for one user and therefore calibrated on that particular user's speech. As a result, that speaker-specific neural network is highly tuned for the particular user and out of tune for all other users. Thus, the error associated with the speaker-specific neural network is relatively small and useful for purposes of identification or verification ([Fig. 1] [col. 3 line 1 to col. 4 line 47]).
Deng et al. (“Deep Convex Net: A Scalable Architecture for Speech Pattern Classification, Aug. 28-31, 2011) teaches context-dependent DNN-HMM for large vocabulary speech recognition. Plurality of stocked DNC modules, each module having bottom, hidden and top linear layers ([Fig. 1] [Section 2-3]).
Vanhoucke (US 8,484,022)) teaches Training is accomplished by computing an error signal 1005 in the form .DELTA.=X-[tilde over (X)], and providing it to the auto-encoder 1000 as feedback during a training process carried out over a body of training data, representing a variety of examples of input X. By adjusting the auto-encoder 1000 so as to minimize the magnitude of the error signal 1005 (or reduce  ([col. 27 lines 41-60]).
The difference between the prior art and the claimed invention is that Korjani, Deng nor Vanhoucke teach wherein the training model is trained for an initialization of the training model based on a process training each baseline model in a first round, and fine-tuning all training models at the same time in a second round.
Therefore, it would not have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Korjani, Deng and Vanhoucke to include wherein the training model is trained for an initialization of the training model based on a process training each baseline model in a first round, and fine-tuning all training models at the same time in a second round. Therefore, the claimed invention is deemed novel. 
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHREYANS A PATEL whose telephone number is (571)270-0689. The examiner can normally be reached Monday-Friday 8am-5pm PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

SHREYANS A. PATEL
Examiner
Art Unit 2657



/SHREYANS A PATEL/Examiner, Art Unit 2656