DETAILED ACTION
Status of the Claims
This action is in response to the application filed on 5/21/2018 for application 15/984,754. Claim 1 – 20 are pending and have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 5/21/2018 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

The information disclosure statement filed on 4/6/2020 to comply with 37 CFR 1.97(c) because it lacks a statement as specified in 37 CFR 1.98. Applicant fail to provide explanation of the relevance for multiple foreign reference nor does applicant provide English-language translation. They have been placed in the application file, but the information referred to therein has not been fully considered.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.


Claim 10 – 16 are rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends. 
Claim 10 recite an apparatus according to Claim 8 however Claim 10 does not further limit Claim 8, as claim 8 already inherits the limitations in claim 10 due to dependency on claim 2.  Claim 11 – 16 have similar deficiency.   
 Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.
Upon review, the dependency on Claim 10 appear to be a typographical error. Instead of depending on Claim 8. changing dependency  to Claim 9 would overcome the rejection. For examination purpose, examiner interpreted Claim 10 to depend on Claim 9. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 9 and 17 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Caruana, Model Compression, in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and Data Mining, ACM, 2006.

Regarding Claim1, Caruana discloses: A method for building a machine learning based network model (Caruana, sec. 1, para. 3, ln. 3 – 4, where we show how to train compact artificial neural nets), comprising:
obtaining, by processing circuitry of an information processing apparatus, a data processing procedure of a first network model (Caruana, sec. 3, experiment is carry out in UCI Repository using computing device with instructions) and a reference dataset that is generated by the first network model in the data processing procedure, the data processing procedure including a first data processing step; building, in a second network model, a first sub-network to perform the first data processing step, the second network model being the machine learning based network model of a neural network type; and performing optimization training on the first sub-network by using the reference dataset (Caruana, sec. 1, para. 3, ln. 4 – 6, where train compact artificial neural net [first sub-network in second network model] to mimic the function [first data processing step] learned by ensemble [first network model] selection; ln. 11 – 17, where instead of training the neural net on the original training set used to train the ensemble, we use the ensemble to label a large unlabeled data set and then train the neural net [performing optimization training] on this much larger, ensemble labeled data set [a reference dataset that is generated by the first network model])

Regarding Claim 9, Claim 9 is the apparatus claim corresponding to Claim 1. Caruana further discloses: an apparatus comprising processing circuitry (Caruana, sec. 3, experiment is carry out in UCI Repository using computing device with instructions). 
Claim 9 is rejected with the same reason as Claim 1. 

Regarding Claim 17, Claim 17 is the non-transitory computer-readable medium storing instruction which when executed by a computer cause the computer to perform (Caruana, sec. 3, experiment is carry out in UCI Repository using computing device with instructions stored in memory). 
Claim 17 is rejected with the same reason as Claim 1. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim 2 – 3, 5, 10 – 11, 13 and 18 – 19 are rejected under 35 U.S.C. 103 as being unpatentable over Caruana, Model Compression, in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and Data Mining, ACM, 2006 in view of Berniker, Deep networks for Motor Control Functions, Frontiers in Computational Neuroscience, Vol 9, 2015 further in view of Makhzani, Adversarial Autoencoders, arXiv, May 2016.

Regarding claim 2, depending on Claim 1, Caruana discloses the method of Claim 1. Caruana does not explicitly disclose: 
wherein the data processing procedure includes the first data processing step and a second data processing step that follows the first data processing step, and the method further comprises
building a second sub-network in the second network model to perform the second data processing step
performing optimization training on the second sub-network by using the reference dataset;
and merging the first sub-network with the second sub-network into the second network model.
Berniker explicitly discloses: 
wherein the data processing procedure includes the first data processing step and a second data processing step that follows the first data processing step, and the method further comprises, building a second sub-network in the second network model to perform the second data processing step (Berniker, fig. 2A, where the model [second network model] comprises first data processing step from Y to V and second data processing step from V to W that follows the first; build VWV^ [second subnetwork] in the model to perform second processing step )
and merging the first sub-network with the second sub-network into the second network model (Berniker fig. 2A, where YVY^ [first sub-network] is merged with VWV^ [second sub-network] into the model [second network model]).
Caruana and Berniker both teach neural network model for data processing and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Caruana’s disclosure of compressing model by mimicking other model with Berniker’s disclosure of deep autoencoder architecture to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to excel at finding hidden low-dimensional features (Berniker, page. 2, para. 2, ln. 3 - 4).
Caruana in view of Berniker do not explicitly disclose:
performing optimization training on the second sub-network by using the reference dataset;
Makhzani explicitly disclose: 
performing optimization training on the second sub-network by using the reference dataset (Makhzani, fig. 8, where autoencoder is trained [optimization training] on labeled data [reference dataset] in the semi-supervised settings)
Caruana (in view of Berniker) and Makhzani both teach autoencoder model and training for data processing and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Caruana (in view of Berniker)’s disclosure of stacked autoencoder training model with Makhzani’s disclosure of autoencoder training on labeled data to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to gain better performance (Makhzani, page. 10, para. 1, ln. 4 – 7).

Regarding Claim 3, depending on Claim 2, Caruana in view of Berniker and Makhzani discloses the method of Claim 1. Caruana in view of Berniker and Makhzani further disclose:
extracting first input and output data for the first data processing step; extracting second input and output data for the second data processing step; and including the first input and output data and the second input and output data in the reference data set (Caruana combine with Berniker and Makhzani disclose a multiple step data processing system with mimicking learning on each step; the result system collect [extract] input/output from the first model of the first processing step to be used to train a mimicking model of the first processing step also collect input/output from the second model of the second processing step to be used to train a mimicking model of the second processing step; training data can be stored in a dataset [reference data set]).

Regarding Claim 5, depending on Claim 3, Caruana in view of Berniker and Makhzani discloses the method of Claim 3. Caruana in view of Berniker and Makhzani further disclose:
wherein the performing optimization training on the first sub-network by using the reference dataset comprises: reading, from the reference dataset, the first input and output data corresponding to the first data processing step; and performing optimization adjustment on a parameter of the first sub-network for performing the first data processing step based on the first input and output data and according to a neural network (NN) training optimization algorithm, wherein the parameter includes: one or a combination of a network node, a weight, and a training rate (Makhzani, fig. 8 & Appendix A.2, where using labeled data [first input and output data in reference dataset corresponding to the first data processing step] to train the autoencoder q() [perform optimization adjustment on parameter of the first sub-network network] to perform data processing [first data processing step] base on the cost function [optimization algorithm]; the training  changes the weights of the autoencoder).

Regarding Claim 10, 11 and 13, Claim 10, 11 and 13 are the apparatus claim corresponding to Claim 2, 3 and 5. 
Claim 10, 11 and 13 are rejected with the same reason as Claim 2, 3 and 5. 

Regarding Claim 18 and 19, Claim 18 and 19 are the non-transitory computer readable media claim corresponding to Claim 2 and 3. 
Claim 18 and 19 are rejected with the same reason as Claim 2 and 3. 

Claim 4, 6, 12, 14 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Caruana, Model Compression, in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and Data Mining, ACM, 2006 in view of Berniker, Deep networks for Motor Control Functions, Frontiers in Computational Neuroscience, Vol 9, 2015 and Makhzani, Adversarial Autoencoders, arXiv, May 2016 further in view of Wright, Neural Network Architecture Selection Analysis with Application to Cryptography Location, WCCI 2010 IEEE World Congress on Computational Intelligence, 2010.

Regarding claim 4, depending on Claim 3, Caruana in view of Berniker and Makhzani disclose the method of Claim 3. 
Caruana in view of Berniker and Makhzani further disclose: 
wherein the building, in the second network model, the first sub-network to perform the first data processing step and the building the second sub-network in the second network model to perform the second data processing step comprises: determining, according to the first input and output data corresponding to the first data processing step, a first input layer structure and a first output layer structure of the first sub-network that performs the first data processing step; building the first sub-network, according to the first main network structure, the first input layer structure, and the first output layer structure; determining, according to the second input and output data corresponding to the second data processing step, a second input layer structure and a second output layer structure of the second sub-network that performs the second data processing step; and building the second sub-network, according to the second main network structure, the second input layer structure, and the second output layer structure (Berniker, fig. 2A, were in the model [second network model], the left autoencoder [first-sub-network] performs the first data processing step and has Y layer [first input layer structure], V layer [first main network structure] and Y^ layer [first output layer structure], the middle autoencoder [second sub-network] performs the second data processing step and has V layer [second input layer structure], W layer [second main network structure] and V^ layer [second output layer structure]; The number of nodes in the input layer and output layer [input and output layer structure] are based on the width of the input data and output data) 
Caruana in view of Berniker and Makhzani do not explicitly disclose: 
searching a preset equivalent correspondence table for a main network structure of the sub-network that performs the data processing step, the present equivalent correspondence table associating main network structure types with data processing types;
determining, according to the input and output data corresponding to the data processing step, a input layer structure and a output layer structure of the sub-network that performs the first data processing step;
Wright explicitly discloses: 
searching a preset equivalent correspondence table for a main network structure of the sub-network that performs the data processing step, the present equivalent correspondence table associating main network structure types with data processing types (Wright, fig. 1 & page 2942, col. 2 para. 2, where the architecture selection process create a table of candidate architectures with different mean values. The different mean values represent fitness of model to different type of task [data processing type]); 
Caruana (in view of Berniker and Makhzani) and Wright both teach neural network model for data processing and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Caruana (in view of Berniker and Makhzani)’s disclosure of multi-step data processing model with White’s disclosure of architecture selection based on the dataset to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to reduce the total error (Wright, abs. ln. 1 – 2).

Regarding Claim 6, depending on Claim 4, Caruana in view of Berniker, Makhzani and Wright discloses the method of Claim 4. Caruana in view of Berniker, Makhzani and Wright further disclose:
wherein the performing optimization training on the first sub-network by using the reference dataset comprises: reading, from the reference dataset, the first input and output data corresponding to the first data processing step; and performing optimization adjustment on a parameter of the first sub-network for performing the first data processing step based on the first input and output data and according to a neural network (NN) training optimization algorithm, wherein the parameter includes: one or a combination of a network node, a weight, and a training rate (Makhzani, fig. 8 & Appendix A.2, where using labeled data [first input and output data in reference dataset corresponding to the first data processing step] to train the autoencoder q() [perform optimization adjustment on parameter of the first sub-network network] to perform data processing [first data processing step] base on the cost function [optimization algorithm]; the training  changes the weights of the autoencoder).

Regarding Claim 12 and 14, Claim 12 and 14 are the apparatus claim corresponding to Claim 4 and 6. 
Claim 12 and 14 are rejected with the same reason as Claim 4 and 6. 

Regarding Claim 20, Claim 20 is the non-transitory computer readable media claim corresponding to Claim 4. 
Claim 20 is rejected with the same reason as Claim 4. 

Claim 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Caruana, Model Compression, in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and Data Mining, ACM, 2006 in view of Berniker, Deep networks for Motor Control Functions, Frontiers in Computational Neuroscience, Vol 9, 2015, Makhzani, Adversarial Autoencoders, arXiv, May 2016 and Wright, Neural Network Architecture Selection Analysis with Application to Cryptography Location, WCCI 2010 IEEE World Congress on Computational Intelligence, 2010, further in view of Bengio, Greedy layer-wise training of deep networks, Advanced in Neural Information Processing System, 2007.

Regarding claim 7, depending on Claim 6, Caruana in view of Berniker, Makhzani and Wright disclose the method of Claim 6. 
Caruana in view of Berniker, Makhzani and Wright further disclose: 
wherein the merging the first sub-network with the second sub-network into the second network model comprises: selecting one of the first sub-network and the second sub-network any sub-network from the at least one sub-network as a seed network, and the other one of the first sub-network and the second sub-network as a to-be-merged network; removing the second input layer structure and the first output layer structure between the seed network and the to-be-merged network; merging the seed network with the to-be-merged network to form a grown seed network (Berniker, fig. 1, where when merge YVY^ [first sub-network; seed network] and VWV^ [second sub-network; tobe merged network], the Y^ layer [first output layer structure] of the YVY^ subnetwork and the V layer [second input layer structure] of the VWV^ sub-network are removed and merged into a model on the right [grown seed network]); 
Caruana in view of Berniker, Makhzani and Wright further do not explicitly disclose: 
and performing, when the merging of the grown seed network succeeds, optimization adjustment on a parameter of the grown seed network based on the first input and output data and the second input and output data.
Bengio explicitly discloses: 
and performing, when the merging of the grown seed network succeeds, optimization adjustment on a parameter of the grown seed network based on the first input and output data and the second input and output data (Bengio, alg. 5, where after each layer are pretrained, perform supervised optimization using the respective target sample x,y; In this case the combined target sample from the sequential processing unit are the input form the first input/output dataset and the corresponding output from the second input/output data set).
Caruana (in view of Berniker, Makhzani and Wright) and Bengio both teach layerwise training of neural network model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Caruana (in view of Berniker Makhzani and Wright)’s disclosure of layerwise training with Bengio’s disclosure of fine tuning merged model after layer training to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to minimize reconstruction error (Bengio, page. 5, ln. 7 – 9).

Regarding Claim 15, Claim 15 is the apparatus claim corresponding to Claim 7. 
Claim 15 is rejected with the same reason as Claim 7. 

Claim 8 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Caruana, Model Compression, in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and Data Mining, ACM, 2006 in view of Berniker, Deep networks for Motor Control Functions, Frontiers in Computational Neuroscience, Vol 9, 2015, Makhzani, Adversarial Autoencoders, arXiv, May 2016, Wright, Neural Network Architecture Selection Analysis with Application to Cryptography Location, WCCI 2010 IEEE World Congress on Computational Intelligence, 2010 and Bengio, Greedy layer-wise training of deep networks, Advanced in Neural Information Processing System, 2007 further in view of Montanez, Unveiling the Hidden layers of Deep learning, Scientific American, May 2016.

Regarding claim 8, depending on Claim 7, Caruana in view of Berniker, Makhzani, Wright and Bengio disclose the method of Claim 7. 
Caruana in view of Berniker, Makhzani, Wright and Bengio do not explicitly disclose: 
adding an intermediate hidden layer structure between the seed network and the to-be-merged network when the merging fails; 
and merging the seed network and the to-be-merged network using the intermediate hidden layer
Montanez explicitly discloses: 
adding an intermediate hidden layer structure between the seed network and the to-be-merged network when the merging fails; and merging the seed network and the to-be-merged network using the intermediate hidden layer (Montanez, page. 5, where user can … add new hidden layers … help the machine to complete the task more successfully; i.e., for the situation that the merge is less successful [failed], adding new hidden layer can promote the chance of success).
Caruana (in view of Berniker, Makhzani, Wright and Bengio) and Montanez both teach deep learning in neural network model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Caruana (in view of Berniker Makhzani, Wright and Bengio)’s disclosure of merging neural network models after layerwise training with Montanez’s disclosure of adding hidden layers to help machine successfully complete tasks to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to increase the chance of success (Montanez, page. 5, ln. 8).

Regarding Claim 16, Claim 16 is the apparatus claim corresponding to Claim 8. 
Claim 16 is rejected with the same reason as Claim 8. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIEN MING CHOU whose telephone number is (571)272-9354. The examiner can normally be reached Monday- Friday 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHAKI KAKALI can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/S.C./Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122