Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This action is in reply to the amendments and remarks filed on 12/16/2021.
Claims 1-17 and 21-23 are pending.
Claims 1, 6-8, and 13-15 have been amended.
Claims 18-20 have been canceled.
Claims 21-23 have been added.

Response to Arguments
Applicant’s arguments, with respect to the rejection(s) of claim(s) 1, 8 and 15 under 35 U.S.C. 103, has been considered but are not persuasive. More specifically, the applicant argues that Yin does not teach the amended claim language of claims 1, 8, and 15 since “Yin does not appear to select an optimal candidate ensemble from a group but, rather, Yin combines ensemble models into an average”; and while Yin teaches pruning of “classifiers”, “classifiers are not ensemble models”. The examiner respectfully disagrees with all presented arguments. 
Due to the broadness of the claim language, the references fairly teach all requirements of the claim limitations. The Examiner concurs that Yin does teach combining interim ensembles, but also teaches weighting the ensembles according to their performances, including favorably weighting better performing ensembles from selecting the optimal candidate ensemble model is based on performance evaluation results). Therefore Yin weights the best performing ensembles outputs higher (selects from a group) and keeps the lower performing ensembles at a lower weighting, and thus can fairly be interpreted as choosing a higher performing ensemble’s results over the remaining ensemble results (selecting an ensemble from a group). Regarding applicant’s argument that Yin’s “classifiers are not ensemble models”, see above reasoning as Yin was shown above to teach weighting a group of interim ensembles based on their performances for use in a DE2; further the argued “pruning” is not recited in the claims. See 35 U.S.C 103 section for full mapping of claim limitations necessitated by applicant amendments.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.

4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-6, 8-13, 15-17, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Yin et al (“DE2: Dynamic ensemble of ensembles for learning nonstationary data”, 2015) hereinafter Yin, in view of Li et al (US Pub 20160078339) hereinafter Li.
Regarding claims 1, 8, and 15, Yin teaches a computer-implemented method, a non-transitory, computer readable medium storing one or more instructions executable by a computer system, and a computer-implemented system, comprising: one or more computers; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform one or more operations comprising (abstract, sections 1 and 4 teach a process for iteratively expanding an existing a “Dynamic Ensemble of Ensembles (DE2)” by forming interim ensembles of classifiers for improving “computational efficiency”, which is well-known in the art to be performed on a computer system including one or more processors communicatively coupled to memory (CRM)): 
obtaining a current ensemble model and a plurality of untrained candidate submodels (section 1, last paragraph, and section 4 teach a process for iteratively expanding an existing DE2 (obtaining a current ensemble model) with interim ensembles, wherein the interim ensembles include classifiers to be trained (and a plurality of untrained candidate submodels)); 
generating a first candidate ensemble model, wherein the first candidate ensemble model comprises the plurality of untrained candidate submodels (section 1, last paragraph, and section 4 teach creating one or more interim ensembles (generating a first candidate ensemble model) with classifiers to be trained (wherein the first candidate ensemble model comprises the plurality of untrained candidate submodels)); 
training, by at least one processor, the first candidate ensemble model to obtain a second candidate ensemble model (sections 1 and 4 teach interim ensembles including classifiers to be trained, and then “we train interim ensembles” (training…the first candidate ensemble model), thus giving trained one or more interim ensembles (to obtain a second candidate ensemble model) for improving “computational efficiency”, which is well-known in the art to be performed on a computer ; 
generating second candidate performance evaluation results for the second candidate ensemble model (sections 1 and 4 teach the trained “interim ensembles” are combined “according to their series performances” (generating second candidate performance evaluation results for the second candidate ensemble model) to determine interim ensemble weighting in the DE2); 
selecting an optimal candidate ensemble model from a group of candidate ensemble models, wherein the group of candidate ensemble models comprises the second candidate ensemble model, and wherein selecting the optimal candidate ensemble model is based on performance evaluation results of each candidate ensemble model in the group of candidate ensemble models, respectively (sections 1 and 4 teach the trained “interim ensembles” are combined “according to their series performances” to determine interim ensemble weighting in the DE2 (selecting an optimal candidate ensemble model from a group of candidate ensemble models, wherein the group of candidate ensemble models comprises the second candidate ensemble model), wherein dynamic weighted majority for interim ensembles includes favorably weighting better performing ensembles from “their series performances” more than that of other ensembles in the DE2 (wherein selecting the optimal candidate ensemble model is based on performance evaluation results of each candidate ensemble model in the group of candidate ensemble models, respectively) ); and 
updating the current ensemble model with the optimal candidate ensemble model, wherein the optimal performance of the optimal candidate ensemble model satisfies a predetermined condition (sections 1 and 4 teach “updating” the DE2 (updating the current ensemble model) with the newly weighted “interim ensembles” that are weighted “according to their series performances” (with the optimal candidate ensemble model), wherein dynamic weighted majority for interim ensembles includes favorably weighting better performing ensembles from “their series performances” more than that of other ensembles in the DE2 (wherein the optimal performance of the optimal candidate ensemble model satisfies a predetermined condition)).

Yin at least implies a computer-implemented method, a non-transitory, computer readable medium storing one or more instructions executable by a computer system, and a computer-implemented system, comprising: one or more computers; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform one or more operations comprising (see mapping above) and by at least one processor (see mapping above), however Li teaches a computer-implemented method, a non-transitory, computer readable medium storing one or more instructions executable by a computer system, and a computer-implemented system, comprising: one or more computers; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform one or more operations comprising: and by at least one processor (paragraphs 0016, 0079-0084 and Fig. 7 teach computer system with one or more processors and memories, such as CRM storing “computer-executable instructions”, to perform the methods disclosed in the embodiments including creating an “ensemble teacher model” and creating “a plurality of sub-DNN models” taught to be “untrained”).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Li’s teachings of a computer system for filling an ensemble teacher model with “untrained” sub-DNN models and training the ensemble into Yin’s teaching of training and ranking interim ensembles for use in a combined DE2 in order to utilize the most efficient computer system hardware for ensemble training and creation (Li, paragraphs 0016, 0032, 0070-0073, 0079-0084, and Figs. 4 and 6-7).

Regarding claims 2, 9, and 16, the combination of Yin and Li teach all the claim limitations of claims 1, 8, and 15 above; and further teach wherein the first candidate ensemble model comprises at least two models that are based on different types of neural networks (Li, paragraph 0045-0047 and Fig. 4 teach “the ensemble sub-DNNs (wherein the first candidate ensemble model comprises at least two models)” can be “DNNs with…different structures (e.g., standard feedforward DNN, convolutional neural network (CNN), recurrent neural network (RNN), long short-term memory RNN, or other structures) (that are based on different types of neural networks)”).


Regarding claims 3, 10, and 16, the combination of Yin and Li teach all the claim limitations of claims 1, 8, and 15 above; and further teach the plurality of untrained candidate submodels comprise a first candidate submodel and a second candidate submodel, and wherein the first candidate submodel and the second candidate submodel are based on the same type of neural networks and have different hyperparameters for the same type of neural networks (Li, paragraphs 0045-0046, 0070, and Fig. 4 teach the initialized ensemble of sub-DNN models (plurality of untrained candidate submodels) include multiple DNNs (first and second candidate submodels), where these models can be the same structure (same type), but with “different topologies (varying in number of layers and nodes, for example) (different hyperparameters)”).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Li’s teachings of filling an ensemble teacher model with “untrained” sub-DNN models and training the ensemble into Yin’s teaching of training and ranking interim ensembles for use in a combined DE2 in order 

Regarding claim 4, 11, and 17, the combination of Yin and Li teach all the claim limitations of claims 3, 10, and 16 above; and further teach the same type of neural networks are deep neural networks (DNN), and the different hyperparameters comprise a quantity of hidden layers in a DNN network structure, a quantity of neural units of each hidden layer in a plurality of hidden layers, and a manner of connection between any two of the plurality of hidden layers (Li, paragraphs 0045-0046, 0070, and Fig. 4 teach the initialized ensemble of sub-DNN models include multiple DNNs (same type of neural networks are deep neural networks (DNN)), where these models can be the same structure (same type), but with “different topologies (varying in number of layers and nodes, for example) (different hyperparameters comprise a quantity of hidden layers in a DNN network structure, a quantity of neural units of each hidden layer in a plurality of hidden layers)”, and shown to have connection between all layers (manner of connection between any two of the plurality of hidden layers)).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Li’s teachings of filling an ensemble teacher model with “untrained” sub-DNN models of different types and training the ensemble into Yin’s teaching of training and ranking interim ensembles for use in a combined DE2 in order to create ensembles with specific types and structured machine learning algorithms (Li, paragraphs 0045-0046, 0070, and Fig. 4).

Regarding claim 5 and 12, the combination of Yin and Li teach all the claim limitations of claims 1 and 8 above; and further teach training the first candidate ensemble model comprises: 
determining that the first candidate ensemble model is not empty (Yin, sections 1 and 4 teach including classifiers to be trained into interim ensembles (determining that the first candidate ensemble model is not empty)); and 
responsive to determining the first candidate ensemble model is not empty, training the first candidate ensemble model (Yin, sections 1 and 4 teach interim ensembles including classifiers to be trained (responsive to determining the first candidate ensemble model is not empty), and then “we train interim ensembles” (training the first candidate ensemble model), thus giving trained one or more interim ensembles).

Regarding claim 6 and 13, the combination of Yin and Li teach all the claim limitations of claims 1 and 8 above; and further teach wherein the performance evaluation results of each candidate ensemble model of the group of candidate ensemble models comprise a function value of a loss function corresponding to each ensemble model in the group of candidate ensemble models (Yin, sections 1 and 4 teach the trained “interim ensembles” are combined “according to their series performances” (wherein the performance evaluation results of each candidate ensemble model of the group of candidate ensemble models) to determine interim ensemble weighting in the DE2, where the performances include determining “the classification , and 
selecting the optimal candidate ensemble model from the group of candidate ensemble models further comprises selecting a candidate ensemble model corresponding to a minimum function value of the loss function as the optimal candidate ensemble model (Yin, sections 1 and 4 teach the trained “interim ensembles” are combined “according to their series performances” (selecting the optimal candidate ensemble model from the group of candidate ensemble models) to determine interim ensemble weighting in the DE2, where the performances include determining “the classification error rate as the loss of the interim ensemble” (loss function) for each ensemble, and wherein favorably weighting better performing interim ensembles from “their series performances” more than that of other ensembles in the DE2 (further comprises selecting a candidate ensemble model corresponding to a minimum function value of the loss function as the optimal candidate ensemble model)).

Regarding claim 23, the combination of Yin and Li teach all the claim limitations of claim 1 above; and further teach wherein updating the current ensemble model with the optimal candidate ensemble model comprises: 
updating the current ensemble model to the optimal candidate ensemble model (Yin, sections 1 and 4 teach “updating” the DE2 (updating the current ensemble model) with the newly weighted “interim ensembles” that are weighted “according to their series performances” (to the optimal candidate ensemble model), wherein dynamic .


Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Yin et al (“DE2: Dynamic ensemble of ensembles for learning nonstationary data”, 2015) hereinafter Yin, in view of Li et al (US Pub 20160078339) hereinafter Li, in view of Lakhani et al (“Deep Learning at Chest Radiology: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural Networks”, 2017) hereinafter Lakhani .
Regarding claim 7 and 14, the combination of Yin and Li teach all the claim limitations of claims 1, 8, and 15 above. However the combination does not explicitly teach wherein the performance evaluation results comprise an area under a receiver operation characteristic (ROC) curve (AUC) value corresponding to each candidate ensemble model of the group of candidate ensemble models, and selecting the optimal candidate ensemble model from the group of candidate ensemble models further comprises selecting a candidate ensemble model corresponding to a maximum AUC value as the optimal candidate ensemble model.
Lakhani teaches wherein the performance evaluation results comprise an area under a receiver operation characteristic (ROC) curve (AUC) value corresponding to each candidate ensemble model of the group of candidate ensemble models (page 577, section Statistical and Data Analysis, paragraph 2 teaches using multiple “[e]nsembles” of different weighting biases between “the classifiers” consisting of “AlexNet and GoogLe Net”, and page 574, section Materials and Methods, and pages 577-578 further teach comparing AUC results for each pre-trained neural network ensemble (corresponding to each candidate ensemble model of the group of candidate ensemble models) vs. the AUC results for each “untrained” neural network ensemble (corresponding to each candidate ensemble model of the group of candidate ensemble models), where it is further taught these AUC measurements to be ROC AUC’s), and 
selecting the optimal candidate ensemble model from the group of candidate ensemble models further comprises selecting a candidate ensemble model corresponding to a maximum AUC value as the optimal candidate ensemble model (page 577, section Statistical and Data Analysis, paragraph 2 teaches using multiple “[e]nsembles” of different weighting biases between “the classifiers” consisting of “AlexNet and GoogLe Net”, and page 574, section Materials and Methods, and pages 577-578 further teach comparing AUC results for each pre-trained neural network ensemble (from the group of candidate ensemble models) vs. the AUC results for each “untrained” neural network ensemble (from the group of candidate ensemble models) to find which result is the highest (maximum AUC value) and declare (select) the highest result (maximum AUC value) as the “best performing ensemble (as the optimal candidate ensemble model)”. It is further taught these AUC measurements are also ROC AUC’s.).
.


Claims 21 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Yin et al (“DE2: Dynamic ensemble of ensembles for learning nonstationary data”, 2015) hereinafter Yin, in view of Li et al (US Pub 20160078339) hereinafter Li, in view of Chen et al (US Pub 20080228680) hereinafter Chen .
Regarding claim 21, the combination of Yin and Li teach all the claim limitations of claim 1 above; however the combination does not explicitly teach determining the optimal performance of the optimal candidate ensemble model satisfies the predetermined condition, and wherein determining the optimal performance of the optimal candidate ensemble model satisfies the predetermined condition comprises: comparing performance evaluation results of the current ensemble model and performance evaluation results of the optimal candidate ensemble model.
Chen teaches determining the optimal performance of the optimal candidate ensemble model satisfies the predetermined condition, and wherein determining the optimal performance of the optimal candidate ensemble model satisfies the predetermined condition comprises: 
comparing performance evaluation results of the current ensemble model and performance evaluation results of the optimal candidate ensemble model (paragraphs 0072-0075 teach updating a “global ensemble” (current ensemble model) with “local ensembles” by determining “the best candidate local ensembles” to include through many performance evaluations (determining the optimal performance of the optimal candidate ensemble model satisfies the predetermined condition), including (comprises) measuring “the degree by which the member network predictions (performance evaluation results of the optimal candidate ensemble model) deviate (comparing) from the (global) ensemble's prediction (performance evaluation results of the current ensemble model)”).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify training and ranking interim ensembles for use in a combined DE2 as taught by Yin, as modified by filling an ensemble teacher model with “untrained” sub-DNN models and training the ensemble as taught by Li, to include comparing global ensemble results to local ensemble results as taught by Chen in order to determine the best local ensembles to include in a global ensemble and obtain an optimal ambiguity measure value (Chen, paragraphs 0072-0075).

Regarding claim 22, the combination of Yin, Li, and Chen teach all the claim limitations of claim 21 above; and further teach wherein updating the current ensemble model with the optimal candidate ensemble model comprises: 
updating the current ensemble model to the current ensemble model based on the performance evaluation results of the current ensemble model being more optimal than the performance evaluation results of the optimal candidate ensemble model (Chen, paragraphs 0072-0075 teach updating a “global ensemble” (updating the current ensemble model to the current ensemble model) with “the best candidate local ensembles” (with the optimal candidate ensemble model) to include through many performance evaluations, including (comprises) measuring “the degree by which the member network predictions (results of the optimal candidate ensemble model) deviate from the (global) ensemble's prediction (performance evaluation results of the current ensemble model being more optimal)”; in other words, the global result is held as the standard (more optimal) to measure the local ensemble results against to obtain a “ambiguity” measure).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify training and ranking interim ensembles for use in a combined DE2 as taught by Yin, as modified by filling an ensemble teacher model with “untrained” sub-DNN models and training the ensemble as taught by Li, to include comparing global ensemble results to local ensemble results as taught by Chen in order to determine the best local ensembles to include in a global ensemble and obtain an optimal ambiguity measure value (Chen, paragraphs 0072-0075).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to CLINT MULLINAX whose telephone number is 571-272-3241.  The examiner can normally be reached on Mon - Fri 8:00-4:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 




/C.M./Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123