DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
The present application is being examined under the claims filed 01/30/2018. 
Claims 1-20 are pending.

Claim Objections
Claim 19 is objected to because of the following informality: “cause” is misspelled as “casue” in the limitation “the instructions further casue deriving.”  Appropriate correction is required.

Claim Rejections - 35 USC § 101 – Abstract Idea
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-4, 9-19 are rejected under 35 U.S.C. 101 for containing an abstract idea without significantly more. 

Regarding Claim 1:
Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is a method.
Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
“deriving a plurality of meta-feature values from an inference dataset by, for each meta- feature of a plurality of meta-features, deriving a respective meta-feature value from an inference dataset;” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
“for each algorithm of a plurality of trainable algorithms: for each meta-model of a plurality of regression meta-models that are respectively associated with the algorithm, calculating a respective score by invoking the meta-model based on at least one of: a) a respective subset of meta-feature values of the plurality of meta-feature values, or b) hyperparameter values of a respective subset of a plurality of hyperparameters of the algorithm;” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
“selecting, based on the respective scores, one or more algorithms of the plurality of trainable algorithms;” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
“invoking, based on the inference dataset, the one or more algorithms to obtain a result.” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. 
Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
	No, there are no additional elements that amount to significantly more than the judicial exception.

Regarding Claim 2:
Claim 2 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). This claim merely recites further limitations on the selecting limitation which was considered a mental process. The claim cites an additional abstract idea:
“selecting the one or more algorithms comprises ranking the plurality of trainable algorithms based on the respective scores.” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
The judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The additional limitation: 
“the one or more algorithms comprises multiple algorithms;” – This limitation is directed to merely generally linking the judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)) as it merely limits the number of algorithms. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 3:
Claim 3 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). This claim merely recites further limitations on the calculating limitation which was considered a mental process. The additional limitation: 
“wherein said hyperparameter values are default values.” – This limitation is directed to merely generally linking the judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)) as it merely limits the number of algorithms. 
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 4:
Claim 4 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). The claim cites additional abstract ideas: 
“the method further comprises deriving, for each algorithm of the plurality of trainable algorithms, a respective ensemble score that is based on the respective scores of the plurality of regression meta-models that are associated with the algorithm;” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
“selecting the one or more algorithms based on the respective scores comprises selecting the one or more algorithms based on the respective ensemble scores.” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
The judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The additional limitation: 
“wherein: each algorithm of the plurality of trainable algorithms is associated with a respective ensemble that contains said plurality of regression meta-models that are associated with the algorithm;” – This limitation is directed to merely generally linking the judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)) as it merely limits the field of algorithm. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 9:
Claim 9 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). This claim merely recites further limitations on the calculating limitation which was considered a mental process. The additional limitation: 
“wherein each algorithm of the plurality of trainable algorithms comprises one of a support vector machine (SVM), a random forest, a decision tree, or an artificial neural network.” – This limitation is directed to merely generally linking the judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)) as it merely links the algorithms to certain types of machine learning algorithms. 
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 10:
Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). This claim merely recites further limitations on the calculating limitation which was considered a mental process. The additional limitation: 
“wherein each algorithm of the plurality of trainable algorithms comprises one of: classification, regression, or anomaly detection.” – This limitation is directed to merely generally linking the judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)) as it merely links the algorithms to certain types of machine learning algorithms. 
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 11:
Claim 11 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). This claim merely recites further limitations on the calculating limitation which was considered a mental process. The additional limitation: 
“wherein each meta-model of the plurality of regression meta- models comprises a distinct artificial neural network.” – This limitation is directed to merely generally linking the judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)) as it merely links the model to neural networks. 
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 12:
Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). This claim merely recites further limitations on the calculating limitation which was considered a mental process. The additional limitation: 
“wherein calculating the respective score comprises a softmax function.” – This limitation is directed to merely generally linking the judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)) as it merely links the calculation to softmax functions. 
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 13:
Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). The claim cites a further abstract idea: 
“further comprising converting values of a non-numeric meta- feature of said plurality of meta-features by at least one encoding scheme of: one-hot or one- cold.” – This limitation is directed to the abstract idea of mathematical concepts (see MPEP 2106.04(a)(2) I.) as an encoding scheme is being used to convert values.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 14:
Claim 14 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). The claim cites a further abstract idea: 
“further comprising converting values of a numeric meta-feature of said plurality of meta-features by at least one encoding scheme of: zero mean or unit variance.” – This limitation is directed to the abstract idea of mathematical concepts (see MPEP 2106.04(a)(2) I.) as an encoding scheme is being used to convert values.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 15:
Claim 15 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). The claim cites a further abstract idea: 
“further comprising, for each algorithm of the plurality of trainable algorithms, optimizing at least one of: a percentage of hyperparameters and/or meta-features injected as inputs into each meta- model of the plurality of regression meta-models that are associated with the algorithm, or a count of meta-models in the plurality of regression meta-models that are associated with the algorithm.” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 16:
Claim 16 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is ultimately dependent on claim 1 which included an abstract idea (see rejection for claim 1). This claim merely recites further limitations on the optimizing limitation from Claim 15 which was considered a mental process. The additional limitation: 
“wherein said optimizing comprises at least one of: gradient descent, Bayesian optimization, SVM, or a decision tree.” – This limitation is directed to merely generally linking the judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)) as it merely links the optimization to certain types of algorithms. 
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 17:
Claim 17 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). The additional limitations: 
“further comprising assigning hyperparameters and/or meta- features as inputs for each meta-model of the plurality of regression meta-models [that are associated with each algorithm of the plurality of trainable algorithms by at least one of: sample bagging, feature bagging, or boosting.]” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
“[further comprising assigning hyperparameters and/or meta- features as inputs for each meta-model of] the plurality of regression meta-models that are associated with each algorithm of the plurality of trainable algorithms by at least one of: sample bagging, feature bagging, or boosting.” – This limitation is directed to merely generally linking the judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)) as it merely links the optimization to certain types of algorithms. 
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 18:
Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is a product.
Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
“deriving a plurality of meta-feature values from an inference dataset by, for each meta- feature of a plurality of meta-features, deriving a respective meta-feature value from an inference dataset;” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
“for each algorithm of a plurality of trainable algorithms: for each meta-model of a plurality of regression meta-models that are respectively associated with the algorithm, calculating a respective score by invoking the meta-model based on at least one of: a) a respective subset of meta-feature values of the plurality of meta-feature values, or b) hyperparameter values of a respective subset of a plurality of hyperparameters of the algorithm;” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
“selecting, based on the respective scores, one or more algorithms of the plurality of trainable algorithms;” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
“invoking, based on the inference dataset, the one or more algorithms to obtain a result.” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. The additional elements merely claiming the concept to be performed on a generic computer. The additional element is “One or more non-transitory computer-readable media storing instructions that, when executed by one or more processors, cause” (see MPEP 2106.04(d)).
Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
	No, there are no additional elements that amount to significantly more than the judicial exception.

Regarding Claim 19:
Claim 19 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 18 which included an abstract idea (see rejection for claim 1). The claim cites additional abstract ideas: 
“the instructions further casue deriving, for each algorithm of the plurality of trainable algorithms, a respective ensemble score that is based on the respective scores of the plurality of regression meta-models that are associated with the algorithm; ;” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
“selecting the one or more algorithms based on the respective scores comprises selecting the one or more algorithms based on the respective ensemble scores.” – This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgment, opinion) which can be performed in the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.).
The judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A Prong 2. The additional limitation: 
“wherein: each algorithm of the plurality of trainable algorithms is associated with a respective ensemble that contains said plurality of regression meta-models that are associated with the algorithm;” – This limitation is directed to merely generally linking the judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)) as it merely limits the field of algorithm. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2, 3, and 11 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding Claim 2:
	Claim 2 cites “the one or more algorithms comprises multiple algorithms.” It is unclear how one algorithm comprises multiple algorithms. The specification does not provide any clarification. For examination purposes, this limitation is interpreted to be at least one algorithm. 

Regarding Claim 3:
	Claim 3 cites “wherein said hyperparameter values are default values.” It is unclear what “default values” are. It is unclear whether the default values are values standard for this claimed method or whether they are assumed values set by the system the claimed method is implemented on. If it is the latter, “default values” are relative. The specification does not provide a standard for “default values” and is only mentioned in the specifications on ¶46, “In an embodiment, default hyperparameter values (not shown) are instead used during inferencing.” For examination purposes, “default values” are assumed to be any values that are assumed. 

Regarding Claim 11: 
	Claim 11 cites “wherein each meta-model of the plurality of regression meta- models comprises a distinct artificial neural network.” It is unclear what the standard for “distinct” is. The models could be distinct in a variety of ways such as weight values, bias values, model architecture, training data, etc. The specification does not provide any standard for distinctness for neural networks; the specification only provides a standard for algorithms in ¶29. For examination purposes, “distinct artificial neural network” is interpreted to be an artificial neural network. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-3, 5, 7, 9-11, 14-17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Wichard (“Model Selection in an Ensemble Framework) (herein thereafter Wichard).

Regarding Claim 1:
	Wichard teaches a method for model selection for ensemble learning. Wichard teaches:
A method comprising: (Wichard teaches a method, disclosed in the Abstract, “We like to present a method to build ensemble models based on an extended cross-validation approach.”)
deriving a plurality of meta-feature values from an inference dataset by, for each meta- feature of a plurality of meta-features, deriving a respective meta-feature value from an inference dataset; (Examiner notes that deriving is defined to be “to take, receive, or obtain especially from a specified source” as per Merriam Webster Dictionary. Wichard discloses deriving features from a dataset in sec. I ¶2, “Let us consider a supervised learning problem with n training examples of the form {(x1, y1),(x2, y2), . . . ,(xn, yn)} from an unknown function y = f(x). The x values are usually d-dimensional vectors that are called ’input-features’.” Wichard further discloses that the dataset is used in the process of inference in sec. IV. B ¶1, “The linear ridge model is a simple multivariate linear
regression that takes the N features {xi}i=1,...,N as input.”)
for each algorithm of a plurality of trainable algorithms: for each meta-model of a plurality of regression meta-models that are respectively associated with the algorithm, calculating a respective score by invoking the meta-model based on at least one of: a) a respective subset of meta-feature values of the plurality of meta-feature values, or b) hyperparameter values of a respective subset of a plurality of hyperparameters of the algorithm; (Wichard discloses calculating a score of classification error for each model in sec. III. B ¶1, “This means, that all models have to compete with each other in a fair tournament because they are trained and validated on the same data set. The models with the lowest classification error in each CV-fold are taken out and added to the final ensemble.” Wichard discloses that the models are regression models in sec. III. B ¶2, “We applied this approach to several regression problems related to time series prediction and we achieved convincing results.” Wichard discloses that calculating the score involves hyperparameter values in sec. IV ¶1, “In this section we give a short overview of the models that we use for ensemble building and the model parameters (hyperparameters) that are optimized in during the CV. All models belong to the classical collection of machine learning algorithms for classification and regression. ”)
selecting, based on the respective scores, one or more algorithms of the plurality of trainable algorithms; (Wichard discloses selecting one or more algorithms associated with a model based on the score of classification error in sec. III. B ¶1, “This leads to k training validation rounds and in every round we select only one model to become a member of the final ensemble (namely the best model with respect to the validation set). In every round we train several different models classes with a variety of model parameters (see Section IV for an overview of the models and the related model parameters). This means, that all models have to compete with each other in a fair tournament because they are trained and validated on the same data set. The models with the lowest classification error in each CV-fold are taken out and added to the final ensemble.”) 
invoking, based on the inference dataset, the one or more algorithms to obtain a result. (Wichard discloses invoking the selected one or more algorithms in sec. VIII ¶1 and Table II, “In this section we will report, which preprocessing, parameter settings and base models were used to build classifier ensembles for the 5 data sets (ADA, GINA, HIVA, NOVA and SYLVA, see Table VII). In all cases we used the validation set, that was provided on the challenge web-site as our ’hold out test set’ (see Section III). This ’test set’ was 10% of the size of the training set in all cases. The results of our best challange entry are reported in Table II.”) 

Regarding Claim 2:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard further teaches:
wherein: the one or more algorithms comprises multiple algorithms; (Wichard discloses multiple models to be selected from in sec. IV ¶1, “In this section we give a short overview of the models that we use for ensemble building and the model parameters (hyperparameters) that are optimized in during the CV.” Wichard discloses that the models have multiple algorithms in sec. IV. A-G. For example, Wichard discloses multiple algorithms in sec. IV. B, “The regression coefficients minimize a penalized residual sum of squares                                 
                                    α
                                    =
                                    a
                                    r
                                    g
                                    m
                                    i
                                    
                                        
                                            n
                                        
                                        
                                            α
                                        
                                    
                                    
                                        
                                            
                                                
                                                    ∑
                                                    
                                                        i
                                                        =
                                                        1
                                                    
                                                    
                                                        N
                                                    
                                                
                                                
                                                    
                                                        
                                                            (
                                                            
                                                                
                                                                    y
                                                                
                                                                
                                                                    i
                                                                
                                                            
                                                            -
                                                            
                                                                
                                                                    α
                                                                
                                                                
                                                                    0
                                                                
                                                            
                                                            -
                                                             
                                                            
                                                                
                                                                    
                                                                        
                                                                            x
                                                                        
                                                                        
                                                                            i
                                                                        
                                                                    
                                                                
                                                                
                                                                    α
                                                                
                                                            
                                                            )
                                                        
                                                        
                                                            2
                                                        
                                                    
                                                    +
                                                    λ
                                                    
                                                        
                                                            ∑
                                                            
                                                                j
                                                                =
                                                                0
                                                            
                                                            
                                                                d
                                                            
                                                        
                                                        
                                                            
                                                                
                                                                    α
                                                                
                                                                
                                                                    j
                                                                
                                                                
                                                                    2
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                            }
                                        
                                    
                                     
                                
                            , where                                 
                                    
                                        
                                            α
                                        
                                        
                                            0
                                        
                                    
                                
                             denotes the constant term in the regression and                                 
                                    
                                        
                                            ∙
                                        
                                        
                                            ∙
                                        
                                    
                                
                             is the scalar product defined as                                 
                                    
                                        
                                            x
                                        
                                        
                                            α
                                        
                                    
                                    =
                                     
                                    
                                        
                                            ∑
                                            
                                                k
                                                =
                                                1
                                            
                                            
                                                d
                                            
                                        
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    k
                                                
                                            
                                            
                                                
                                                    α
                                                
                                                
                                                    k
                                                
                                            
                                        
                                    
                                
                             . The ‘linear discriminant function’ is given by                                 
                                    f
                                    
                                        
                                            x
                                        
                                    
                                    =
                                    s
                                    i
                                    g
                                    n
                                    (
                                    
                                        
                                            α
                                        
                                        
                                            0
                                        
                                    
                                    +
                                    
                                        
                                            x
                                        
                                        
                                            α
                                        
                                    
                                    )
                                
                            .”)
selecting the one or more algorithms comprises ranking the plurality of trainable algorithms based on the respective scores. (Wichard discloses ranking the scores of the models during selection in sec. III. B ¶1, “This means, that all models have to compete with each other in a fair tournament because they are trained and validated on the same data set. The models with the lowest classification error in each CV-fold are taken out and added to the final ensemble, receiving the weight ωi =                                 
                                    
                                        
                                            1
                                        
                                        
                                            k
                                        
                                    
                                
                             (see Equation 1). All other models in this CV-fold are deleted.”) 

Regarding Claim 3:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard further teaches:
wherein said hyperparameter values are default values. (Examiner notes that default values are not well defined and is interpreted to be values that are assumed. Wichard teaches hyperparameters which are also referred to as model parameters in sec. IV ¶1, “In this section we give a short overview of the models that we use for ensemble building and the model parameters (hyperparameters) that are optimized in during the CV.” Wichard discloses default values (i.e. assumed model parameters) in sec. II ¶1, “we assume that the model weights ωi sum to one                                 
                                    
                                        
                                            ∑
                                            
                                                i
                                                =
                                                1
                                                 
                                            
                                            
                                                K
                                            
                                        
                                        
                                            
                                                
                                                    ω
                                                
                                                
                                                    i
                                                
                                            
                                            =
                                            1
                                        
                                    
                                
                            . There are several suggestions concerning the choice of the model weights (see Perrone et al. [3] or Hashem et al. [11]). We decided to use uniform weights with                                 
                                    
                                        
                                            ω
                                        
                                        
                                            i
                                        
                                    
                                    =
                                    
                                        
                                            1
                                        
                                        
                                            K
                                        
                                    
                                
                              for the sake of simplicity and not to run into over-fitting problems as reported by Krogh et al. [8].”) 

Regarding Claim 5:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard further teaches:
further comprising: storing a plurality of testing datasets; (Wichard teaches using stored datasets for testing in sec. VII ¶1, “The Performance Prediction Challenge [32] provides five data sets from different sources divided in a training set, a validation set and a test set each.”)
for each algorithm of the plurality of trainable algorithms: for each model of a plurality of models that are based on the algorithm: configuring the model based on respective particular values for said plurality of hyperparameters of the algorithm; (Wichard teaches optimizing the hyperparameters (i.e. configuring the model) for each of the algorithms of the models in sec. IV ¶1, “In this section we give a short overview of the models that we use for ensemble building and the model parameters (hyperparameters) that are optimized in during the CV.”) 
and for each testing dataset of the plurality of testing datasets: testing the model based on the testing dataset to calculate a respective test score; (Wichard teaches testing models for each dataset to calculate respective test scores in Table 2:

    PNG
    media_image1.png
    206
    410
    media_image1.png
    Greyscale

and recording a distinct tuple that references: the respective particular values for said plurality of hyperparameters, the testing dataset, the respective test score, and the algorithm; (Examiner notes that a tuple is equivalent to a list as per Wolfram MathWorld. Wichard implements the models with an open source toolbox, disclosed in sec. IV ¶1, “In this section we give a short overview of the models that we use for ensemble building and the model parameters (hyperparameters) that are optimized in during the CV. […] The implementation of these models in an open source toolbox together with a more detailed description can be found in [16].”  [16] is a toolbox created by Wichard et al. called ENTOOL and in its documentation, discloses a list that references hyperparameters, testing dataset, test score, and algorithm in sec. 3.1, “train(model, X, Y , sampleclass, trainparams, eps, varargin).”)
and for each meta-model of the plurality of regression meta-models that are associated with the algorithm, training the meta-model based on at least one of said distinct tuples recorded for the algorithm. (Wichard teaches training the model based on the tuple in the ENTOOL documentation sec. 3.1, “[model, trainerr] = train(model, X, Y , sampleclass, trainparams, eps, varargin).”)

Regarding Claim 7:
Wichard teaches “The method of Claim 5” as seen above.  
Wichard further teaches: 
further comprising cross validating the plurality of models with an original training dataset that is partitioned into: a plurality of training datasets and said plurality of testing datasets. (Wichard discloses cross validating models in sec. III ¶1, “In order to select models for the final ensemble we use cross validation (CV) for model training.” Wichard further discloses that the original training set is partitioned into training sets and testing sets in sec. III. A ¶2, “First of all we isolate a ’test set’ that is hold out from the training procedure and only used for the final evaluation (usually 10% to 25% of the entire data set). For a k-fold CV the data is divided k-times into a ’training set’ and a ’validation set’ (see Figure 1), both sets containing randomly drawn subsets of the data without replications.” Wichard further discloses a plurality of training sets and test sets in sec. VIII ¶1, “The Performance Prediction Challenge [32] provides five data sets from different sources divided in a training set, a validation set and a test set each.”) 

Regarding Claim 9:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard further teaches:
wherein each algorithm of the plurality of trainable algorithms comprises one of a support vector machine (SVM), a random forest, a decision tree, or an artificial neural network. (Wichard discloses each algorithm of the plurality of trainable algorithms comprising of a certain type in sec. IV, “In this section we give a short overview of the models that we use for ensemble building and the model parameters (hyperparameters) that are optimized in during the CV.” Wichard discloses support vector machines in sec. IV. F, “Support Vector Machines. Over the last decade Support Vector Machines (SVMs) have become very powerful tools in machine learning. […].” Wichard discloses decision trees in sec. IV. D, “Trees. […] For our purpose we use the ’classification and regression trees’ (CART) as described in Breiman et al.” Wichard discloses artificial neural networks in sec. IV. E, “Neural Networks. We use a multilayer feed-forward Neural Network (MLP: Multi Layer Perceptron) with the tanh(x) as activation function.”) 

Regarding Claim 10:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard further teaches:
wherein each algorithm of the plurality of trainable algorithms comprises one of: classification, regression, or anomaly detection. (Wichard discloses each algorithm of the trainable algorithms being for classification in sec. I. ¶3, “In the next sections we will present our ensemble approach, the model selection scheme and the method we used to estimate the performance of our classifier ensemble regarding the five classification tasks of the ’Performance Prediction Challenge’.” Wichard also discloses regression in sec. III. B. ¶2, “We applied this approach to several regression problems related to time series prediction and we achieved convincing results [10], [13], [14].”) 

Regarding Claim 11:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard further teaches:
wherein each meta-model of the plurality of regression meta- models comprises a distinct artificial neural network. (Examiner notes that “a distinct artificial neural network” is interpreted to be an artificial neural network. Wichard discloses regression models in sec. III. B. ¶1-2, “In every round we train several different models classes with a variety of model parameters (see Section IV for an overview of the models and the related model parameters). We applied this approach to several regression problems related to time series prediction and we achieved convincing results [10], [13], [14].” Wichard discloses the models being artificial neural networks in sec. IV. E. ¶1, “Neural Networks. We use a multilayer feed-forward Neural Network (MLP: Multi Layer Perceptron) with the tanh(x) as activation function.”) 

Regarding Claim 14:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard further teaches:
further comprising converting values of a numeric meta-feature of said plurality of meta-features by at least one encoding scheme of: zero mean or unit variance. (Wichard discloses converting numeric features via normalizing the data to have zero mean and unit variance in sec. V. A. ¶1, “If we substract the mean from the features and divide them with their variance, we call this normalizing the data.”)

Regarding Claim 15:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard further teaches:
further comprising, for each algorithm of the plurality of trainable algorithms, optimizing at least one of: a percentage of hyperparameters and/or meta-features injected as inputs into each meta- model of the plurality of regression meta-models that are associated with the algorithm, or a count of meta-models in the plurality of regression meta-models that are associated with the algorithm. (Wichard teaches for each algorithm of the plurality of trainable algorithms optimizing hyperparameters injected as inputs into regression models that are associated with the algorithm. Wichard discloses regression models in sec. III. B. ¶1-2, “In every round we train several different models classes with a variety of model parameters (see Section IV for an overview of the models and the related model parameters). We applied this approach to several regression problems related to time series prediction and we achieved convincing results [10], [13], [14].” Wichard discloses optimizing hyperparameters in sec. IV ¶1, “In this section we give a short overview of the models that we use for ensemble building and the model parameters (hyperparameters) that are optimized in during the CV.”)

Regarding Claim 16:
Wichard teaches “The method of Claim 15” as seen above. 
Wichard further teaches:
wherein said optimizing comprises at least one of: gradient descent, Bayesian optimization, SVM, or a decision tree. (Wichard discloses optimizing hyperparameters in sec. IV ¶1, “In this section we give a short overview of the models that we use for ensemble building and the model parameters (hyperparameters) that are optimized in during the CV.” Wichard discloses optimizing neural networks comprising of gradient descent in sec. IV. E. ¶1, “The weights are trained with a gradient descend based on the Rprop Algorithm [20] with the improvements given in [21].” Wichard further discloses optimizing SVM models in sec. IV F. ¶1, “The parameters of the model are with respect to the kernel type the polynomial degree d, the width of the rbf σ2 and the value concerning the cost of constrain violation during the SVM training.”)

Regarding Claim 17:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard further teaches:
further comprising assigning hyperparameters and/or meta- features as inputs for each meta-model of the plurality of regression meta-models that are associated with each algorithm of the plurality of trainable algorithms by at least one of: sample bagging, feature bagging, or boosting. (Wichard discloses regression models in sec. III. B. ¶1-2, “In every round we train several different models classes with a variety of model parameters (see Section IV for an overview of the models and the related model parameters). We applied this approach to several regression problems related to time series prediction and we achieved convincing results [10], [13], [14].” Wichard discloses boosting in sec. IV. G. ¶1, “In our approach we used the Adaboost.M1 boosting scheme as described by Friedman et al. [28] (see Hastie et al. [17] for a detailed overview and description) and applied it to the ridge model in Section IV-B. So the base model of the ensemble is a boosted linear ridge model.” Wichard further discloses assigning features as inputs for the linear ridge model in sec. IV. B. ¶1, “The linear ridge model is a simple multivariate linear regression that takes the N features {xi}i=1,...,N as input.”)

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Wichard in view of Heinermann et al. (“Machine learning ensembles for wind power prediction”) (herein thereafter Heinermann). 

Regarding Claim 4:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard does not teach “wherein: each algorithm of the plurality of trainable algorithms is associated with a respective ensemble that contains said plurality of regression meta-models that are associated with the algorithm; the method further comprises deriving, for each algorithm of the plurality of trainable algorithms, a respective ensemble score that is based on the respective scores of the plurality of regression meta-models that are associated with the algorithm; selecting the one or more algorithms based on the respective scores comprises selecting the one or more algorithms based on the respective ensemble scores.”
	Heinermann discloses a comparison of machine learning ensembles. Heinermann teaches:
wherein: each algorithm of the plurality of trainable algorithms is associated with a respective ensemble that contains said plurality of regression meta-models that are associated with the algorithm; (Heinermann teaches ensembles containing regression models in sec. 1 ¶3, “In this work, we discuss the practical use of regression ensembles for the task of wind power forecasting, aiming at optimal regression accuracy as well as maintaining a reasonable computation time.” Heinermann teaches algorithms associated with each of the ensembles in sec. 1 ¶3, “In the first step we compare homogeneous ensemble predictors consisting of either decision trees (DT), k-NN, or support vector regressors as base algorithms. As diversity among the ensemble members is crucial for the accuracy of the ensemble, we propose the use of heterogeneous ensemble predictors consisting of different types of base predictors for wind power prediction.”) 
the method further comprises deriving, for each algorithm of the plurality of trainable algorithms, a respective ensemble score that is based on the respective scores of the plurality of regression meta-models that are associated with the algorithm; (Heinermann teaches calculating scores, in the form of mean squared error, for each of the algorithms a respective ensemble score based on the regression model associated with the algorithm in sec. 3 ¶5 and Table 2, “We experimentally compare different regression algorithms composed to ensembles: Table 1 shows a comparison of decision trees, SVR, and k-NN used as weak predictors.” 

    PNG
    media_image2.png
    216
    930
    media_image2.png
    Greyscale

selecting the one or more algorithms based on the respective scores comprises selecting the one or more algorithms based on the respective ensemble scores. (Heinermann discloses choosing the best ensemble approach based on the score in sec. 7 ¶1, “After analyzing different types of ensemble predictors, we propose a heterogeneous ensemble approach utilizing both DT and SVR.” Heinermann discloses the score is mean squared error in Table 2: 

    PNG
    media_image3.png
    156
    456
    media_image3.png
    Greyscale

Wichard, Heinermann, and the instant application are analogous art because they are all directed to machine learning ensembles.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the ensemble framework disclosed by Wichard to include ensemble selection as taught by Heinermann. One would be motivated to do so to decrease computation time, as suggested by Heinermann (Heinermann sec. 7 ¶1: “In our comprehensive experimental evaluation, we show that our approach yields better results within a shorter computation time than state-of-the-art machine learning algorithms.”).

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Wichard in view of Kwak et al. (“Statistical data preparation: management of missing values and outliers”) (herein thereafter Kwak). 

Regarding Claim 6:
Wichard teaches “The method of Claim 5” as seen above.  Wichard discloses preprocessing data in sec. V ¶1, “In some cases it is usefull to apply a kind of data preprocessing in order to select the most discriminant features and reduce the dimensionality of the leaning problem, in particular if we have more features than training examples.” Removing data with missing values is a part of data preprocessing but is not explicitly taught by Wichard. 
Wichard does not teach “wherein at least one of: said plurality of meta-features excludes meta-features that are missing a value in a percentage of the distinct tuples that exceeds a threshold, or said at least one of the distinct tuples excludes tuples that are missing a value.”
Kwak teaches data preprocessing. Kwak teaches:
wherein at least one of: said plurality of meta-features excludes meta-features that are missing a value in a percentage of the distinct tuples that exceeds a threshold, or said at least one of the distinct tuples excludes tuples that are missing a value. (Examiner notes that tuples are equivalents to lists as per Wolfram MathWorld. Kwak teaches at least one of the distinct tuples excluding tuples that are missing a value. Kwak discloses excluding missing values in the sec. “Methods for Handling Missing Values” ¶2, “This method uses only the data of variables observed at each time point for analysis after removing all missing values.” The data consists of tuples as disclosed by Kwak in the section “Types of Missing Values” ¶3, “Yij: The ‘j’th measurement value for the ‘i’th patient, i = i, … , I, j = 1, … , J.”)
Wichard, Kwak, and the instant application are analogous art because they are all directed to data preprocessing.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the ensemble framework disclosed by Wichard to include missing data exclusion as taught by Kwak. One would be motivated to do so to reduce biased results, as suggested by Kwak (Kwak sec. “Examples of Missing Values and Outliers” ¶2: “Thus, the treatment of a missing value and outlier does not cause under- or over-estimation of the statistics, with neither a change in the sample size nor a bias in the results.”, sec. Conclusion ¶1: “This review paper underlines the efforts to minimize common problems associated with data analysis, including biased results and subsequent under- or over-estimation, by handling missing data and outliers properly during the pretreatment process.”).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Wichard  in view of Haixiang et al. (“Learning from class-imbalanced data: Review of methods and applications”) (herein thereafter Haixiang). 

Regarding Claim 8:
Wichard teaches “The method of Claim 5” as seen above.  
Wichard does not teach parallelization and does not teach “further comprising generating, in parallel, multiple tuples of said distinct tuples.”
Haixiang teaches:
further comprising generating, in parallel, multiple tuples of said distinct tuples. (Haixiang discloses training in parallel in sec. 3.2.1.2. ¶1 and Fig. 5, “Parallel based ensembles. In this study, parallel based ensembles refer to ensemble models in which each base classifier can be trained in parallel. Parallel based ensemble schemes include bagging, re-sampling based ensembles and feature selection based ensembles. The basic framework for parallel ensemble methods is shown in Fig. 5, in which the dashed boxes and lines represent optional processes.”

    PNG
    media_image4.png
    506
    654
    media_image4.png
    Greyscale

Examiner notes that combination of Wichard, who teaches tuples, and Haixiang teach this 
limitation.)
Wichard, Haixiang, and the instant application are analogous art because they are all directed to ensemble learning.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the ensemble framework disclosed by Wichard to include parallelization as taught by Haixiang. One would be motivated to do so to save training time, as suggested by Haixiang (Haixiang sec. 3.2.1.2 ¶1: “Since parallel ensembles have time-saving and ease-of-development advantages, they are recommended for solving practical problems.”).

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Wichard  in view of Sedghi et al. (“Provable Methods for Training Neural Networks with Sparse Connectivity”) (herein thereafter Sedghi). 

Regarding Claim 12:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard teaches calculating the respective score involving neural networks. However, Wichard does not teach “wherein calculating the respective score comprises a softmax function.”
Sedghi teaches:
wherein calculating the respective score comprises a softmax function. (Examiner notes that the combination of Wichard and Sedghi teach this limitation. Wichard teaches the calculation of the respective score via the invoking of a neural network. Sedghi teaches using softmax functions for neural networks. Sedghi discloses a softmax function for neural networks in sec. 2.1., “We first consider a feedforward network with one hidden layer. […] For multiclass classification σ2 is the softmax function.”) 
Wichard, Sedghi, and the instant application are analogous art because they are all directed to neural networks.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the ensemble framework disclosed by Wichard to include softmax functions as taught by Sedghi. One would be motivated to do so to reduce the number of computations, as suggested by Sedghi (Sedghi sec. 1.1 ¶1: “In practice, the output of our method can be used for dimensionality reduction for back propagation, resulting in reduced computation.”).

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Wichard in view of Chauhan (“Categorical Encoding, One Hot Encoding and why use it?”) (herein thereafter Chauhan). 

Regarding Claim 13:
Wichard teaches “The method of Claim 1” as seen above. 
Wichard does not explicitly teach “further comprising converting values of a non-numeric meta- feature of said plurality of meta-features by at least one encoding scheme of: one-hot or one- cold.” 
Chauhan teaches: 
further comprising converting values of a non-numeric meta- feature of said plurality of meta-features by at least one encoding scheme of: one-hot or one- cold. (Chauhan discloses converting non-numeric features to numeric features through one-hot encoding in ¶8, “One Hot Encoding does exactly the same. It takes distinct values from the feature and convert into a feature itself to improve the relationship with overall data. So if we choose One Hot Encoding to the “Sex” feature the dataset will look like as below.”)
Wichard, Chauhan, and the instant application are analogous art because they are all directed to data preprocessing.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the ensemble framework disclosed by Wichard to include one-hot encoding as taught by Chauhan. One would be motivated to do so to improve learning time by handling data sparsity properly and to allow the user to handle more data types as algorithms cannot take non-numeric features, as suggested by Chauhan (Chauhan ¶1: “In the data science categorical values are encoded as enumerator so the algorithms can use them numerically when processing the data,” ¶13: “As you can see, using One Hot encoding, sparsity of data is included into original data set which is more memory friendly and improve learning time if algorithm is designed to handle data sparsity properly.”).

Claim 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Wichard in view of Bonissone et al. (US20140188768) (herein thereafter Bonissone). 

Regarding Claim 18:
Claim 18 is a product claim, corresponding to method claim 1. The only difference is that claim 18 recites one or more non-transitory computer readable media storing instructions. Bonissone teaches:
One or more non-transitory computer-readable media storing instructions that, when executed by one or more processors, cause: (Bonissone discloses in ¶26, “Therefore, the methods described herein may be encoded as executable instructions embodied in a tangible, non-transitory, computer readable medium, including, with out limitation, a storage device and/or a memory device. Such instructions, when executed by a processor, cause the processor to perform at least a portion of the methods described herein.”)
Wichard, Bonissone, and the instant application are analogous art because they are all directed to ensemble learning.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the ensemble framework disclosed by Wichard to include “One or more non-transitory computer-readable media storing instructions that, when executed by one or more processors cause” as taught by Bonissone. One would be motivated to do so to improve accuracy, as suggested by Bonissone (Bonissone ¶4: “To improve accuracy, at least some known approaches to machine learning implement model ensembles, i.e., collections of models, to obtain better predictive performance over any single model within the ensemble.”).
The rest of the limitations of claim 18 is rejected for the same reasons as claim 1.

Regarding Claim 20:
Claim 20 is a product claim, corresponding to method claim 5. The only difference is that claim 20 recites one or more non-transitory computer readable media storing instructions, taught by Bonissone. Claim 20 is rejected for the same reasons as claim 5.

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Wichard in view of Bonissone and further in view of Heinermann. 

Regarding Claim 19:
Claim 19 is a product claim, corresponding to method claim 4. The only difference is that claim 19 recites one or more non-transitory computer readable media storing instructions, taught by Bonissone. Claim 19 is rejected for the same reasons as claim 4.

Prior Art of Record
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Caruana et al. in “Ensemble Selection from Libraries of Models” discloses model selection for constructing ensembles using scores (Caruana sec. I. ¶2-4: “Here we generate diverse sets of models by using many different algorithms. We use Support Vector Machines (SVMs), artificial neural nets (ANNs), memory-based learning (KNN), decision trees (DT), bagged decision trees (BAG-DT), boosted decision trees (BST-DT), and boosted stumps (BST-STMP). For each algorithm we train models using many different parameter settings. […] We evaluate the performance of ensemble selection on ten performance metrics.”)



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Somie Park whose telephone number is (571)272-1056. The examiner can normally be reached 9:00am - 5:00pm, Monday-Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571)272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SOMIE PARK/Examiner, Art Unit 2126        
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126