Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-5, 7-10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim 1
Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: The claim recites a system; therefore, it falls into the statutory category of machines.
Step 2A Prong 1: 
The limitations of 
“An information processing system for generating an analysis pipeline model by using an analysis pipeline, the analysis pipeline including a pre-process and a learning process for data to be analyzed, a value of a pipeline parameter being a parameter related to at least one of the pre-process and the learning process being applied to the analysis pipeline, the analysis pipeline model including the pre-process and a learned model being learned with the learning process, the information processing system comprising: 
a memory storing instructions; and 
one or more processors configured to execute the instructions to:
receive an input of a validation module that generates the analysis pipeline model and calculates an evaluation value of the generated analysis pipeline model by using an input analysis pipeline in accordance with a predetermined validation method for the validation module, and outputs the generated analysis pipeline model and the calculated evaluation value;
generate a function that executes the input validation module inputting the analysis pipeline to which a value of the pipeline parameter included in an input parameter set is applied, and outputs the analysis pipeline model and the evaluation value obtained by executing the input validation module;
execute a search module inputting the generated function to the search module that executes the input generated function, searches for a value of the parameter set for which the evaluation value obtained by executing the input generated function is optimized within a search range of the parameter set and in accordance with a predetermined search method for the search module, and outputs the analysis pipeline model for which the evaluation value is optimized; and
output the analysis pipeline model obtained by executing the search module.”, as drafted, are a machine that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, other than reciting “system”, “memory”, “processor”, “instructions”, nothing in the claim element precludes the step from practically being performed in the mind. 
For example, but for the “system”, “memory”, “processor”, “instructions” languages, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper) of generating a series of models with pre-processing step and learning step; getting a validation method which generates a series of models, does evaluations and outputting the generated series of models; preparing a function for executing the validation method and outputting the generated series of models; searching appropriate parameters based on the function among a given parameter set; outputting the generated series of models.

If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.

Step 2A Prong 2: This judicial exception is not integrated into a practical application. 
In particular, the claim recites an additional elements – the act of receiving data. The claim is adding an insignificant extra-solution activity to the judicial exception – see MPEP 2106.05(g). The act of receiving data is recited at a high-level of generality (i.e., as a generic act of receiving performing a generic act function of receiving data) such that it amounts no more than a mere act to apply the exception using a generic act of receiving. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
In particular, the claim recites an additional element – using a device to process data. The device in each step are recited at a high-level of generality (i.e., as a generic computer performing a generic computer function of processing data) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.

Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
The claim is appending a well-understood, routine, conventional activity previously known to the industry, specified at a high level of generality, to the judicial exception - see MPEP 2106.05(d)(II) – “Receiving or transmitting data over a network, e.g., using the Internet to gather data” is Well-Understood, Routine, and Conventional Activity (MPEP 2106.05(d)). As discussed above with respect to integration of the abstract idea into a practical application, the additional element of the act of receiving/transmitting data amounts to no more than a mere act to apply the exception using a generic act of receiving/transmitting. A mere act to apply an exception using a generic act of receiving/transmitting cannot provide an inventive concept. The claim is not patent eligible.
As discussed above, with respect to integration of the abstract idea into a practical application, the additional elements of using a generic computer to perform each step amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.

Similarly, Claim(s) 7, 9 is/are rejected under 35 U.S.C. 101, mutatis mutandis, as reciting an abstract idea without adding significantly more than the judicial exception.

Regarding claim 2
Claim 2 is rejected under 35 U.S.C. 101 because it only modifies the abstract idea by getting and executing a search method, which also does not add significantly more or provide a specific application of the judicial exception.

Similarly, Claim(s) 8, 10 is/are rejected under 35 U.S.C. 101, mutatis mutandis, as reciting an abstract idea without adding significantly more than the judicial exception.

Regarding claim 3
Claim 3 is rejected under 35 U.S.C. 101 because it only modifies the abstract idea by using a series of models having an identifier when a validation method is executed, which also does not add significantly more or provide a specific application of the judicial exception.

Regarding claim 4
Claim 4 is rejected under 35 U.S.C. 101 because it only modifies the abstract idea by using a parameter related to a validation method in addition to a series of models when the validation method is executed, which also does not add significantly more or provide a specific application of the judicial exception.

Regarding claim 5
Claim 5 is rejected under 35 U.S.C. 101 because it only modifies the abstract idea by narrowing learning data based on a narrowing ratio parameter, which also does not add significantly more or provide a specific application of the judicial exception.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4, 7-10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bergstra et al. (Hyperopt: a Python library for model selection and hyperparameter optimization) in view of Louppe et al. (An introduction to Machine Learning with Scikit-Learn) 

Regarding claim 1
Bergstra teaches
An information processing system for generating an analysis pipeline model by using an analysis pipeline, the analysis pipeline including a pre-process and a learning process for data to be analyzed, a value of a pipeline parameter being a parameter related to at least one of the pre-process and the learning process being applied to the analysis pipeline, the analysis pipeline model including the pre-process and a learned model being learned with the learning process, the information processing system comprising:
(Bergstra, [fig(s) 1-2] [sec(s) Scikit-learn model selection as a search problem] “Scikit-learn includes many algorithms for classification (classifiers), as well as many algorithms for preprocessing data into the vectors expected by classification algorithms. Classifiers include for example, K-Neighbors, SVM and RF algorithms. Preprocessing algorithms include things like component-wise Z-scaling (Normalizer) and principle components analysis (PCA). A full classification algorithm typically includes a series of preprocessing steps followed by a classifier. … Hyperopt-Sklearn provides a parameterization of a search space over pipelines, that is, of sequences of preprocessing steps and classifiers.” [sec(s) Example usage] “# Return instances of the classifier and # preprocessing steps model = estim.best_model()” [sec(s) Abs] “Following Auto-Weka, we take the view that the choice of classifier and even the choice of preprocessing module can be taken together to represent a single large hyperparameter optimization problem.”; e.g., “parameterization of a search space over pipelines, that is, of sequences of preprocessing steps and classifiers” may read on “a value of a pipeline parameter being a parameter related to at least one of the pre-process and the learning process being applied to the analysis pipeline”. In addition, e.g., “# Return instances of the classifier and # preprocessing steps” may read on “the analysis pipeline model including the pre-process and a learned model being learned with the learning process”.)

a memory storing instructions; and 
one or more processors configured to execute the instructions to:
(Bergstra, [fig(s) 1-2] [sec(s) Example usage] “Here is the simplest example of using this software.”; e.g., the example code may read on “memory” and “one or more processors” since code runs on a computer.)

receive an input of a validation module that generates the analysis pipeline model and calculates an evaluation value of the generated analysis pipeline model by using an input analysis pipeline in accordance with a predetermined validation method for the validation module, and outputs the generated analysis pipeline model and the calculated evaluation value;
(Bergstra, [fig(s) 1-2] [sec(s) Scikit-learn model selection as a search problem] “The basic approach is to set up a search space with random variable hyperparameters, use Scikit-learn to implement the objective function that performs model training and model validation, and use Hyperopt to optimize the hyperparamters. … Hyperopt-Sklearn provides a parameterization of a search space over pipelines, that is, of sequences of preprocessing steps and classifiers.” [sec(s) Example usage] “Following Scikit-learn’s convention, Hyperopt-Sklearn provides an Estimator class with a fit method and a predict method. The fit method of this class performs hyperparameter optimization, and after it has completed, the predict method applies the best model to test data. Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls. … # Return instances of the classifier and # preprocessing steps model = estim.best_model()” [sec(s) Discussion] “Table 1 lists the test set scores of the best models found by cross-validation, as well as some points of reference from previous work.”; e.g., “Each evaluation during optimization” may read on “validation module”. In addition, e.g., “At the end of search, the best configuration is retrained on the whole data set to produce the classifier” and “# Return instances of the classifier and # preprocessing steps” may read on “analysis pipeline model”. Furthermore, e.g., “estimates test set accuracy on a validation set” may read on “evaluation value”.)

generate a function that executes the input validation module inputting the analysis pipeline to which a value of the pipeline parameter included in an input parameter set is applied, and outputs the analysis pipeline model and the evaluation value obtained by executing the input validation module; 
(Bergstra, [fig(s) 1-2] [sec(s) Scikit-learn model selection as a search problem] “The basic approach is to set up a search space with random variable hyperparameters, use Scikit-learn to implement the objective function that performs model training and model validation, and use Hyperopt to optimize the hyperparamters. … Hyperopt-Sklearn provides a parameterization of a search space over pipelines, that is, of sequences of preprocessing steps and classifiers.” [sec(s) Example usage] “Following Scikit-learn’s convention, Hyperopt-Sklearn provides an Estimator class with a fit method and a predict method. The fit method of this class performs hyperparameter optimization, and after it has completed, the predict method applies the best model to test data. Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls.” [sec(s) Getting started with hyperopt] “To summarize, these are the steps to using Hyperopt: (1) implement an objective function that maps configuration points to a real-valued loss value, (2) define a configuration space of valid configuration points, and then (3) call fmin to search the space to optimize the objective function.”; e.g., “Scikit-learn to implement the objective function that performs model training and model validation” and “implement an objective function that maps configuration points to a real-valued loss value” may read on “generate a function that executes the input validation module”.)

execute a search module inputting the generated function to the search module that executes the input generated function, searches for a value of the parameter set for which the evaluation value obtained by executing the input generated function is optimized within a search range of the parameter set and in accordance with a predetermined search method for the search module, and outputs the analysis pipeline model for which the evaluation value is optimized; and 
(Bergstra, [fig(s) 1-2] [sec(s) Scikit-learn model selection as a search problem] “The basic approach is to set up a search space with random variable hyperparameters, use Scikit-learn to implement the objective function that performs model training and model validation, and use Hyperopt to optimize the hyperparamters. … Hyperopt-Sklearn provides a parameterization of a search space over pipelines, that is, of sequences of preprocessing steps and classifiers.” [sec(s) Example usage] “Following Scikit-learn’s convention, Hyperopt-Sklearn provides an Estimator class with a fit method and a predict method. The fit method of this class performs hyperparameter optimization, and after it has completed, the predict method applies the best model to test data. Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls.” [sec(s) “Getting started with hyperopt”] “Assigning the algo keyword argument to hyperopt.fmin is recommended way to choose a search algorithm. Currently supported search algorithms are random search (hyperopt. rand.suggest), annealing (hyperopt.anneal.suggest), and TPE (hyperopt. tpe.suggest). … To summarize, these are the steps to using Hyperopt: (1) implement an objective function that maps configuration points to a real-valued loss value, (2) define a configuration space of valid configuration points, and then (3) call fmin to search the space to optimize the objective function.” see also [sec(s) “Configuration spaces”]; e.g., “The basic approach is to set up a search space with random variable hyperparameters, use Scikit-learn to implement the objective function that performs model training and model validation, and use Hyperopt to optimize the hyperparamters” along with “call fmin to search the space to optimize the objective function” may read on “execute a search module inputting the generated function to the search module that executes the input generated function” since the search module executes the objective function to find the optimal hyperparameters. In addition, e.g., “supported search algorithms” may read on “predetermined search method for the search module”.)

output the analysis pipeline model obtained by executing the search module.
(Bergstra, [fig(s) 1-2] [sec(s) Scikit-learn model selection as a search problem] “# Return instances of the classifier and # preprocessing steps model = estim.best_model()”;)

	In the alternative, Louppe can also be interpreted to teach the following limitation:
Louppe teaches 
An information processing system for generating an analysis pipeline model by using an analysis pipeline, 
(Louppe, [sec Pipelines] 
“
    PNG
    media_image1.png
    388
    1145
    media_image1.png
    Greyscale
…

    PNG
    media_image2.png
    221
    1157
    media_image2.png
    Greyscale
…

    PNG
    media_image3.png
    546
    1615
    media_image3.png
    Greyscale
”; e.g., “make_pipeline” may read on “analysis pipeline”. In addition, e.g., “grid” may read on “analysis pipeline model”.)
the analysis pipeline including a pre-process and a learning process for data to be analyzed, a value of a pipeline parameter being a parameter related to at least one of the pre-process and the learning process being applied to the analysis pipeline, the analysis pipeline model including the pre-process and a learned model being learned with the learning process, the information processing system comprising:
(Louppe, [sec Pipelines] 
“
    PNG
    media_image3.png
    546
    1615
    media_image3.png
    Greyscale
”; e.g., “StandardScaler” may read on “pre-process”. In addition, e.g., “RandomForestClassifier” may read on “learning process”.)

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the parameter optimization system of Bergstra with the analysis pipeline of Louppe. 
Doing so would lead to achieving better performance and selecting the best model by finding good parameters.
(Louppe, [sec Model evaluation and selection] “Finding good hyper-parameters is crucial to control under- and over-fitting, hence achieving better performance. The estimated generalization error can be used to select the best model.” [sec(s) Transformers, pipelines and feature unions] “Classification (or regression) is often only one or the last step of a long and complicated process; In most cases, input data needs to be cleaned, massaged or extended before being fed to a learning algorithm”)

Regarding claim 2
The combination of Bergstra, Louppe teaches claim 1.

the one or more processors is further configured to execute the instructions to: (see the rejections of claim 1)

Bergstra further teaches 
receive an input of the search module; and
execute the input search module inputting the generated function to the search
module.
(Bergstra, [fig(s) 1-2] [sec(s) Scikit-learn model selection as a search problem] “The basic approach is to set up a search space with random variable hyperparameters, use Scikit-learn to implement the objective function that performs model training and model validation, and use Hyperopt to optimize the hyperparamters.” [sec(s) Example usage] “Following Scikit-learn’s convention, Hyperopt-Sklearn provides an Estimator class with a fit method and a predict method. The fit method of this class performs hyperparameter optimization, and after it has completed, the predict method applies the best model to test data. Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls.” [sec(s) “Getting started with hyperopt”] “Assigning the algo keyword argument to hyperopt.fmin is recommended way to choose a search algorithm. Currently supported search algorithms are random search (hyperopt. rand.suggest), annealing (hyperopt.anneal.suggest), and TPE (hyperopt. tpe.suggest). … To summarize, these are the steps to using Hyperopt: (1) implement an objective function that maps configuration points to a real-valued loss value, (2) define a configuration space of valid configuration points, and then (3) call fmin to search the space to optimize the objective function.” see also [sec(s) “Configuration spaces”]; e.g., “The basic approach is to set up a search space with random variable hyperparameters, use Scikit-learn to implement the objective function that performs model training and model validation, and use Hyperopt to optimize the hyperparamters” along with “call fmin to search the space to optimize the objective function” may read on “execute the input search module inputting the generated function to the search module” since the search module executes the objective function to find the optimal hyperparameters. In addition, e.g., “supported search algorithms” may read on “receive an input of the search module”.)

Regarding claim 3
The combination of Bergstra, Louppe teaches claim 1.

	Bergstra further teaches 
the parameter set further includes an identifier of the analysis pipeline, and, 
when the validation module is executed, the analysis pipeline indicated by an identifier of the analysis pipeline to which a value of the pipeline parameter included in the parameter set is applied is input.
(Bergstra, [fig(s) 1-2] [sec(s) ] “The full search space is illustrated in figure 1. The preprocessing algorithms were (by class name, followed by n. hyperparameters + n. unused hyperparameters): PCA(2), StandardScaler(2), MinMaxScaler(1), Normalizer(1), None, and TFIDF(0+9). The first four preprocessing algorithms were for dense features. … The classification algorithms were (by class name (used + unused hyperparameters)): SVC(23), KNN(4+5), RandomForest(8), ExtraTrees(8), SGD(8 +4), and MultinomialNB(2) . The SVC module is a fork of LibSVM, and our wrapper has 23 hyperparameters because we treated each possible kernel as a different classifier, with its own set of hyperparameters: Linear(4), RBF(5), Polynomial(7) and Sigmoid(6).” 
[sec(s) Configuration example: sklearn classifiers] “
    PNG
    media_image4.png
    168
    874
    media_image4.png
    Greyscale

    PNG
    media_image5.png
    305
    872
    media_image5.png
    Greyscale
” 
[sec(s) Example usage] 
“Following Scikit-learn’s convention, Hyperopt-Sklearn provides an Estimator class with a fit method and a predict method. The fit method of this class performs hyperparameter optimization, and after it has completed, the predict method applies the best model to test data. Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls. … 

    PNG
    media_image6.png
    270
    810
    media_image6.png
    Greyscale
”; e.g., “preprocessing algorithms were (by class name” and “classification algorithms were (by class name” may read on “identifier of the analysis pipeline”. In addition, e.g., “Each evaluation during optimization” may read on “validation module”.)

Regarding claim 4
The combination of Bergstra, Louppe teaches claim 1.

Bergstra further teaches
the parameter set further includes a parameter related to the predetermined validation method, 
(Bergstra, [fig(s) 1-2] [sec(s) Example usage] “Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls.” [sec(s) Discussion] “Table 1 lists the test set scores of the best models found by cross-validation, as well as some points of reference from previous work.”; e.g., a parameter which is associated with validations may read on “parameter related to the predetermined validation method”.)

the validation module generates the analysis pipeline model and calculates the evaluation value of the analysis pipeline model in accordance with the predetermined validation method associated with an input value of the parameter related to the predetermined validation method, and, 
(Bergstra, [fig(s) 1-2] [sec(s) Scikit-learn model selection as a search problem] “The basic approach is to set up a search space with random variable hyperparameters, use Scikit-learn to implement the objective function that performs model training and model validation, and use Hyperopt to optimize the hyperparamters. … Hyperopt-Sklearn provides a parameterization of a search space over pipelines, that is, of sequences of preprocessing steps and classifiers.” [sec(s) Example usage] “Following Scikit-learn’s convention, Hyperopt-Sklearn provides an Estimator class with a fit method and a predict method. The fit method of this class performs hyperparameter optimization, and after it has completed, the predict method applies the best model to test data. Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls. … # Return instances of the classifier and # preprocessing steps model = estim.best_model()” [sec(s) Discussion] “Table 1 lists the test set scores of the best models found by cross-validation, as well as some points of reference from previous work.”; e.g., a parameter which is associated with validations may read on “parameter related to the predetermined validation method”. In addition, e.g., “Each evaluation during optimization” may read on “validation module”. Furthermore, e.g., “At the end of search, the best configuration is retrained on the whole data set to produce the classifier” and “# Return instances of the classifier and # preprocessing steps” may read on “analysis pipeline model”. Furthermore, e.g., “estimates test set accuracy on a validation set” may read on “evaluation value”.)

when the validation module is executed, the value of the parameter related to the predetermined validation method is input in addition to the analysis pipeline to which the value of the pipeline parameter included in the parameter set is applied.
(Bergstra, [fig(s) 1-2] [sec(s) Scikit-learn model selection as a search problem] “The full search space is illustrated in figure 1. The preprocessing algorithms were (by class name, followed by n. hyperparameters + n. unused hyperparameters): … The classification algorithms were (by class name (used + unused hyperparameters)): SVC(23), KNN(4+5), RandomForest(8), ExtraTrees(8), SGD(8 +4), and MultinomialNB(2) .” [sec(s) Configuration example: sklearn classifiers] 
“
    PNG
    media_image4.png
    168
    874
    media_image4.png
    Greyscale

    PNG
    media_image5.png
    305
    872
    media_image5.png
    Greyscale
” [sec(s) Example usage] 
“Following Scikit-learn’s convention, Hyperopt-Sklearn provides an Estimator class with a fit method and a predict method. The fit method of this class performs hyperparameter optimization, and after it has completed, the predict method applies the best model to test data. Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls. … 

    PNG
    media_image6.png
    270
    810
    media_image6.png
    Greyscale
”; e.g., “preprocessing algorithms” and “classification algorithms” may read on “analysis pipeline”. In addition, e.g., “Each evaluation during optimization” may read on “validation module”.)

Claim(s) 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bergstra et al. (Hyperopt: a Python library for model selection and hyperparameter optimization) in view of Louppe et al. (An introduction to Machine Learning with Scikit-Learn) further in view of Pedregosa et al. (Scikit-learn)

Regarding claim 5
The combination of Bergstra, Louppe teaches claim 4.

(Note: Hereinafter, if a limitation has brackets (i.e. [·]) around claim languages, the bracketed claim languages indicate that they have not been taught yet by the current prior art reference but they will be taught by another prior art reference afterwards.)

the parameter related to the predetermined validation method is a parameter for specifying a [narrowing ratio] of data for learning, and 
(Bergstra, [fig(s) 1-2] [sec(s) Example usage] “Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls.” [sec(s) Discussion] “Table 1 lists the test set scores of the best models found by cross-validation, as well as some points of reference from previous work.”; e.g., a parameter which is associated with validations may read on “parameter related to the predetermined validation method”.)

the validation module, when [dividing] the data to be analyzed into data for learning for generating the analysis pipeline model and data for testing for calculating the evaluation value of the analysis pipeline model, further [narrows] the data for learning obtained by [dividing in accordance with a value of the parameter for specifying a narrowing ratio of data for learning].
(Bergstra, [fig(s) 1-2] [sec(s) Example usage] “Following Scikit-learn’s convention, Hyperopt-Sklearn provides an Estimator class with a fit method and a predict method. The fit method of this class performs hyperparameter optimization, and after it has completed, the predict method applies the best model to test data. Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls. … 

    PNG
    media_image7.png
    513
    813
    media_image7.png
    Greyscale
”; e.g., “Each evaluation during optimization” may read on “validation module”.)

(Note: Hereinafter, if a limitation has one or more underlines, the one or more underlined claim languages indicate that they are taught by the current prior art reference, while the one or more non-underlined claim languages indicate that they have been taught already by one or more previous art references.)

Louppe further teaches 
the parameter related to the predetermined validation method is a parameter for specifying a narrowing ratio of data for learning, and
the validation module, when dividing the data to be analyzed into data for learning for generating the analysis pipeline model and data for testing for calculating the evaluation value of the analysis pipeline model, further narrows the data for learning obtained by dividing in accordance with a value of the parameter for specifying a narrowing ratio of data for learning.
(Louppe, [sec(s) Model evaluation and selection] “Split L into K small disjoint folds. Train on K-1 folds, evaluate the test error one the held-out fold. Repeat for all combinations and average the K estimates of the generalization error. … 
    PNG
    media_image8.png
    634
    1017
    media_image8.png
    Greyscale
”; e.g., “n_splits” may read on “parameter for specifying a narrowing ratio of data for learning” and “dividing”. In addition, e.g., “5” may read on “a value of the parameter for specifying a narrowing ratio of data for learning”. Furthermore, e.g., “Split L into K small disjoint folds. Train on K-1 folds, evaluate the test error one the held-out fold” may read on “dividing”. Moreover, e.g., “cross_val_score” may read on “validation module” as well. Note that the combination of Bergstra, Louppe teaches “the parameter related to the predetermined validation method is a parameter for specifying a [narrowing ratio] of data for learning, and the validation module, when [dividing] the data to be analyzed into data for learning for generating the analysis pipeline model and data for testing for calculating the evaluation value of the analysis pipeline model, further [narrows] the data for learning obtained by [dividing in accordance with a value of the parameter for specifying a narrowing ratio of data for learning]”.)

The combination of Bergstra, Louppe is combinable with Louppe for the same rationale as set forth above with respect to claim 1.

In the alternative, Pedregosa can also be interpreted to teach the following limitation:
the parameter related to the predetermined validation method is a parameter for specifying a narrowing ratio of data for learning, and
the validation module, when dividing the data to be analyzed into data for learning for generating the analysis pipeline model and data for testing for calculating the evaluation value of the analysis pipeline model, further narrows the data for learning obtained by dividing in accordance with a value of the parameter for specifying a narrowing ratio of data for learning.
(Pedregosa, [fig(s) ] [sec(s) 3.1] “A model is trained using k-1 of the folds as training data; the resulting model is validated on the remaining part of the data (i.e., it is used as a test set to compute a performance measure such as accuracy). …

    PNG
    media_image9.png
    166
    671
    media_image9.png
    Greyscale
 … 
    PNG
    media_image10.png
    246
    611
    media_image10.png
    Greyscale
”; e.g., “cv” may read on “parameter for specifying a narrowing ratio of data for learning”.  In addition, e.g., “5” may read on “a value of the parameter for specifying a narrowing ratio of data for learning”. Furthermore, e.g., “A model is trained using k-1 of the folds as training data; the resulting model is validated on the remaining part of the data” may read on “dividing”. Moreover, e.g., “cross_validation” may read on “validation module” as well. Note that the combination of Bergstra, Louppe teaches “the parameter related to the predetermined validation method is a parameter for specifying a [narrowing ratio] of data for learning, and the validation module, when [dividing] the data to be analyzed into data for learning for generating the analysis pipeline model and data for testing for calculating the evaluation value of the analysis pipeline model, further [narrows] the data for learning obtained by [dividing in accordance with a value of the parameter for specifying a narrowing ratio of data for learning]”.)

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the parameter optimization system of Bergstra, Louppe with the narrowing of Pedregosa. 
Doing so would lead to preventing the estimation results from being dependent on a particular random choice for the pair of (train, validation) sets by maintaining the number of samples which can be used for learning the model.
(Pedregosa, [sec 3.1] “However, by partitioning the available data into three sets, we drastically reduce the number of samples which can be used for learning the model, and the results can depend on a particular random choice for the pair of (train, validation) sets. A solution to this problem is a procedure called cross-validation (CV for short). A test set should still be held out for final evaluation, but the validation set is no longer needed when doing CV. In the basic approach, called k-fold CV, the training set is split into k smaller sets (other approaches are described below, but generally follow the same principles).”)


Claim(s) 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bergstra et al. (Hyperopt: a Python library for model selection and hyperparameter optimization) in view of Louppe et al. (An introduction to Machine Learning with Scikit-Learn) further in view of Gerard (US 2016/0283861 A1)

Regarding claim 6
The combination of Bergstra, Louppe teaches claim 4.

the parameter related to the predetermined validation method is a parameter for [specifying] relearning, and 
(Bergstra, [fig(s) 1-2] [sec(s) Example usage] “Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls.” [sec(s) Discussion] “Table 1 lists the test set scores of the best models found by cross-validation, as well as some points of reference from previous work.”; e.g., a parameter which is associated with validations may read on “parameter related to the predetermined validation method”. In addition, e.g., “the best configuration is retrained on the whole data set to produce the classifier” may read on “relearning”.)

the validation module generates the analysis pipeline model by the learning process using data for learning among the data to be analyzed, calculates the evaluation value of the analysis pipeline model by using data for testing among the data to be analyzed, and then updates the analysis pipeline model by further performing the learning process using the data for learning and the data for testing in accordance with a value of the parameter for [specifying] the relearning.
(Bergstra, [fig(s) 1-2] [sec(s) Example usage] “Each evaluation during optimization performs training on a large fraction of the training set, estimates test set accuracy on a validation set and returns that validation set score to the optimizer. At the end of search, the best configuration is retrained on the whole data set to produce the classifier that handles subsequent predict calls.” [sec(s) Discussion] “Table 1 lists the test set scores of the best models found by cross-validation, as well as some points of reference from previous work. … 

    PNG
    media_image11.png
    552
    821
    media_image11.png
    Greyscale
”; e.g., “estimates test set accuracy on a validation set and returns that validation set score to the optimizer” may read on “calculates the evaluation value of the analysis pipeline model by using data for testing among the data to be analyzed”. In addition, e.g., “the best configuration is retrained on the whole data set to produce the classifier” may read on “updates the analysis pipeline model by further performing the learning process using the data for learning and the data for testing”. Furthermore, e.g., “# Return instances of the classifier and # preprocessing steps” may read on “analysis pipeline model”.)

	However, the combination of Bergstra, Louppe does not appear to distinctly disclose:
the parameter related to the predetermined validation method is a parameter for [specifying] relearning, and 
the validation module generates the analysis pipeline model by the learning process using data for learning among the data to be analyzed, calculates the evaluation value of the analysis pipeline model by using data for testing among the data to be analyzed, and then updates the analysis pipeline model by further performing the learning process using the data for learning and the data for testing in accordance with a value of the parameter for [specifying] the relearning.

Gerard teaches
the parameter related to the predetermined validation method is a parameter for specifying relearning, and 
the validation module generates the analysis pipeline model by the learning process using data for learning among the data to be analyzed, calculates the evaluation value of the analysis pipeline model by using data for testing among the data to be analyzed, and then updates the analysis pipeline model by further performing the learning process using the data for learning and the data for testing in accordance with a value of the parameter for specifying the relearning.
(Gerard, [fig(s) 9-10] [par(s) 60-68] “On the other hand, if the distribution difference is greater than the distribution difference threshold, then decision 1040 branches to the 'yes' branch. At step 1050, the process generates a retraining indicator to retrain machine learning model. In one embodiment, the process may generate a notification to a system administrator, informing the system administrator to commence retraining the machinelearning model. In another embodiment, the process may automatically commence retraining the machine-learning model. … At step 1070, the process retrains the machinelearning model and generates an updated hyperplane (e.g., hyperplane 800 shown in FIG. 8) using the labeled subsequent feature vectors and the labeled baseline feature vectors.” [claim 1] “generating an indicator in response to determining that a distribution difference between the second distribution and the first distribution reaches a distribution difference threshold, wherein the machine-learning model is retrained based on the generated indicator.”; e.g., “retraining indicator to retrain machine learning model” may read on “performing the learning process using the data … in accordance with a value of the parameter for specifying the relearning”. Note that teaches “the parameter related to the predetermined validation method is a parameter for [specifying] relearning, and the validation module generates the analysis pipeline model by the learning process using data for learning among the data to be analyzed, calculates the evaluation value of the analysis pipeline model by using data for testing among the data to be analyzed, and then updates the analysis pipeline model by further performing the learning process using the data for learning and the data for testing in accordance with a value of the parameter for [specifying] the relearning”.)

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the parameter optimization system of Bergstra, Louppe with the relearning-specifying parameter of Gerard. 
Doing so would lead to enabling improving the performance of the estimator by adapting to the up-to-date information which is changing over time.
(Gerard, [par(s) 60-68] “Over time, the knowledge manager incrementally ingests subsequent source documents that may include enough up-to-date information to require a retraining of the machine-learning model. In order to determine a time at which the machine-learning model requires retraining, the process computes and monitors the distributions of feature vectors corresponding to the subsequent source documents in combination with the baseline feature vectors (discussed below)”)

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Adams et al. (US 2014/0358831 A1) teaches finding hyper-parameters based on evaluations using an objective function.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEHWAN KIM whose telephone number is (571)270-7409. The examiner can normally be reached Mon - Thu 7:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/S.K./Examiner, Art Unit 2129                                                                                                                                                                                                        
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129