Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responding to application papers dated 12/31/2019. 
Claims 1-20 are pending in the application.  
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claim 20 is rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.  the limitations in claim 20 are recited in the parent claim 19.  Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 4-6, 13-15, and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Per claims 4-6, 13-15, and 19, the set is selected from an open list of alternatives, therefore, it is unclear what other alternatives are intended to be encompassed by the claim.  MPEP 2173.05(h).  It is recommended to replace “comprise” or “comprises” with: is or consists of. 
Per claim 20, this is rejected because it depends from claim 19.
Claim Rejections - 35 USC § 101
	35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


 Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. Specifically, claims 1-20 are directed to an abstract idea. 
	Per claim 1, the claim is directed to an idea of itself, mental processes that can be performed in the human mind, or by a human using a pen and paper. The steps of training a first machine learning model, determining a first performance metrics, past metrics, selecting a second machine learning model and selecting a second data processing package can be pure mental processes. The training based on data can be mentally done as the amount of data to train or process does not need to be large and no particular manners of training or the model implementation are recited to be considered to be an additional limitation that is significantly more. The additional limitations, processors and memory are recited at described at a high level of generality for applying or performing the abstract idea, furthermore, the term, automatically does not make the selecting step less abstract as the selection can be made merely on a generic computer to apply the selection without reciting a particular manner of such automatic selection.  Hence they do not indicate any integration of the abstract idea into a practical application as the mental steps are merely applied on a generic computer and performed using a computer. See MPEP see MPEP 2106.05(f) /2106.05(h). It is noted that employing generic computer functions to execute an abstract idea, even when limiting the use of the idea to one particular environment, does not add significantly more, similar to how limiting the abstract idea in Flook to petrochemical and oil-refining industries was insufficient.  Therefore, the additional limitations do not integrate the abstract idea into a practical application. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process). Viewing the limitations individually and as a combination, the additional elements merely perform the mental steps on a generic computer without integrating the abstract idea into a practical application. For at least these reasons, claim 1 is not patent eligible. 
 	Per claims 2-9, these claims are directed to the same idea itself as in claim 1, reciting details of the abstract idea, without adding any other additional element that is significantly more. Therefore, the claims are rejected for the same reasons as in claim 1. 
 	Per claim 10, the claim is directed to an idea of itself, mental processes that can be performed in the human mind, or by a human using a pen and paper. The steps of iteratively selecting different machine learning models, selecting a first data package can be pure mental processes. The additional limitation, the step of receiving by a service provider system is mere data gathering for the selection  step, specifically, the receiving step is performed in order to gather data so that the selections based on the received data can be made which is therefore a necessary step for the selection.  Therefore, the additional limitation does not integrate the abstract idea into a practical application. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process). At most, the receiving is not found to include anything more than what is well-understood, routine, conventional activity in the field. In this case, it is noted that the claimed extra-solution of data gathering or outputting a result from data gathered and analysis is acknowledged to be a well-understood, routine, conventional activity court recognized as WURC examples in MPEP 2106.05(d)(ll), for example, data gathering and retrieving, storing data, displaying a result - Symantec, Versata Dev, Content extraction, Electric Power Group). Insignificant extra solution activities or mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  Viewing the limitations individually and as a combination, the additional element merely performs data gathering for the mental steps without integrating the abstract idea into a practical application. For at least these reasons, claim 10 is not patent eligible. 
 	Per claims 11-17, these claims are directed to the same idea itself as in claim 10, reciting details of the abstract idea, without adding any other additional element that is significantly more. Therefore, the claims are rejected for the same reasons as in claim 10. 
 	Per claim 18, the claim is directed to an idea of itself, mental processes that can be performed in the human mind, or by a human using a pen and paper. The steps of iteratively selecting different machine learning models, selecting a first data package can be pure mental processes. The additional limitations, the step of receiving by a service provider system is mere data gathering for the selection  step, specifically, the receiving step is performed in order to gather data so that the selections based on the received data can be made which is therefore a necessary step for the selections.  The additional limitation, a non-transitory computer readable medium is recited at a preamble and described at a high level of generality for applying or performing the abstract idea on a generic computer to apply the selection without reciting a particular manner of such selection.  Hence they do not indicate any integration of the abstract idea into a practical application as the mental steps are merely applied on a generic computer and performed using a computer. See MPEP see MPEP 2106.05(f) /2106.05(h). It is noted that employing generic computer functions to execute an abstract idea, even when limiting the use of the idea to one particular environment, does not add significantly more, similar to how limiting the abstract idea in Flook to petrochemical and oil-refining industries was insufficient. Therefore, the additional limitation does not integrate the abstract idea into a practical application. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process). At most, the receiving is not found to include anything more than what is well-understood, routine, conventional activity in the field. In this case, it is noted that the claimed extra-solution of data gathering or outputting a result from data gathered and analysis is acknowledged to be a well-understood, routine, conventional activity court recognized as WURC examples in MPEP 2106.05(d)(ll), for example, data gathering and retrieving, storing data, displaying a result - Symantec, Versata Dev, Content extraction, Electric Power Group). Insignificant extra solution activities or mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  Viewing the limitations individually and as a combination, the additional element merely performs data gathering for the mental steps performed on a generic computer without integrating the abstract idea into a practical application.  For at least these reasons, claim 18 is not patent eligible. 
 	Per claims 19 and 20, these claims are directed to the same idea itself as in claim 18, reciting details of the abstract idea, without adding any other additional element that is significantly more. Therefore, the claims are rejected for the same reasons as in claim 18. 
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 2, 4-15, and 18 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Volodarskiy et al. (US20200175354, hereafter Volodarskiy).
 	1. A system, comprising: one or more hardware processors; and at least one memory storing computer-executable instructions, that in response to execution by the one or more hardware processors, causes the system to perform operations comprising:
 training a first machine learning model based on a set of training data, the first machine learning model being trained using a first data processing package from a set of data processing packages, a first feature selection package from a set of feature selection packages, and a first subset of model parameters from a set of model parameters (Volodarskiy, see at least [0008] select a plurality of machine-learning algorithms; generate a batch of trials from the plurality of machine-learning algorithms; begin executing at least a portion of the batch of trials; and, during execution of the batch of trials, provide intermediate evaluation results by, in each of one or more iterations, estimating a training time and model accuracy for two or more of models represented in the batch of trials, selecting at least one of the two or more models with a highest estimated model accuracy and an estimated training time within a predefined training time; [0026] Selection module 113 is able to offer a plurality of available algorithms for selection; [0049] the method may use a combination of the estimated training time and the a priori (pre-training) estimated accuracy of the algorithm to choose the next best algorithm to train; [0062] Specifically, the optimization algorithm may select, for the next batch of models to be trialed, those candidates with the highest accuracy based on previous trials. Then, a train-time estimation algorithm selects the models with the fastest train times that fit the user-specified time frame from among these candidates. The selected models are then evaluated; [0066] the plurality of machine-learning algorithms may be selected at random and/or in random order. In addition, selection module 113 may generate a random set of hyperparameters to be used with each of the machine-learning algorithms. For example, selection module 113 may generate a trial by selecting a machine-learning algorithm and a set of hyperparameters to be used in a trial with the selected machine-learning algorithm. It should be understood that selection module 113 may generate a plurality of trials for any given machine-learning algorithm by selecting different sets of hyperparameters; [0083] Estimation service 420 receives the batch of models 460 from trial-based optimization service 410, and selects a subset of the batch of models 460, having the fastest estimated train times, to produce a set of sampled models 465. Estimation service 420 provides the sampled models 465 to model evaluation service 430 for evaluation. In an embodiment, one or more models may be randomly selected and added to the set of sampled models 465 for evaluation, in order to prevent local minima; [0054] the application determines the features to be used by the machine-learning algorithms and/or the target feature to be predicted by the machine-learning algorithms …Each field name may be associated with one or more inputs, including, without limitation, inputs for selecting a data type (e.g., integer, categorical, etc.) to be used for the field, specifying a filter to be used for the values in the field… select a type of machine-learning algorithm to be used (e.g., regression or classification) and initiate the automated evaluation of a plurality of available machine-learning algorithms of the selected type); [0025] the application may enable a user to select one or more algorithms, optimize hyperparameters for the algorithm(s), and deploy the selected algorithm(s) with the optimized hyperparameters to the user's cloud services;[0049] select a plurality of machine-learning algorithms).  
 	based on the training, determining a first performance metric corresponding to the first machine learning model; determining one or more past performance metrics corresponding to one or more machine learning models that were previously trained based on the set of training data (Volodarskiy, see at least [0062] Specifically, the optimization algorithm may select, for the next batch of models to be trialed, those candidates with the highest accuracy based on previous trials; [0074] steps 422-426 may be performed iteratively, as new data is accumulated (e.g., in database(s) 114) by trial-based optimization service 410; [0063] This time-based and accuracy-based model selection can reduce the time between the start of the evaluation process (e.g., represented by step 350 in process 300) and the presentation of at least an initial leaderboard (e.g., represented by step 360 in process 300) from which a user can select one or more models. Selection module 113 may also provide a continually updated view of the progress of the evaluation process; [0068]; [0074]; [0080] the user can iteratively track the progress of the evaluation results as they are posted and/or updated; [0056] provide statistics about the evaluation (e.g., number of worker threads used, CPU usage for each worker thread, memory usage for each worker thread, etc.) within the graphical user interface; [0069] In step 422, train-time and/or accuracy estimation service 420 (or trial-based optimization service 410) determines whether or not there are sufficient data to perform an intermediate selection of models to be evaluated. The data in this case may comprise the trial statistics stored (e.g., in database(s) 114) by trial-based optimization service 410; [0070]).
based on the first performance metric and the one or more past performance metrics, automatically selecting a second machine learning model to train based on the set of training data, the automatically selecting further comprising  (Volodarskiy, see at least [0025]; [0059] server application 112 may access the user's cloud service to directly deploy the selected model(s) on the user's cloud service; [0069]; [0071] selects one or more models with the fastest estimated training times. The training time for each model may be determined based on the execution times of the trial executed for that model. Estimation service 420 may select any models for which the estimated training time is below a predetermined threshold amount of time (e.g., a user-specified or system-wide threshold amount of time) … select a predetermined number or percentage of models with the fastest estimated training times. In any case, the selected model(s) are provided to model evaluation service 430. Specifically, process 400 proceeds to step 432A. In this manner, slower models are initially filtered out from evaluation by model evaluation service 430 in order to provide faster results (e.g., in the leaderboard);[0062] Specifically, the optimization algorithm may select, for the next batch of models to be trialed, those candidates with the highest accuracy based on previous trials. Then, a train-time estimation algorithm selects the models with the fastest train times that fit the user-specified time frame from among these candidates; [0072] In an embodiment, in step 426, estimation service 420 also selects the one or more models based on accuracy, in addition to their estimated training times. Specifically, estimation service 420 may select a subset of models with the highest estimated accuracy and an estimated training time within a predefined time frame (e.g., a user-specified time frame). For example, each model in the batch of sampled models may be ranked according to a score, and a subset of models with the highest score (e.g., one model with the highest score, three models having the three highest scores, etc.) may be selected for the next evaluation. The score for each model may comprise a product of the probability of the model having a training time within the predefined time frame multiplied by the estimated accuracy of the model. Advantageously, the selection of models based on both accuracy and training time appropriately manages the tradeoff between time and efficacy; [0073] For the sake of illustration, a non-limiting example of an estimation service 420 that selects model(s) based on accuracy; [0082]; [0087]; [0089] model evaluation service 430 may select a model with the highest evaluation score (e.g., representing the highest accuracy). Model evaluation service 430 provides the selected train-time and/or accuracy model to estimation service 420. Estimation service 420 may then use the train-time and/or accuracy model and meta-features to predict the training times and/or accuracy for models in the future, for example, as trials for those models are executed by trial-based optimization service 410).
selecting a second data processing package from the set of data processing packages, a second feature selection package from the set of feature selection packages, and a second subset of model parameters from the set of model parameters (Volodarskiy, see at least [0066] In step 405, selection module 113 generates a batch of trials for a plurality of machine-learning algorithms to be evaluated. In an embodiment, the plurality of machine-learning algorithms may be selected at random and/or in random order. In addition, selection module 113 may generate a random set of hyperparameters to be used with each of the machine-learning algorithms. For example, selection module 113 may generate a trial by selecting a machine-learning algorithm and a set of hyperparameters to be used in a trial with the selected machine-learning algorithm. It should be understood that selection module 113 may generate a plurality of trials for any given machine-learning algorithm by selecting different sets of hyperparameters; [0083] Estimation service 420 receives the batch of models 460 from trial-based optimization service 410, and selects a subset of the batch of models 460, having the fastest estimated train times, to produce a set of sampled models 465. Estimation service 420 provides the sampled models 465 to model evaluation service 430 for evaluation. In an embodiment, one or more models may be randomly selected and added to the set of sampled models 465 for evaluation, in order to prevent local minima). 
2. The system of claim 1, wherein a previously trained machine learning model, of the one or more machine learning models that were previously trained based on the set of training data, was randomly selected (Volodarskiy, see at least [0062] Specifically, the optimization algorithm may select, for the next batch of models to be trialed, those candidates with the highest accuracy based on previous trials. Then, a train-time estimation algorithm selects the models with the fastest train times that fit the user-specified time frame from among these candidates. The selected models are then evaluated; [0066] the plurality of machine-learning algorithms may be selected at random and/or in random order. In addition, selection module 113 may generate a random set of hyperparameters to be used with each of the machine-learning algorithms. For example, selection module 113 may generate a trial by selecting a machine-learning algorithm and a set of hyperparameters to be used in a trial with the selected machine-learning algorithm. It should be understood that selection module 113 may generate a plurality of trials for any given machine-learning algorithm by selecting different sets of hyperparameters; [0083] Estimation service 420 receives the batch of models 460 from trial-based optimization service 410, and selects a subset of the batch of models 460, having the fastest estimated train times, to produce a set of sampled models 465. Estimation service 420 provides the sampled models 465 to model evaluation service 430 for evaluation. In an embodiment, one or more models may be randomly selected and added to the set of sampled models 465 for evaluation, in order to prevent local minima).
 	4. The system of claim 1, wherein the set of data processing packages comprise at least one of a z-scale transformation package, a categorical variable binning package, or a variable multiplication package (Volodarskiy, see at least [0054] the application determines the features to be used by the machine-learning algorithms and/or the target feature to be predicted by the machine-learning algorithms. For example, the application may generate one or more screens of the graphical user interface to include a list of all of the field names identified in the raw data. Each field name may be associated with one or more inputs, including, without limitation, inputs for selecting a data type (e.g., integer, categorical, etc.) to be used for the field, specifying a filter to be used for the values in the field, specifying a default value to be used for missing values in the field, selecting the field as a feature to be used in each machine-learning algorithm).
5. The system of claim 1, wherein the set of feature selection packages comprise at least one of an information value feature selection package, a correlation based feature selection package, a feature importance package, or a T-test features selection package (Volodarskiy, see at least [0054] In step 330, the application determines the features to be used by the machine-learning algorithms and/or the target feature to be predicted by the machine-learning algorithms … Each field name may be associated with one or more inputs, including, without limitation, inputs for selecting a data type (e.g., integer, categorical, etc.) to be used for the field, specifying a filter to be used for the values in the field, specifying a default value to be used for missing values in the field, selecting the field as a feature to be used in each machine-learning algorithm, selecting the field as a target feature to be predicted by each machine-learning algorithm, viewing actual values of the field in the dataset … including, without limitation, a feature correlation, the number of unique values for the field, a range of values for the field, a number of missing values for the field, and/or the like … select a type of machine-learning algorithm to be used (e.g., regression or classification) and initiate the automated evaluation of a plurality of available machine-learning algorithms of the selected type). 
 	6. The system of claim 1, wherein the set of model parameters comprise at least one of a set of machine learning platforms, a set of model algorithms, and a set of model hyperparameters (Volodarskiy, see at least [0025] the application may enable a user to select one or more algorithms, optimize hyperparameters for the algorithm(s), and deploy the selected algorithm(s) with the optimized hyperparameters to the user's cloud services. The combination of the algorithm(s) and associated hyperparameters will be referred to herein as a “model.” [0026] Selection module 113 is able to offer a plurality of available algorithms for selection. These available algorithms may comprise basic regression and/or classification algorithms, including, without limitation, logistic regression, linear regression, polynomial regression, k-nearest neighbor, and/or random forest algorithms. The available algorithms may also comprise more complex algorithms, such as deep-learning algorithms or deep neural networks. In addition, selection module 113 may enable users to set appropriate hyperparameters for the training process, and allows users to combine a plurality of algorithms into an ensemble algorithm; [0049] select a plurality of machine-learning algorithms, generate a batch of trials from the plurality of machine-learning algorithms, begin executing at least a portion of the batch of trials, and, during execution of the batch of trials, provide intermediate evaluation results. It may involve estimating a training time and accuracy for two or more models represented in the batch of trials, and then selecting the best algorithm and hyperparameter settings to train from an available set of algorithm/hyperparameter setting combinations based on a time constraint set by the user … to choose the next best algorithm to train).  
7. The system of claim 1, wherein the operations further comprise: based on training the second machine learning model, determining a second performance metric corresponding to the second machine learning model; and in response to determining that the second performance metric satisfies a metric threshold, selecting the second machine learning model as a final model to use in a live operating environment (Volodarskiy, see at least [0025]; [0059] server application 112 may access the user's cloud service to directly deploy the selected model(s) on the user's cloud service; [0069]; [0071] selects one or more models with the fastest estimated training times. The training time for each model may be determined based on the execution times of the trial executed for that model. Estimation service 420 may select any models for which the estimated training time is below a predetermined threshold amount of time (e.g., a user-specified or system-wide threshold amount of time) … select a predetermined number or percentage of models with the fastest estimated training times. In any case, the selected model(s) are provided to model evaluation service 430. Specifically, process 400 proceeds to step 432A. In this manner, slower models are initially filtered out from evaluation by model evaluation service 430 in order to provide faster results (e.g., in the leaderboard);[0062] Specifically, the optimization algorithm may select, for the next batch of models to be trialed, those candidates with the highest accuracy based on previous trials. Then, a train-time estimation algorithm selects the models with the fastest train times that fit the user-specified time frame from among these candidates; [0072] In an embodiment, in step 426, estimation service 420 also selects the one or more models based on accuracy, in addition to their estimated training times. Specifically, estimation service 420 may select a subset of models with the highest estimated accuracy and an estimated training time within a predefined time frame (e.g., a user-specified time frame). For example, each model in the batch of sampled models may be ranked according to a score, and a subset of models with the highest score (e.g., one model with the highest score, three models having the three highest scores, etc.) may be selected for the next evaluation. The score for each model may comprise a product of the probability of the model having a training time within the predefined time frame multiplied by the estimated accuracy of the model. Advantageously, the selection of models based on both accuracy and training time appropriately manages the tradeoff between time and efficacy; [0073] For the sake of illustration, a non-limiting example of an estimation service 420 that selects model(s) based on accuracy; [0082]; [0087]; [0089] model evaluation service 430 may select a model with the highest evaluation score (e.g., representing the highest accuracy). Model evaluation service 430 provides the selected train-time and/or accuracy model to estimation service 420. Estimation service 420 may then use the train-time and/or accuracy model and meta-features to predict the training times and/or accuracy for models in the future, for example, as trials for those models are executed by trial-based optimization service 410).
 	8. The system of claim 1, wherein the automatically selecting the second machine learning model to train is performed in response to determining that the first performance metric corresponding to the first machine learning model fails to satisfy a metric threshold (Volodarskiy, see at least [0071] In step 426, estimation service 420 selects one or more models with the fastest estimated training times. The training time for each model may be determined based on the execution times of the trial executed for that model. Estimation service 420 may select any models for which the estimated training time is below a predetermined threshold amount of time (e.g., a user-specified or system-wide threshold amount of time) … slower models are initially filtered out from evaluation by model evaluation service 430 in order to provide faster results (e.g., in the leaderboard); [0062] select, for the next batch of models to be trialed, those candidates with the highest accuracy based on previous trials. Then, a train-time estimation algorithm selects the models with the fastest train times that fit the user-specified time frame from among these candidates. The selected models are then evaluated; [0072] In an embodiment, in step 426, estimation service 420 also selects the one or more models based on accuracy, in addition to their estimated training times. Specifically, estimation service 420 may select a subset of models with the highest estimated accuracy and an estimated training time within a predefined time frame (e.g., a user-specified time frame). …three models having the three highest scores, etc.) may be selected for the next evaluation; [0089]).   
9. The system of claim 1, wherein at least one of the first data processing package, the first features selection package, or the first subset of model parameters is different than the second data processing package, the second features selection package, or the second subset of model parameters, respectively (Volodarskiy, see at least [0066] selection module 113 may generate a random set of hyperparameters to be used with each of the machine-learning algorithms. For example, selection module 113 may generate a trial by selecting a machine-learning algorithm and a set of hyperparameters to be used in a trial with the selected machine-learning algorithm. It should be understood that selection module 113 may generate a plurality of trials for any given machine-learning algorithm by selecting different sets of hyperparameters; [0084] select additional machine-learning algorithms and/or hyperparameters (e.g., based on the evaluation results), and provide those additional models to trial-based optimization service 410 to generate new trials. In an embodiment, this loop may continue until stopped by a user;[0086] algorithm generation service 510, dataset generation service 520, and/or feature extraction service 530 may be implemented by selection module 113 of server application 112; [0089] Model evaluation service 430 evaluates the machine-learning algorithms, and selects the machine-learning algorithm(s) and hyperparameters to be used as the train-time and/or accuracy model. For example, model evaluation service 430 may select a model with the highest evaluation score (e.g., representing the highest accuracy). Model evaluation service 430 provides the selected train-time and/or accuracy model to estimation service 420. Estimation service 420 may then use the train-time and/or accuracy model and meta-features to predict the training times and/or accuracy for models in the future, for example, as trials for those models are executed by trial-based optimization service 410).
 	10. A method, comprising:  receiving, by a service provider system comprising one or more hardware processors, a set of training data; and iteratively selecting different machine learning models to train using the set of training data until a performance metric corresponding to a final machine learning model satisfies a metric threshold, wherein the iteratively selecting comprises selecting a first machine learning model of the different machine learning models based on respective performance metrics corresponding to previously trained machine learning models of the different machine learning models (Volodarskiy, see at least [0074] steps 422-426 may be performed iteratively, as new data is accumulated (e.g., in database(s) 114) by trial-based optimization service 410; [0063] This time-based and accuracy-based model selection can reduce the time between the start of the evaluation process (e.g., represented by step 350 in process 300) and the presentation of at least an initial leaderboard (e.g., represented by step 360 in process 300) from which a user can select one or more models. Selection module 113 may also provide a continually updated view of the progress of the evaluation process; [0068]; [0074] In an embodiment, steps 422-426 may be performed iteratively, as new data is accumulated (e.g., in database(s) 114) by trial-based optimization service 410. In other words, after step 426, estimation service 420 may return to step 422 to check whether or not there is sufficient new data to estimate training times for another sample of models; [0080] the user can iteratively track the progress of the evaluation results as they are posted and/or updated; [0062] Specifically, the optimization algorithm may select, for the next batch of models to be trialed, those candidates with the highest accuracy based on previous trials; [0074] steps 422-426 may be performed iteratively, as new data is accumulated (e.g., in database(s) 114) by trial-based optimization service 410; [0063] This time-based and accuracy-based model selection can reduce the time between the start of the evaluation process (e.g., represented by step 350 in process 300) and the presentation of at least an initial leaderboard (e.g., represented by step 360 in process 300) from which a user can select one or more models. Selection module 113 may also provide a continually updated view of the progress of the evaluation process; [0068]; [0074]; [0080] the user can iteratively track the progress of the evaluation results as they are posted and/or updated; [0056] provide statistics about the evaluation (e.g., number of worker threads used, CPU usage for each worker thread, memory usage for each worker thread, etc.) within the graphical user interface; [0069] In step 422, train-time and/or accuracy estimation service 420 (or trial-based optimization service 410) determines whether or not there are sufficient data to perform an intermediate selection of models to be evaluated. The data in this case may comprise the trial statistics stored (e.g., in database(s) 114) by trial-based optimization service 410; [0070]; [0089]).
  and wherein the selecting the first machine learning model further comprises: selecting a first data processing package from a set of data processing packages, a first feature selection package from a set of feature selection packages, and a first subset of model parameters from a set of model parameters (Volodarskiy, see at least [0066] In step 405, selection module 113 generates a batch of trials for a plurality of machine-learning algorithms to be evaluated. In an embodiment, the plurality of machine-learning algorithms may be selected at random and/or in random order. In addition, selection module 113 may generate a random set of hyperparameters to be used with each of the machine-learning algorithms. For example, selection module 113 may generate a trial by selecting a machine-learning algorithm and a set of hyperparameters to be used in a trial with the selected machine-learning algorithm. It should be understood that selection module 113 may generate a plurality of trials for any given machine-learning algorithm by selecting different sets of hyperparameters; [0083] Estimation service 420 receives the batch of models 460 from trial-based optimization service 410, and selects a subset of the batch of models 460, having the fastest estimated train times, to produce a set of sampled models 465. Estimation service 420 provides the sampled models 465 to model evaluation service 430 for evaluation. In an embodiment, one or more models may be randomly selected and added to the set of sampled models 465 for evaluation, in order to prevent local minima). 
  
 	11. The method of claim 10, further comprising: prior to the iteratively selecting the different machine learning models, randomly selecting a model to train using the set of training data (Volodarskiy, see at least [0063] This time-based and accuracy-based model selection can reduce the time between the start of the evaluation process (e.g., represented by step 350 in process 300) and the presentation of at least an initial leaderboard (e.g., represented by step 360 in process 300) from which a user can select one or more models. Selection module 113 may also provide a continually updated view of the progress of the evaluation process; [0068]; [0074] In an embodiment, steps 422-426 may be performed iteratively, as new data is accumulated (e.g., in database(s) 114) by trial-based optimization service 410. In other words, after step 426, estimation service 420 may return to step 422 to check whether or not there is sufficient new data to estimate training times for another sample of models; [0080] the user can iteratively track the progress of the evaluation results as they are posted and/or updated;  [0066] In step 405, selection module 113 generates a batch of trials for a plurality of machine-learning algorithms to be evaluated. In an embodiment, the plurality of machine-learning algorithms may be selected at random and/or in random order. In addition, selection module 113 may generate a random set of hyperparameters to be used with each of the machine-learning algorithms …one or more models may be randomly selected and added to the set of sampled models 465 for evaluation, in order to prevent local minima. This provides a tradeoff between exploration and exploitation. The amount (e.g., number or percentage) of random models to be included in the sampled models 465 may be defined by the user and/or the system).
12. The method of claim 11, wherein the randomly selected model is associated with at least one of a randomly selected data processing package from the set of data processing packages, randomly selected feature selection package from of the set of feature selection packages, or randomly selected model parameters from the set of model parameters (Volodarskiy, see at least [0066] In step 405, selection module 113 generates a batch of trials for a plurality of machine-learning algorithms to be evaluated. In an embodiment, the plurality of machine-learning algorithms may be selected at random and/or in random order. In addition, selection module 113 may generate a random set of hyperparameters to be used with each of the machine-learning algorithms. For example, selection module 113 may generate a trial by selecting a machine-learning algorithm and a set of hyperparameters to be used in a trial with the selected machine-learning algorithm. It should be understood that selection module 113 may generate a plurality of trials for any given machine-learning algorithm by selecting different sets of hyperparameters; [0083] Estimation service 420 receives the batch of models 460 from trial-based optimization service 410, and selects a subset of the batch of models 460, having the fastest estimated train times, to produce a set of sampled models 465. Estimation service 420 provides the sampled models 465 to model evaluation service 430 for evaluation. In an embodiment, one or more models may be randomly selected and added to the set of sampled models 465 for evaluation, in order to prevent local minima).
Per claim 13, it is the method version of claim 6, respectively, and is rejected for the same reasons set forth in connection with the rejection of claim 6 above. 
	14. The method of claim 13, wherein the set of model algorithms comprises at least one of a neural network, a recurrent neural network, a logistic regression, a gradient boosted tree, or a random forest (Volodarskiy, see at least [0026] Selection module 113 is able to offer a plurality of available algorithms for selection. These available algorithms may comprise basic regression and/or classification algorithms, including, without limitation, logistic regression, linear regression, polynomial regression, k-nearest neighbor, and/or random forest algorithms).
 	15. The method of claim 13, wherein the set of model hyperparameters comprises at least one of a learning rate, an activation function, a number of iterations, number of trees, a maximum depth, a dropout rate, a number of hidden layers, or a number of hidden nodes (Volodarskiy, see at least [0056] over k iterations, a single subset is selected for testing the model, while the remaining k−1 subsets are used for training the model, such that, across all k iterations, each subset is used once for testing the model … represent its progress (e.g., status, percentage complete, etc.) … predetermined number of trials (e.g., one, two, five, ten, etc.) and/or models have been successfully executed).
Per claim 18, it is the medium version of claim 10, respectively, and is rejected for the same reasons set forth in connection with the rejection of claim 10 above. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Volodarskiy in view of Dent et al. (US20200012962, hereafter Dent).
 	Per claim 3:
Volodarskiy does not explicitly teach wherein the determining the one or more past performance metrics comprises an area under curve calculation.  Dent teaches wherein the determining the one or more past performance metrics comprises an area under curve calculation (Dent, see at least [0032] the performance module 120 can be configured to generate a prediction score for a machine learning model 110 which can be used to evaluate performance of the machine learning model 110. Various methods can be used to generate a prediction score, including: Fi score (also F-score or F-measure), area under curve (AUC) metric, receiver operating characteristic (ROC) curve metric, relevance and ranking (in information retrieval), as well as other scoring methods; [0042]; [0056] As in block 1150, the datasets can be input to the machine learning model to train the machine learning model to generate predictions of the target metric. In one example, the target metric can be analyzed to determine a machine learning type (e.g., classification, linear regression, etc.) associated with the target metric and a machine learning model that corresponds to the machine learning type can be selected. In another example, a listing of available machine learning models of various machine learning types can be provided to a user and the user can select a machine learning model that corresponds to a target metric. In some examples, multiple machine learning models can be trained using various machine learning algorithms and the datasets. Performance of the multiple machine learning models to predict the target metric can be determined and the automated ML platform can be used to select one of the machine learning models).  It would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention to have combined Volodarskiy with Dent’s area under curve calculation to modify Volodarskiy’s system to combine the AUC calculation as taught by Dent, with a reasonable expectation of success, since they are analogous art because they are from the same field of endeavor related to machine learning.  Combining Dent’s functionality with that of Volodarskiy results in a system that allows AUC metric to be used for performance metrics. The modification would be obvious because one having ordinary skill in the art would be motivated to make this combination to enable to use the known AUC to accurately estimate or optimize the model (Dent, see at least [0032] the performance module 120 can be configured to generate a prediction score for a machine learning model 110 which can be used to evaluate performance of the machine learning model 110. Various methods can be used to generate a prediction score, including: Fi score (also F-score or F-measure), area under curve (AUC) metric, receiver operating characteristic (ROC) curve metric, relevance and ranking (in information retrieval), as well as other scoring methods; [0042]; [0056] As in block 1150, the datasets can be input to the machine learning model to train the machine learning model to generate predictions of the target metric. In one example, the target metric can be analyzed to determine a machine learning type (e.g., classification, linear regression, etc.) associated with the target metric and a machine learning model that corresponds to the machine learning type can be selected. In another example, a listing of available machine learning models of various machine learning types can be provided to a user and the user can select a machine learning model that corresponds to a target metric. In some examples, multiple machine learning models can be trained using various machine learning algorithms and the datasets. Performance of the multiple machine learning models to predict the target metric can be determined and the automated ML platform can be used to select one of the machine learning models).
 	Claims 16, 17, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Volodarskiy in view of Beaudoin (US 20200065708).
	Per claim 16:
Volodarskiy does not explicitly teach wherein a selection of a model algorithm from the set of model algorithms is dependent upon a selection of a machine learning platform from the set of machine learning platforms.  Chu teaches wherein a selection of a model algorithm from the set of model algorithms is dependent upon a selection of a machine learning platform from the set of machine learning platforms (Beaudoin, see at least [0009] determining which hardware platform to use when implementing a machine learning model; [0010] a) determining configurations of multiple hardware platforms, each of said multiple hardware platforms having different hardware configurations from each other; [0011] b) selecting a specific selected hardware platform from said multiple hardware platforms; [0012] c) training a specific machine learning model on said selected hardware platform to result in a first trained model;[0013] d) adjusting said first trained model to operate on another of said multiple hardware platforms to result in at least one second trained model; [0014] e) determine performance data of said at least one second trained model to determine efficacy and latency data for said at least one second trained model; [0016] g) determining an optimal hardware platform to use when operating said machine learning model based on said trade-off data; [0023] f) training an improved version of said specific machine learning model on said selected hardware platform to result in an improved first trained model, said first trained model and said improved first trained model having parameters that are as similar as possible to each other).
It would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention to have combined Volodarskiy with Beaudoin’s machine learning platform selection to modify Volodarskiy’s system to combine the platform selection as taught by Beaudoin, with a reasonable expectation of success, since they are analogous art because they are from the same field of endeavor related to machine learning.  Combining Beaudoin’s functionality with that of Volodarskiy results in a system that allows machine learning platform selection. The modification would be obvious because one having ordinary skill in the art would be motivated to make this combination to enable to select an optimal platform to use for the machine learning model to achieve execution efficacy and optimization (Beaudoin, see at least [0009] determining which hardware platform to use when implementing a machine learning model; [0010] a) determining configurations of multiple hardware platforms, each of said multiple hardware platforms having different hardware configurations from each other; [0011] b) selecting a specific selected hardware platform from said multiple hardware platforms; [0012] c) training a specific machine learning model on said selected hardware platform to result in a first trained model;[0013] d) adjusting said first trained model to operate on another of said multiple hardware platforms to result in at least one second trained model; [0014] e) determine performance data of said at least one second trained model to determine efficacy and latency data for said at least one second trained model; [0016] g) determining an optimal hardware platform to use when operating said machine learning model based on said trade-off data; [0023] f) training an improved version of said specific machine learning model on said selected hardware platform to result in an improved first trained model, said first trained model and said improved first trained model having parameters that are as similar as possible to each other).
 	17. The method of claim 16, wherein a selection of one or more model hyperparameters from the set of model hyperparameters is dependent upon the selection of the model algorithm (Volodarskiy, see at least [0026] Selection module 113 is able to offer a plurality of available algorithms for selection. These available algorithms may comprise basic regression and/or classification algorithms, including, without limitation, logistic regression, linear regression, polynomial regression, k-nearest neighbor, and/or random forest algorithms. The available algorithms may also comprise more complex algorithms, such as deep-learning algorithms or deep neural networks. In addition, selection module 113 may enable users to set appropriate hyperparameters for the training process, and allows users to combine a plurality of algorithms into an ensemble algorithm).
 	19. The non-transitory computer readable medium of claim 18, wherein the set of model parameters comprise at least one of a set of machine learning platforms, a set of model algorithms, and a set of model hyperparameters, and wherein a selection of a Page 26 of 28model algorithm from the set of model algorithms is dependent upon a selection of a machine learning platform from the set of machine learning platforms. (Volodarskiy, see at least [0025] the application may enable a user to select one or more algorithms, optimize hyperparameters for the algorithm(s), and deploy the selected algorithm(s) with the optimized hyperparameters to the user's cloud services. The combination of the algorithm(s) and associated hyperparameters will be referred to herein as a “model.” [0026] Selection module 113 is able to offer a plurality of available algorithms for selection. These available algorithms may comprise basic regression and/or classification algorithms, including, without limitation, logistic regression, linear regression, polynomial regression, k-nearest neighbor, and/or random forest algorithms. The available algorithms may also comprise more complex algorithms, such as deep-learning algorithms or deep neural networks. In addition, selection module 113 may enable users to set appropriate hyperparameters for the training process, and allows users to combine a plurality of algorithms into an ensemble algorithm; [0049] select a plurality of machine-learning algorithms, generate a batch of trials from the plurality of machine-learning algorithms, begin executing at least a portion of the batch of trials, and, during execution of the batch of trials, provide intermediate evaluation results. It may involve estimating a training time and accuracy for two or more models represented in the batch of trials, and then selecting the best algorithm and hyperparameter settings to train from an available set of algorithm/hyperparameter setting combinations based on a time constraint set by the user … to choose the next best algorithm to train).  
 	Volodarskiy does not explicitly teach wherein a selection of a model algorithm from the set of model algorithms is dependent upon a selection of a machine learning platform from the set of machine learning platforms.  Chu teaches wherein a selection of a model algorithm from the set of model algorithms is dependent upon a selection of a machine learning platform from the set of machine learning platforms (Beaudoin, see at least [0009] determining which hardware platform to use when implementing a machine learning model; [0010] a) determining configurations of multiple hardware platforms, each of said multiple hardware platforms having different hardware configurations from each other; [0011] b) selecting a specific selected hardware platform from said multiple hardware platforms; [0012] c) training a specific machine learning model on said selected hardware platform to result in a first trained model;[0013] d) adjusting said first trained model to operate on another of said multiple hardware platforms to result in at least one second trained model; [0014] e) determine performance data of said at least one second trained model to determine efficacy and latency data for said at least one second trained model; [0016] g) determining an optimal hardware platform to use when operating said machine learning model based on said trade-off data; [0023] f) training an improved version of said specific machine learning model on said selected hardware platform to result in an improved first trained model, said first trained model and said improved first trained model having parameters that are as similar as possible to each other).
It would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention to have combined Volodarskiy with Beaudoin’s machine learning platform selection to modify Volodarskiy’s system to combine the platform selection as taught by Beaudoin, with a reasonable expectation of success, since they are analogous art because they are from the same field of endeavor related to machine learning.  Combining Beaudoin’s functionality with that of Volodarskiy results in a system that allows machine learning platform selection. The modification would be obvious because one having ordinary skill in the art would be motivated to make this combination to enable to select an optimal platform to use for the machine learning model to achieve execution efficacy and optimization (Beaudoin, see at least [0009] determining which hardware platform to use when implementing a machine learning model; [0010] a) determining configurations of multiple hardware platforms, each of said multiple hardware platforms having different hardware configurations from each other; [0011] b) selecting a specific selected hardware platform from said multiple hardware platforms; [0012] c) training a specific machine learning model on said selected hardware platform to result in a first trained model;[0013] d) adjusting said first trained model to operate on another of said multiple hardware platforms to result in at least one second trained model; [0014] e) determine performance data of said at least one second trained model to determine efficacy and latency data for said at least one second trained model; [0016] g) determining an optimal hardware platform to use when operating said machine learning model based on said trade-off data; [0023] f) training an improved version of said specific machine learning model on said selected hardware platform to result in an improved first trained model, said first trained model and said improved first trained model having parameters that are as similar as possible to each other).
 	Per claim 20, see the rejection of claim 19.
Examiner’s Note
 	The Examiner has pointed out particular references contained in the prior art of record within the body of this action for the convenience of the Applicant.  Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply.  Applicant, in preparing the response, should consider fully the entire reference as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US20210166117 is related to selecting processors of a training platform for a machine learning model training;
US20200097851 is related to selection of a particular model from machine learning models;
US20200034710 is related to based on performance of untrained models, selecting a subset of models to be trained and a single model is selected for deployment;
 	US20190147371 is related to selecting a trained model from trained models based on model metrics and scores;
US20190102700 is related to selecting models from a model group based on a score determined using scoring metrics;
US20210065053 is related to selecting a set of features, particular type of a model of models;
US20210012239 is related to selecting features to train a particular machine learning model based on a type of evaluation;
US 20200401886 is related to selecting input features and joining with output data according to common features;
US20190095756 is related to selecting machine learning algorithms based on performance predictions by trained regressors;
US20200242511 is related to a machine learning model wit dynamic data selection.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to INSUN KANG whose telephone number is (571)272-3724. The examiner can normally be reached M-F 10 am-6 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chat Do can be reached on 571-272-3721. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/INSUN KANG/               Primary Examiner, Art Unit 2193