DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 01/10/2018.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Specification
Applicant is reminded of the proper language and format for an abstract of the disclosure.
The abstract should be in narrative form and generally limited to a single paragraph on a separate sheet within the range of 50 to 150 words in length. The abstract should describe the disclosure sufficiently to assist readers in deciding whether there is a need for consulting the full patent text for details.
The language should be clear and concise and should not repeat information given in the title. It should avoid using phrases which can be implied, such as, “The disclosure concerns,” “The disclosure defined by this invention,” “The disclosure describes,” etc.  In addition, the form and legal phraseology often used in patent claims, such as “means” and “said,” should be avoided.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 6, 8, 13, 15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Estrada (US 2017/0220407), in view of Gibiansky (US 2016/0110657), and further in view of Streit (Maximum Likelihood Training of Probabilistic Neural Networks – 1994).

Regarding claim 1 Estrada teaches A method comprising: 
Identifying, by a processing device, a group of parameters for configuring a machine-learning model ([0023-0024]: The training phase 100 includes the data collection and transformation operation 106, which may include operations such as data preprocessing 200, data imputation 202, and discretization 204. The examiner notes that the claim does not describe how these parameters are identified or what they represent. The examiner considers the data used in the training operation of Estrada to be these parameters).
determining, by the processing device, descriptor values ([0017]: Prediction errors against observed performance provide insight as to whether the model needs retraining (model self-validation in the execution phase 102) to guard against loss of 
for multiple versions of the machine-learning model by ([0020]-[0022]: In the adaptation phase 104 the application using the model is notified (operation 114) and an adaptation to the model is triggered (operation 116). The flow moves back to the instrumented entity 118 (e.g., a node in a network), which provides data to the training phase 100 and execution phase. The examiner notes that Estrada teaches a model induction and adaptation phase where multiple iterations of the model are created based on data that includes but not limited to performance metrics.)
determining, by the processing device, that a particular version of the machine- learning model has a lowest descriptor value among the descriptor values by comparing the descriptor values to one another; and ([0039]: In an embodiment, to implement the automatic verification operation, the model manager 502 is to compare an observed value from the telemetry data to a predicted value from the performance model and declare the performance model invalid when the observed value deviates from the predicted value by more than a threshold amount. The examiner notes that the claim does not describe how the descriptor values are determined. The examiner considers the observed and predicted data used in the verification process of Estrada to be these descriptor values).
executing, by the processing device, the particular version of the machine-learning model to perform a task in a computing environment based on the particular version of the machine-learning model having the lowest descriptor value ([0021]: The execution phase 102 deploys and executes the selected model (operation 110) and continually or periodically tests for new events (decision operation 112). The examiner notes that the claim does not describe how descriptor values are determined. The examiner considers [0024] Estrada’s accuracy measure used in the model selection process to be the descriptor value).
Estrada, however, fails to explicitly teach performing a process that includes, for each parameter in the group of parameters: adjusting a value of the parameter to generate a modified version of the machine-learning model. Estrada also fails to explicitly teach training the modified version of the machine-learning model to determine a likelihood function for the modified version of the machine-learning model. Furthermore, Estrada also fails to explicitly teach determining a descriptor value for the modified version of the machine learning model using (i) a number of parameters in the group of parameters and (ii) the likelihood function for the modified version of the machine-learning model, the descriptor value being a number expressing a relationship between the number of parameters in the group of parameters and the accuracy of the modified version of the machine-learning model.
On the other hand, Gibiansky teaches performing a process that includes, for each parameter in the group of parameters: adjusting a value of the parameter to generate a modified version of the machine-learning model ([0010]: determine a first candidate machine learning method; tune one or more parameters (i.e. adjusting a value of the parameter) of the first candidate machine learning method; determine that the first candidate machine learning method and a first parameter configuration for the generate a modified version of the machine-learning model) for the first candidate machine learning method. The examiner notes that Estrada and Gibiansky are considered analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking model to incorporate performing a process that includes, for each parameter in the group of parameters: adjusting a value of the parameter to generate a modified version of the machine-learning model as taught by Gibiansky to [0009] optimize the machine learning models).
	Furthermore, Streit teaches training the modified version of the machine-learning model to determine a likelihood function for the modified version of the machine-learning model ([Page770, Para2]: The GF training algorithm (i.e. training the modified version of the machine-learning model) gives explicit maximum likelihood estimates (i.e. determine a likelihood function) for the class a priori probabilities {ay} without iteration. The examiner notes that Estrada/Gibiansky and Streit are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking models to incorporate training the modified version of the machine-learning model to determine a likelihood function for the modified version of the machine-learning model as taught by Streit to [Page770, Para2] eliminate data anomalies).
Furthermore, Akaike teaches determining a descriptor value for the modified version of the machine learning model using (i) a number of parameters in the group of parameters and (ii) the likelihood function for the modified version of the machine-learning model, the descriptor value being a number expressing a relationship between the number of parameters in the group of parameters and the accuracy of the modified version of the machine-learning model  ([Page 1, Para. 4]: Suppose that we have a statistical model of some data. Let k be the number of estimated parameters in the model (i.e. (i) a number of parameters in the group of parameters). Let L be the maximum value of the likelihood function for the model (i.e. (ii) the likelihood function for the modified version of the machine-learning model). Then the AIC value of the model is the following AIC= 2*k – 2*ln(L)). The examiner notes that Estrada/Gibiansky/Streit and Akaike are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking models to incorporate determining a descriptor value for the modified version of the machine learning model using (i) a number of parameters in the group of parameters and (ii) the likelihood function for the modified version of the machine-learning model, the descriptor value being a number expressing a relationship between the number of parameters in the group of parameters and the accuracy of the modified version of the machine-learning model as taught by Akaike to [Page1, Para1] estimate the relative quality of statistical models).

Regarding claim 6 Estrada/Gibiansky/Streit teach The method of claim 1, wherein the machine-learning model is a first type of machine-learning model prior to executing the particular version of the machine-learning model to perform the task: determining another descriptor value for a second type of machine-learning model that is different from the first type of machine-learning model determining that the lowest descriptor value associated with the first type of machine-learning model is lower than the other descriptor value associated with the second type of machine-learning model based on determining that the lowest descriptor value is lower than the other descriptor value, selecting the particular version of the machine-learning model for performing the task. ([0024 on Estrada]: Model selection may be performed using various methods, Such as Akaike information criterion (AIC), Bayesian information criterion (BIC), or minimum description length (MDL). The examiner notes that AIC (descriptor value) is an estimator of prediction error, therefore, a lower number indicates a better model. The selection of a better machine learning model is done by comparing the descriptor values of the machine learning models involved and the model with the lowest descriptor value is the best model, therefore, that is the one selected).
Regarding claim 8 Estrada teaches A non-transitory computer-readable medium comprising program code that is executable by a processing device for causing the processing device to: 
Identify a group of parameters for configuring a machine-learning model ([0023-0024]: The training phase 100 includes the data collection and transformation operation 106, which may include operations such as data preprocessing 200, data imputation 202, and discretization 204). The examiner notes that the claim does not describe how these parameters are identified or what they represent. The examiner considers the data used in the training operation of Estrada to be these parameters).
determining descriptor values ([0017]: Prediction errors against observed performance provide insight as to whether the model needs retraining (model self-validation in the execution phase 102) to guard against loss of accuracy over time. The examiner notes that the claim does not describe how these descriptor values are calculated. The examiner considers the prediction errors used in the training models of Estrada to be these descriptor values).
for multiple versions of the machine-learning model by ([0020]-[0022]: In the adaptation phase 104 the application using the model is notified (operation 114) and an adaptation to the model is triggered (operation 116). The flow moves back to the instrumented entity 118 (e.g., a node in a network), which provides data to the training phase 100 and execution phase. The examiner notes that Estrada teaches a model induction and adaptation phase where multiple iterations of the model are created based on data that includes but not limited to performance metrics.)
determining that a particular version of the machine- learning model has a lowest descriptor value among the descriptor values by comparing the descriptor values to one another; and ([0039]: In an embodiment, to implement the automatic verification operation, the model manager 502 is to compare an observed value from 
 executing, by the processing device, the particular version of the machine-learning model to perform a task in a computing environment based on the particular version of the machine-learning model having the lowest descriptor value ([0021]: The execution phase 102 deploys and executes the selected model (operation 110) and continually or periodically tests for new events (decision operation 112). The examiner notes that the claim does not describe how descriptor values are determined. The examiner considers [0024] Estrada’s accuracy measure used in the model selection process to be the descriptor value).
Estrada, however, fails to explicitly teach performing a process that includes, for each parameter in the group of parameters: adjusting a value of the parameter to generate a modified version of the machine-learning model. Estrada also fails to explicitly teach training the modified version of the machine-learning model to determine a likelihood function for the modified version of the machine-learning model. Furthermore, Estrada also fails to explicitly teach determining a descriptor value for the modified version of the machine learning model using (i) a number of parameters in the group of parameters and (ii) the likelihood function for the modified version of the machine-learning model, the descriptor value being a number expressing a relationship between the number of parameters in the group of parameters and the accuracy of the modified version of the machine-learning model 
On the other hand, Gibiansky teaches performing a process that includes, for each parameter in the group of parameters: adjusting a value of the parameter to generate a modified version of the machine-learning model ([0010]: determine a first candidate machine learning method; tune one or more parameters (i.e. adjusting a value of the parameter) of the first candidate machine learning method; determine that the first candidate machine learning method and a first parameter configuration for the first candidate machine learning method are the best based on a measure of fitness Subsequent to satisfaction of a stop condition; and output the first candidate machine learning method and the first parameter configuration (i.e. generate a modified version of the machine-learning model) for the first candidate machine learning method. The examiner notes that Estrada and Gibiansky are considered analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking model to incorporate performing a process that includes, for each parameter in the group of parameters: adjusting a value of the parameter to generate a modified version of the machine-learning model as taught by Gibiansky to [0009] optimize the machine learning models).
Furthermore, Streit teaches training the modified version of the machine-learning model to determine a likelihood function for the modified version of the machine-learning model ([Page770, Para2]: The GF training algorithm (i.e. training the modified version of the machine-learning model) gives explicit maximum likelihood estimates (i.e. determine a likelihood function) for the class a priori probabilities {ay} without iteration. The examiner notes that Estrada/Gibiansky and Streit are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking models to incorporate training the modified version of the machine-learning model to determine a likelihood function for the modified version of the machine-learning model as taught by Streit to [Page770, Para2] eliminate data anomalies).
Furthermore, Akaike teaches determining a descriptor value for the modified version of the machine learning model using (i) a number of parameters in the group of parameters and (ii) the likelihood function for the modified version of the machine-learning model, the descriptor value being a number expressing a relationship between the number of parameters in the group of parameters and the accuracy of the modified version of the machine-learning model  ([Page 1, Para. 4]: Suppose that we have a statistical model of some data. Let k be the number of estimated parameters in the model (i.e. (i) a number of parameters in the group of parameters). Let L be the maximum value of the likelihood function for the model (i.e. (ii) the likelihood function for the modified version of the machine-learning model). Then the AIC value of the model is the following AIC= 2*k – 2*ln(L)). The examiner notes that Estrada/Gibiansky/Streit and Akaike are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it determining a descriptor value for the modified version of the machine learning model using (i) a number of parameters in the group of parameters and (ii) the likelihood function for the modified version of the machine-learning model, the descriptor value being a number expressing a relationship between the number of parameters in the group of parameters and the accuracy of the modified version of the machine-learning model  as taught by Akaike to [Page1, Para1] estimate the relative quality of statistical models).

Regarding claim 13 Estrada/Gibiansky/Streit teach The non-transitory computer-readable medium of claim 8, wherein the machine-learning model is a first type of machine-learning model prior to executing the particular version of the machine-learning model to perform the task: determining another descriptor value for a second type of machine-learning model that is different from the first type of machine-learning model determining that the lowest descriptor value associated with the first type of machine-learning model is lower than the other descriptor value associated with the second type of machine-learning model based on determining that the lowest descriptor value is lower than the other descriptor value, selecting the particular version of the machine-learning model for performing the task. ([0024 on Estrada]: Model selection may be performed using various methods, Such as Akaike information criterion (AIC), Bayesian information criterion (BIC), or minimum description length (MDL). The examiner notes that AIC 

Regarding claim 15 Estrada teaches A system comprising a processing device; and a memory device on which instructions that are executable by the processing device are stored for causing the processing device to:
 Identify a group of parameters for configuring a machine-learning model. ([0023-0024]: The training phase 100 includes the data collection and transformation operation 106, which may include operations such as data preprocessing 200, data imputation 202, and discretization 204). The examiner notes that the claim does not describe how these parameters are identified or what they represent. The examiner considers the data used in the training operation of Estrada to be these parameters);
determining descriptor values ([0017]: Prediction errors against observed performance provide insight as to whether the model needs retraining (model self-validation in the execution phase 102) to guard against loss of accuracy over time. The examiner notes that the claim does not describe how these descriptor values are calculated. The examiner considers the prediction errors used in the training models of Estrada to be these descriptor values).
for multiple versions of the machine-learning model by ([0020]-[0022]: In the adaptation phase 104 the application using the model is notified (operation 114) and an adaptation to the model is triggered (operation 116). The flow moves back to the 
determining that a particular version of the machine- learning model has a lowest descriptor value among the descriptor values by comparing the descriptor values to one another; and ([0039]: In an embodiment, to implement the automatic verification operation, the model manager 502 is to compare an observed value from the telemetry data to a predicted value from the performance model and declare the performance model invalid when the observed value deviates from the predicted value by more than a threshold amount. The examiner notes that the claim does not describe how the descriptor values are determined. The examiner considers the observed and predicted data used in the verification process of Estrada to be these descriptor values).
executing, by the processing device, the particular version of the machine-learning model to perform a task in a computing environment based on the particular version of the machine-learning model having the lowest descriptor value ([0021]: The execution phase 102 deploys and executes the selected model (operation 110) and continually or periodically tests for new events (decision operation 112). The examiner notes that the claim does not describe how descriptor values are determined. The examiner considers [0024] Estrada’s accuracy measure used in the model selection process to be the descriptor value).
Estrada, however, fails to explicitly teach performing a process that includes, for each parameter in the group of parameters: adjusting a value of the parameter to generate a modified version of the machine-learning model. Estrada also fails to explicitly teach training the modified version of the machine-learning model to determine a likelihood function for the modified version of the machine-learning model. Furthermore, Estrada also fails to explicitly teach determining a descriptor value for the modified version of the machine learning model using (i) a number of parameters in the group of parameters and (ii) the likelihood function for the modified version of the machine-learning model, the descriptor value being a number expressing a relationship between the number of parameters in the group of parameters and the accuracy of the modified version of the machine-learning model
On the other hand, Gibiansky teaches performing a process that includes, for each parameter in the group of parameters: adjusting a value of the parameter to generate a modified version of the machine-learning model ([0010]: determine a first candidate machine learning method; tune one or more parameters (i.e. adjusting a value of the parameter) of the first candidate machine learning method; determine that the first candidate machine learning method and a first parameter configuration for the first candidate machine learning method are the best based on a measure of fitness Subsequent to satisfaction of a stop condition; and output the first candidate machine learning method and the first parameter configuration (i.e. generate a modified version of the machine-learning model) for the first candidate machine learning method. The examiner notes that Estrada and Gibiansky are considered analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the performing a process that includes, for each parameter in the group of parameters: adjusting a value of the parameter to generate a modified version of the machine-learning model as taught by Gibiansky to [0009] optimize the machine learning models).
Furthermore, Streit teaches training the modified version of the machine-learning model to determine a likelihood function for the modified version of the machine-learning model ([Page770, Para2]: The GF training algorithm (i.e. training the modified version of the machine-learning model) gives explicit maximum likelihood estimates (i.e. determine a likelihood function) for the class a priori probabilities {ay} without iteration. The examiner notes that Estrada/Gibiansky and Streit are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking models to incorporate training the modified version of the machine-learning model to determine a likelihood function for the modified version of the machine-learning model as taught by Streit to [Page770, Para2] eliminate data anomalies).
Furthermore, Akaike teaches determining a descriptor value for the modified version of the machine learning model using (i) a number of parameters in the group of parameters and (ii) the likelihood function for the modified version of the machine-learning model, the descriptor value being a number expressing a relationship between the number of parameters in the group of parameters and the accuracy of the modified version of the machine-learning model  ([Page 1, Para. 4]: Suppose that we have a statistical model of some data. Let k be the number of estimated parameters in the model (i.e. (i) a number of parameters in the group of parameters). Let L be the maximum value of the likelihood function for the model (i.e. (ii) the likelihood function for the modified version of the machine-learning model). Then the AIC value of the model is the following AIC= 2*k – 2*ln(L)). The examiner notes that Estrada/Gibiansky/Streit and Akaike are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking models to incorporate determining a descriptor value for the modified version of the machine learning model using (i) a number of parameters in the group of parameters and (ii) the likelihood function for the modified version of the machine-learning model, the descriptor value being a number expressing a relationship between the number of parameters in the group of parameters and the accuracy of the modified version of the machine-learning model as taught by Akaike to [Page1, Para1] estimate the relative quality of statistical models).

Regarding claim 19 Estrada/Gibiansky/Streit teach The system of claim 15, wherein the machine-learning model is a first type of machine-learning model prior to executing the particular version of the machine-learning model to perform the task: determining another descriptor value for a second type of machine-learning model that is different from the first type of machine-learning model determining that the lowest descriptor value associated with the first type of machine-learning model is lower than the other descriptor value associated with the second type of machine-learning model based on determining that the lowest descriptor value is lower than the other descriptor value, selecting the particular version of the machine-learning model for performing the task. ([0024 on Estrada]: Model selection may be performed using various methods, Such as Akaike information criterion (AIC), Bayesian information criterion (BIC), or minimum description length (MDL). The examiner notes that AIC (descriptor value) is an estimator of prediction error, therefore, a lower number indicates a better model. The selection of a better machine learning model is done by comparing the descriptor values of the machine learning models involved and the model with the lowest descriptor value is the best model, therefore, that is the one selected).

Claims 2, 9, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Estrada (US 2017/0220407), in view of Gibiansky (US 2016/0110657), further in view of Streit (Maximum Likelihood Training of Probabilistic Neural Networks – 1994), and still further in view of Akaike (Akaike Information Criterion - 2017).

Regarding claim 2, Estrada/Gibiansky/Streit teach The method of claim 1. However, Estrada/Gibiansky/Streit fails to explicitly teach further comprising determining the descriptor value for the modified version of the machine-learning model by: multiplying a first constant value by the number of parameters to determine a first value; multiplying a second constant value by a logarithm of a maximum value of a likelihood function for the modified version of the machine-learning model to determine a  second value. determining the descriptor value for the modified version of the machine-learning model by subtracting the second value from the first value.
On the other hand, Akaike teaches further comprising determining the descriptor value for the modified version of the machine-learning model by: multiplying a first constant value by the number of parameters to determine a first value; multiplying a second constant value by a logarithm of a maximum value of a likelihood function for the modified version of the machine-learning model to determine a second value. determining the descriptor value for the modified version of the machine-learning model by subtracting the second value from the first value ([Page1, Para4]: AIC = 2k - 2ln(L). The examiner notes that according to Akaike’s information criterion (AIC), k is the number of parameters, and L is the maximum value of the likelihood function. The examiner also notes that Estrada/Gibiansky/Streit and Akaike are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analysis to incorporate determining the descriptor value for the modified version of the machine-learning model by: multiplying a first constant value by the number of parameters to determine a first value; multiplying a second constant value by a logarithm of a maximum value of a likelihood function for the modified version of the machine-learning model to determine a second value. determining the descriptor value for the modified version of the machine-learning model by subtracting the second value from the first value as taught by Akaike to [Page1, Para1] estimate the relative quality of statistical models).

Regarding claim 9, Estrada/Gibiansky/Streit teach The non-transitory computer-readable medium of claim 8. However, Estrada/Gibiansky/Streit fails to explicitly teach further comprising program code that is executable by the processing device for causing the processing device to determine the descriptor value for the modified version of the machine-learning model by: multiplying a first constant value by the number of parameters to determine a first value; multiplying a second constant value by a logarithm of a maximum value of a likelihood function for the modified version of the machine-learning model to determine a RH2O171276 12/18 Attorney Docket No. 1068225 second value. determining the descriptor value for the modified version of the machine-learning model by subtracting the second value from the first value.
On the other hand, Akaike teaches further comprising program code that is executable by the processing device for causing the processing device to determine the descriptor value for the modified version of the machine-learning model by: multiplying a first constant value by the number of parameters to determine a first value; multiplying a second constant value by a logarithm of a maximum value of a likelihood function for the modified version of the machine-learning model to determine a RH2O171276 12/18 Attorney Docket No. 1068225 second value. determining the descriptor value for the modified version of the machine-learning model by subtracting the second value from the first value ([Page1, Para4]: AIC = 2k - 2ln(L). The examiner notes that according to Akaike’s information criterion (AIC), k is the number of parameters, and L is the maximum value of the likelihood function. The examiner also notes that Estrada/Gibiansky/Streit and Akaike are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analysis to incorporate program code that is executable by the processing device for causing the processing device to determine the descriptor value for the modified version of the machine-learning model by: multiplying a first constant value by the number of parameters to determine a first value; multiplying a second constant value by a logarithm of a maximum value of a likelihood function for the modified version of the machine-learning model to determine a RH2O171276 12/18 Attorney Docket No. 1068225 second value. determining the descriptor value for the modified version of the machine-learning model by subtracting the second value from the first value as taught by Akaike to [Page1, Para1] estimate the relative quality of statistical models).

Regarding claim 16, Estrada/Gibiansky/Streit teach The system of claim 15. However, Estrada/Gibiansky/Streit fails to explicitly teach wherein the memory device further comprises instructions that are executable by the processing device for causing the processing device to determine the descriptor value for the modified version of the machine-learning model by: multiplying a first constant value by the number of parameters to determine a first value; multiplying a second constant value by a logarithm of a maximum value of a likelihood function for the modified version of the machine-learning model to determine a RH2O171276 12/18 Attorney Docket No. 1068225 second value. determining the descriptor value for the modified version of the machine-learning model by subtracting the second value from the first value.
On the other hand, Akaike teaches wherein the memory device further comprises instructions that are executable by the processing device for causing the processing device to determine the descriptor value for the modified version of the machine-learning model by: multiplying a first constant value by the number of parameters to determine a first value; multiplying a second constant value by a logarithm of a maximum value of a likelihood function for the modified version of the machine-learning model to determine a RH2O171276 12/18 Attorney Docket No. 1068225 second value. determining the descriptor value for the modified version of the machine-learning model by subtracting the second value from the first value ([Page1, Para4]: AIC = 2k - 2ln(L) Page1. The examiner notes that according to Akaike’s information criterion (AIC), k is the number of parameters, and L is the maximum value of the likelihood function. The examiner also notes that Estrada/Gibiansky/Streit and Akaike are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analysis to incorporate the memory device further comprises instructions that are executable by the processing device for causing the processing device to determine the descriptor value for the modified version of the machine-learning model by: multiplying a first constant value by the number of parameters to determine a first value; multiplying a second constant value by a logarithm of a maximum value of a likelihood function for the modified version of the machine-learning model to determine a RH2O171276 12/18 Attorney Docket No. 1068225 second value. determining the descriptor value for the modified version of the machine-learning model by subtracting the second value from the first value as taught by Akaike to [Page1, Para1] estimate the relative quality of statistical models).

Claims 3 and 4 are rejected under 35 U.S.C. 103 as being unpatentable over Estrada (US 2017/0220407), in view of Gibiansky (US 2016/0110657), further in view of Streit (Maximum Likelihood Training of Probabilistic Neural Networks – 1994), further in view of Akaike (Akaike Information Criterion - 2017), and still further in view of Jordan (Hyperparameter tuning for machine learning models - 11/02/2017).

Regarding claim 3, Estrada/Gibiansky/Streit/Akaike teach The method of claim 2. However, Estrada/Gibiansky/Streit/Akaike fails to explicitly teach wherein determining the descriptor values comprises performing the process for multiple parameter values for every parameter in the group of parameters.
On the other hand, Jordan teaches wherein determining the descriptor values comprises performing the process for multiple parameter values for every parameter in the group of parameters ([Page2, Para1]:
How many trees should I include in my random forest?
How many neurons should I have in my neural network layer?
How many layers should I have in my neural network?
What should I set my learning rate to for gradient descent?. 
The examiner notes that Jordan teaches a group of multiple parameters which include number of trees, number of neurons, number of layers, and learning rate. The examiner also notes that Estrada/Gibiansky/Streit/Akaike and Jordan are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking models to incorporate determining the descriptor values comprises performing the process for multiple parameter values for every parameter in the group of parameters as taught by Jordan to [Page1, Para1] find the ideal model).

Regarding claim 4, Estrada/Gibiansky/Streit/Akaike/Jordan teach The method of claim 3, wherein determining the descriptor values comprises performing the process for multiple parameter values for multiple combinations of parameters in the group of parameters. ([Page2, Para1 on Jordan]:
How many trees should I include in my random forest?
How many neurons should I have in my neural network layer?
How many layers should I have in my neural network?
What should I set my learning rate to for gradient descent? 
The examiner notes that Jordan teaches a group of multiple parameters which include number of trees, number of neurons, number of layers, and learning rate).

Claims 10, 11, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Estrada (US 2017/0220407), in view of Gibiansky (US 2016/0110657), further in view of Streit (Maximum Likelihood Training of Probabilistic Neural Networks – 1994), and still further in view of Jordan (Hyperparameter tuning for machine learning models - 11/02/2017).

Regarding claim 10, Estrada/Gibiansky/Streit teaches The non-transitory computer-readable medium of claim 8. However, Estrada/Gibiansky/Streit fails to explicitly teach wherein determining the descriptor values comprises performing the process for multiple parameter values for every parameter in the group of parameters.

On the other hand, Jordan teaches wherein determining the descriptor values comprises performing the process for multiple parameter values for every parameter in the group of parameters ([Page2, Para1]: 
How many trees should I include in my random forest?
How many neurons should I have in my neural network layer?
How many layers should I have in my neural network?
What should I set my learning rate to for gradient descent?
The examiner notes that Jordan teaches a group of multiple parameters which include number of trees, number of neurons, number of layers, and learning rate. The examiner also notes that Estrada/Gibiansky/Streit and Jordan are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the determining the descriptor values comprises performing the process for multiple parameter values for every parameter in the group of parameters as taught by Jordan to [Page1, Para1] find the ideal model).

Regarding claim 11, Estrada/Gibiansky/Streit teaches The non-transitory computer-readable medium of claim 8. However, Estrada/Gibiansky/Streit fails to explicitly teach wherein determining the descriptor values comprises performing the process for multiple combinations of parameters in the group of parameters.

On the other hand, Jordan teaches wherein determining the descriptor values comprises performing the process for multiple combinations of parameters in the group of parameters ([Page2, Para1]:
How many trees should I include in my random forest?
How many neurons should I have in my neural network layer?
How many layers should I have in my neural network?
What should I set my learning rate to for gradient descent?
The examiner notes that Jordan teaches a group of multiple parameters which include number of trees, number of neurons, number of layers, and learning rate. The examiner also notes that Estrada/Gibiansky/Streit and Jordan are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking models to incorporate determining the descriptor values comprises performing the process for multiple combinations of parameters in the group of parameters as taught by Jordan to [Page1, Para1] find the ideal model).

Regarding claim 17, Estrada/Gibiansky/Streit teaches The system of claim 15. However, Estrada/Gibiansky/Streit fails to explicitly teach wherein determining the descriptor values comprises performing the process for multiple parameter values for every parameter in the group of parameters.

On the other hand, Jordan teaches wherein determining the descriptor values comprises performing the process for multiple parameter values for every parameter in the group of parameters ([Page2, Para1]:
How many trees should I include in my random forest?
How many neurons should I have in my neural network layer?
How many layers should I have in my neural network?
What should I set my learning rate to for gradient descent?
The examiner notes that Jordan teaches a group of multiple parameters which include number of trees, number of neurons, number of layers, and learning rate. The examiner also notes that Estrada/Gibiansky/Streit and Jordan are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking models to incorporate determining the descriptor values comprises performing the process for multiple parameter values for every parameter in the group of parameters as taught by Jordan to [Page1, Para1] find the ideal model).

Claims 5, 12, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Estrada (US 2017/0220407), in view of Gibiansky (US 2016/0110657), further in view of Streit (Maximum Likelihood Training of Probabilistic Neural Networks – 1994), and still further in view of Jordan (Hyperparameter tuning for machine learning models - 11/02/2017).

Regarding claim 5, Estrada/Gibiansky/Streit teaches The method of claim 1. However, Estrada/Gibiansky/Streit fails to explicitly teach wherein the group of parameters are hyperparameters, and at least one parameter in the group of parameters does not affect a topology of the machine-learning model.

On the other hand, Jordan teaches wherein the group of parameters are hyperparameters, and at least one parameter in the group of parameters does not affect a topology of the machine-learning model ([Page2, Para1]: 
How many trees should I include in my random forest?
How many neurons should I have in my neural network layer?
How many layers should I have in my neural network?
What should I set my learning rate to for gradient descent?
The examiner notes that Jordan teaches a group of hyperparameters which include three topological hyperparameters (number of trees, number of neurons, number of layers), and one non-topological hyperparameter (learning rate). The examiner also notes that Estrada/Gibiansky/Streit and Jordan are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have the group of parameters are hyperparameters, and at least one parameter in the group of parameters does not affect a topology of the machine-learning model as taught by Jordan to [Page1, Para1] find the ideal model).

Regarding claim 12, Estrada/Gibiansky/Streit teaches The non-transitory computer-readable medium of claim 8. However, Estrada/Gibiansky/Streit fails to explicitly teach wherein the group of parameters are hyperparameters, and at least one parameter in the group of parameters does not affect a topology of the machine-learning model.

On the other hand, Jordan teaches wherein the group of parameters are hyperparameters, and at least one parameter in the group of parameters does not affect a topology of the machine-learning model ([Page2, Para1]:
How many trees should I include in my random forest?
How many neurons should I have in my neural network layer?
How many layers should I have in my neural network?
What should I set my learning rate to for gradient descent?
The examiner notes that Jordan teaches a group of hyperparameters which include three topological hyperparameters (number of trees, number of neurons, number of layers), and one non-topological hyperparameter (learning rate). The examiner also notes that Estrada/Gibiansky/Streit and Jordan are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have the group of parameters are hyperparameters, and at least one parameter in the group of parameters does not affect a topology of the machine-learning model as taught by Jordan to [Page1, Para1] find the ideal model).

Regarding claim 18, Estrada/Gibiansky/Streit teaches The system of claim 15. However, Estrada/Gibiansky/Streit fails to explicitly teach wherein the group of parameters are hyperparameters, and at least one parameter in the group of parameters does not affect a topology of the machine-learning model.

On the other hand, Jordan teaches wherein the group of parameters are hyperparameters, and at least one parameter in the group of parameters does not affect a topology of the machine-learning model ([Page2, Para1]:
How many trees should I include in my random forest?
How many neurons should I have in my neural network layer?
How many layers should I have in my neural network?
What should I set my learning rate to for gradient descent?
The examiner notes that Jordan teaches a group of hyperparameters which include three topological hyperparameters (number of trees, number of neurons, number of layers), and one non-topological hyperparameter (learning rate). The examiner also notes that Estrada/Gibiansky/Streit and Jordan are considered to be analogous because they are in the same field of Artificial Intelligence. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the the group of parameters are hyperparameters, and at least one parameter in the group of parameters does not affect a topology of the machine-learning model as taught by Jordan to [Page1, Para1] find the ideal model).

Claims 7, 14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Estrada (US 2017/0220407), in view of Gibiansky (US 2016/0110657), further in view of Streit (Maximum Likelihood Training of Probabilistic Neural Networks – 1994), and still further in view of Wikipedia (Multi-objective optimization  – 12/06/2017).

Regarding claim 7, Estrada/Gibiansky/Streit teaches The method of claim 1. However, Estrada/Gibiansky/Streit fails to explicitly teach wherein determining the descriptor value for the modified version of the machine-learning model comprises: generating a Pareto surface in a graph having (i) model accuracy along a first axis and (ii) the number of parameters used to configure the machine-learning model along a second axis; determining a plot point on the Pareto surface in the graph; and determine the descriptor value based on the plot point.
On the other hand, Wikipedia teaches wherein determining the descriptor value for the modified version of the machine-learning model comprises: generating a Pareto surface in a graph having (i) model accuracy along a first axis and (ii) the number of parameters used to configure the machine-learning model along a second axis; determining a plot point on the Pareto surface in the graph; and determine the descriptor value based on the plot point ([Page11, Para2]: In the case of bi-objective problems, informing the decision maker concerning the Pareto front is usually carried out by its visualization (i.e. generating a Pareto surface in a graph): the Pareto front, often named the tradeoff curve in this case, can be drawn at the objective plane. The tradeoff curve gives full information on objective values and on objective tradeoffs, which inform how improving one objective is related to deteriorating the second one while moving along the tradeoff curve (i.e. having (i) model accuracy along a first axis and (ii) the number of parameters used to configure the machine-learning model along a second axis). The decision maker takes this information into account while specifying the preferred Pareto optimal objective point (i.e. determine the descriptor value based on the plot point). The examiner notes that Estrada/Gibiansky/Streit and Wikipedia are considered to be analogous because they are in the same field of data analysis. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking models to incorporate determining the descriptor value for the modified version of the machine-learning model comprises: generating a Pareto surface in a graph having (i) model accuracy along a first axis and (ii) the number of parameters used to configure the machine-learning model along a second axis; determining a plot point on the Pareto surface in the graph; and determine the descriptor value based on the plot point as taught by Wikipedia to [Page1, Para1] make an optimal decision in the presence of trade-offs between two conflicting objectives).

Regarding claim 14, Estrada/Gibiansky/Streit teaches The method of claim 8. However, Estrada/Gibiansky/Streit fails to explicitly teach The non-transitory computer-readable medium of claim 8, further comprising program code that is executable by the processing device for causing the processing device to determine the descriptor value for the modified version of the machine-learning model comprises: generating a Pareto surface in a graph having (i) model accuracy along a first axis and (ii) the number of parameters used to configure the machine-learning model along a second axis; determining a plot point on the Pareto surface in the graph; and determine the descriptor value based on the plot point.
On the other hand, Wikipedia teaches The non-transitory computer-readable medium of claim 8, further comprising program code that is executable by the processing device for causing the processing device to determine the descriptor value for the modified version of the machine-learning model comprises: generating a Pareto surface in a graph having (i) model accuracy along a first axis and (ii) the number of parameters used to configure the machine-learning model along a second axis; determining a plot point on the Pareto surface in the graph; and determine the descriptor value based on the plot point ([Page11, Para2]: In the case of bi-objective problems, informing the decision maker concerning the Pareto front is usually carried out by its visualization: the Pareto front, often named the tradeoff curve in this case, can be drawn at the objective plane. The tradeoff curve gives full information on objective values and on objective tradeoffs, which inform how improving one objective is related to deteriorating the second one while moving along the tradeoff The non-transitory computer-readable medium of claim 8, further comprising program code that is executable by the processing device for causing the processing device to determine the descriptor value for the modified version of the machine-learning model comprises: generating a Pareto surface in a graph having (i) model accuracy along a first axis and (ii) the number of parameters used to configure the machine-learning model along a second axis; determining a plot point on the Pareto surface in the graph; and determine the descriptor value based on the plot point as taught by Wikipedia to [Page1, Para1] make an optimal decision in the presence of trade-offs between two conflicting objectives).

Regarding claim 20, Estrada/Gibiansky/Streit teaches The system of claim 15. However, Estrada/Gibiansky/Streit fails to explicitly teach wherein the memory device further comprises instructions that are executable by the processing device for causing the processing device to determine  the descriptor value for the modified version of the machine-learning model comprises: generating a Pareto surface in a graph having (i) model accuracy along a first axis and (ii) the number of parameters used to configure the machine-learning model along a second axis; determining a plot point on the Pareto surface in the graph; and determine the descriptor value based on the plot point.
On the other hand, Wikipedia teaches wherein the memory device further comprises instructions that are executable by the processing device for causing the processing device to determine  the descriptor value for the modified version of the machine-learning model comprises: generating a Pareto surface in a graph having (i) model accuracy along a first axis and (ii) the number of parameters used to configure the machine-learning model along a second axis; determining a plot point on the Pareto surface in the graph; and determine the descriptor value based on the plot point ([Page11, Para2]: In the case of bi-objective problems, informing the decision maker concerning the Pareto front is usually carried out by its visualization: the Pareto front, often named the tradeoff curve in this case, can be drawn at the objective plane. The tradeoff curve gives full information on objective values and on objective tradeoffs, which inform how improving one objective is related to deteriorating the second one while moving along the tradeoff curve. The decision maker takes this information into account while specifying the preferred Pareto optimal objective point. The examiner notes that Estrada/Gibiansky/Streit and Wikipedia are considered to be analogous because they are in the same field of data analysis. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Estrada’s analytical and ranking models to incorporate the memory device further comprises instructions that are executable by the processing device for causing the processing device to determine  the descriptor value for the modified version of the machine-learning model comprises: generating a Pareto surface in a graph having (i) model accuracy along a first axis and (ii) the number of parameters used to configure the machine-learning model along a second axis; determining a plot point on the Pareto surface in the graph; and determine the descriptor value based on the plot point as taught by Wikipedia to [Page1, Para1] make an optimal decision in the presence of trade-offs between two conflicting objectives).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAMCY ALGHAZZY whose telephone number is (571)272-8824.  The examiner can normally be reached on Monday-Friday 7:30am-4:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez can be reached on 571-272-2589.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  

/SHAMCY ALGHAZZY/Examiner, Art Unit 4193                                                                                                                                                                                             
/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128