DETAILED ACTION
Status of Claims
This action is in response to the applicant amendment filed on 3/15/2022 for application 16/219,242 filed on 12/13/2018. Claim 1 – 20 are pending and have been examined.
Claim 2, 11 and 19 are amended. 
Claim rejection under 35 U.S.C 112b has been withdrawn in light of the applicant’s amendment. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in Republic of India on 10/31/2018.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/31/2018 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

The information disclosure statement filed on 4/12/2021 fails to comply with 37 CFR 1.98(a)(2), which requires a legible copy of each cited foreign patent document; each non-patent literature publication or that portion which caused it to be listed; and all other information or that portion which caused it to be listed.  It has been placed in the application file, but the information referred to therein has not been considered.

Response to Argument
Applicant's remark filed on 3/15/2022 regarding prior art rejection has been fully considered but they are not persuasive.

Regarding independent Claims, applicant state that the adjustment of ג of Natekin does not based on anything in particular thus does not disclose "adjusting a learning rate of the machine learning system based on the optimal boosting iterations hyper parameter of the selected candidate model". Examiner respectfully disagrees. Natekin express the consideration when selection λ: the smaller parameter λ … the better generalization is achieved … the cost of improving the generalization properties is the convergence speed choosing a stronger λ value of  will increase the number of iteration M required for convergence (Natekin, page 9, column 1, paragraph 2, line 7 – 12). i.e., the number of iteration M (optimal boosting iteration hyper parameter) is a factor choosing the value of λ – Natekin describes a tradeoff between the two parameters, thus they are each chosen based on each other. 

Applicant further state that Natekin does not disclose perform one or more additional "cycles of the machine learning system" as described in the claim. Particularly, the system of Natekin does not perform cycles of the machine learning system "by iteratively evaluating a plurality of generations of the machine learning system" as done in the first performance of one or more cycles of the machine learning system. Examiner respectfully disagree. Claim 1 and 17 recite “performing one or more additional cycles of the machine learning system employing the adjusted learning rate”; Claim 11 recite “performing one or more second cycles employing the adjusted learning rate”. The claims do not require the additional cycle or second cycle to be "by iteratively evaluating a plurality of generations of the machine learning system". As the broadest reasonable interpretation, examiner interpret the retraining after regularization as performing a cycle of machine learning system recited in the independent claim of the instant application. Therefore, Natekin discloses “performing one or more additional cycles of the machine learning system employing the adjusted learning rate." 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim 1 – 3, 7 – 8, 11 – 12, 14 – 15, and 17 – 19 are rejected under 35 U.S.C. 103 as being unpatentable over Jain, Hyperparameter tuning in XGBoost using genetic algorithm, Towards data science, Sep, 2018 in view of Natekin, Gradient Boosting Machines, a Tutorial, Frontiers in Neurorobotics, 2013.

Regarding Claim 1, Jain discloses: 
A method for automatically optimizing hyper parameter and feature selection in a machine learning system (Jain, page. 4, where selected … parameters [hyper parameter] to be optimized: … colsample_bytree [feature selection]), the method comprising
identifying a training data source comprising a plurality of records, wherein each record of the plurality of records comprises data corresponding to a plurality of features (Jain, page 8 – 9, where dataset is from UCI Machine learning Repository … 6500 low-energy conformation .. 166 features [plurality of features]);
initializing a first cycle of the machine learning system by generate first generation of candidate models, wherein generating each respective first candidate model of the first generation of candidate models comprises selecting a first subset of features, of the plurality of features, for use in the first candidate model (Jain, page. 4, & fig. 1, where first step is initialization … first generation of the populations; colsample_bytree parameter specify the fraction of randomly selected features that will be used in each tree for XGBoost);
determining, for each first candidate model of the first generation of candidate models,  a respective first optimal boosting iterations hyper parameter (Jain, page 4, Initialization, ln. 5 &  page 10, ln 18 – page 11, ln. 10, where train each population of parents [candidate model] and determine the optimized model. The n_estimators parameter of the optimized model is the optimized number of boosting iteration hyper parameter)
evaluating fitness values for each respective first candidate model in the first generation based on a corresponding subset of features and a corresponding optimal boosting iterations hyper parameter (Jain page 4 Initialization, & page 10, ln 18 – page 11, ln. 10, where fitnessValue is generated for each population by training using the corresponding n_estimator [optimal boosting iteration hyper parameter] parameter and the randomly selected feature set specified with the colsample_bytree parameter);
performing one or more cycles of the machine learning system by iteratively evaluating a plurality of generations of the machine learning system (page. 10, ln. 18 – page 11, ln. 10, where the machine learnings and evaluation are performed for multiple generations [one or more cycle ]), wherein evaluating a respective generation of the machine learning system comprises:
generating a second generation of candidate models by performing an evolution process on a respective subset of features associated with selected models of a current generation of candidate models to generate respective second candidate models (Jain, page. 2 – 3, & fig. 2 – 4, where second generation of candidate model is performed by crossover and mutation [evolution process]);
determining, for each second candidate model of the second generation of candidate models, a respective second optimal boosting iterations hyper parameter; and evaluating fitness values for each respective second candidate model in the second generation based on a corresponding subset of features and a corresponding optimal boosting iterations hyper parameter; subsequent to iteratively evaluating the plurality of generations, determining a selected candidate model of a final generation of candidate models associated with the one or more cycles (Jain, page 10, ln. 18 – page 11, ln. 10, where for generation in range (numberOfGenerations) performs an iteratively evaluation of the plurality of generations, including the final generation, the fitness values of each candidate models corresponding to the trained/optimized n_estimator [optimal boosting iterations hyper parameter] and the features the system used for each population specified by colsample_bytree);
Jain does not explicitly disclose: 
adjusting a learning rate of the machine learning system based on the optimal boosting iterations hyper parameter of the selected candidate model
performing one or more additional cycles of the machine learning system employing the adjusted learning rate
and based on termination criteria, identifying a resulting candidate model of a final cycle of the machine learning system as an optimized model
Natekin explicitly discloses:
adjusting a learning rate of the machine learning system based on the optimal boosting iterations hyper parameter of the selected candidate model (Natekin, sec. 4.2, para. 3, where, the parameter ג  [learning rate], the regularization is applied to the final step [adjust learning rate]);
performing one or more additional cycles of the machine learning system employing the adjusted learning rate (Natekin, fig. 5 B & sec. 4.3, para. 1, where using regularization technique described above, … given a shrinkage parameter ג  , the optimal number of iterations Mopt, … can be different; sec. 4.3, para. 3, where in practice one typically chooses the shrinkage parameter ג  [learning rate] beforehand and varies the number of iterations M with respect to the chosen shrinkage [search optimal number of iterations for a learning rate]; i.e., Natekin suggest search for optimal number of iteration given a shrinkage parameter [learning rate] and change shrinkage parameter [learning rate] to regularize the model with additional search for the optimal number of iteration);
and based on termination criteria, identifying a resulting candidate model of a final cycle of the machine learning system as an optimized model (Natekin, sec. 4, para. 2 – 3, where overfitting a GBM is possible … to decrease the overfitting effect in GBMS, … regularization techniques that are most frequently used; i.e., Natekin suggest to perform regularization when there is overfitting, with the overfitting condition dissipated, regularization is no longer needed).
Jain and Natekin both teach Gradient Boosting machine learning model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before effective filing date of the claimed invention to combine Jain’s disclosure of genetic algorithm for finding optimal number of iteration and feature selection in Gradient Boosting machine with Natekin’s disclosure of change learning rate to regularize the model with additional search for the optimal number of iteration to teach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to reach a balance between accuracy requirement and computational cost (Natekin, sec. 4.2, para. 3, where smaller parameter … better generalization … cost … is the convergence speed [number of iterations; computational cost]).

Regarding Claim 2, Jain in view of Natekin disclose a method of Claim 1. Jain in view of Natekin further disclose: 
performing at least one second cycle of the machine learning system, wherein a first generation of the at least one second cycle is generated based on a final generation associated with the first cycle, wherein the resulting candidate model is an output of the at least one second cycle (Jain, page. 10, ln. 18 – page 11, ln. 10, where the machine learnings and evaluation are performed for a number of generations including generation 2 [second cycle ]; page 10, ln. 20 – 24, where train/tune the hyperparameters to calculate fitnessValue of the current population [first generation of the second cycle] which is based on the population of the prior generation [final generation associated with the first cycle] ).

Regarding Claim 3, Jain in view of Natekin disclose a method of Claim 1. Jain in view of Natekin further disclose: 
wherein adjusting the learning rate of the machine learning system based on the optimal boosting iterations hyper parameter of the selected candidate model comprises: determining whether the selected candidate model satisfies an optimal boosting iterations constraint; and based on determining that the selected model does not satisfy the optimal boosting iterations constraint, adjusting the learning rate of the machine learning system (Natekin, sec. 4, para. 1, where model can easily overfit the data [not satisfy the optimal boosting iterations constraint]; sec. 4, para. 3, where to decrease the overfitting effect in GBMs a number of different approaches were introduced including shrinkage which change a learning parameter ג  [learning rate]; i.e., Natekin suggest to perform regularization by change learning rate if the model overfit [not satisfy the optimal boosting iteration constraint]).

Regarding Claim 7, Jain in view of Natekin disclose a method of Claim 1. Jain in view of Natekin further disclose: 
wherein determining the first optimal boosting iterations hyper parameter for the first candidate model is based on a maximum number of boosting iterations (Natekin, sec. 4.3 Early Stopping, para. 5, where maximum number of iterations Mmax … is specified in the process of searching the optimal boosting iterations).

Regarding Claim 8, Jain in view of Natekin disclose a method of Claim 7. Jain in view of Natekin further disclose: 
wherein determining the first optimal boosting iterations hyper parameter for the first candidate model comprises: for each number of boosting iterations between an initial number of boosting iterations and the maximum number of boosting iterations, evaluating fitness of the first candidate model using the number of boosting iterations; and selecting the number of boosting iterations that provides a highest fitness for the first candidate model as the first optimal boosting iterations hyper parameter (Natekin, fig. 5 & sec. 4.3, para. 1 – 2 & para. 7 – 8, where fig. 5C and 5D evaluate different fitness score from the first bootstrapping number of iterations [initial number of boosting iterations] to the maximum number of boosting iterations 3000 to determine the optimal boosting iterations).

Regarding Claim 11, Jain discloses: 
A machine learning system comprising: one or more processors and memory storing instructions that, when executed by the one or more processors (Jain, page. 4, where XGBoost is an … library … implements machine learning algorithm; the XGBoost is executed in a computer system with processor, memory and instructions which are executed by processors), cause the machine learning system to optimizing hyper parameter and feature selection in a machine learning system (Jain, page. 4, where selected … parameters [hyper parameter] to be optimized: … colsample_bytree [feature selection]), by causing the machine learning system to
identify a training data source comprising a plurality of records, wherein each record of the plurality of records comprises data corresponding to a plurality of features (Jain, page 8 – 9, where dataset is from UCI Machine learning Repository … 6500 low-energy conformation .. 166 features [plurality of features]);
initializing a first cycle by generating a first generation of candidate models, wherein generating each respective first candidate model of the first generation of candidate models comprises selecting a first subset of features, of the plurality of features, for use in the first candidate model (Jain, page. 10, ln. 18 – page 11, ln. 10, where the algorithm create multiple generations, the first generation perform mutation to generate the second generation [first cycle]; page. 4, & fig. 1, where first step is initialization … first generation of the populations; colsample_bytree parameter specify the fraction of randomly selected features that will be used in each tree for XGBoost);
determining, for each first candidate model of the first generation of candidate models,  a respective first optimal boosting iterations hyper parameter (Jain, page 4, Initialization, ln. 5 &  page 10, ln 18 – page 11, ln. 10, where train each population of parents [candidate model] and determine the optimized model. The n_estimators parameter of the optimized model is the optimized number of boosting iteration hyper parameter)
evaluating fitness values for each respective first candidate model in the first generation based on a corresponding subset of features and a corresponding optimal boosting iterations hyper parameter (Jain page 4 Initialization, & page 10, ln 18 – page 11, ln. 10, where fitnessValue is generated for each population by training using the corresponding n_estimator [optimal boosting iteration hyper parameter] parameter and the randomly selected feature set specified with the colsample_bytree parameter);
performing the first cycles by iteratively evaluating a plurality of generations of the machine learning system (page. 10, ln. 18 – page 11, ln. 10, where the machine learnings and evaluation are performed for multiple generations [one or more cycle ]), wherein evaluating a respective generation of the machine learning system comprises:
generating a second generation of candidate models by performing an evolution process on a respective subset of features associated with selected models of a current generation of candidate models to generate respective second candidate models (Jain, page. 2 – 3, & fig. 2 – 4, where second generation of candidate model is performed by crossover and mutation [evolution process]);
determining, for each second candidate model of the second generation of candidate models, a respective second optimal boosting iterations hyper parameter; and evaluating fitness values for each respective second candidate model in the second generation based on a corresponding subset of features and a corresponding optimal boosting iterations hyper parameter; subsequent to iteratively evaluating the plurality of generations, determining a selected candidate model of a final generation of candidate models of the first cycle (Jain, page 10, ln. 18 – page 11, ln. 10, where for each generation performs an iteratively evaluation of the plurality of generations, including the final generation, the fitness values of each candidate models corresponding to the trained/optimized n_estimator [optimal boosting iterations hyper parameter] and the features the system used for each population specified by colsample_bytree);
performing one or more second cycles … wherein a first generation of at least one second cycle of the one or more second cycles is generated based on the final generation of the first cycle (Jain, page 10, ln. 18 – page 11, ln. 10, where the third and the fourth generation went through a cycle of mutation [second cycle] where the third generation [first generation of the second cycle] is generated by the second generation [final generation of the first cycle])
Jain does not explicitly disclose: 
adjusting a learning rate of the machine learning system based on the optimal boosting iterations hyper parameter of the selected candidate model
performing one or more second cycles employing the adjusted learning rate 
and based on termination criteria, identifying a resulting candidate model of a final cycle of the machine learning system as an optimized model
Natekin explicitly discloses:
adjusting a learning rate of the machine learning system based on the optimal boosting iterations hyper parameter of the selected candidate model (Natekin, sec. 4.2, para. 3, where, the parameter ג  [learning rate], the regularization is applied to the final step [adjust learning rate]);
performing one or more additional cycles of the machine learning system employing the adjusted learning rate (Natekin, fig. 5 B & sec. 4.3, para. 1, where using regularization technique described above, … given a shrinkage parameter ג  , the optimal number of iterations Mopt, … can be different; sec. 4.3, para. 3, where in practice one typically chooses the shrinkage parameter ג  [learning rate] beforehand and varies the number of iterations M with respect to the chosen shrinkage [search optimal number of iterations for a learning rate]; i.e., Natekin suggest search for optimal number of iteration given a shrinkage parameter [learning rate] and change shrinkage parameter [learning rate] to regularize the model with additional search for the optimal number of iteration);
and based on termination criteria, identifying a resulting candidate model of a final cycle of the machine learning system as an optimized model (Natekin, sec. 4, para. 2 – 3, where overfitting a GBM is possible … to decrease the overfitting effect in GBMS, … regularization techniques that are most frequently used; i.e., Natekin suggest to perform regularization when there is overfitting, with the overfitting condition dissipated, regularization is no longer needed).
Jain and Natekin both teach Gradient Boosting machine learning model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before effective filing date of the claimed invention to combine Jain’s disclosure of genetic algorithm for finding optimal number of iteration and feature selection in Gradient Boosting machine with Natekin’s disclosure of change learning rate to regularize the model with additional search for the optimal number of iteration to teach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to reach a balance between accuracy requirement and computational cost (Natekin, sec. 4.2, para. 3, where smaller parameter … better generalization … cost … is the convergence speed [number of iterations; computational cost]).

Regarding Claim 12, Jain in view of Natekin disclose a method of Claim 11. Jain in view of Natekin further disclose: 
wherein adjusting the learning rate of the machine learning system based on the optimal boosting iterations hyper parameter of the selected candidate model comprises: determining whether the selected candidate model satisfies an optimal boosting iterations constraint; and based on determining that the selected model does not satisfy the optimal boosting iterations constraint, adjusting the learning rate of the machine learning system (Natekin, sec. 4, para. 1, where model can easily overfit the data [not satisfy the optimal boosting iterations constraint]; sec. 4, para. 3, where to decrease the overfitting effect in GBMs a number of different approaches were introduced including shrinkage which change a learning parameter ג  [learning rate]; i.e., Natekin suggest to perform regularization by change learning rate if the model overfit [iteration constraint]).

Regarding Claim 14, Jain in view of Natekin disclose a method of Claim 11. Jain in view of Natekin further disclose: 
wherein the instructions cause the machine learning system to determining the first optimal boosting iterations hyper parameter for the first candidate model is based on a maximum number of boosting iterations (Natekin, sec. 4.3 Early Stopping, para. 5, where maximum number of iterations Mmax … is specified in the process of searching the optimal boosting iterations).

Regarding Claim 15, Jain in view of Natekin disclose a method of Claim 14. Jain in view of Natekin further disclose: 
wherein the instructions cause the machine learning system to determining the first optimal boosting iterations hyper parameter for the first candidate model by causing the machine learning system to: for each number of boosting iterations between an initial number of boosting iterations and the maximum number of boosting iterations, evaluating fitness of the first candidate model using the number of boosting iterations; and selecting the number of boosting iterations that provides a highest fitness for the first candidate model as the first optimal boosting iterations hyper parameter (Natekin, fig. 5 & sec. 4.3, para. 1 – 2 & para. 7 – 8, where fig. 5C and 5D evaluate different fitness score from the first bootstrapping number of iterations [initial number of boosting iterations] to the maximum number of boosting iterations 3000 to determine the optimal boosting iterations).

Regarding Claim 17, Claim 17 is the corresponding one or more non-transitory computer readable media claim of Claim 1. Jain further discloses: one or more non-transitory computer readable media storing instruction that, when executed by one or more processors, cause a machine learning system to perform steps (Jain, page. 4, where XGBoost is an … library … implement machine learning algorithm; page 10 – 11, where the program code is executed in a computer system having memory to store the instructions). Claim 17 is rejected with the same reason as Claim 1. 

Regarding Claim 18, Jain in view of Natekin disclose a method of Claim 17. Jain in view of Natekin further disclose:
Wherein adjusting the learning rate of the machine learning system based on the optimal boosting iterations hyper parameter of the selected candidate model comprises: determining whether the selected candidate model satisfies at least one solution constraint and based on determining that the selected model does not satisfy the at least one solution constraint, adjust the learning rate of the machine learning system (Natekin, sec. 4, para. 1, where model can easily overfit the data [not satisfy the solution constraint]; sec. 4, para. 3, where to decrease the overfitting effect in GBMs a number of different approaches were introduced including shrinkage which change a learning parameter ג  [learning rate]; i.e., Natekin suggest to perform regularization by change learning rate if the model overfit [not satisfy solution constraint]). 

Regarding Claim 19, Jain in view of Natekin disclose a method of Claim 17. Jain in view of Natekin further disclose: 
wherein the instructions cause the machine learning system to determine the first optimal boosting iterations hyper parameter for the first candidate model by causing the machine learning system to: for each number of boosting iterations between an initial number of boosting iterations and a maximum number of boosting iterations, evaluating fitness of the first candidate model using the number of boosting iterations; and selecting the number of boosting iterations that provides a highest fitness for the first candidate model as the first optimal boosting iterations hyper parameter (Natekin, fig. 5 & sec. 4.3, para. 1 – 2 & para. 7 – 8, where fig. 5C and 5D evaluate different fitness score from the first bootstrapping number of iterations [initial number of boosting iterations] to the maximum number of boosting iterations 3000 to determine the optimal boosting iterations).


Claim 4 – 6, 9 – 10, 13, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Jain, Hyperparameter tuning in XGBoost using genetic algorithm, Towards data science, Sep, 2018 in view of Natekin, Gradient Boosting Machines, a Tutorial, Frontiers in Neurorobotics, 2013 and further in view of Ferrucci, A Framework for Genetic Algorithms Based on Hadoop, arXiv, 2013.

Regarding Claim 4 Jain in view of Natekin disclose a method of Claim 1. Jain in view of Natekin do not explicitly disclose: 
wherein the termination criteria comprises a predetermined number of cycles of the machine learning system.
Ferrucci explicitly disclose: 
wherein the termination criteria comprises a predetermined number of cycles of the machine learning system (Ferrucci, sec. IV A, para. 8, ln. 4 – 7, where during each generation, every subset is evaluated by computing the accuracy value … until target accuracy is achieved or maximum number of generation is reached).
Jain (in view of Natekin) and Ferrucci both teach Genetic algorithm for hyperparameter tuning and feature selection and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before effective filing date of the claimed invention to combine Jain (in view of Natekin)’s disclosure of genetic algorithm for finding optimal number of iteration and feature selection in Gradient Boosting machine with Ferrucci’s disclosure of the implementation detail of genetic algorithm to teach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to develop and execute full applications (Ferrucci, sec. I, para. 5, ln. 2 – 3).

Regarding Claim 5, Jain in view of Natekin disclose a method of Claim 1. Jain in view of Natekin do not explicitly disclose: 
wherein the termination criteria comprises a threshold fitness value for the resulting candidate model.
Ferrucci explicitly disclose: 
wherein the termination criteria comprises a threshold fitness value for the resulting candidate model (Ferrucci, sec. IV A, para. 8, ln. 4 – 7, where during each generation, every subset is evaluated by computing the accuracy value … until target accuracy [threshold fitness value] is achieved or maximum number of generation is reached).
The reason for combination is the same as Claim 4. 

Regarding Claim 6, Jain in view of Natekin disclose a method of Claim 1. Jain in view of Natekin do not explicitly disclose: 
wherein generating each respective first candidate model of the first generation of candidate models comprises: selecting the first subset of features randomly and based on a maximum allowed features constraint 
Ferrucci explicitly discloses: 
wherein generating each respective first candidate model of the first generation of candidate models comprises: selecting the first subset of features randomly and based on a maximum allowed features constraint (Ferrucci, sec. IV. A, para. 9, ln. 2 – 4, where the algorithm generate the r random attribute subset; sec. IV. A para. 3, ln. 8 – 9, where it is clear that saving the number of collected features means saving money, i.e., the number of feature is a desired constraint).
The reason for combination is the same as Claim 4.

Regarding Claim 9, Jain in view of Natekin disclose a method of Claim 1. Jain in view of Natekin do not explicitly disclose:
wherein the evolution process employs a crossover function configured to repair candidate solutions that exceed a maximum number of allowed features.
 Ferrucci explicitly disclose: 
wherein the evolution process employs a crossover function configured to repair candidate solutions that exceed a maximum number of allowed features (Ferrucci, fig. 2 & sec. IV A, para. 8, Crossover, where splitting the parent of each couple into two parts … and then mixing the parts to obtaining two new children; as demonstrate in fig. 2, the first row of crossover children has feature 2 which is does not exist in the first row parent. In the case the parent does not have feature 2 due to the limited number of features, the crossover process can bring feature 2 to the children).
The reason for combination is the same as Claim 4

Regarding Claim 10, Jain in view of Natekin disclose a method of Claim 1. Jain in view of Natekin do not explicitly disclose: 
wherein the evolution process employs a mutation function configured to reduce a number of features selected in a given candidate model.
Ferrucci explicitly disclose: 
wherein the evolution process employs a mutation function configured to reduce a number of features selected in a given candidate model (Ferrucci, fig. 2 & sec. IV A, para. 8, Mutation, where according to a probability to mutate, during this step each subset may change the attribute into itself; as demonstrate in fig. 2, the third row of mutation replace feature 9 by feature 8 and thus reduced the number of features).
The reason for combination is the same as Claim 4

Regarding Claim 13 Jain in view of Natekin disclose a method of Claim 11. Jain in view of Natekin do not explicitly disclose: 
wherein the termination criteria comprises a predetermined number of cycles of the machine learning system.
Ferrucci explicitly disclose: 
wherein the termination criteria comprises a predetermined number of cycles of the machine learning system (Ferrucci, sec. IV A, para. 8, ln. 4 – 7, where during each generation, every subset is evaluated by computing the accuracy value … until target accuracy is achieved or maximum number of generation is reached).
Jain (in view of Natekin) and Ferrucci both teach Genetic algorithm for hyperparameter tuning and feature selection and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before effective filing date of the claimed invention to combine Jain (in view of Natekin)’s disclosure of genetic algorithm for finding optimal number of iteration and feature selection in Gradient Boosting machine with Ferrucci’s disclosure of the implementation detail of genetic algorithm to teach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to develop and execute full applications (Ferrucci, sec. I, para. 5, ln. 2 – 3).

Regarding Claim 16, Jain in view of Natekin disclose a method of Claim 11. Jain in view of Natekin do not explicitly disclose: wherein the evolution process employs:
a first evolutionary operation configured to repair candidate solutions that exceed a maximum number of allowed features; and 
a second evolutionary operation configured to reduce a number of features selected in a given candidate model.
Ferrucci explicitly disclose: 
a first evolutionary operation configured to repair candidate solutions that exceed a maximum number of allowed features (Ferrucci, fig. 2 & sec. IV A, para. 8, Crossover, where splitting the parent of each couple into two parts … and then mixing the parts to obtaining two new children; as demonstrate in fig. 2, the first row of crossover children has feature 2 which is does not exist in the first row parent. In the case the parent does not have feature 2 due to the limited number of features, the crossover process can bring feature 2 to the children).
a second evolutionary operation configured to reduce a number of features selected in a given candidate model (Ferrucci, fig. 2 & sec. IV A, para. 8, Mutation, where according to a probability to mutate, during this step each subset may change the attribute into itself; as demonstrate in fig. 2, the third row of mutation replace feature 9 by feature 8 and thus reduced the number of features).
The reason for combination is the same as Claim 13.

Regarding Claim 20, Jain in view of Natekin disclose a method of Claim 17. Jain in view of Natekin do not explicitly disclose: wherein the evolution process employs:
a first evolutionary operation configured to repair candidate solutions that exceed a maximum number of allowed features; and 
a second evolutionary operation configured to reduce a number of features selected in a given candidate model.
Ferrucci explicitly disclose: 
a first evolutionary operation configured to repair candidate solutions that exceed a maximum number of allowed features (Ferrucci, fig. 2 & sec. IV A, para. 8, Crossover, where splitting the parent of each couple into two parts … and then mixing the parts to obtaining two new children; as demonstrate in fig. 2, the first row of crossover children has feature 2 which is does not exist in the first row parent. In the case the parent does not have feature 2 due to the limited number of features, the crossover process can bring feature 2 to the children).
a second evolutionary operation configured to reduce a number of features selected in a given candidate model (Ferrucci, fig. 2 & sec. IV A, para. 8, Mutation, where according to a probability to mutate, during this step each subset may change the attribute into itself; as demonstrate in fig. 2, the third row of mutation replace feature 9 by feature 8 and thus reduced the number of features).
Jain (in view of Natekin) and Ferrucci both teach Genetic algorithm for hyperparameter tuning and feature selection and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before effective filing date of the claimed invention to combine Jain (in view of Natekin)’s disclosure of genetic algorithm for finding optimal number of iteration and feature selection in Gradient Boosting machine with Ferrucci’s disclosure of the implementation detail of genetic algorithm to teach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to develop and execute full applications (Ferrucci, sec. I, para. 5, ln. 2 – 3).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIEN MING CHOU whose telephone number is (571)272-9354. The examiner can normally be reached Monday- Friday 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHAKI KAKALI can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/S.C./Examiner, Art Unit 2122                                                                                                                                                                                                        
/BRIAN M SMITH/Primary Examiner, Art Unit 2122