DETAILED ACTION
This is the response to applicant’s amendment action regarding application number 16/166,039, filed October 19, 2018.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
The amendment filed December 30, 2021 has been entered. Examiner acknowledges receipt of Amendments to Application 16/166,039, which include: Amendments to the Drawings and Appendix (1 page), Amendments to the Specification, Amendments to the Claims, and Remarks (containing applicant’s amendments). 
Regarding Applicant’s Remarks on p.18, Examiner acknowledges Claims 1-5, 7-16, and 18-22 have been amended. Claims 1-22 remain pending in the application. 
Regarding Applicant’s Amendments to the Claims, Examiner acknowledges the claim objections identified in Claims 11 and 22 have been resolved, and therefore the respective objections previously set forth in the Non-Final Office Action mailed September 3, 2021 are withdrawn. However, Examiner has noted that the Applicant’s Amendments to the Claims have resulted in new claim objections, which are identified in the relevant section indicated below.
Regarding Applicant’s Remarks on p.19, Examiner acknowledges Applicant has explicitly indicated that the double-arrows in Figure 2 are adequately supported in Applicant’s specification paragraphs [0058]-[0060]. Therefore, the respective drawing objection for Figure 2 previously set forth in the Non-Final Office Action mailed September 3, 2021 are withdrawn. Examiner also acknowledges Applicant’s Amendments to the Drawings include versions of Figures 3A and 3B that show a sharper version of the graphs within the two figures. Applicant has also indicated that Figure 3B contains only one graph showing a single percentage speedup over the reference variant for the same training data set, thus clarifying the ambiguous statement provided in the Applicant’s specification paragraph [0069]: “FIG. 3B is a graph that depicts speedup percentages of the same mini-ML variant as in FIG. 3A over the same reference variant as in FIG. 3A for training training data sets, in an embodiment.”. However, while the graphs in both Figures 3A and 3B are now visible, the text surrounding these graphs within each figure (e.g., the x- and y- plots and associated captions and legends) are still blurry and unreadable, which has been confirmed by the Office as being an inherent defect in the submission. Hence Examiner requests that the Applicant re-submit a new set of figures in which both graphs and text within the figures are sharply presented and readable. Hence, the respective drawing objections for Figures 3A and 3B previously set forth in the Non-Final Office Action mailed September 3, 2021 are still maintained.
Regarding Applicant’s Remarks on p.20, Examiner acknowledges Applicant’s Amendments to the Specification have resolved the respective specification objections, and therefore the respective objections previously set forth in the Non-Final Office Action mailed September 3, 2021 are withdrawn.
Regarding Applicant’s Remarks on p.18, Examiner acknowledges Applicant’s Amendments to the Claims have resolved the indefiniteness/lack of antecedent issues in Claims 1-22, and therefore the respective §112(b) rejections previously set forth in the Non-Final Office Action mailed September 3, 2021 for Claims 1-22 are withdrawn. 

Response to Arguments
Examiner acknowledges receipt of Arguments to Application 16/166,039, which include: Remarks (containing applicant’s arguments). 
Regarding Applicant’s Remarks on p.19 for the Abstract objection previously set forth in the Non-Final Office Action mailed September 3, 2021, Examiner acknowledges Applicant has declined to shorten the length of the Abstract to the range of 50-150 words. Examiner notes on the record that the current length of the Applicant’s Abstract is 14 lines and 193 words (not including the Abstract heading). Applicant has recited the contents of 37 C.F.R. 1.72 in MPEP 608.01(b), indicating that the abstract length is “preferably not exceeding 150 words in length”. Hence, based the above statement contained 37 C.F.R. 1.72, the respective objection to the Abstract is withdrawn. 
However, Examiner further notes that while 37 C.F.R. 1.72 does state that the abstract is “preferably not exceeding 150 words in length”, 37 C.F.R. 1.72 also provides the motivation behind The abstract must be as concise as the disclosure permits, preferably not exceeding 150 words in length. The purpose of the abstract is to enable the Office and the public generally to determine quickly from a cursory inspection the nature and gist of the technical disclosure.”. As shown in the identified text below, MPEP 608.01(b) in general indicates that an applicant has a responsibility to preferably make the necessary changes prior to issue in an effort to bring the abstract (and the specification as a whole) in compliance with the provided guidelines, to facilitate prosecution prior to issue, as the concise contents provided in the Abstract will be eventually printed on the patent itself.
The Office of Patent Application Processing (OPAP) will review all applications filed under 35 U.S.C. 111(a)  for compliance with 37 CFR 1.72  and will require an abstract, if one has not been filed. In all other applications which lack an abstract, the examiner in the first Office action should require the submission of an abstract directed to the technical disclosure in the specification. Applicants may use either "Abstract" or "Abstract of the Disclosure" as a heading.

If the abstract contained in the application does not comply with the guidelines, the examiner should point out the defect to the applicant in the first Office action, or at the earliest point in the prosecution that the defect is noted, and require compliance with the guidelines. Since the abstract of the disclosure has been interpreted to be a part of the specification for the purpose of compliance with 35 U.S.C. 112  (In re Armbruster, 512 F.2d 676, 678-79, 185 USPQ 152, 154 (CCPA 1975)), it would ordinarily be preferable that the applicant make the necessary changes to the abstract to bring it into compliance with the guidelines. 
Replies to such actions requiring either a new abstract or amendment to bring the abstract into compliance with the guidelines should be treated under 37 CFR 1.111(b)  practice like any other formal matter. Any submission of a new abstract or amendment to an existing abstract should be carefully reviewed for introduction of new matter, 35 U.S.C. 132, MPEP § 608.04. The abstract will be printed on the patent. 

Upon passing the application to issue, the examiner should make certain that the abstract is an adequate and clear statement of the contents of the disclosure and generally in line with the guidelines. If the application is otherwise in condition for allowance except that the abstract does not comply with the guidelines, the examiner generally should make any necessary revisions by a formal examiner’s amendment after obtaining applicant’s authorization (see MPEP § 1302.04) rather than issuing an Ex parte Quayle action requiring applicant to make the necessary revisions.

Regarding Applicant’s Remarks on pp.20-23 for Claims 1-7, 9-17, and 20-22 under 35 U.S.C §102(a)(1) as being anticipated by Kobayashi et al., U.S. PGPUB 2017/0061329, published 3/2/2017 [hereafter referred as Kobayashi], Examiner acknowledges Applicant’s arguments and have considered them, and have found them to be not persuasive. Hence the existing U.S.C. 35 §102(a)(1) rejections are still maintained, and the updated claim mappings according to the applicant’s amended claims are provided in the sections indicated below.
Regarding applicant’s Remarks on pp.21-22:
Kobayashi fails to describe at least these features of amended Claim 1. Instead, Kobayashi describes executing multiple machine learning algorithms and based on the execution results determining the "increase rates" of prediction performances. (Kobayashi, Abstract.) In Kobayashi, the "increase rate" is measured as the dependency of performance improvement from the training time (size of the training data set). The machine learning algorithm with the highest "increase rate" is selected. (Kobayashi, FIG. 1, par. [0055]-[0062].)
Unlike the selection based on the highest "increase rate" metric used in Kobayashi, Claim 1, as amended, features comparing the performance score to determine the differences in performance scores of a new variant and a reference for selecting a new hyper-parameter value for a low-cost variant of the machine learning algorithm.
Even if, for argument purpose only (although not explicitly disclosed in Kobayashi), Kobayashi's "increase rate" -based selection implicitly involves comparison of performance scores, such comparison cannot possibly describe accuracy criteria that determine acceptable measure for the differences in the performance scores of the new variant and of the reference variant of the machine learning algorithm; much less selecting a new hyper-parameter value for a low-cost variant of the machine learning algorithm based on the acceptable measure of the differences and the cost metric of the new variant, as featured in amended Claim 1.
For at least the reasons provided above, amended Claim 1 recites one or more features that are not anticipated by Kobayashi.”
Examiner has considered this argument, and finds the argument to be not persuasive. Examiner notes that the bulk of the Applicant’s arguments are directed to the newly added claim limitations not previously presented that are now recited in the respective independent claims, where these new claim limitations necessitates further examination and re-evaluation of the amended and related original claims. These updated claim mappings according to the Applicant’s amended claims are provided in the sections indicated below. However, Examiner has noted that the Applicant has introduced a sub-argument regarding that “increase rate” cited in the Kobayashi reference is not the same as the “performance score” cited in the Applicant’s claims. However, Applicant’s assertion is not persuasive, 
“… determining a performance score for the new variant of machine learning algorithm using a training dataset, the performance score representing the accuracy of the new machine learning model for the training data set;
comparing the performance score of the new variant of machine learning algorithm for the training dataset with a performance score of the reference variant of machine learning algorithm for the training data set;…”
 As indicated in the Non-Final Office Action mailed September 3, 2021, Examiner points to Kobayashi [0078]-[0079], which states that “The machine learning device 100 calculates the “prediction performance” of a learned model. The prediction performance is the capability of accurately predicting results of unknown cases and may be referred to as “accuracy” … The accuracy, precision, RMSE, or the like may be used as the index representing the prediction performance. … the RMSE is calculated by             
                
                    
                        (
                        s
                        u
                        m
                        
                            
                                (
                                y
                                -
                                
                                    
                                        y
                                    
                                    
                                        ^
                                    
                                
                                )
                            
                            
                                2
                            
                        
                        /
                        N
                        )
                    
                    
                        1
                        /
                        2
                    
                
            
         if the result value and the predicted value of an individual case are represented by y and             
                
                    
                        y
                    
                    
                        ^
                    
                
            
        , respectively.”, thereby indicating that a prediction performance is a measure of accuracy, as well as listing various measurement indices to determine (evaluate) prediction performance, including the use of a RMSE metric (which specifies a calculation involving a difference between prediction values as a way to measure increased accuracy). A person having ordinary skill in the art would understand that this comparison to identify the best prediction performance between the two prediction performance values will select a variant with the higher prediction performance value, where the higher value is indirectly based on measuring a difference between these two prediction performance values, and as such this comparison that indirectly uses a difference between these two prediction performance values (choosing the higher of the two prediction values) corresponds a measurement of increased accuracy. Therefore, given the evidence shown above, Applicant’s prior art argument is not persuasive, and the prior art rejection is maintained.
Regarding applicant’s Remarks on pp.22-23:
“Arriving at the Applicant's invention of amended Claim 1 based on Kobayashi also would require impermissible hindsight, at least on the present evidentiary record. There is no evidence of record that at the time of the invention, one of ordinary skill in the art would have thought of the features of amended Claim 1 discussed above on the basis of Kobayashi. Indeed, the only teaching for modifying Kobayashi to include any additional limitations exists in the present application, which the Examiner cannot use as a roadmap to find all of the claim features without relying on impermissible hindsight. Accordingly, withdrawal of this rejection is respectfully solicited.”.
Examiner has considered this argument, and finds the argument to be not persuasive. MPEP 2141 describes the examination guidelines for determining obviousness under 35 U.S.C. 103, and MPEP 2145 describes impermissible hindsight in the context of establishing a prima facie case of obviousness. However, according to the Non-Final Office Action mailed on September 3, 2021, Applicant’s independent claims 1 and 12 were rejected under §102(a)(1) as being anticipated by the Kobayashi reference (U.S. PGPUB 2017/0061329, published 3/2/2017, with foreign priority date 8/31/2015), with no other reference used in combination with the Kobayashi reference for rejecting the independent claims. Examiner notes that Applicant does not cite any case law or MPEP guidelines that indicate impermissible hindsight with regards to a §102(a)(1) anticipation rejection. Hence, Applicant’s argument to withdraw the existing §102(a)(1) anticipation rejection based on impermissible hindsight does not make sense, given the Examiner did not use a combination of references in the earlier rejection of the independent claims (i.e., no impermissible hindsight was used in the earlier §102(a)(1) anticipation rejection), and Applicant did not cite any case law or MPEP guidelines related to impermissible hindsight with regards to a §102(a)(1) anticipation rejection to support their argument. Furthermore, Examiner also reminds Applicant that it must be recognized that any judgment on obviousness is in a sense necessarily a reconstruction based upon hindsight reasoning. See MPEP 2145. But so long as it takes into account only knowledge which was within the level of ordinary skill at the time the claimed invention was made, and does not include knowledge gleaned only from the applicant's disclosure, such a reconstruction is proper. See In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971). Given the above points, Applicant’s argument concerning impermissible hindsight is not persuasive, and the prior art rejection is maintained.
As indicated earlier, Examiner notes that the remainder of the Applicant’s prior art arguments are directed to the newly added claim limitations not previously presented that are now recited in the 

Drawings
The drawings are objected to 
because of the following informality: Regarding the newly-submitted Figures 3A and 3B filed on December 30, 2021, the grainy quality of the text surrounding  these two figures are still present, making it difficult to read the text within the figures (e.g., the x- and y- plots and associated captions and legends). Examiner requests that the Applicant re-submit a new set of figures in which both graphs and text within the figures are sharply presented and readable. Appropriate correction is required.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Objections
Claims 9, 11, and 22 are objected to 
because of the following informalities:
Claim 9: Remove the word “the” from the following claim limitation: “wherein the new hyper-parameter value is based on a previous hyper-parameter value of a previous machine learning algorithm generated from the reference variant of the machine learning algorithm”. Appropriate correction is required.
Claims 11 and 22: The term “machine learning algorithms” needs to be changed to a singular form throughout this claim, given the amended changes of the term “variant” to “variants” is now used to establish the plural form, as recited in the following limitations: 
“wherein the reference variant of the machine learning algorithm is one of a plurality of reference variants of machine learning algorithm[[s]] for which a corresponding plurality of modified, less costly, variants of the reference variant of machine learning algorithm[[s]] are generated, …
for each reference variant of the plurality of reference variants of the machine learning algorithm[[s]], selecting a respective modified variant; …”
Appropriate correction is required.

Claim Rejections – 35 USC § 102









The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-6, 9-17, and 20-22 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by 
Kobayashi et al., U.S. PGPUB 2017/0061329, published 3/2/2017 [hereafter referred as Kobayashi].
Regarding amended Claim 1, 
Kobayashi teaches
(Currently Amended) A computer-implemented method comprising: 
selecting a reference variant of a machine learning algorithm, the reference variant of the machine learning algorithm indicating at least one hyper-parameter having at least one original hyper-parameter value (Examiner’s note: Under its broadest reasonable interpretation, a “reference variant of machine learning algorithm” broadly recites a machine learning model that is used as a reference (or baseline) for other machine learning models. Kobayashi teaches an embodiment (a “fifth embodiment” Kobayashi [0229]) that generates and determines different machine learning models, where each of these machine learning models are based on a machine learning algorithm and contain a set of hyperparameter vectors (representing a set of hyperparameter and associated values). Kobayashi additionally teaches a learning control unit 135c that specifies a machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     with a sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     from a data set D (“training data set”) that manages the execution of a set of learning steps j based on the same machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     (with each learning step j producing variants of the same machine learning algorithm) that includes at least one hyperparameter in its associated hyperparameter vector, where initial values for the vector are fixed (Kobayashi [0205]-[0206]), and also teaches a step execution unit performing cross-validation or random sub-sampling validation to identify and select the highest prediction performances out of H iterations to identify a representative machine learning model for the executed learning step. At learning step j=1, the step execution unit will select and output a model representing the highest prediction performances out of H iterations, where the model information (step number, prediction performance, algorithm information including hyperparameter vector and execution learning step time) is further stored in a management table to be used for comparison in additional learning steps j=2,3,4,…, which also generate additional representations of a machine learning model, all of which are being executed within an execution learning time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                    . Hence, a “reference variant of machine learning algorithm” is interpreted as the representative model selected by the step execution unit at this first learning step j=1, and the model selection process performed by the step execution unit represents the selection of the reference variant of machine learning algorithm (Kobayashi Figure 23, elements 135c, 138c and [0256]; Figure 24, steps S118, S119 and [0261], [0266]-[0269], where the step execution unit 138c refers to steps recited in the fourth embodiment and taught in Figure 19, [0216]-[0225]: “The step execution unit 138 recognizes the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     … In addition, the step execution unit 138 recognized the data set D held in the data storage unit 121 … (S71) The step execution unit requests the hyperparameter adjustment unit 137 for a hyperparameter vector to be used next. … (S75) The step execution unit 138 learns a model m by using the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    , the hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    h
                                
                            
                        
                    , and the training data                         
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                        
                     …(S76) The step execution unit calculates the prediction performance p of the model m by using the learned model m and the test data                         
                            
                                
                                    D
                                
                                
                                    s
                                
                            
                        
                     …   (S78) The step execution unit 138 calculates the average value of the K prediction performances p … as a prediction performance                         
                            
                                
                                    p
                                
                                
                                    h
                                
                            
                        
                     that corresponds to the hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    h
                                
                            
                        
                     … (S79) The step execution unit 138 executes cross validation … Next, the operation proceeds to step S80. … (S80) The step execution unit 138 compares the number of times of the repetition of the above steps S71 to S79 with a threshold H and determines whether the former is less than the latter. …”).); 
modifying the at least one hyper-parameter from the at least one original hyper-parameter value to a new hyper-parameter value thereby generating a new variant of the machine learning algorithm with the at least one hyper-parameter having the new hyper-parameter value (Examiner’s note: Under its broadest reasonable interpretation, a “new variant of the machine learning algorithm” is interpreted as a machine learning model that is produced and selected after the “reference variant of the machine learning algorithm”. As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, where each learning step performed by a learning step unit and step execution unit produces representative machine learning models based on the same machine learning algorithm using a sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     from a training data set (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19 and [0216]-[0227]). Hence, the generated representative machine learning model produced by the step execution unit in learning step j=2 represents a “new variant of machine learning algorithm”. Kobayashi further teaches each learning step uses a search region determination unit that identifies a set of hyperparameter vectors to use in each learning step j based on methods such as grid search and random search, as well as a hyper-parameter adjustment unit that performs further selection of the set of hyperparameter vectors (also using grid search, random search, or other alternative methods). A person having ordinary skill in the art would understand that methods such as grid search and random search are used to determine a set of optimal hyperparameter values (where at least one of the hyperparameters values in the hyperparameter vector are changed as part of the grid search/random search), and hence, these steps of determining a hyperparameter vector in a search region and further selecting the set of hyperparameter vectors using these methods for each learning step j (e.g., learning step j=2) represents modifying hyperparameter values for generating a new variant of the machine learning algorithm containing these modified Kobayashi Figure 19 and [0209]: “… the hyperparameter adjustment unit 137 generates a hyperparameter vector applied to a machine learning algorithm to be executed by the step execution unit 138. Grid search or random search may be used to generate the hyperparameter vector.”; Figure 23, element 137c, 139, and [0250]: “… the search region determination unit 139 selects hyperparameter vectors … through random search, grid search, or the like …”; and [0265]-[0267]: (S117) The search region determination unit 139 determines a search region that corresponds to the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     … and the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . Namely, the search region determination unit 139 determines the hyperparameter vector set                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     … (S118) The step execution unit 138c executes the j-th learning step of the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                    . Namely, the hyperparameter adjustment unit 137c selects a hyperparameter vector included in the search region determined in step S117 or a hyperparameter vector near the hyperparameter vector. The step execution unit 138c applies the selected hyperparameter vector to the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and learns a model by using training data having the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . … The step execution unit 138c repeats the above processing for a plurality of hyperparameter vectors. The step execution unit 138c determines a model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     and the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the results of the learning not stopped. … (S119) The learning control unit 135c acquires the learned model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    thereof, the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the step execution unit 138c.”).); 
determining a performance score for the new variant of the machine learning algorithm using a training data set, the performance score representing an accuracy of a new machine learning model generated by training the new variant of the machine learning algorithm with the training data set (Examiner’s note: As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, where each learning step (controlled by a learning control unit and step execution unit) produces representative machine learning models based on the same machine learning algorithm using a sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     from a training data set (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19 and [0216]-[0227]), and where the generated representative machine learning model produced by the step execution unit in learning step j=2 new variant of machine learning algorithm”. Kobayashi further teaches the step execution unit performing cross-validation or random sub-sampling methods to identify and learn a model based on the training data, the machine learning algorithm, and associated hyperparameter vector on each H iteration, and outputs the highest prediction performance (and its associated hyperparameter vector) out of H iterations to identify representative machine learning model for the executed learning step. Kobayashi teaches that each prediction performance value is a measure of accuracy and is determined through a measurement index (where this measurement index represents an accuracy criteria), such that each of these prediction performance values generated through H iterations of cross-validation or random sub-sampling are representations of accuracy values determined through one or more accuracy criteria (Kobayashi [0078]-[0079]: “The machine learning device 100 calculates the “prediction performance” of a learned model. The prediction performance is the capability of accurately predicting results of unknown cases and may be referred to as “accuracy” … The accuracy, precision, RMSE, or the like may be used as the index representing the prediction performance … the RMSE is calculated by                         
                            
                                
                                    (
                                    s
                                    u
                                    m
                                    
                                        
                                            (
                                            y
                                            -
                                            
                                                
                                                    y
                                                
                                                
                                                    ^
                                                
                                            
                                            )
                                        
                                        
                                            2
                                        
                                    
                                    /
                                    N
                                    )
                                
                                
                                    1
                                    /
                                    2
                                
                            
                        
                     …”; [0166]: “… The index that represents the prediction performance p may be set in advance in the step execution unit …”; and [0222]). Hence, the highest prediction performance value calculated and determined by the step execution unit at learning step j=2 represents a determination of a performance score representing an accuracy for a new variant of the machine learning algorithm (Kobayashi Figure 19 and [0216]-[0225]; and [0226]-[0227]: “… (S81) The step execution unit 138 outputs the highest prediction performance among the prediction performances                         
                            
                                
                                    p
                                
                                
                                    1
                                
                            
                        
                    ,                         
                            
                                
                                    p
                                
                                
                                    2
                                
                            
                        
                    , …                        
                            
                                
                                    p
                                
                                
                                    H
                                
                            
                        
                     as the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                        
                    . In addition, the step execution unit 138 outputs a model that corresponds to the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                        
                     among the models                         
                            
                                
                                    m
                                
                                
                                    1
                                
                            
                        
                    ,                         
                            
                                
                                    m
                                
                                
                                    2
                                
                            
                        
                    , …                        
                            
                                
                                    m
                                
                                
                                    H
                                
                            
                        
                    . In addition, the step execution unit 138 outputs a hyperparameter vector that corresponds to the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                        
                     among the hyperparameter vectors                         
                            
                                
                                    θ
                                
                                
                                    1
                                
                            
                        
                    ,                         
                            
                                
                                    θ
                                
                                
                                    2
                                
                            
                        
                    , …                        
                            
                                
                                    θ
                                
                                
                                    H
                                
                            
                        
                    . In addition, the step execution unit 138 calculates and outputs an execution time. The execution time may be the entire time needed to execute the single learning step from step S70 to step S81 or the time needed to execute steps S72 to S79 from which the outputted model is obtained.”).); 
comparing the performance score of the new variant of the machine learning algorithm for the training data set with a performance score of the reference variant of the machine learning algorithm for the training data set to determine at least one difference, of a plurality of differences in performance scores, between the performance score of the new variant of the machine learning algorithm and the performance score of the reference variant of the machine learning algorithm (Examiner’s note: As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, to produce representative machine learning models based on the same machine learning algorithm (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19 and [0216]-[0227]), where the generated representative machine learning model at learning step j=2 represents a “new variant of machine learning algorithm”. Kobayashi further teaches at each learning step j, the learning control unit acquires the learned model from the output of the step execution unit, and performs a comparison between the selected current learned model’s prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    with the achieved performance P, where P represents the achieved prediction performance up to now (which is interpreted as being the highest prediction performance from earlier learning steps). As indicated earlier, Kobayashi teaches that each prediction performance value is an accuracy value based on a measurement index, where the measurement index includes a RMSE (root mean squared error) metric involving a calculation based on a difference between prediction performance results and is applied during each of the H iterations in the step execution unit to produce a set of prediction performances from which the current learned model’s prediction performance was selected as being the highest prediction performance (where this comparison within this set of prediction performances indirectly determines a set of differences between each prediction performance to select the highest one) (Kobayashi [0078]-[0079]; [0166]; [0222]; and [0226]-[0227]). Hence, in the context of learning step j=2, the comparison between the current learned model prediction performance and the achieved prediction performance P (i.e., the prediction performance from the learned model at learning step j=1, where learning step j=1 establishes a “reference variant of machine learning algorithm”) represents  “comparing the performance score of the new variant … with a performance score of the reference variant …” (Kobayashi [0267]-[0269]: “(S119) The learning control unit 135c acquires the learned model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    thereof, the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the step execution unit 138c. … (S120) The learning control unit 135c compares the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     acquired in step S119 with the achieved prediction performance P (the maximum prediction performance achieved up until now) and determines whether the former is larger than the latter. … If the prediction                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     is larger than the achieved prediction performance P, the operation proceeds to step S121. Otherwise, the operation proceeds to step S122 … (S121) The learning control unit 135c updates the achieved prediction performance P to the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     …”).); 
determining a cost metric of the new variant of the machine learning algorithm by measuring usage of computing resources when training the new variant of the machine learning algorithm on the training data set (Examiner’s note: Kobayashi teaches a learning control unit using a learning execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    determined from a time estimation unit to determine whether a machine learning model is taking more computational resources to learn based on the hyperparameter vector being learned during the learning step, where it is interpreted that a machine learning model that completes training within a learning execution time and with a high predictive performance is considered less computationally costly than that takes more learning execution time to train (Kobayashi [0231]), with the learning execution time representing a cost metric. The determination based on this cost metric is performed by calculating an improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                    , which is based on the learning execution time and a performance improvement amount, where a performance improvement amount based on execution time represents a measurement based on usage of computing resources (Kobayashi Figure 23, elements 133c, 134, 135c; Figure 25, step S128 and [0277]: “(S128) The learning control unit 135c updates the total time                         
                            
                                
                                    t
                                
                                
                                    s
                                    u
                                    m
                                
                            
                        
                     to                         
                            
                                
                                    t
                                
                                
                                    s
                                    u
                                    m
                                
                            
                        
                    +                         
                            
                                
                                    t
                                
                                
                                    i
                                    ,
                                    j
                                    +
                                    1
                                
                                
                                    q
                                
                            
                        
                     on the basis of the execution time                         
                            
                                
                                    t
                                
                                
                                    i
                                    ,
                                    j
                                    +
                                    1
                                
                                
                                    q
                                
                            
                        
                     obtained from the time estimation unit 133c. In addition, the learning control unit 135c calculates the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                             
                        
                    =                        
                            
                                
                                    g
                                
                                
                                    i
                                    ,
                                    j
                                    +
                                    1
                                
                                
                                    q
                                
                            
                        
                    +/                        
                            
                                
                                    t
                                
                                
                                    s
                                    u
                                    m
                                
                            
                        
                    , on the basis of the updated total time                         
                            
                                
                                    t
                                
                                
                                    s
                                    u
                                    m
                                
                            
                        
                     and the performance improvement amount                         
                            
                                
                                    g
                                
                                
                                    i
                                    ,
                                    j
                                    +
                                    1
                                
                                
                                    q
                                
                            
                        
                     acquired from the performance improvement amount estimation unit 134.”).); 
based on the cost metric of the new variant of the machine learning algorithm and based at least in part on the plurality of differences in performance scores meeting one or more accuracy criteria, determining whether to select the new hyper-parameter value for a modified, computationally less costly, variant of the machine learning algorithm, wherein the one or more accuracy criteria determine acceptable measure of the plurality of differences in the performance scores of the new variant of the machine learning algorithm and of the reference variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, the term “acceptable measure” broadly recites any measure or indication that uses a difference between performance scores as an accuracy criteria, the limitation “… based at least in part on the plurality of differences in performance scores meeting or more accuracy criteria” broadly recites the earlier limitation involving comparison of prediction performance scores to determine the best prediction performance, and the limitation “determining … a modified, computationally less costly, variant of the machine learning algorithm” is interpreted as identifying whether the new variant of the machine learning algorithm is the modified variant based on satisfying the criteria of being computationally less costly. As indicated earlier, Kobayashi teaches applying measurement indices in the step execution unit to determine prediction performances for each model during the H iterations of random sub-sampling or cross validation, where the measurement index includes RMSE, and where the measurement index represents an accuracy criteria (Kobayashi [0078]-[0079]; [0166]; [0222]; and [0226]-[0227]). Kobayashi additionally teaches performing a comparison between the selected current learned model prediction performance and the achieved prediction performance P to determine whether to update P with the learned model representing the best prediction performance (Kobayashi [0267]-[0269]), where this comparison to identify the best prediction performance between two values also represents a measurement of a difference between two prediction performances, and thus also represents another accuracy criteria. Based on the execution of the above two steps, Kobayashi further teaches the learning control unit using the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     to check against a threshold R, and performing a further determination (Kobayashi Figure 25, step S131) whether to further store the learned model with prediction performance P, and associated information including the hyper-parameter vector (step S132) or to continue with additional processing (step S114). Hence, in the context of learning step j=2, the learning control unit performing a determination whether the new variant of machine learning algorithm and its associated prediction performance and hyper-parameter is stored (i.e., selected) according to whether it has exceeded a learning execution time or completed within the learning execution time for each learning step j (e.g., learning step j=2) represents a determination of whether to select the new hyperparameter value for a modified, computationally less costly variant of the machine learning algorithm (Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132 or S114, and  [0278]-[0281]: “(S129) The learning control unit 135c determines whether the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R, the operation proceeds to step S130. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is equal to or more than the threshold R, the operation proceeds to step S131. … (S130) The learning control unit 135c updates j to j+1. Next, the operation returns to step S123. … (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114.” … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. … In addition, the learning control unit 135c stores the hyperparameter vector 𝛉 use to learn the model …”).);
Regarding amended Claim 2, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising: 
based on comparing the performance score of the new variant of the machine learning algorithm for the training data set with the performance score of the reference variant of the machine learning algorithm for the training data set, determining that the new variant of the machine learning algorithm meets the one or more accuracy criteria based on the performance score of the reference variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation is directed to the actions taken after the comparison step recited in Claim 1. As indicated earlier, Kobayashi teaches a comparison is performed between the selected current learned model prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     (representing a “new variant of the machine learning algorithm”) and the achieved prediction performance P (i.e., the prediction performance from the learned model at learning step j=1, where learning step j=1 establishes a “reference variant of machine learning algorithm”), where this comparison to identify the best prediction performance between two values also represents a measurement of a difference between two prediction performances, and thus also represents an accuracy criteria. Kobayashi further teaches the higher of the two prediction performance values is stored by updating P, where the scenario of storing of the current prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     and updating of the achieved prediction performance P is taught in Kobayashi Figure 24, steps S120 and S121, thus Kobayashi Figure 23, elements 135c, 138c; Figure 24, steps S120→S121; and [0266]-[0269]: “… (S120) The learning control unit 135c compares the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     acquired in step S119 with the achieved prediction performance P (the maximum prediction performance achieved up until now) and determines whether the former is larger than the latter. If the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     is larger than the achieved prediction performance P, the operation proceeds to step S121. … (S121) The learning control unit 135c updates the achieved prediction performance P to the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     …”).); 
determining to select the new hyper-parameter value for the modified variant of the machine learning algorithm based on the cost metric of the new variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, the limitation “determining … the modified, computationally less costly, variant of the machine learning algorithm based on the cost metric of the new variant of the machine learning algorithm” is interpreted as identifying the new variant of the machine learning algorithm as the modified variant based on satisfying the criteria of being computationally less costly. As indicated earlier, for each learning step j, Kobayashi teaches the learning control unit using the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     to check against a threshold R, and performing a determination (Kobayashi Figure 25, step S131) whether to store the model, its prediction performance P, and associated information including the hyper-parameter vector (Kobayashi Figure 25, step S132) or to continue with additional processing (Kobayashi Figure 25, step E, which points to Figure 24, step S114). Hence, the learning control unit performing a determination to store the new variant of machine learning algorithm and its associated prediction performance and hyper-parameter given that it has completed its learning within the learning execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    represents “… determining to select the new hyper-parameter value for the modified variant of machine learning algorithm based on the cost metric of the new variant of the machine learning algorithm” (Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132, and [0278]-[0281]: “(S129) The learning control unit 135c determines whether the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R, the operation proceeds to step S130. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is equal to or more than the threshold R, the operation proceeds to step S131. … (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114. … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. In addition, the learning control unit 135c stores the algorithm ID of the machine learning algorithm associated with the achieved prediction performance P and the sample size that corresponds to the step number associated with the achieved prediction performance P in the learning result storage unit 123. In addition, the learning control unit 135c stores the hyperparameter vector θ used to learn the model in the learning result storage unit 123.”).).  
Regarding amended Claim 3,
 Kobayashi teaches
(Currently Amended) The method of Claim 1, 
wherein determining the performance score for the new variant of the machine learning algorithm using the training data set further comprises performing cross-validation of the new variant of the machine learning algorithm on the training data set (Examiner’s note: As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, to produce representative machine learning models based on the same machine learning algorithm (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19 and [0216]-[0227]), where the generated representative machine learning model at learning step j=2 represents a “new variant of machine learning algorithm”. Kobayashi additionally teaches the step execution unit performs determination of the prediction performance values using K-fold cross-validation to learn and evaluate the model with the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and modified hyperparameter vector, where the execution of the K-fold cross validation method is performed at each learning step j, and hence this execution of the K-fold cross validation method at learning step j=2 represents a determination of the performance score for the new variant of the machine learning algorithm (Kobayashi Figure 23, element 138c and [0256]: “The step execution unit 138c executes learning steps one by one in the same way as in the fourth embodiment. …”; Figure 18, element 138, Figure 19, element S79 and [0221], [0225]: “(S79) The step execution unit 138 executes cross validation …”; and [0089]: “… every time a single sample size (a single learning step) is processed, a model is learned and the prediction performance thereof is evaluated. Examples of the validation method in each learning step include cross validation … In cross validation, the machine learning device 100 divides the sampled data into K blocks … The machine learning device 100 uses (K-1) blocks as the training data and 1 block as the test data. The machine learning device 100 repeatedly performs model learning and evaluating the prediction performance K times while changing the block used as the test data. As a result of a single learning step, for example, the machine learning device 100 outputs a model indicating the highest prediction performance among the K models and an average value of the K prediction performances.”).).  
Regarding amended Claim 4, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising 
determining the performance score for the reference variant by performing cross-validation of the reference variant on the training data set (Examiner’s note: As indicated earlier, Kobayashi teaches the step execution unit performs determination of the prediction performance values using K-fold cross-validation to learn and evaluate the model with the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and modified hyperparameter vector, where the execution of the K-fold cross validation method is performed at each learning step j, and hence this execution of the K-fold cross validation method at learning step j=1 represents a determination of the performance score for the reference variant of the machine learning algorithm (Kobayashi Figure 23, element 138c and [0256]; Figure 18, element 138, Figure 19, element S79, and [0221], [0225]: “(S79): “The step execution unit 138 executes cross validation instead of the above random sub-sampling validation. …”; and [0089]).).  
Regarding amended Claim 5, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising generating the reference variant by: 
selecting, for the machine learning algorithm, a distinct set of hyper-parameter values from a plurality of distinct sets of hyper-parameter values (Examiner’s note: Kobayashi teaches each learning j based on methods such as grid search and random search, as well as a hyper-parameter adjustment unit that performs further selection of the set of hyperparameter vectors (also using grid search, random search, or other alternative methods). Kobayashi further teaches the search region determination unit divides a hyperparameter vector space into different regions, where each region is used to produce distinct groups of hyperparameter vectors (Kobayashi Figure 20 and [0232]-[0235]: “… the hyperparameter vector space 40 is divided into regions 41 to 44 … The regions 41 to 44 are examples obtained by dividing the hyperparameter vector space 40 when a machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    1
                                
                            
                        
                     is executed by using training data having the sample size                         
                            
                                
                                    s
                                
                                
                                    1
                                
                            
                        
                    . The region 41 corresponds to a hyperparameter vector set                         
                            
                                
                                    ∆
                                    Φ
                                
                                
                                    1,1
                                
                                
                                    1
                                
                            
                        
                     … The region 44 corresponds to a hyperparameter vector set                         
                            
                                
                                    ∆
                                    Φ
                                
                                
                                    1,1
                                
                                
                                    4
                                
                            
                        
                     …”; Figure 23, elements 139, 137c, 135c, and [0249]-[0250]: “The search region determination unit 139 determines a set of hyperparameter vectors (a search region) used in the next learning step in response to a request from the learning control unit 135c. The search region determination unit 139 determines                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     as described above. Namely, among the hyperparameter vectors included in                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     the search region determination unit 139 adds the hyperparameter vectors used in the model learning completed to                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                    . … when j=1 and q=1, the search region determination unit 139 selects hyperparameter vectors as many as possible from the hyperparameter vector space through random search, grid search, or the like and adds the selected hyperparameter vectors to                         
                            
                                
                                    Φ
                                
                                
                                    1,1
                                
                                
                                    1
                                
                            
                        
                    .”).);   
performing cross-validation of the machine learning algorithm on one or more training data sets (Examiner’s note: As indicated earlier, Kobayashi teaches the step execution unit performs determination of the prediction performance values using K-fold cross-validation to learn and evaluate the model with the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and modified hyperparameter vector, where the execution of the K-fold cross validation method is performed at each learning step j, and hence this execution of the K-fold cross validation method at learning step j=1 represents a determination of the performance score for the reference variant of the machine learning algorithm (Kobayashi Figure 23, element 138c and [0256]; Figure 18, element 138, Figure 19, element S79, and [0221], [0225]: “(S79): “The step execution unit 138 executes cross validation instead of the above random sub-sampling validation. …”; and [0089]).); 
based on performing cross-validation of the machine learning algorithm, determining whether to select the distinct set of hyper-parameter values for the reference variant (Examiner’s note: As indicated earlier, for each learning step j, Kobayashi teaches using the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     to check against a threshold R, and performing a determination (Kobayashi Figure 25, step S131) whether to store the model, its prediction performance P (which was earlier determined as taught in Kobayashi [0267]-[0269]), and associated information including the hyper-parameter vector (step S132) or to continue with additional processing (step S114). Hence, the learning control unit performing a determination whether the new variant of machine learning algorithm and its associated prediction performance and hyper-parameter is stored (i.e., selected) according to whether it has exceeded a learning execution time or completed within the learning execution time (at learning step j=1) represents a determination of whether to select the distinct set of hyper-parameters values for the reference variant (Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132 or S114, and [0278]-[0281]: “(S129) The learning control unit 135c determines whether the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R, the operation proceeds to step S130. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is equal to or more than the threshold R, the operation proceeds to step S131. … (S130) The learning control unit 135c updates j to j+1. Next, the operation returns to step S123. … (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114.” … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. … In addition, the learning control unit 135c stores the hyperparameter vector 𝛉 use to learn the model …”).).  
Regarding original Claim 6, 
Kobayashi teaches
(Original) The method of Claim 5, further comprising selecting the distinct set of hyper-parameter values from the plurality of distinct sets of hyper-parameter values based on one of: 
a Bayesian optimization, 
a random search (Examiner’s note: Kobayashi teaches the search region determination unit and hyperparameter adjustment unit generating a hyperparameter vector from a set of hyperparameters using a random search method (Kobayashi Figure 23, elements 139, 137c; Figure 18, element 137 and [0249]: “The search region determination unit 139 determines a set of hyperparameter vectors (a search region) used in the next learning step in response to a request from the learning control unit 135c.  … Namely, among the hyperparameter vectors included in                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     the search region determination unit 139 adds the hyperparameter vectors used in the model learning completed to                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                    . … However, when j=1 and q=1, the search region determination unit 139 selects hyperparameter vectors as many as possible from the hyperparameter vector space through random search, grid search, or the like and adds the selected hyperparameter vectors to                         
                            
                                
                                    Φ
                                
                                
                                    1,1
                                
                                
                                    1
                                
                            
                        
                    .” And [0209]: “In response to a request from the step execution unit 138, the hyperparameter adjustment unit 137 generates a hyperparameter vector applied to a machine learning algorithm to be executed by the step execution unit 138. Grid search or random search may be used to generate the hyperparameter vector. …”).), 
a gradient-based search, 
a grid search (Kobayashi Figure 23, elements 139, 137c; Figure 18, element 137: examiner’s note: As indicated earlier, Kobayashi teaches the search region determination unit and hyperparameter adjustment unit generating a hyperparameter vector from a set of hyperparameters using a grid search method (Kobayashi [0249] and [0209]).), or 
a Tree-structured Parzen Estimators (TPE) based selection.  
Regarding amended Claim 9,
 Kobayashi teaches
(Currently Amended) The method of Claim 1, 
wherein the new hyper-parameter value is based on a previous hyper-parameter value of a previous (Examiner’s note: Kobayashi teaches the hyperparameter adjustment unit performing adjustment of a hyperparameter vector used in the last learning step performed by the step execution unit (where the last learning step corresponds to “wherein the new hyper-parameter value is based on a previous hyper-parameter value of a previous machine-leaning algorithm …”), such that if this adjustment is executed for a learning step (e.g., j=2), the last learning step (j=1) performed by the step execution unit can be traced back to the reference variant of machine-learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     (corresponding to “… generated from the reference variant of machine learning algorithm”) (Kobayashi Figure 23, element 137c; Figure 18, element 137 and [0211]-[0212]: “The hyperparameter adjustment unit 137 may refer to a hyperparameter vector used in the last learning step of the same machine learning algorithm, to make the search for a preferable hyperparameter vector more efficient. For example, the hyperparameter adjustment unit 137 may perform the search by starting with a hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    j
                                    -
                                    i
                                
                            
                        
                    , that achieved the best prediction performance in the last learning step. … assuming that the hyperparameter vectors that achieve the best prediction performance … are                         
                            
                                
                                    θ
                                
                                
                                    j
                                    -
                                    1
                                
                            
                        
                    and                         
                            
                                
                                    θ
                                
                                
                                    j
                                    -
                                    2
                                
                            
                        
                    , respectively, the hyperparameter adjustment unit may generate 2                        
                            
                                
                                    θ
                                
                                
                                    j
                                    -
                                    1
                                
                            
                            -
                            
                                
                                    θ
                                
                                
                                    j
                                    -
                                    2
                                
                            
                        
                     as the hyperparameter to be used next.”).).  
Regarding amended Claim 10, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising: 
based on comparing the performance score of the new variant of the machine learning algorithm for the training data set with the performance score of the reference variant of the machine learning algorithm for the training data set, determining that the new variant of the machine learning algorithm fails to meet the one or more accuracy criteria based on the performance score of the reference variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation is directed to the actions taken after the comparison step recited in Claim 1. As indicated earlier, Kobayashi teaches a comparison is performed between the selected current learned model prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     (representing a “new variant of the machine learning algorithm”) and the achieved prediction performance P (i.e., the prediction performance from the learned model at learning step j=1, where learning step j=1 establishes a “reference variant of machine learning algorithm”), where this comparison to identify the best prediction performance between two values also represents a measurement of a difference between two prediction performances, and thus also represents an accuracy criteria. Kobayashi further teaches the higher of the two prediction performance values is stored by updating P, where the scenario of not storing the current prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     (i.e., the is not higher than the achieved prediction performance P) is taught in Kobayashi Figure 24, step S122, thus representing a determination that the new variant of the machine learning algorithm fails to meets the one or more accuracy criteria based on the performance score of the reference variant (Kobayashi [0266]-[0269]: “… (S120) The learning control unit 135c compares the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     acquired in step S119 with the achieved prediction performance P (the maximum prediction performance achieved up until now) and determines whether the former is larger than the latter. If the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     is larger than the achieved prediction performance P, the operation proceeds to step S121. Otherwise, the operation proceeds to step S122. … ”).); 
based on determining that the new variant of the machine learning algorithm fails to meet the one or more accuracy criteria, determining not to select the new hyper-parameter value for the modified variant of the machine learning algorithm (Examiner’s note: As indicated earlier, for each learning step j, Kobayashi teaches the learning control unit using the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     to check against a threshold R, and performing a determination (Kobayashi Figure 25, step S131) whether to store the model, its prediction performance P, and associated information including the hyper-parameter vector (Kobayashi Figure 25, step S132) or to continue with additional processing (Kobayashi Figure 25, step E, which points to Figure 24, step S114). Based on the results of the preceding claim limitation where the prediction performance P was not updated with the prediction performance of the new variant (Kobayashi Figure 24, step 122 and [0266]-[0269]), it follows that the hyper-parameter vectors for the new variant are also not stored (selected) when executing the steps taught in Kobayashi Figure 25, steps S129-S132, thus representing a determination of not selecting the new hyper-parameter value for the modified variant of the machine learning algorithm (Kobayashi [0278]-[0281]: “(S129) The learning control unit 135c determines whether the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R, the operation proceeds to step S130. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is equal to or more than the threshold R, the operation proceeds to step S131. … (S130) The learning control unit 135c updates j to j+1. Next, the operation returns to step S123. … (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114.” … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. … In addition, the learning control unit 135c stores the hyperparameter vector 𝛉 use to learn the model …”).); 
modifying the at least one hyper-parameter from the at least one original hyper-parameter value to a next hyper-parameter thereby generating a next machine learning algorithm from the reference variant of the machine learning algorithm with the at least one hyper-parameter having the next hyper-parameter value (Examiner’s note: Under its broadest reasonable interpretation, the limitation “modifying the at least one hyper-parameter … to a next hyper-parameter value thereby generating another machine learning algorithm …” is interpreted as generating additional variants of the machine learning algorithm containing a modified hyper-parameter value. Kobayashi teaches a condition where additional learning steps can be initiated if the current learning step is completed within the given learning execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    , where these additional learning steps (e.g., j=3,4,…) also execute the search region determination unit and the hyperparameter adjustment unit to determine different sets of modified hyperparameter vectors using grid or random search methods (Kobayashi [0280]: “… S131 The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit … If the elapsed time has exceeded the time limit … Otherwise, the operations returns to step S114 …”; Figure 24, steps S117, S118 and [0262]-[0266]: (Kobayashi [0265]-[0267]: (S114) The learning control unit 135c selects a virtual algorithm … (S117) The search region determination unit 139 determines a search region that corresponds to the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     (the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and the learning time level q) and the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . Namely, the search region determination unit 139 determines the hyperparameter vector set                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     in accordance with the above method. … (S118) The step execution unit 138c executes the j-th learning step of the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                    . Namely, the hyperparameter adjustment unit 137c selects a hyperparameter vector included in the search region determined in step S117 or a hyperparameter vector near the hyperparameter vector. The step execution unit 138c applies the selected hyperparameter vector to the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and learns a model by using training data having the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . … The step execution unit 138c repeats the above processing for a plurality of hyperparameter vectors. The step execution unit 138c determines a model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     and the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the results of the learning not stopped. …”; and [0249]-[250]).);
without performing cross-validation of the next machine learning algorithm on the training data set, determining that the next machine learning algorithm is less accurate than the new variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, the limitation “without performing cross-validation of the next machine learning algorithm …” broadly recites using any method other than cross validation to determine accuracy between different variants. As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, where each learning step (controlled by a learning control unit and step execution unit) produces representative machine learning models based on the same machine learning algorithm using a sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     from a training data set (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19 and [0216]-[0227]), and where the generated representative machine learning model produced by the step execution unit in learning step j=3 represents a “next variant of machine learning algorithm”. Kobayashi further teaches the step execution unit performing cross-validation or random sub-sampling methods to identify and learn a model based on the training data, the machine learning algorithm, and associated hyperparameter vector on each H iteration, and outputs the highest prediction performance (and its associated hyperparameter vector) out of H iterations to identify representative machine learning model for the executed learning step. Kobayashi teaches that the determination of choosing either random sub-sampling or cross-validation is based on a determination of whether the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     is larger than                         
                            
                                
                                    2
                                
                                
                                    3
                                
                            
                        
                     of the size of data set D,  where the random sub-sampling method is selected for the scenario of having the sample size larger than                         
                            
                                
                                    2
                                
                                
                                    3
                                
                            
                        
                     of the size of data set D  (Kobayashi [0218]-[0219]; [0225]). As indicated earlier, Kobayashi teaches that each prediction performance value is a measure of accuracy and is determined through a measurement index (representing accuracy criteria), such that each of these prediction performance values generated through H iterations of random sub-sampling are representations of accuracy values determined through one or more accuracy criteria (Kobayashi [0078]-[0079]; [0166]; and [0222]). Hence, the highest j=3 represents a determination of a performance score representing an accuracy for a next variant of the machine learning algorithm (Kobayashi Figure 19 and [0216]-[0225]; and [0226]-[0227]), where this selected current prediction performance value                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     is further compared with the achieved prediction performance value P (storing the maximum prediction performance value up to this point), and where the scenario of not storing the current prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     (i.e., the prediction performance of the new variant is not higher than the achieved prediction performance P) is taught in Kobayashi Figure 24, step S122, thus representing a determination that the next variant of the machine learning algorithm fails to meets the one or more accuracy criteria based on the performance score of the new variant (and hence being less accurate than the new variant) (Kobayashi [0266]-[0269]).).  
Regarding amended Claim 11, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, 
wherein the reference variant of the machine learning algorithm is one of a plurality of reference variants of machine learning algorithm[[s]] for which a corresponding plurality of modified, less costly, variants of the reference variant of machine learning algorithm[[s]] are generated, the corresponding plurality of modified variants includes the new variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, “one of the plurality of reference variant of machine learning algorithms” broadly recites a process for producing a plurality of reference variants. Kobayashi teaches the step execution unit generating a plurality of variants during the cross-validation or random sub-sampling process involving H iterations (e.g., during learning step j=1 to produce a final reference variant, Kobayashi [0226]-[0227]), with one of these plurality of variants with the highest prediction performance being selected as the variant for comparison in the next learning step j=2 (Kobayashi [0267]-[0269]), thus corresponding to “… the corresponding plurality of modified variants includes the new variant of the machine learning algorithm”. Hence, performing the steps described in the fifth embodiment (Kobayashi [0266]-[0269], [0278]-[0281] and the steps for the step execution unit described Kobayashi [0218]-[0227]) for subsequent learning steps j=2,3,4,…, represent a process for generating a “ … corresponding plurality of modified, less costly, variants of the reference variant …”.), and the method further comprising: 
receiving a request to determine expected performances of the plurality of reference variants for a particular training data set (Examiner’s note: Kobayashi teaches the learning control unit providing the step execution unit the machine leaning algorithm, sample size, and hyperparameter search region to the step execution unit, where the step execution unit performs the H iterations of K-fold cross-validation for a particular machine learning algorithm and sample size, with each H iteration during learning step j=1 producing a “plurality of reference variants” (Kobayashi Figure 23, element 135c, 138c and [0254]: “The learning control unit 135c specifies the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    , the search region (                        
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                    ) determined by the search region determination unit 139, and the stopping time                         
                            
                                
                                    ϕ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     to the step execution unit 138c.”; Figure 24, steps S118, S119, and [0266]-[0267]: “(S118) The step execution unit 138c executes the j-th learning step of the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                    . Namely, the hyperparameter adjustment unit 137c selects a hyperparameter vector included in the search region determined in step S117 or a hyperparameter vector near the hyperparameter vector. The step execution unit 138c applies the selected hyperparameter vector to the machine learning algorithm a, and learns a model by using training data having the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . … (S119) The learning control unit 135c acquires the learned model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     thereof, the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the step execution unit 138c.”; [0256]: “The step execution unit 138c executes learning steps one by one in the same way as in the fourth embodiment. …”; and [0224]-[0227]: “…The step execution unit 138 calculates the average value of the K prediction performances p … as a prediction performance                         
                            
                                
                                    p
                                
                                
                                    h
                                
                            
                        
                     that corresponds to the hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    h
                                
                            
                        
                    .  … (S80) The step execution unit 138 compares the number of times of the repetition of the above steps S71 to S79 with a threshold H and determines whether the former is less than the latter.”).); 
for each reference variant of the plurality of reference variants of the machine learning algorithm[[s]], selecting a respective modified variant (Examiner’s note: Under its broadest reasonable interpretation, “the respective modified variant” is interpreted as an instance of a reference variant being produced and identified during an H iteration during learning step j=1 (with each H iteration of a reference variant corresponding to a selected “respective modified variant”, with each H iteration corresponding to for each of the plurality of reference variant of machine learning algorithms…”, and the selection of an output model as “…selecting a respective modified variant” (Kobayashi [0226]: “(S80) The step execution unit 138 compares the number of times of the repetition of the above steps S71 to S79 with a threshold H and determines whether the former is less than the latter. If the number of times of the repetition is less than the threshold H, the operation returns to step S71. If the number of times of the repetition reaches the threshold H, the operation proceeds to step S81. Note that h=1,2,…,H. H is a predetermined number, e.g., 30.”).); 
performing cross-validation of the respective modified variant thereby generating a corresponding particular performance score for said each reference variant (Examiner’s note: Under its broadest reasonable interpretation, “the respective modified variant” is interpreted as an instance of a reference variant being produced and identified during an H iteration during learning step j=1. The step execution unit performs cross-validation for H iterations, and selects the iteration with the best prediction performance as the model representing this learning step j=1, with the prediction performance being generated for each modified variant (represented by a H iteration) via K-fold cross-validation (thus representing a process for “performing cross-validation of the respective modified variant thereby generating a particular performance score for said reference variant”) (Kobayashi Figure 19, steps S72, S79, S80; [0224]-[0227]: “…The step execution unit 138 calculates the average value of the K prediction performances p … as a prediction performance                         
                            
                                
                                    p
                                
                                
                                    h
                                
                            
                        
                     that corresponds to the hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    h
                                
                            
                        
                    .  … (S80) The step execution unit 138 compares the number of times of the repetition of the above steps S71 to S79 with a threshold H and determines whether the former is less than the latter.”; Figure 18, element 138, Figure 19, step S79, and [0221]: “(S79) The step execution unit 138 executes cross validation …”; and [0089]: “… every time a single sample size (a single learning step) is processed, a model is learned and the prediction performance thereof is evaluated. Examples of the validation method in each learning step include cross validation … In cross validation, the machine learning device 100 divides the sampled data into K blocks … The machine learning device 100 uses (K-1) blocks as the training data and 1 block as the test data. The machine learning device 100 repeatedly performs model learning and evaluating the prediction performance K times while changing the block used as the test data. As a result of a single learning step, for example, the machine learning device 100 outputs a model indicating the highest prediction performance among the K models and an average value of the K prediction performances.”).); 
based on the corresponding particular performance score, determining whether said each reference variant of the plurality of reference variants, when trained by the particular training data set, yields most accuracy (Examiner’s note: Kobayashi teaches during learning step j=1, the step execution unit determines a set of prediction performance scores for a machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     based on H iterations of performing K-fold cross-validation (each iteration corresponding to “the plurality of reference variants when trained by the particular training data set”), and selects the iteration with the highest prediction performance as the model representing this learning step j=1, where the prediction performance is a measure of accuracy (Kobayashi [0078]-[0079], thus corresponding to “based at least in part on the particular performance score, determining which of the plurality of reference variants … yields most accuracy”) (Kobayashi [0224]-[0227]: “…The step execution unit 138 calculates the average value of the K prediction performances p … as a prediction performance                         
                            
                                
                                    p
                                
                                
                                    h
                                
                            
                        
                     that corresponds to the hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    h
                                
                            
                        
                    .  … (S80) The step execution unit 138 compares the number of times of the repetition of the above steps S71 to S79 with a threshold H and determines whether the former is less than the latter. … (S81) The step execution unit 138 outputs the highest prediction performance among the prediction performances                         
                            
                                
                                    p
                                
                                
                                    1
                                
                            
                        
                    ,                         
                            
                                
                                    p
                                
                                
                                    2
                                
                            
                        
                    , …                        
                            
                                
                                    p
                                
                                
                                    H
                                
                            
                        
                     as the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                        
                    . In addition, the step execution unit 138 outputs a model that corresponds to the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                        
                     among the models                         
                            
                                
                                    m
                                
                                
                                    1
                                
                            
                        
                    ,                         
                            
                                
                                    m
                                
                                
                                    2
                                
                            
                        
                    , …                        
                            
                                
                                    m
                                
                                
                                    H
                                
                            
                        
                    .”).).  
Regarding amended Claim 12, 
Claim 12 recites one or more transitory computer-readable media storing a sequence of instructions, which when executed by one or more hardware processors cause operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 1, and hence is rejected under similar rationale provided by Kobayashi as indicated in Claim 1. In addition, Kobayashi teaches the machine learning device containing one or more programs implementing the information processing of the machine learning device, where the one or more programs are recorded in a computer-readable recording medium (Kobayashi [0283]-[0284]: “…The information processing according to the fifth embodiment may be realized by causing the machine learning device 100c to execute a program. … An individual program may be recorded in a computer-readable recording medium (for example, the recording medium 113).”; and [0065]-[0066]: “The machine learning device 100 includes a CPU 101, a RAM 102, an HDD 103 … The CPU 101 is a processor which includes an arithmetic circuit that executes program instructions. The CPU 101 loads at least a part of programs or data held in the HDD 103 to the RAM 102 and executes the program. The CPU 101 may include a plurality of processor cores…”).
Regarding amended Claim 13, 
Claim 13 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 2, and hence is rejected under similar rationale provided by Kobayashi as indicated in amended Claim 2, in view of the rejections applied to Claim 12.
Regarding amended Claim 14, 
Claim 14 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 3, and hence is rejected under similar rationale provided by Kobayashi as indicated in amended Claim 3, in view of the rejections applied to Claim 12.
Regarding amended Claim 15, 
Claim 15 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 4, and hence is rejected under similar rationale provided by Kobayashi as indicated in amended Claim 4, in view of the rejections applied to Claim 12.
Regarding amended Claim 16, 
Claim 16 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations 
Regarding original Claim 17, 
Claim 17 recites the one or more non-transitory computer-readable media of Claim 16, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in original Claim 6, and hence is rejected under similar rationale provided by Kobayashi as indicated in original Claim 6, in view of the rejections applied to Claim 16.
Regarding amended Claim 20, 
Claim 20 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 9, and hence is rejected under similar rationale provided by Kobayashi as indicated in amended Claim 9, in view of the rejections applied to Claim 12.
Regarding amended Claim 21, 
Claim 21 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 10, and hence is rejected under similar rationale provided by Kobayashi as indicated in amended Claim 10, in view of the rejections applied to Claim 12.
Regarding amended Claim 22, 
Claim 22 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 11, and hence is rejected under similar rationale provided by Kobayashi as indicated in amended Claim 11, in view of the rejections applied to Claim 12.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 7-8 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over 
Kobayashi et al., U.S. PGPUB 2017/0061329, published 3/2/2017 [hereafter referred as Kobayashi] in view of Nguyen et al., U.S. PGPUB 2019/0042887, filed 8/6/2018 [hereafter referred as Nguyen].
Regarding amended Claim 7, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising: 
determining a cost metric of the reference variant by measuring usage of computing resources for training of the reference variant on the training data set (Examiner’s note: As indicated earlier,                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    (corresponding to a “cost metric”) determined from a time estimation unit to determine whether a machine learning model is taking more computational resources to learn based on the hyperparameter vector being learned during the learning step, where it is interpreted that a machine learning model that completes training within a learning execution time and with a high predictive performance is considered less computationally costly than that takes more learning execution time to train (Kobayashi [0231]). The determination based on this cost metric is performed by calculating an improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                             
                        
                    , which is based on the learning execution time and a performance improvement amount, where a performance improvement amount based on execution time represents a measurement based on usage of computing resources. Hence, performing a determination of this cost metric at learning step j=1 (corresponding to the “reference variant”) represents performing a determination of this cost metric of the reference variant (Kobayashi Figure 23, elements 133c, 134, 135c; Figure 25, element S128 and [0277]).) …
However, Kobayashi does not explicitly teach
… comparing the cost metric of the new variant of the machine learning algorithm with the cost metric of the reference variant; 
based on comparing the cost metric of the new variant of the machine learning algorithm with the cost metric of the reference variant, determining that the cost metric of the new variant of the machine learning algorithm is lower than the cost metric for the reference variant; 
based on determining that the cost metric of the new variant of the machine learning algorithm is lower than the cost metric for the reference variant, qualifying the new hyper-parameter value for the modified variant of the machine learning algorithm.  
Nguyen teaches
… comparing the cost metric of the new variant of the machine learning algorithm with the cost metric of the reference variant (Examiner’s note: Nguyen teaches a training management module performing comparison of metrics received from the training of multiple models, where the metrics are used to estimate the effectiveness of the trained model (which can be based on resource usage costs, thus corresponding to “cost metric”) for using the trained model (thus corresponding to “comparing the cost metric of the new variant … with the cost metric of the reference variant”) (Nguyen Figure 5, elements 510, 512, 514, 518, 520, 522, 525; Figure 1, elements 120, 160; [0089]-[0093]: “…At 510, a plurality of training sets are obtained … At step 512, initial hyper parameter sets are determined. … At 514, training systems are invoked to train models based on the training sets and hyper parameter sets. … At 518, training of a plurality of predictive models is initiated … At step 520, an estimate of the effectiveness of each trained model is determined. If a threshold estimated effectiveness is not reached by any of the trained models, a new hyper parameter set is determined (step 522). … At step 525, a hyper-parameter set of the plurality of sets of hyper-parameters is selected, based on a measure of estimated effectiveness of the trained predictive models. At 530, a production predictive model is generated by training a predictive model using the selected candidate hyper-parameter set … ”; and [0075]-[0076]: “… techniques other than, or in addition to, cross-validation can be used to estimate the effectiveness. In one example, the resource usage costs for using the trained model can be estimated and can be used as a factor to estimate the effectiveness of the trained model. … Training management module 120 can compare the metrics received from each training system 160 to determine if a model should be selected or if an additional round of model training should occur and new hyper parameters generated (212).”).); 
based on comparing the cost metric of the new variant of the machine learning algorithm with the cost metric of the reference variant, determining that the cost metric of the new variant of the machine learning algorithm is lower than the cost metric for the reference variant (Examiner’s note: As indicated earlier, Nguyen teaches a training management module performing comparison of metrics received from the training of multiple models, where the metrics are used to estimate the effectiveness of the trained model based on resource usage costs for using the trained model. Nguyen further teaches an estimate of the effectiveness of each model is determined based on a threshold estimated effectiveness, where the comparison between resource usage costs reaches a threshold level of effectiveness or when the change in effectiveness between two rounds drops below a predefined threshold (indicating that the cost metric from one of the trained models must be lower than the cost metric for another trained model) (Nguyen Figure 5, elements 510, 512, 514, 518, 520, 522, 525; Figure 1, elements 120, 160; [0089]-[0093]; [0075]-[0076]; and [0078]: “… The predictive model generated by each training system 160 or effectiveness metric of the predictive model generated by each training system 160 … can be evaluated as discussed above. Rounds of model training can be repeated using new hyper parameters until a model reaches a threshold level of effectiveness or other condition is met. In some embodiments, training rounds can be repeated until the change in effectiveness between two rounds drops below a pre-defined threshold. In any event, whether preformed in multiple rounds or a single round the most performant hyper parameter set may be selected …”).); 
based on determining that the cost metric of the new variant of the machine learning algorithm is lower than the cost metric for the reference variant, qualifying the new hyper-parameter value for the modified variant of the machine learning algorithm (Examiner’s note: As indicated earlier, Nguyen teaches a training management module performing comparison of metrics received from the training of multiple models, where the metrics are used to estimate the effectiveness of the trained model based on resource usage costs for using the trained model. Nguyen further teaches an estimate of the effectiveness of each model is determined based on a threshold estimated effectiveness, where the result of the comparison between resource usage costs is used to select the hyperparameter when an improved effectiveness is determined (Figure 5, elements 510, 512, 514, 518, 520, 522, 525; Figure 1, elements 120, 160; [0089]-[0093]; [0075]-[0076]; and [0078]: “… The predictive model generated by each training system 160 or effectiveness metric of the predictive model generated by each training system 160 … can be evaluated as discussed above. Rounds of model training can be repeated using new hyper parameters until a model reaches a threshold level of effectiveness or other condition is met. In some embodiments, training rounds can be repeated until the change in effectiveness between two rounds drops below a pre-defined threshold. In any event, whether preformed in multiple rounds or a single round the most performant hyper parameter set may be selected …”).).  
Both Kobayashi and Nguyen are analogous art since they both teach machine learning systems that train machine learning algorithms with sets of hyper-parameters.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the learning control unit as taught in Kobayashi and enhance it by directly comparing the cost metrics between two trained machine learning algorithms as taught in Nguyen as a (Nguyen [0025]: “ …the models may be trained with a specific set of hyper-parameters in a distributed way, and then performance of that specific set may be determined, and then the specific set may be adjusted, models may be re-trained using the adjusted set, and so on, until a stopping criterion is met, that may be based on amounts of improvements to the iteratively trained models (e.g., using a convergence criterion), in terms of performance of each iteration of trained models. In this way, the system may determine an optimal set of hyper-parameters that may yield the best predictions.”).
Regarding amended Claim 8, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising: 
based on the cost metric of the new variant of the machine learning algorithm and comparing the performance score of the new variant of the machine learning algorithm for the training data set with the performance score of the reference variant of the machine learning algorithm for the training data set, qualifying the new hyper-parameter value for the modified variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, the limitation “qualifying the new hyper-parameter value for the modified variant of the machine learning algorithm” is interpreted as identifying the new variant of the machine learning algorithm as being the modified variant that satisfies the criteria of being computationally less costly, and then storing the associated model (and its hyperparameter value). As indicated earlier, Kobayashi teaches using the improvement rate to check against a threshold R, and performing a determination (Kobayashi Figure 25, step S131) whether to store the model, its prediction performance P (which was earlier determined as taught in Kobayashi [0267]-[0269]), and associated information including the hyper-parameter vector (step S132) or to continue with additional processing (step S114). Hence, the learning control unit performing a determination to store (qualify) the new variant of machine learning algorithm and its associated prediction performance and hyper-parameter (e.g., at learning step j=2) represents the qualifying of the new hyperparameter value for a Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132 or S114, and  [0278]-[0281]: “… (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114.” … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. … In addition, the learning control unit 135c stores the hyperparameter vector 𝛉 use to learn the model …”).); 
modifying the at least one hyper-parameter from the at least one original hyper-parameter value to another hyper-parameter value thereby generating another machine learning algorithm with the at least one hyper-parameter value having the other hyper-parameter value (Examiner’s note: Under its broadest reasonable interpretation, the limitation “modifying the at least one hyper-parameter … to another hyper-parameter value thereby generating another machine learning algorithm” is interpreted as generating another variant of the machine learning algorithm containing a modified hyper-parameter value. Kobayashi teaches a condition where additional learning steps can be initiated if the current learning step is completed within the given learning execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    , where these additional learning steps (e.g., j=3,4,…) also execute the search region determination unit and the hyperparameter adjustment unit to determine different sets of modified hyperparameter vectors using grid or random search methods (Kobayashi [0280]: “… S131 The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit … If the elapsed time has exceeded the time limit … Otherwise, the operations returns to step S114 …”; Figure 24, steps S117, S118 and [0262]-[0266]: (Kobayashi [0265]-[0267]: (S114) The learning control unit 135c selects a virtual algorithm … (S117) The search region determination unit 139 determines a search region that corresponds to the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     (the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and the learning time level q) and the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . Namely, the search region determination unit 139 determines the hyperparameter vector set                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     in accordance with the above method. … (S118) The step execution unit 138c executes the j-th learning step of the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                    . Namely, the hyperparameter adjustment unit 137c selects a hyperparameter vector included in the search region determined in step S117 or a hyperparameter vector near the hyperparameter vector. The step execution unit 138c applies the selected hyperparameter vector to the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and learns a model by using training data having the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . … The step execution unit 138c repeats the above processing for a plurality of hyperparameter vectors. The step execution unit 138c determines a model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     and the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the results of the learning not stopped. …”; and [0249]-[250]).); 
comparing a performance score of the other machine learning algorithm for the training data set with the performance score of the reference variant of the machine learning algorithm for the training data set (Examiner’s note: As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, where each learning step performed by a learning control unit and step execution unit produces representative machine learning models based on the same machine learning algorithm using a sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     from a training data set (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19, [0216]-[0225]), and where the generated representative machine learning model produced by the step execution unit in learning step j=3 represents “the other machine learning algorithm”. Kobayashi teaches at each learning step j, the learning control unit acquires the learned model from the output of the step execution unit, and performs a comparison between the current learned model’s prediction performance with the achieved performance P, where P represents the achieved prediction performance up to now (which is interpreted as being the highest prediction performance from earlier learning steps j=1 and j=2), and as such, this comparison represents a comparison with a “reference variant of machine learning algorithm” (Kobayashi [0267]-[0268]: “(S119) The learning control unit 135c acquires the learned model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    thereof, the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the step execution unit 138c. … (S120) The learning control unit 135c compares the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     acquired in step S119 with the achieved prediction performance P (the maximum prediction performance achieved up until now) and determines whether the former is larger than the latter. …”).); 
based on a cost metric of the other algorithm and comparing the performance score of the other machine learning algorithm with the performance score of the reference variant of the machine learning algorithm, qualifying the other hyper-parameter value for the modified variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, the limitation “qualifying the other hyper-parameter value for the modified variant of the machine learning algorithm” is interpreted as identifying the other variant of the machine learning algorithm (at learning step j=3) as being the modified variant that satisfies the criteria of being computationally less costly, and then storing the associated model (and its hyperparameter value). This claim limitation is functionally equivalent to the first claim limitation in this claim, except it is referencing a different learning step (j=3, instead of j=2) in which to create a different “other” variant of the machine learning algorithm, and hence it is also rejected under similar rationale and claim mappings identified in the earlier claim limitation (Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132 or S114, and [0278]-[0281]).); … 
However, Kobayashi does not explicitly teach
… based on the cost metric of the other algorithm and the cost metric of the new variant of the machine learning algorithm, determining whether to select the new hyper-parameter value or the other hyper-parameter value for the modified variant of the machine learning algorithm.  
Nguyen teaches
… based on the cost metric of the other algorithm and the cost metric of the new variant of the machine learning algorithm, determining whether to select the new hyper-parameter value or the other hyper-parameter value for the modified variant of the machine learning algorithm (Examiner’s note: Nguyen teaches a training management module performing comparison of metrics received from the training of multiple models, where the metrics are used to estimate the effectiveness of the trained model (which can be based on resource usage costs, thus corresponding to “cost metric”) for using the trained model, and for selecting the hyperparameter set between one model and another based on the improved effectiveness (thus corresponding to “comparing the cost metric of the other algorithm … and the cost metric of the new variant, determining whether to select the new hyper-parameter value or the other hyper-parameter value …”) (Nguyen Figure 5, elements 510, 512, 514, 518, 520, 522, 525; Figure 1, elements 120, 160; [0089]-[0093]: “…At 510, a plurality of training sets are obtained … At step 512, initial hyper parameter sets are determined. … At 514, training systems are invoked to train models based on the training sets and hyper parameter sets. … At 518, training of a plurality of predictive models is initiated … At step 520, an estimate of the effectiveness of each trained model is determined. If a threshold estimated effectiveness is not reached by any of the trained models, a new hyper parameter set is determined (step 522). … At step 525, a hyper-parameter set of the plurality of sets of hyper-parameters is selected, based on a measure of estimated effectiveness of the trained predictive models. At 530, a production predictive model is generated by training a predictive model using the selected candidate hyper-parameter set … ”; and [0075]-[0076]: “… techniques other than, or in addition to, cross-validation can be used to estimate the effectiveness. In one example, the resource usage costs for using the trained model can be estimated and can be used as a factor to estimate the effectiveness of the trained model. … Training management module 120 can compare the metrics received from each training system 160 to determine if a model should be selected or if an additional round of model training should occur and new hyper parameters generated (212).”).).  
Both Kobayashi and Nguyen are analogous art since they both teach machine learning systems that train machine learning algorithms with sets of hyper-parameters.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the learning control unit as taught in Kobayashi and enhance it by directly comparing the cost metrics between two trained machine learning algorithms as taught in Nguyen as a way to select a trained machine learning algorithm using an associated set of hyperparameters that exhibits a higher prediction performance. The motivation to combine is taught in Nguyen, as provided in the prior art claim mapping of Claim 7 recited above.
Regarding amended Claim 18, 
Claim 18 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 7, and hence is rejected under similar rationale and motivations provided by Kobayashi and Nguyen as indicated in amended Claim 7, in view of the rejections applied to Claim 12.
Regarding amended Claim 19, 
Claim 19 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in amended Claim 8, and hence is rejected under similar rationale and motivations provided by Kobayashi and Nguyen as indicated in amended Claim 8, in view of the rejections applied to Claim 12.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM WAI YIN KWAN whose telephone number is 303-297-4332. The examiner can normally be reached Monday-Friday 8:00am - 4:30pm PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 571-272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/WILLIAM WAI YIN KWAN/Examiner, Art Unit 2121                                                                                                                                                                                                        


/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121