DETAILED ACTION
The applicant’s request for continued examination regarding application number 16/166,039, filed October 19, 2018 has been entered.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on June 20, 2022 (including documents filed on May 18, 2022) has been entered.

Response to Amendments
The amendment filed June 20, 2022 (including documents filed on May 18, 2022) has been entered. Examiner acknowledges receipt of Amendments to Application 16/166,039, which include: Amendments to the Claims (filed on June 20, 2022), Amendments to the Drawings (filed on May 18, 2022), and Remarks containing Applicant’s amendments (filed on June 20, 2022). 
Regarding Applicant’s Remarks and Amendments to the Claims, Examiner acknowledges Claims 1-4, 7-13, 15, 18-19, and 21-22 have been amended. Claims 1-22 remain pending in the application. 
Regarding Applicant’s Remarks and Amendments to the Claims, Examiner acknowledges the claim objection identified in Claim 9 has been resolved, and therefore the identified objection for Claim 9 previously set forth in the Final Office Action mailed March 18, 2022 is withdrawn.
Regarding Applicant’s Remarks and Amendments to the Drawings, Examiner acknowledges Applicant has provided replacement sheets filed on May 18, 2022 that contain sharper versions of Figures 3A and 3B (where the text denoting the x- and y- plots and associated captions/legends are now readable). These updated figures have already been identified as acceptable replacement sheets in the Advisory Action mailed May 27, 2022, and therefore the respective drawing objections for Figures 3A and 3B previously set forth in the Final Office Action mailed March 18, 2022 are withdrawn.

Response to Arguments
Examiner acknowledges receipt of Arguments to Application 16/166,039, which include: Remarks containing Applicant’s arguments (filed on June 20, 2022). 
Regarding Applicant’s Remarks and Amendments to the Claims, Examiner notes that Applicant has not resolved the identified claim objections for Claims 11 and 22. Applicant’s argument for not correcting this objection is based on indicating that the term “machine learning algorithms” specifies “a plural term introduced in respective Claims 11 and 22 along with its respective antecedent references”. Examiner has considered Applicant’s arguments for those claim objections, and finds the argument to be not persuasive. Examiner points out that both Claims 11 and 22 are dependent claims based on their respective independent Claims 1 and 12, which only recite a reference variant of a single machine learning algorithm throughout each claim limitation (e.g., “selecting a reference variant of a machine learning algorithm, the reference variant of the machine learning algorithm …”). Claims 11 and 22 have been amended to recite the following limitations: “wherein the reference variant of the machine learning algorithm is one of a plurality of reference variants of machine learning algorithms for which a corresponding plurality of mini-ML, less costly, variants of the reference variant of machine learning algorithms are generated …”; and “for each reference variant of the plurality of reference variants of the machine learning algorithms, selecting a respective mini-ML variant …”. However, the existing terms “reference variants of machine learning algorithms” and “variants of the reference variant of machine learning algorithms” were left unchanged, and still exhibit a lack of antecedent basis with respect to their respective parent independent Claims 1 and 12, since neither the respective dependent Claims nor their respective parent independent Claims contain proper antecedents within their limitations that recite “a plurality of machine learning algorithms”. Examiner also points out that Applicant’s specification does not indicate that reference variants or other variants are based on a plurality of machine learning algorithms. For example, Applicant’s specification paragraphs [0043] and [0050] only describe reference variants and mini-ML variants as being variants of a single machine learning algorithm ([0043]: “… This variant of the SVM machine learning algorithm is then selected as a reference variant to generate a mini-ML variant”; and [0050]: “… a mini-ML variant is generated based on a reference variant … the mini-ML variant is a variant of the algorithm …”). Hence, as indicated in the Final Office Action mailed March 18, 2022, Examiner recommended Applicant to change the term “machine learning algorithms” to a singular form to correspond with the established antecedent basis of a single machine learning algorithm found in their respective parent independent Claims (i.e., “reference variants of machine learning algorithm[[s]]” and “variants of the reference variant of machine learning algorithm[[s]]”). Examiner notes that Applicant’s latest arguments only asserts that these terms in the claims support the plural term “a plurality of machine learning algorithms”, but remains silent as to where the support for the plural term “a plurality of machine learning algorithms” is provided in the Applicant’s disclosure. Examiner further points out Applicant’s specification only discusses “a plurality of machine learning algorithms” in the context of generally describing different types of machine learning algorithms in the “Background” and “Machine Learning Algorithms and Domains” sub-sections (paragraphs [0002]-[0007] and paragraphs [0027]-[0029] respectively), where these sub-sections only serve as general background and supplementary information, and the contents of these sub-sections are not further mentioned nor integrated with the description of “reference variant” or other variants that are described in the claimed invention (starting from Applicant’s specification paragraph [0039]). Hence, Applicant’s disclosure lacks any written description that would enable a person having ordinary skill in the art to understand that the plurality of reference variants (or in general, a plurality of variants) recited in the claimed invention are based on a plurality of machine learning algorithms. Given that these recited limitations found in dependent Claims 11 and 22 do not have proper antecedent basis with respect to their parent independent Claims, and Applicant has not provided evidence in their disclosure to support the term “reference variants of a plurality of machine learning algorithms”, the earlier interpretation that refers to a single machine learning algorithm is maintained, and these earlier claim objections will be marked as new 112(b) lack of antecedent and 112(a) lack of written description rejections in this round of prosecution. 
Regarding Applicant’s Remarks for Claims 1-7, 9-17, and 20-22 under 35 U.S.C §102(a)(1) as being anticipated by Kobayashi et al., U.S. PGPUB 2017/0061329, published 3/2/2017 [hereafter referred as Kobayashi], Examiner acknowledges Applicant’s arguments and have considered them, and have found them to be not persuasive. Hence the existing U.S.C. 35 §102(a)(1) rejections are still maintained, and the updated claim mappings according to the applicant’s amended claims are provided in the sections indicated below.
Regarding Applicant’s Remarks:
“Claim 1, as amended, features determining whether to use, for new data sets, a new variant of a machine learning algorithm as, computationally less costly, a mini-ML variant to determine the performance of a reference variant on the new data sets, as highlighted above in amended Claim 1.
The Advisory Action, on page 3, provides an example of "mini-ML variant" as featured in amended Claim 1.
However in addition to this feature, the amended Claim 1 further features that a determination is made whether to use such a mini-ML variant for new data sets to determine the performance of the reference variant on the new data sets.
Kobayashi fails to describe this feature of Claim 1. Instead, as discussed in the interview, Kobayashi determines whether to select a new algorithm (derived from the original) by training and producing metrics of the new learning algorithm. (Kobayashi, FIG. 23-25, par. [0204]-[0281]).
Nothing in Kobayashi describes using one algorithm to determine the performance of another algorithm.
Therefore, Kobayashi fails to describe "based at least in part on the plurality of differences in performance scores meeting one or more accuracy criteria, determining whether to use, for new data sets, said new variant of the machine learning algorithm as, computationally less costly, a mini-ML variant to determine performance of the reference variant on the new data sets," as recited in amended Claim 1.”
Examiner has considered this argument, and finds the argument to be not persuasive. Examiner points out that the majority of Applicant’s above arguments is directed to the newly amended limitations that were not previously entered. However, Examiner notes that Applicant’s above arguments contain several assertions, each of which will be addressed in the following paragraphs.
Regarding Applicant’s usage of the term “mini-ML variant” in place of the term “modified variant”, Examiner finds this change does not affect the existing scope of the claimed invention, and hence is not persuasive. Examiner points out that the definition of the term “mini-ML variant” mentioned in the Advisory Action mailed May 27, 2022 is based on Applicant’s own definition provided in paragraph [0039]: “… The “mini-machine learning algorithm variant” (“mini-ML variant”) term refers to a computationally less costly-to-train variant of a machine learning algorithm.”, where this concept of a computationally less costly-to-train variant of a machine learning algorithm was already expressed in the Applicant’s earlier recited claim language (i.e., “a modified, computationally less costly, variant of the machine learning algorithm” recited in independent Claims 1 and 12; “modified, less costly, variants of the reference variant of machine learning algorithms” recited in Claims 11 and 22). Examiner further notes that although this term “mini-ML variant” contains the sub-term “mini”, the usage of this sub-term does not further imply any reduction in size or a limited representation of a machine learning variant, other than the fact that the “mini-ML variant” is computationally less costly. Hence, the usage of “mini-ML variant” in place of the “modified variant” term does not introduce any additional features that would further restrict the earlier scope of the “modified, [computationally] less costly, variant” such that it is different than the “mini-ML variant” now recited in the claims. As indicated in the Final Office Action mailed March 18, 2022, Kobayashi teaches a learning control unit using a learning execution time             
                
                    
                        T
                    
                    
                        i
                        ,
                        j
                    
                    
                        q
                    
                
                 
            
        determined from a time estimation unit to determine whether a machine learning model is taking more computational resources to learn based on the hyperparameter vector being learned during the learning step, where a machine learning model that completes training within a learning execution time and with a high predictive performance is considered less computationally costly than that takes more learning execution time to train (Kobayashi [0231]). This learning execution time             
                
                    
                        T
                    
                    
                        i
                        ,
                        j
                    
                    
                        q
                    
                
                 
            
        associated with a machine learning model that completes training within a learning execution time represents a cost metric associated with the machine learning model. The determination based on this cost metric is performed by calculating an improvement rate             
                
                    
                        r
                    
                    
                        i
                    
                    
                        q
                    
                
            
        , which is based on the learning execution time and a performance improvement amount that is based on execution time, and as such, this cost metric based on a performance improvement amount dependent on execution time represents a cost metric that is measured based on usage of computing resources (Kobayashi Figure 23, elements 133c, 134, 135c; Figure 25, step S128 and [0277]: “(S128) The learning control unit 135c updates the total time             
                
                    
                        t
                    
                    
                        s
                        u
                        m
                    
                
            
         to             
                
                    
                        t
                    
                    
                        s
                        u
                        m
                    
                
            
        +             
                
                    
                        t
                    
                    
                        i
                        ,
                        j
                        +
                        1
                    
                    
                        q
                    
                
            
         on the basis of the execution time             
                
                    
                        t
                    
                    
                        i
                        ,
                        j
                        +
                        1
                    
                    
                        q
                    
                
            
         obtained from the time estimation unit 133c. In addition, the learning control unit 135c calculates the improvement rate             
                
                    
                        r
                    
                    
                        i
                    
                    
                        q
                    
                
                 
            
        =            
                
                    
                        g
                    
                    
                        i
                        ,
                        j
                        +
                        1
                    
                    
                        q
                    
                
            
        +/            
                
                    
                        t
                    
                    
                        s
                        u
                        m
                    
                
            
        , on the basis of the updated total time             
                
                    
                        t
                    
                    
                        s
                        u
                        m
                    
                
            
         and the performance improvement amount             
                
                    
                        g
                    
                    
                        i
                        ,
                        j
                        +
                        1
                    
                    
                        q
                    
                
            
         acquired from the performance improvement amount estimation unit 134.”). Kobayashi further teaches the learning control unit using the improvement rate             
                
                    
                        r
                    
                    
                        i
                    
                    
                        q
                    
                
            
         to check against a threshold R, and performing a determination (Kobayashi Figure 25, step S131) whether to further store the learned model with prediction performance P, and associated information including the hyper-parameter vector (step S132) or to continue with additional processing (step S114). Hence the learning control unit at learning step j=2 performs a determination whether the earlier selected new variant of machine learning algorithm and its associated prediction performance and hyper-parameter is stored (i.e., selected) based on this measured improvement rate and learning execution time, where the storing of the model and its hyper-parameter value represents the training completion for this new model variant within a certain measured improvement rate. This stored new variant (and its hyper-parameter value) completed within the measured improvement rate and learning execution time represents a computationally less costly-to-train model, and hence this earlier identification and selection of this new variant based on a comparison of prediction performances between the new variant (learned in learning step j=2) and the reference variant (learned in learning step j=1) corresponds to steps for determining whether to use a new variant of the machine learning algorithm as a “mini-ML variant” based on the cost metric identifying it as being computationally less costly, where the earlier selection of this new variant (compared against a reference variant) was based on one or more accuracy criteria (Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132 or S114, and  [0278]-[0281]: “(S129) The learning control unit 135c determines whether the improvement rate             
                
                    
                        r
                    
                    
                        i
                    
                    
                        q
                    
                
            
         is less than the threshold R. If the improvement rate             
                
                    
                        r
                    
                    
                        i
                    
                    
                        q
                    
                
            
         is less than the threshold R, the operation proceeds to step S130. If the improvement rate             
                
                    
                        r
                    
                    
                        i
                    
                    
                        q
                    
                
            
         is equal to or more than the threshold R, the operation proceeds to step S131. … (S130) The learning control unit 135c updates j to j+1. Next, the operation returns to step S123. … (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114.” … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. … In addition, the learning control unit 135c stores the hyperparameter vector 𝛉 used to learn the model …”). Hence, the claim mappings associated with the Kobayashi reference as indicated in the Final Office Action mailed March 18, 2022 which mapped the term “modified, computationally less costly, variant of the machine learning algorithm” still applies to the new term “mini-ML variant” being used in the amended claims. Given this above evidence, Applicant’s amended claim limitations related to replacing the term “modified variant” term with “mini-ML variant” is merely a term change, with the existing Kobayashi reference still applicable and within scope of this new term, and hence the existing prior art rejection is maintained.
Regarding Applicant’s assertion that the Kobayashi reference does not teach “… using one algorithm to determine the performance of another algorithm”, Examiner finds this assertion to be not persuasive. Examiner points out that Applicant’s assertion is directed to a limitation within the amended limitation: “based on the cost metric of the new variant of the machine learning algorithm and based at least in part on the plurality of differences in performance scores meeting one or more accuracy criteria, determining whether to use … said new variant of the machine learning algorithm as, computationally less costly, a mini-ML variant to determine performance of the reference variant …”. Examiner points out that this amended limitation contains a change in scope that was not previously entered, where this change in scope will be addressed in the relevant section indicated below. Examiner reminds Applicant that MPEP 2145(VI) indicates that “Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.”. Examiner also cites the related guidelines in MPEP 2111.01(II), which caution against importing written description into a claim limitation that is broader than the cited embodiment: "Though understanding the claim language may be aided by explanations contained in the written description, it is important not to import into a claim limitations that are not part of the claim. For example, a particular embodiment appearing in the written description may not be read into a claim when the claim language is broader than the embodiment.". Applicant’s argument points to “one algorithm” being used to determine the performance of “another algorithm”, but there is nothing in the amended claim language to suggest that the “reference variant of the machine learning algorithm” and the “new variant of the machine learning algorithm” represent different algorithms, especially since their antecedent basis “the machine learning algorithm” implies that both the reference variant and the new variant are based on the same machine learning algorithm. Examiner also points out that none of the recited limitations in the independent claims recite a plurality of machine learning algorithms such that one algorithm is used to determine the performance of another algorithm. Under its broadest reasonable interpretation, the general term “variant of a machine learning algorithm” refers to variation or change applied to a machine learning algorithm, such as a parameter (in the case of the claimed invention, a hyperparameter). Hence, one “variant of the machine learning algorithm” with respect to another “variant of the machine learning algorithm” may differ by a hyperparameter, but both variants are based on the same machine learning algorithm. This meaning is consistent with Applicant’s specification paragraph [0041], where Applicant indicates that the performance score is determined for a reference variant of a machine learning algorithm, and successive iterations are performed to generate modified variants (from the same machine learning algorithm), where the performance score and a computational cost for a modified variant and reference variant are compared to determine whether the modified variant is computationally less costly ([0041]: “… the accuracy and computational cost are measured in comparison with a reference variant of algorithm. A reference variant is selected as a baseline variant, of the machine learning algorithm, which is modified to generate a mini-ML variant. Through iterative evaluations of variants of the reference variant, a mini-ML variant is generated, and in each iteration, at least one hyper-parameter value of the reference variant is modified. The modified algorithm is evaluated using a training data set to determine a performance score and to determine the computational cost for training the modified algorithm using the training dataset. Based on the computational cost and the performance score, the system determines whether to select the modified hyper-parameter value for the mini-ML variant of the algorithm - the lesser is the computational cost, the more likely is the hyper-parameter value to be selected.”). Hence, given the above evidence, Applicant’s assertion is not persuasive, and the existing prior art rejection is maintained. 
Regarding Applicant’s Remarks:
“… Arriving at the Applicant's invention of amended Claim 1 based on Kobayashi also would require impermissible hindsight, at least on the present evidentiary record. There is no evidence of record that at the time of the invention, one of ordinary skill in the art would have thought of the features of amended Claim 1 discussed above on the basis of Kobayashi. Indeed, the only teaching for modifying Kobayashi to include any additional limitations exists in the present application, which the Examiner cannot use as a roadmap to find all of the claim features without relying on impermissible hindsight. Accordingly, withdrawal of this rejection is respectfully solicited.”.
Examiner has considered this argument, and finds the argument to be not persuasive. Examiner points out that Applicant’s above arguments are similar to earlier Remarks already presented and responded to in the Final Office Action mailed March 18, 2022. MPEP 2141 describes the examination guidelines for determining obviousness under 35 U.S.C. 103, and MPEP 2145 describes impermissible hindsight in the context of establishing a prima facie case of obviousness. According to the Final Office Action mailed on March 18, 2022, Applicant’s independent claims 1 and 12 were rejected under §102(a)(1) as being anticipated by the Kobayashi reference (U.S. PGPUB 2017/0061329, published 3/2/2017, with foreign priority date 8/31/2015), with no other reference used in combination with the Kobayashi reference for rejecting the independent claims. Examiner notes that Applicant does not cite any case law or MPEP guidelines that discuss impermissible hindsight with regards to a §102(a)(1) anticipation rejection. Hence, Applicant’s argument to withdraw the existing §102(a)(1) anticipation rejection based on impermissible hindsight does not make sense, given the Examiner did not use a combination of references in the earlier rejection of the independent claims (i.e., no impermissible hindsight was used in the earlier §102(a)(1) anticipation rejection), and Applicant did not cite any case law or MPEP guidelines related to impermissible hindsight with regards to a §102(a)(1) anticipation rejection to support their argument. Furthermore, Examiner also reminds Applicant that it must be recognized that any judgment on obviousness is in a sense necessarily a reconstruction based upon hindsight reasoning. See MPEP 2145. But so long as it takes into account only knowledge which was within the level of ordinary skill at the time the claimed invention was made, and does not include knowledge gleaned only from the applicant's disclosure, such a reconstruction is proper. See In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971). Given the above points, Applicant’s argument concerning impermissible hindsight is not persuasive, and the prior art rejection is maintained.
Regarding Applicant’s Remarks:
 “Claim 12 is directed to one or more non-transitory computer-readable media storing instructions, wherein the instructions include a sequence of instructions, which, when executed by one or more hardware processors, cause to perform operations analogous to the steps of the method recited in amended Claim 1. Therefore, Claim 12, as amended, is patentable over the art of record at least for the same reasons as amended Claim 1. Applicant respectfully requests withdrawal of the rejections of Claim 12.”
Examiner has considered this argument and finds the argument to be not persuasive. Examiner notes that Applicant does not provide any additional arguments other than referencing Applicant’s previous set of arguments made for the limitations recited in independent Claim 1. As established in response to the previous set of arguments in the above paragraphs, Applicant’s arguments concerning the identified limitations in independent Claim 1 were not persuasive, and hence Applicant’s arguments for the same limitations present in independent Claim 12 are also not persuasive, and thus the prior art rejections are maintained.
Regarding Applicant’s Remarks:
“The pending claims not discussed so far, Claims 2-11 and 13-22 are dependent claims that depend on the independent claims that are discussed above. Because each of the dependent claims includes the limitations of claims upon which the dependent claims depend, the dependent claims are patentable for at least those reasons that the claims upon which the dependent claims depend on, are patentable. Applicant respectfully requests withdrawal of the rejections with respect to the dependent claims and further requests allowance of the dependent claims.”
Examiner notes that Applicant does not provide any additional arguments for the respective dependent claims other than referencing Applicant’s previous set of arguments made for the limitations recited in the respective parent independent claims. As established in response to the previous set of arguments in the above paragraphs, Applicant’s arguments concerning the identified limitations in the respective parent independent claims were not persuasive, and hence these same arguments applied to the respective dependent claims also not persuasive, and thus the existing prior art rejections are maintained.
As indicated earlier, Examiner notes that the remainder of the Applicant’s prior art arguments are directed to the newly added claim limitations not previously presented that are now recited in the respective independent claims, where these new claim limitations necessitates further examination and re-evaluation of the amended and related original claims. These updated claim mappings according to the Applicant’s amended claims are provided in the sections indicated below.

Claim Objections
Claims 1, 12, and 22 are objected to 
because of the following informalities:
Claims 1 and 12: The following amended limitation should be corrected for grammatical errors as follows: “… determining whether to use, for new data sets, said new variant of the machine learning algorithm as[[,]] a computationally less costly[[,]] ”. Appropriate correction is required.
Claim 22: The following recited limitation contains a typographical error and needs to be corrected as follows: “wherein the reference variant of the machine learning algorithm is one of a plurality of reference variants of machine learning algorithms for which a corresponding plurality of modified, less costly, variants of the reference variant of machine learning algorithm[[s]] are generated …”. Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 11 and 22 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite 
for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding amended Claims 11 and 22,
	Both Claims 11 and 22 recite the following limitations: “wherein the reference variant of the machine learning algorithm is one of a plurality of reference variants of machine learning algorithms for which a corresponding plurality of mini-ML, less costly, variants of the reference variant of machine learning algorithms are generated …”; and “for each reference variant of the plurality of reference variants of the machine learning algorithms, selecting a respective mini-ML variant …”. There is insufficient antecedent basis for these limitations in the respective claims, since the terms “reference variants of machine learning algorithms” and “variants of the reference variant of machine learning algorithms” do not contain proper antecedents within any limitations from their respective independent Claims or earlier limitations from their respective dependent Claims that recite a plurality of machine learning algorithms. Examiner points out that Applicant’s specification does not indicate that reference variants or other variants are based on a plurality of machine learning algorithms. For example, Applicant’s specification paragraphs [0043] and [0050] only describe reference variants and mini-ML variants as being variants of a single machine learning algorithm ([0043]: “… This variant of the SVM machine learning algorithm is then selected as a reference variant to generate a mini-ML variant”; and [0050]: “… a mini-ML variant is generated based on a reference variant … the mini-ML variant is a variant of the algorithm …”). Hence, for purposes of examination, Examiner will interpret these terms using the established antecedent basis of a single machine learning algorithm found in their respective parent independent Claims (i.e., “wherein the reference variant of the machine learning algorithm is one of a plurality of reference variants of machine learning algorithm[[s]] for which a corresponding plurality of mini-ML, less costly, variants of the reference variant of machine learning algorithm[[s]] are generated …”; and “for each reference variant of the plurality of reference variants of the machine learning algorithm[[s]], selecting a respective mini-ML variant …”).
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 11 and 22 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. 
The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Regarding amended Claims 11 and 22,
Both Claims 11 and 22 recite the following limitations: “wherein the reference variant of the machine learning algorithm is one of a plurality of reference variants of machine learning algorithms for which a corresponding plurality of mini-ML, less costly, variants of the reference variant of machine learning algorithms are generated …”; and “for each reference variant of the plurality of reference variants of the machine learning algorithms, selecting a respective mini-ML variant …”, but the specification fails to disclose an embodiment of the invention that recite a plurality of machine learning algorithms. Examiner points out that Applicant’s specification paragraphs [0043] and [0050] describe reference variants and mini-ML variants as being variants of a single machine learning algorithm ([0043]: “… This variant of the SVM machine learning algorithm is then selected as a reference variant to generate a mini-ML variant”; and [0050]: “… a mini-ML variant is generated based on a reference variant … the mini-ML variant is a variant of the algorithm …”). Examiner further points out Applicant’s specification only discusses “a plurality of machine learning algorithms” in the context of generally describing different types of machine learning algorithms in the “Background” and “Machine Learning Algorithms and Domains” sub-sections (paragraphs [0002]-[0007] and paragraphs [0027]-[0029] respectively), where these sub-sections only serve as general background and supplementary information, and the contents of these sub-sections are not further mentioned nor integrated with the description of “reference variant” or other variants that are described in the claimed invention (starting from Applicant’s specification paragraph [0039]). Hence, Applicant’s disclosure lacks any written description that would enable a person having ordinary skill in the art to understand that the plurality of reference variants (or in general, a plurality of variants) recited in the claimed invention are based on a plurality of machine learning algorithms. The specification must describe and support the claims such that the public is informed of the boundaries of what constitutes infringement of the patent, as well as determining whether the claimed invention meets all the criteria for patentability by distinctly claiming the subject matter which the inventor regards as the invention. See MPEP 2163. Given that there is no support of these limitations present in the specification, these limitations in Claims 11 and 22 fail to comply with the written description requirement. For the purposes of examination, Examiner will interpret these terms using the established antecedent basis of a single machine learning algorithm found in their respective parent independent Claims, as indicated earlier with respect to the lack of antecedent issue found in these same claims.

Claim Rejections – 35 USC § 102









The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-6, 9-17, and 20-22 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by 
Kobayashi et al., U.S. PGPUB 2017/0061329, published 3/2/2017 [hereafter referred as Kobayashi].
Regarding amended Claim 1, 
Kobayashi teaches
(Currently Amended) A computer-implemented method comprising: 
selecting a reference variant of a machine learning algorithm, the reference variant of the machine learning algorithm indicating at least one hyper-parameter having at least one original hyper-parameter value (Examiner’s note: Under its broadest reasonable interpretation, a “reference variant of machine learning algorithm” broadly recites a machine learning model that is used as a reference (or baseline) for other machine learning models. Kobayashi teaches a “fifth embodiment” exhibiting characteristics inherited from a second and fourth embodiment (Kobayashi [0229]) that generates and determines different machine learning models, where each of these machine learning models are based on a machine learning algorithm containing a set of hyperparameter vectors, where these vectors represent at least one hyper-parameter having at least one original hyper-parameter value. Kobayashi additionally teaches a learning control unit and a step execution unit, where the learning control unit  specifies a machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     with a sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     from a data set D (representing a “training data set”) that manages the execution of a set of learning steps j based on the same machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     (with each learning step j producing variants of the same machine learning algorithm) that includes at least one hyperparameter in its associated hyperparameter vector (where initial values for the vector are fixed, Kobayashi [0205]-[0206]), and where the step execution unit performs cross-validation or random sub-sampling validation to identify and select the highest prediction performances out of H iterations to identify a representative machine learning model for the executed learning step. At learning step j=1, the step execution unit will select and output a model representing the highest prediction performances out of H iterations, where the model information (step number, prediction performance, algorithm information including hyperparameter vector and execution learning step time) is further stored in a management table to be used for comparison in additional learning steps j=2,3,4,…, which also generate additional representations of a machine learning model, all of which are being executed within an execution learning time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                    . Hence, a “reference variant of machine learning algorithm” is interpreted as the representative model based on a machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     (containing a set of hyperparameter vectors) selected by the step execution unit at this first learning step j=1, and the model selection process performed by the step execution unit represents the selection of the reference variant of machine learning algorithm (Kobayashi Figure 23, elements 135c, 138c and [0256]; Figure 24, steps S118, S119 and [0261], [0266]-[0269], where the step execution unit 138c refers to steps recited in the fourth embodiment and taught in Figure 19, [0216]-[0225]: “The step execution unit 138 recognizes the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     … In addition, the step execution unit 138 recognized the data set D held in the data storage unit 121 … (S71) The step execution unit requests the hyperparameter adjustment unit 137 for a hyperparameter vector to be used next. … (S75) The step execution unit 138 learns a model m by using the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    , the hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    h
                                
                            
                        
                    , and the training data                         
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                        
                     …(S76) The step execution unit calculates the prediction performance p of the model m by using the learned model m and the test data                         
                            
                                
                                    D
                                
                                
                                    s
                                
                            
                        
                     …   (S78) The step execution unit 138 calculates the average value of the K prediction performances p … as a prediction performance                         
                            
                                
                                    p
                                
                                
                                    h
                                
                            
                        
                     that corresponds to the hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    h
                                
                            
                        
                     … (S79) The step execution unit 138 executes cross validation … Next, the operation proceeds to step S80. … (S80) The step execution unit 138 compares the number of times of the repetition of the above steps S71 to S79 with a threshold H and determines whether the former is less than the latter. …”).); 
modifying the at least one hyper-parameter from the at least one original hyper-parameter value to a new hyper-parameter value thereby generating a new variant of the machine learning algorithm with the at least one hyper-parameter having the new hyper-parameter value (Examiner’s note: Under its broadest reasonable interpretation, a “new variant of the machine learning algorithm” is interpreted as a machine learning model that is produced and selected after the “reference variant of the machine learning algorithm”. As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, where each learning step performed by a learning step unit and step execution unit produces representative machine learning models based on the same machine learning algorithm using a sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     from a training data set (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19 and [0216]-[0227]). Hence, the generated representative machine learning model produced by the step execution unit in learning step j=2 represents a “new variant of machine learning algorithm”. Kobayashi further teaches each learning step uses a search region determination unit that identifies a set of hyperparameter vectors to use in each learning step j based on methods such as grid search and random search, as well as a hyper-parameter adjustment unit that performs further selection of the set of hyperparameter vectors (also using grid search, random search, or other alternative methods). A person having ordinary skill in the art would understand that methods such as grid search and random search are used to determine a set of optimal hyperparameter values (where at least one of the hyperparameters values in the hyperparameter vector are changed as part of the grid search/random search), and hence, these steps of determining a hyperparameter vector in a search region and further selecting the set of hyperparameter vectors using these methods for each learning step j (e.g., learning step j=2) represents modifying hyperparameter values for generating a new variant of the machine learning algorithm containing these modified hyperparameter values (Kobayashi Figure 19 and [0209]: “… the hyperparameter adjustment unit 137 generates a hyperparameter vector applied to a machine learning algorithm to be executed by the step execution unit 138. Grid search or random search may be used to generate the hyperparameter vector.”; Figure 23, element 137c, 139, and [0250]: “… the search region determination unit 139 selects hyperparameter vectors … through random search, grid search, or the like …”; and [0265]-[0267]: (S117) The search region determination unit 139 determines a search region that corresponds to the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     … and the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . Namely, the search region determination unit 139 determines the hyperparameter vector set                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     … (S118) The step execution unit 138c executes the j-th learning step of the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                    . Namely, the hyperparameter adjustment unit 137c selects a hyperparameter vector included in the search region determined in step S117 or a hyperparameter vector near the hyperparameter vector. The step execution unit 138c applies the selected hyperparameter vector to the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and learns a model by using training data having the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . … The step execution unit 138c repeats the above processing for a plurality of hyperparameter vectors. The step execution unit 138c determines a model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     and the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the results of the learning not stopped. … (S119) The learning control unit 135c acquires the learned model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    thereof, the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the step execution unit 138c.”).); 
determining a performance score of the new variant of the machine learning algorithm using a training data set, the performance score representing an accuracy of a new machine learning model generated by training the new variant of the machine learning algorithm with the training data set (Examiner’s note: As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, where each learning step (controlled by a learning control unit and step execution unit) produces representative machine learning models based on the same machine learning algorithm using a sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     from a training data set (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19 and [0216]-[0227]), and where the generated representative machine learning model produced by the step execution unit in learning step j=2 represents a “new variant of machine learning algorithm”. Kobayashi further teaches the step execution unit performing cross-validation or random sub-sampling methods (Kobayashi Figure 23, element 138c and [0256]; Figure 18, element 138, Figure 19, element S79 and [0221], [0225]; and [0089]) to identify and learn a model based on the training data, the machine learning algorithm, and associated hyperparameter vector on each H iteration, and outputs the model with the highest prediction performance (and its associated hyperparameter vector) out of H iterations to identify representative machine learning model for the executed learning step. Kobayashi teaches that each prediction performance value associated with the model is a measure of accuracy and is determined through a measurement index (where this measurement index represents an accuracy criteria), such that each of these models and their associated prediction performance values generated through H iterations of cross-validation or random sub-sampling represent accuracy values determined through one or more accuracy criteria (Kobayashi [0078]-[0079]: “The machine learning device 100 calculates the “prediction performance” of a learned model. The prediction performance is the capability of accurately predicting results of unknown cases and may be referred to as “accuracy” … The accuracy, precision, RMSE, or the like may be used as the index representing the prediction performance … the RMSE is calculated by                         
                            
                                
                                    (
                                    s
                                    u
                                    m
                                    
                                        
                                            (
                                            y
                                            -
                                            
                                                
                                                    y
                                                
                                                
                                                    ^
                                                
                                            
                                            )
                                        
                                        
                                            2
                                        
                                    
                                    /
                                    N
                                    )
                                
                                
                                    1
                                    /
                                    2
                                
                            
                        
                     …”; [0166]: “… The index that represents the prediction performance p may be set in advance in the step execution unit …”; and [0222]). Hence, these calculations of prediction performance values for each model resulting in the selection of the highest prediction performance value calculated and determined by the step execution unit at learning step j=2 represents a determination of a performance score representing an accuracy for a new variant of the machine learning algorithm (Kobayashi Figure 19 and [0216]-[0225]; and [0226]-[0227]: “… (S81) The step execution unit 138 outputs the highest prediction performance among the prediction performances                         
                            
                                
                                    p
                                
                                
                                    1
                                
                            
                        
                    ,                         
                            
                                
                                    p
                                
                                
                                    2
                                
                            
                        
                    , …                        
                            
                                
                                    p
                                
                                
                                    H
                                
                            
                        
                     as the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                        
                    . In addition, the step execution unit 138 outputs a model that corresponds to the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                        
                     among the models                         
                            
                                
                                    m
                                
                                
                                    1
                                
                            
                        
                    ,                         
                            
                                
                                    m
                                
                                
                                    2
                                
                            
                        
                    , …                        
                            
                                
                                    m
                                
                                
                                    H
                                
                            
                        
                    . In addition, the step execution unit 138 outputs a hyperparameter vector that corresponds to the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                        
                     among the hyperparameter vectors                         
                            
                                
                                    θ
                                
                                
                                    1
                                
                            
                        
                    ,                         
                            
                                
                                    θ
                                
                                
                                    2
                                
                            
                        
                    , …                        
                            
                                
                                    θ
                                
                                
                                    H
                                
                            
                        
                    . In addition, the step execution unit 138 calculates and outputs an execution time. The execution time may be the entire time needed to execute the single learning step from step S70 to step S81 or the time needed to execute steps S72 to S79 from which the outputted model is obtained.”).); 
comparing the performance score of the new variant of the machine learning algorithm for the training data set with a performance score of the reference variant of the machine learning algorithm for the training data set to determine at least one difference, of a plurality of differences in performance scores, between the performance score of the new variant of the machine learning algorithm and the performance score of the reference variant of the machine learning algorithm (Examiner’s note: As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, to produce representative machine learning models based on the same machine learning algorithm (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19 and [0216]-[0227]), where the generated representative machine learning model at learning step j=2 represents a “new variant of machine learning algorithm”. Kobayashi further teaches at each learning step j, the learning control unit acquires the learned model from the output of the step execution unit, and performs a comparison between the selected current learned model’s prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    with the achieved performance P, where P represents the achieved prediction performance up to now (which is interpreted as being the highest prediction performance from earlier learning steps). As indicated earlier, Kobayashi teaches that each prediction performance value is an accuracy value based on a measurement index, where the measurement index includes a RMSE (root mean squared error) metric involving a calculation based on a difference between prediction performance results and is applied during each of the H iterations in the step execution unit to produce a set of prediction performances from which the current learned model’s prediction performance was selected as being the highest prediction performance (where this comparison within this set of prediction performances indirectly determines a set of differences between each prediction performance to select the highest one) (Kobayashi [0078]-[0079]; [0166]; [0222]; and [0226]-[0227]). Hence, in the context of learning step j=2, the comparison between the current learned model prediction performance and the achieved prediction performance P (i.e., the prediction performance from the learned model at learning step j=1, where learning step j=1 establishes a “reference variant of machine learning algorithm”) corresponds to the step of “comparing the performance score of the new variant … with a performance score of the reference variant …” (Kobayashi [0267]-[0269]: “(S119) The learning control unit 135c acquires the learned model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    thereof, the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the step execution unit 138c. … (S120) The learning control unit 135c compares the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     acquired in step S119 with the achieved prediction performance P (the maximum prediction performance achieved up until now) and determines whether the former is larger than the latter. … If the prediction                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     is larger than the achieved prediction performance P, the operation proceeds to step S121. Otherwise, the operation proceeds to step S122 … (S121) The learning control unit 135c updates the achieved prediction performance P to the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     …”).); 
determining a cost metric of the new variant of the machine learning algorithm by measuring usage of computing resources when training the new variant of the machine learning algorithm on the training data set (Examiner’s note: Kobayashi teaches a learning control unit using a learning execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    determined from a time estimation unit to determine whether a machine learning model is taking more computational resources to learn based on the hyperparameter vector being learned during the learning step, where a machine learning model that completes training within a learning execution time and with a high predictive performance is considered less computationally costly than that takes more learning execution time to train (Kobayashi [0231]). Hence, this learning execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    associated with a machine learning model that completes training within a learning execution time represents a cost metric associated with the machine learning model. The determination based on this cost metric is performed by calculating an improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                    , which is based on the learning execution time and a performance improvement amount that is based on execution time, and as such, this cost metric based on a performance improvement amount dependent on execution time represents a cost metric that is measured based on usage of computing resources (Kobayashi Figure 23, elements 133c, 134, 135c; Figure 25, step S128 and [0277]: “(S128) The learning control unit 135c updates the total time                         
                            
                                
                                    t
                                
                                
                                    s
                                    u
                                    m
                                
                            
                        
                     to                         
                            
                                
                                    t
                                
                                
                                    s
                                    u
                                    m
                                
                            
                        
                    +                         
                            
                                
                                    t
                                
                                
                                    i
                                    ,
                                    j
                                    +
                                    1
                                
                                
                                    q
                                
                            
                        
                     on the basis of the execution time                         
                            
                                
                                    t
                                
                                
                                    i
                                    ,
                                    j
                                    +
                                    1
                                
                                
                                    q
                                
                            
                        
                     obtained from the time estimation unit 133c. In addition, the learning control unit 135c calculates the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                             
                        
                    =                        
                            
                                
                                    g
                                
                                
                                    i
                                    ,
                                    j
                                    +
                                    1
                                
                                
                                    q
                                
                            
                        
                    +/                        
                            
                                
                                    t
                                
                                
                                    s
                                    u
                                    m
                                
                            
                        
                    , on the basis of the updated total time                         
                            
                                
                                    t
                                
                                
                                    s
                                    u
                                    m
                                
                            
                        
                     and the performance improvement amount                         
                            
                                
                                    g
                                
                                
                                    i
                                    ,
                                    j
                                    +
                                    1
                                
                                
                                    q
                                
                            
                        
                     acquired from the performance improvement amount estimation unit 134.”).); 
based on the cost metric of the new variant of the machine learning algorithm and based at least in part on the plurality of differences in performance scores meeting one or more accuracy criteria, determining whether to use, for new data sets, said new variant of the machine learning algorithm as[[,]] a computationally less costly[[,]] (Examiner’s note: Under its broadest reasonable interpretation in light of Applicant’s specification paragraph [0039], the term “mini-ML variant” is an alternate name for a variant that is computationally less costly-to-train, and the phrases “… based at least in part on the plurality of differences in performance scores meeting or more accuracy criteria” and “acceptable measure of the plurality of differences in the performance scores of a new variant … and of the reference variant” broadly indicate performing the earlier limitation involving comparison of prediction performance scores and applying same earlier accuracy criteria to compare the performance scores between a new variant and a reference variant. Furthermore, under its broadest reasonable interpretation in light of Applicant’s specification paragraph [0074], the term “new data sets” broadly indicate data sets applied in cross-validation. Hence this limitation broadly recites steps for determining performance of the reference variant based on data sets applied in cross-validation, including steps for determining whether to identify and select a new variant of the machine learning algorithm as a “mini-ML variant” based on a cost metric that identifies it as being computationally less costly and based on an earlier selection process based on least a plurality of differences in performance scores between the reference variant and the new variant according to one or more accuracy criteria. As indicated earlier, Kobayashi teaches the step execution unit applying measurement indices representing an accuracy criteria (Kobayashi [0078]-[0079]; [0166]; [0222]; and [0226]-[0227]), as well as performing a comparison to identify the best prediction performance between two values associated with respective models, where this comparison represents a measurement of a difference between two prediction performances, thus representing another accuracy criteria (Kobayashi [0267]-[0269]). Kobayashi teaches the step execution unit performing determination of the prediction performance values using K-fold cross-validation to learn and evaluate the model with the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and modified hyperparameter vector, where the execution of the K-fold cross validation method is performed at each learning step j, and hence execution of K-fold cross validation at learning step j=1 represents a determination of the performance score of a reference variant based on new data sets (Kobayashi Figure 23, element 138c and [0256]: “The step execution unit 138c executes learning steps … in the same way as in the fourth embodiment. …”; Figure 18, element 138, Figure 19, element S79 and [0221], [0225]: “(S79) The step execution unit 138 executes cross validation …”; and [0089]). Following the execution of the above steps, Kobayashi further teaches the learning control unit using the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     to check against a threshold R, and performing a further determination (Kobayashi Figure 25, step S131) whether to further store the learned model with prediction performance P (step S132) or to continue with additional processing (step S114). Hence the learning control unit (at learning step j=2) performs additional determination of whether the earlier selected new variant of machine learning algorithm and its associated prediction performance and hyper-parameter is stored (i.e., selected) based on this measured improvement rate and learning execution time, where the act of storing the model and its associated information represents the training completion for this new model variant within the learning execution time. This stored new variant completed within the learning execution time represents a computationally less costly-to-train model, and hence when combined with the earlier identification and selection of this new variant based on a comparison of prediction performances between the new variant (learned in learning step j=2) and the reference variant (learned in learning step j=1), this series of steps correspond to steps for determining performance of the reference variant based on new data sets, including steps for determining whether to identify and select a new variant of the machine learning algorithm as a “mini-ML variant” based on a cost metric that identifies it as being computationally less costly and based on an earlier selection process based on least a plurality of differences in performance scores between the reference variant and the new variant according to one or more accuracy criteria (Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132 or S114, and  [0278]-[0281]: “(S129) The learning control unit 135c determines whether the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R, the operation proceeds to step S130. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is equal to or more than the threshold R, the operation proceeds to step S131. … (S130) The learning control unit 135c updates j to j+1. Next, the operation returns to step S123. … (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114.” … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. … In addition, the learning control unit 135c stores the hyperparameter vector 𝛉 used to learn the model …”).).
Regarding amended Claim 2, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising: 
based on comparing the performance score of the new variant of the machine learning algorithm for the training data set with the performance score of the reference variant of the machine learning algorithm for the training data set, determining that the new variant of the machine learning algorithm meets the one or more accuracy criteria based on the performance score of the reference variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation is directed to the actions taken after the comparison step recited in Claim 1. As indicated earlier, Kobayashi teaches a comparison is performed between the selected current learned model prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     (representing a “new variant of the machine learning algorithm”) and the achieved prediction performance P (i.e., the prediction performance from the learned model at learning step j=1, where learning step j=1 establishes a “reference variant of machine learning algorithm”), where this comparison to identify the best prediction performance between two values also represents a measurement of a difference between two prediction performances, and thus also represents an accuracy criteria. Kobayashi further teaches the higher of the two prediction performance values is stored by updating P, where the scenario of storing of the current prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     and updating of the achieved prediction performance P is taught in Kobayashi Figure 24, steps S120 and S121, thus representing a determination that the new variant of the machine learning algorithm meets the one or more accuracy criteria based on the performance score of the reference variant (Kobayashi Figure 23, elements 135c, 138c; Figure 24, steps S120→S121; and [0266]-[0269]: “… (S120) The learning control unit 135c compares the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     acquired in step S119 with the achieved prediction performance P (the maximum prediction performance achieved up until now) and determines whether the former is larger than the latter. If the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     is larger than the achieved prediction performance P, the operation proceeds to step S121. … (S121) The learning control unit 135c updates the achieved prediction performance P to the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     …”).); 
determining to select the new hyper-parameter value for the mini-ML variant of the machine learning algorithm based on the cost metric of the new variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation in light of Applicant’s specification paragraph [0039], the term “mini-ML variant” is an alternate name for a variant that is computationally less costly-to-train, and hence this limitation broadly recites steps for identifying the new variant of the machine learning algorithm as the “mini-ML variant” based on satisfying the criteria of being computationally less costly. As indicated earlier, for each learning step j, Kobayashi teaches the learning control unit using the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     to check against a threshold R, and performing a determination (Kobayashi Figure 25, step S131) whether to store the model, its prediction performance P, and associated information including the hyper-parameter vector (Kobayashi Figure 25, step S132) or to continue with additional processing (Kobayashi Figure 25, step E, which points to Figure 24, step S114). Hence, the learning control unit that determines to store the new variant of machine learning algorithm and its associated prediction performance and hyper-parameter given that it has completed its learning within the measured improvement rate and learning execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    corresponds to steps for identifying the new variant of the machine learning algorithm as the “mini-ML variant” based on satisfying the criteria of being computationally less costly (Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132, and [0278]-[0281]: “(S129) The learning control unit 135c determines whether the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R, the operation proceeds to step S130. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is equal to or more than the threshold R, the operation proceeds to step S131. … (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114. … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. In addition, the learning control unit 135c stores the algorithm ID of the machine learning algorithm associated with the achieved prediction performance P and the sample size that corresponds to the step number associated with the achieved prediction performance P in the learning result storage unit 123. In addition, the learning control unit 135c stores the hyperparameter vector θ used to learn the model in the learning result storage unit 123.”).).  
Regarding amended Claim 3,
 Kobayashi teaches
(Currently Amended) The method of Claim 1, 
wherein determining the performance score of the new variant of the machine learning algorithm using the training data set further comprises performing cross-validation of the new variant of the machine learning algorithm on the training data set (Examiner’s note: As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, to produce representative machine learning models based on the same machine learning algorithm (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19 and [0216]-[0227]), where the generated representative machine learning model at learning step j=2 represents a “new variant of machine learning algorithm”. Kobayashi additionally teaches the step execution unit performs determination of the prediction performance values using K-fold cross-validation to learn and evaluate the model with the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and modified hyperparameter vector, where the execution of the K-fold cross validation method is performed at each learning step j, and hence this execution of the K-fold cross validation method at learning step j=2 represents a determination of the performance score of the new variant of the machine learning algorithm (Kobayashi Figure 23, element 138c and [0256]: “The step execution unit 138c executes learning steps one by one in the same way as in the fourth embodiment. …”; Figure 18, element 138, Figure 19, element S79 and [0221], [0225]: “(S79) The step execution unit 138 executes cross validation …”; and [0089]: “… every time a single sample size (a single learning step) is processed, a model is learned and the prediction performance thereof is evaluated. Examples of the validation method in each learning step include cross validation … In cross validation, the machine learning device 100 divides the sampled data into K blocks … The machine learning device 100 uses (K-1) blocks as the training data and 1 block as the test data. The machine learning device 100 repeatedly performs model learning and evaluating the prediction performance K times while changing the block used as the test data. As a result of a single learning step, for example, the machine learning device 100 outputs a model indicating the highest prediction performance among the K models and an average value of the K prediction performances.”).).  
Regarding amended Claim 4, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising 
determining the performance score of the reference variant by performing cross-validation of the reference variant on the training data set (Examiner’s note: As indicated earlier, Kobayashi teaches the step execution unit performs determination of the prediction performance values using K-fold cross-validation to learn and evaluate the model with the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and modified hyperparameter vector, where the execution of the K-fold cross validation method is performed at each learning step j, and hence this execution of the K-fold cross validation method at learning step j=1 represents a determination of the performance score of the reference variant of the machine learning algorithm (Kobayashi Figure 23, element 138c and [0256]; Figure 18, element 138, Figure 19, element S79, and [0221], [0225]: “(S79): “The step execution unit 138 executes cross validation instead of the above random sub-sampling validation. …”; and [0089]).).  
Regarding previously presented Claim 5, 
Kobayashi teaches
(Previously Presented) The method of Claim 1, further comprising generating the reference variant by: 
selecting, for the machine learning algorithm, a distinct set of hyper-parameter values from a plurality of distinct sets of hyper-parameter values (Examiner’s note: Kobayashi teaches each learning step uses a search region determination unit that identifies a set of hyperparameter vectors to use in each learning step j based on methods such as grid search and random search, as well as a hyper-parameter adjustment unit that performs further selection of the set of hyperparameter vectors (also using grid search, random search, or other alternative methods). Kobayashi further teaches the search region determination unit divides a hyperparameter vector space into different regions, where each region is used to produce distinct groups of hyperparameter vectors (Kobayashi Figure 20 and [0232]-[0235]: “… the hyperparameter vector space 40 is divided into regions 41 to 44 … The regions 41 to 44 are examples obtained by dividing the hyperparameter vector space 40 when a machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    1
                                
                            
                        
                     is executed by using training data having the sample size                         
                            
                                
                                    s
                                
                                
                                    1
                                
                            
                        
                    . The region 41 corresponds to a hyperparameter vector set                         
                            
                                
                                    ∆
                                    Φ
                                
                                
                                    1,1
                                
                                
                                    1
                                
                            
                        
                     … The region 44 corresponds to a hyperparameter vector set                         
                            
                                
                                    ∆
                                    Φ
                                
                                
                                    1,1
                                
                                
                                    4
                                
                            
                        
                     …”; Figure 23, elements 139, 137c, 135c, and [0249]-[0250]: “The search region determination unit 139 determines a set of hyperparameter vectors (a search region) used in the next learning step in response to a request from the learning control unit 135c. The search region determination unit 139 determines                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     as described above. Namely, among the hyperparameter vectors included in                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     the search region determination unit 139 adds the hyperparameter vectors used in the model learning completed to                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                    . … when j=1 and q=1, the search region determination unit 139 selects hyperparameter vectors as many as possible from the hyperparameter vector space through random search, grid search, or the like and adds the selected hyperparameter vectors to                         
                            
                                
                                    Φ
                                
                                
                                    1,1
                                
                                
                                    1
                                
                            
                        
                    .”).);   
performing cross-validation of the machine learning algorithm on one or more training data sets (Examiner’s note: As indicated earlier, Kobayashi teaches the step execution unit performs determination of the prediction performance values using K-fold cross-validation to learn and evaluate the model with the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and modified hyperparameter vector, where the execution of the K-fold cross validation method is performed at each learning step j, and hence this execution of the K-fold cross validation method at learning step j=1 represents a determination of the performance score for the reference variant of the machine learning algorithm (Kobayashi Figure 23, element 138c and [0256]; Figure 18, element 138, Figure 19, element S79, and [0221], [0225]: “(S79): “The step execution unit 138 executes cross validation instead of the above random sub-sampling validation. …”; and [0089]).); 
based on performing cross-validation of the machine learning algorithm, determining whether to select the distinct set of hyper-parameter values for the reference variant (Examiner’s note: As indicated earlier, for each learning step j, Kobayashi teaches using the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     to check against a threshold R, and performing a determination (Kobayashi Figure 25, step S131) whether to store the model, its prediction performance P (which was earlier determined as taught in Kobayashi [0267]-[0269]), and associated information including the hyper-parameter vector (step S132) or to continue with additional processing (step S114). Hence, the learning control unit performing a determination whether the new variant of machine learning algorithm and its associated prediction performance and hyper-parameter is stored (i.e., selected) according to whether it has exceeded a learning execution time or completed within the learning execution time (at learning step j=1) represents a determination of whether to select the distinct set of hyper-parameters values for the reference variant (Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132 or S114, and [0278]-[0281]: “(S129) The learning control unit 135c determines whether the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R, the operation proceeds to step S130. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is equal to or more than the threshold R, the operation proceeds to step S131. … (S130) The learning control unit 135c updates j to j+1. Next, the operation returns to step S123. … (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114.” … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. … In addition, the learning control unit 135c stores the hyperparameter vector 𝛉 use to learn the model …”).).  
Regarding original Claim 6, 
Kobayashi teaches
(Original) The method of Claim 5, further comprising selecting the distinct set of hyper-parameter values from the plurality of distinct sets of hyper-parameter values based on one of: 
a Bayesian optimization, 
a random search (Examiner’s note: Kobayashi teaches the search region determination unit and hyperparameter adjustment unit generating a hyperparameter vector from a set of hyperparameters using a random search method (Kobayashi Figure 23, elements 139, 137c; Figure 18, element 137 and [0249]: “The search region determination unit 139 determines a set of hyperparameter vectors (a search region) used in the next learning step in response to a request from the learning control unit 135c.  … Namely, among the hyperparameter vectors included in                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     the search region determination unit 139 adds the hyperparameter vectors used in the model learning completed to                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                    . … However, when j=1 and q=1, the search region determination unit 139 selects hyperparameter vectors as many as possible from the hyperparameter vector space through random search, grid search, or the like and adds the selected hyperparameter vectors to                         
                            
                                
                                    Φ
                                
                                
                                    1,1
                                
                                
                                    1
                                
                            
                        
                    .” And [0209]: “In response to a request from the step execution unit 138, the hyperparameter adjustment unit 137 generates a hyperparameter vector applied to a machine learning algorithm to be executed by the step execution unit 138. Grid search or random search may be used to generate the hyperparameter vector. …”).), 
a gradient-based search, 
a grid search (Kobayashi Figure 23, elements 139, 137c; Figure 18, element 137: examiner’s note: As indicated earlier, Kobayashi teaches the search region determination unit and hyperparameter adjustment unit generating a hyperparameter vector from a set of hyperparameters using a grid search method (Kobayashi [0249] and [0209]).), or 
a Tree-structured Parzen Estimators (TPE) based selection.  
Regarding amended Claim 9,
 Kobayashi teaches
(Currently Amended) The method of Claim 1, 
wherein the new hyper-parameter value is based on a previous hyper-parameter value of a previous machine learning algorithm generated from the reference variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, the term “previous machine learning algorithm generated from the reference variant of the machine learning algorithm” broadly indicates a variant based on the reference variant of a machine learning algorithm. Kobayashi teaches the hyperparameter adjustment unit performing adjustment of a hyperparameter vector used in the last learning step performed by the step execution unit, where this hyperparameter for a learning step (e.g., j=2) is based on an adjustment of the hyperparameter from the last learning step (j=1) that is associated with the reference variant of machine-learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                    . Hence this hyperparameter value in the latest learning step based on an adjusted hyperparameter value from the last learning step (with the last learning step representing a generation of a reference variant) corresponds to “wherein the new hyper-parameter value is based on a previous hyper-parameter value of a previous machine-learning algorithm generated from the reference variant of machine learning algorithm” (Kobayashi Figure 23, element 137c; Figure 18, element 137 and [0211]-[0212]: “The hyperparameter adjustment unit 137 may refer to a hyperparameter vector used in the last learning step of the same machine learning algorithm, to make the search for a preferable hyperparameter vector more efficient. For example, the hyperparameter adjustment unit 137 may perform the search by starting with a hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    j
                                    -
                                    i
                                
                            
                        
                    , that achieved the best prediction performance in the last learning step. … assuming that the hyperparameter vectors that achieve the best prediction performance … are                         
                            
                                
                                    θ
                                
                                
                                    j
                                    -
                                    1
                                
                            
                        
                    and                         
                            
                                
                                    θ
                                
                                
                                    j
                                    -
                                    2
                                
                            
                        
                    , respectively, the hyperparameter adjustment unit may generate 2                        
                            
                                
                                    θ
                                
                                
                                    j
                                    -
                                    1
                                
                            
                            -
                            
                                
                                    θ
                                
                                
                                    j
                                    -
                                    2
                                
                            
                        
                     as the hyperparameter to be used next.”).).  
Regarding amended Claim 10, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising: 
based on comparing the performance score of the new variant of the machine learning algorithm for the training data set with the performance score of the reference variant of the machine learning algorithm for the training data set, determining that the new variant of the machine learning algorithm fails to meet the one or more accuracy criteria based on the performance score of the reference variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation is directed to the actions taken after the comparison step recited in Claim 1. As indicated earlier, Kobayashi teaches a comparison is performed between the selected current learned model prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     (representing a “new variant of the machine learning algorithm”) and the achieved prediction performance P (i.e., the prediction performance from the learned model at learning step j=1, where learning step j=1 establishes a “reference variant of machine learning algorithm”), where this comparison to identify the best prediction performance between two values also represents a measurement of a difference between two prediction performances, and thus also represents an accuracy criteria. Kobayashi further teaches the higher of the two prediction performance values is stored by updating P, where the scenario of not storing the current prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     (i.e., the prediction performance of the new variant is not higher than the achieved prediction performance P) is taught in Kobayashi Figure 24, step S122, thus representing a determination that the new variant of the machine learning algorithm fails to meets the one or more accuracy criteria based on the performance score of the reference variant (Kobayashi [0266]-[0269]: “… (S120) The learning control unit 135c compares the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     acquired in step S119 with the achieved prediction performance P (the maximum prediction performance achieved up until now) and determines whether the former is larger than the latter. If the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     is larger than the achieved prediction performance P, the operation proceeds to step S121. Otherwise, the operation proceeds to step S122. … ”).); 
based on determining that the new variant of the machine learning algorithm fails to meet the one or more accuracy criteria, determining not to select the new hyper-parameter value for the mini-ML variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation in light of Applicant’s specification paragraph [0039], the term “mini-ML variant” is an alternate name for a variant that is computationally less costly-to-train. As indicated earlier, for each learning step j, Kobayashi teaches the learning control unit using the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     to check against a threshold R, and performing a determination (Kobayashi Figure 25, step S131) whether to store the model, its prediction performance P, and associated information including the hyper-parameter vector (Kobayashi Figure 25, step S132) or to continue with additional processing (Kobayashi Figure 25, step E, which points to Figure 24, step S114). Based on the results of the preceding claim limitation where the prediction performance P was not updated with the prediction performance of the new variant (i.e., the stopping time has elapsed, Kobayashi Figure 24, step 122 and [0266]-[0269]), it follows that the hyper-parameter vectors for the new variant (as well as the new variant model) are also not stored (selected) when executing the steps taught in Kobayashi Figure 25, steps S129-S132, thus representing a determination of not selecting the new hyper-parameter value for the “mini-ML variant” of the machine learning algorithm (Kobayashi [0278]-[0281]: “(S129) The learning control unit 135c determines whether the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is less than the threshold R, the operation proceeds to step S130. If the improvement rate                         
                            
                                
                                    r
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     is equal to or more than the threshold R, the operation proceeds to step S131. … (S130) The learning control unit 135c updates j to j+1. Next, the operation returns to step S123. … (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114.” … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. … In addition, the learning control unit 135c stores the hyperparameter vector 𝛉 used to learn the model …”).); 
modifying the at least one hyper-parameter from the at least one original hyper-parameter value to a next hyper-parameter thereby generating a next machine learning algorithm from the reference variant of the machine learning algorithm with the at least one hyper-parameter having the next hyper-parameter value (Examiner’s note: Under its broadest reasonable interpretation, the limitation “modifying the at least one hyper-parameter … to a next hyper-parameter value thereby generating another machine learning algorithm …” is interpreted as generating additional variants of the machine learning algorithm containing a modified hyper-parameter value. Kobayashi teaches a condition where additional learning steps can be initiated if the current learning step is completed within the given learning execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                             
                        
                    , where these additional learning steps (e.g., j=3,4,…) also execute the search region determination unit and the hyperparameter adjustment unit to determine different sets of modified hyperparameter vectors using grid or random search methods (Kobayashi [0280]: “… S131 The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit … If the elapsed time has exceeded the time limit … Otherwise, the operations returns to step S114 …”; Figure 24, steps S117, S118 and [0262]-[0266]: (Kobayashi [0265]-[0267]: (S114) The learning control unit 135c selects a virtual algorithm … (S117) The search region determination unit 139 determines a search region that corresponds to the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                     (the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and the learning time level q) and the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . Namely, the search region determination unit 139 determines the hyperparameter vector set                         
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     in accordance with the above method. … (S118) The step execution unit 138c executes the j-th learning step of the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                    . Namely, the hyperparameter adjustment unit 137c selects a hyperparameter vector included in the search region determined in step S117 or a hyperparameter vector near the hyperparameter vector. The step execution unit 138c applies the selected hyperparameter vector to the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     and learns a model by using training data having the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . … The step execution unit 138c repeats the above processing for a plurality of hyperparameter vectors. The step execution unit 138c determines a model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     and the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the results of the learning not stopped. …”; and [0249]-[250]).);
without performing cross-validation of the next machine learning algorithm on the training data set, determining that the next machine learning algorithm is less accurate than the new variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation, the limitation “without performing cross-validation of the next machine learning algorithm …” broadly recites using any method other than cross validation to determine accuracy between different variants. As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, where each learning step (controlled by a learning control unit and step execution unit) produces representative machine learning models based on the same machine learning algorithm using a sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     from a training data set (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19 and [0216]-[0227]), and where the generated representative machine learning model produced by the step execution unit in learning step j=3 represents a “next variant of machine learning algorithm”. Kobayashi further teaches the step execution unit performing cross-validation or random sub-sampling methods to identify and learn a model based on the training data, the machine learning algorithm, and associated hyperparameter vector on each H iteration, and outputs the highest prediction performance (and its associated hyperparameter vector) out of H iterations to identify representative machine learning model for the executed learning step. Kobayashi teaches that the determination of choosing either random sub-sampling or cross-validation is based on a determination of whether the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                     is larger than                         
                            
                                
                                    2
                                
                                
                                    3
                                
                            
                        
                     of the size of data set D,  where the random sub-sampling method is selected for the scenario of having the sample size larger than                         
                            
                                
                                    2
                                
                                
                                    3
                                
                            
                        
                     of the size of data set D  (Kobayashi [0218]-[0219]; [0225]). As indicated earlier, Kobayashi teaches that each prediction performance value is a measure of accuracy and is determined through a measurement index (representing accuracy criteria), such that each of these prediction performance values generated through H iterations of random sub-sampling are representations of accuracy values determined through one or more accuracy criteria (Kobayashi [0078]-[0079]; [0166]; and [0222]). Hence, the highest prediction performance value calculated and determined by the step execution unit at learning step j=3 represents a determination of a performance score representing an accuracy for a next variant of the machine learning algorithm (Kobayashi Figure 19 and [0216]-[0225]; and [0226]-[0227]), where this selected current prediction performance value                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     is further compared with the achieved prediction performance value P (storing the maximum prediction performance value up to this point), and where the scenario of not storing the current prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     (i.e., the prediction performance of the new variant is not higher than the achieved prediction performance P) is taught in Kobayashi Figure 24, step S122, thus representing a determination that the next variant of the machine learning algorithm fails to meets the one or more accuracy criteria based on the performance score of the new variant (and hence being less accurate than the new variant) (Kobayashi [0266]-[0269]).).  
Regarding amended Claim 11, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, 
wherein the reference variant of the machine learning algorithm is one of a plurality of reference variants of machine learning algorithm[[s]] for which a corresponding plurality of mini-ML, less costly, variants of the reference variant of machine learning algorithm[[s]] are generated, the corresponding plurality of mini-ML variants includes the new variant of the machine learning algorithm (Examiner’s note: As indicated earlier, this limitation exhibits both a 112(b) lack of antecedent issue as well as a 112(a) lack of written description issue, and hence for purposes of examination, this limitation will be interpreted based on the established antecedent basis of a single machine learning algorithm found in their respective parent independent Claims. Furthermore, under its broadest reasonable interpretation in light of Applicant’s specification paragraph [0039], the term “mini-ML variant” is an alternate name for a variant that is computationally less costly-to-train, and hence this limitation broadly recites a process for generating a plurality of less computationally costly variants (“mini-ML variants”) based on one of a plurality of reference variants. Kobayashi teaches the step execution unit generating a plurality of variants during the cross-validation or random sub-sampling process involving H iterations (e.g., during learning step j=1, where these generated plurality of variants represent a plurality of reference variants with corresponding performance values, Kobayashi [0226]-[0227], thus corresponding to “… wherein the reference variant of the machine learning algorithm is one of a plurality of reference variants of machine learning algorithm[[s]] …”), with one of these plurality of variants with the highest prediction performance being selected as the reference variant for comparison in the next learning step j=2 (Kobayashi [0267]-[0269]). Hence, performing the steps described in the fifth embodiment (Kobayashi [0266]-[0269], [0278]-[0281] and the steps for the step execution unit described Kobayashi [0218]-[0227]) for subsequent learning steps j=2,3,4,…, where these subsequent steps produce new/modified variants that can be less computationally costly variants, and thus represent a process for generating a “ … corresponding plurality of mini-ML, less costly, variants of the reference variant of machine learning algorithms are generated, the corresponding plurality of mini-ML variants includes the new variant of the machine learning algorithm …”.), and 
the method further comprising: 
receiving a request to determine expected performances of the plurality of reference variants for a particular training data set (Examiner’s note: Kobayashi teaches the learning control unit providing the step execution unit the machine leaning algorithm, sample size, and hyperparameter search region to the step execution unit, where the step execution unit performs the H iterations of K-fold cross-validation for a particular machine learning algorithm and sample size, with each H iteration during learning step j=1 producing a “plurality of reference variants” (Kobayashi Figure 23, element 135c, 138c and [0254]: “The learning control unit 135c specifies the machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    , the search region (                        
                            
                                
                                    Φ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                    ) determined by the search region determination unit 139, and the stopping time                         
                            
                                
                                    ϕ
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     to the step execution unit 138c.”; Figure 24, steps S118, S119, and [0266]-[0267]: “(S118) The step execution unit 138c executes the j-th learning step of the virtual algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                                
                                    q
                                
                            
                        
                    . Namely, the hyperparameter adjustment unit 137c selects a hyperparameter vector included in the search region determined in step S117 or a hyperparameter vector near the hyperparameter vector. The step execution unit 138c applies the selected hyperparameter vector to the machine learning algorithm a, and learns a model by using training data having the sample size                         
                            
                                
                                    s
                                
                                
                                    j
                                
                            
                        
                    . … (S119) The learning control unit 135c acquires the learned model, the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     thereof, the execution time                         
                            
                                
                                    T
                                
                                
                                    i
                                    ,
                                    j
                                
                                
                                    q
                                
                            
                        
                     from the step execution unit 138c.”; [0256]: “The step execution unit 138c executes learning steps one by one in the same way as in the fourth embodiment. …”; and [0224]-[0227]: “…The step execution unit 138 calculates the average value of the K prediction performances p … as a prediction performance                         
                            
                                
                                    p
                                
                                
                                    h
                                
                            
                        
                     that corresponds to the hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    h
                                
                            
                        
                    .  … (S80) The step execution unit 138 compares the number of times of the repetition of the above steps S71 to S79 with a threshold H and determines whether the former is less than the latter.”).); 
for each reference variant of the plurality of reference variants of the machine learning algorithm[[s]], selecting a respective mini-ML variant (Examiner’s note: As indicated earlier, this limitation exhibits both a 112(b) lack of antecedent issue as well as a 112(a) lack of written description issue, and hence for purposes of examination, this limitation will be interpreted based on the established antecedent basis of a single machine learning algorithm found in their respective parent independent Claims. Furthermore, under its broadest reasonable interpretation in light of Applicant’s specification paragraph [0039], the term “mini-ML variant” is an alternate name for a variant that is computationally less costly-to-train, and hence this limitation broadly recites a process for generating a computationally less costly variant based on a reference variant. This limitation broadly recites the limitations performed in independent Claim 1 to determine a computationally less costly variant for each reference variant of a plurality of reference variants (where these limitations are based on the fourth embodiment of Kobayashi as mapped in Claim 1, Kobayashi [0256]: “The step execution unit 138c executes learning steps one by one in the same way as in the fourth embodiment. …”), and hence this limitation is rejected under similar rationale.); 
performing cross-validation of the respective mini-ML variant thereby generating a corresponding particular performance score for said each reference variant (Examiner’s note: Under its broadest reasonable interpretation, this limitation broadly recites a process for generating a performance score based on performing cross-validation for the computationally less costly variant of the machine learning algorithm, which is similar to corresponding limitations in Claim 3 (where the new variant is the computationally less costly mini-ML variant), and hence is rejected under similar rationale.); 
based on the corresponding particular performance score, determining whether said each reference variant of the plurality of reference variants, when trained by the particular training data set, yields most accuracy (Examiner’s note: Under its broadest reasonable interpretation, this limitation broadly recites the generated performance score representing the computationally less costly model is compared against the earlier reference variant according to a threshold. As indicated earlier, Kobayashi teaches during learning step j=1, the step execution unit determines a set of prediction performance scores for a machine learning algorithm                         
                            
                                
                                    a
                                
                                
                                    i
                                
                            
                        
                     based on H iterations of performing K-fold cross-validation (each iteration corresponding to “the plurality of reference variants when trained by the particular training data set”), and selects the iteration with the highest prediction performance as the model representing this learning step j=1, where the prediction performance is a measure of accuracy (Kobayashi [0078]-[0079]. Performing the same set of steps at different learning steps j=2,3,4,… produces computationally less costly models that are compared with the highest prediction performance model from the previous learning step, and hence corresponding to generating performance scores representing the computationally less costly model and comparing the performance score against the earlier reference variant according to a threshold, where this threshold is represented by the earlier performance score for the reference variant (Kobayashi [0256]: “The step execution unit 138c executes learning steps one by one in the same way as in the fourth embodiment. …”; [0224]-[0227]: “…The step execution unit 138 calculates the average value of the K prediction performances p … as a prediction performance                         
                            
                                
                                    p
                                
                                
                                    h
                                
                            
                        
                     that corresponds to the hyperparameter vector                         
                            
                                
                                    θ
                                
                                
                                    h
                                
                            
                        
                    .  … (S80) The step execution unit 138 compares the number of times of the repetition of the above steps S71 to S79 with a threshold H and determines whether the former is less than the latter. … (S81) The step execution unit 138 outputs the highest prediction performance among the prediction performances                         
                            
                                
                                    p
                                
                                
                                    1
                                
                            
                        
                    ,                         
                            
                                
                                    p
                                
                                
                                    2
                                
                            
                        
                    , …                        
                            
                                
                                    p
                                
                                
                                    H
                                
                            
                        
                     as the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                        
                    . In addition, the step execution unit 138 outputs a model that corresponds to the prediction performance                         
                            
                                
                                    p
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                        
                     among the models                         
                            
                                
                                    m
                                
                                
                                    1
                                
                            
                        
                    ,                         
                            
                                
                                    m
                                
                                
                                    2
                                
                            
                        
                    , …                        
                            
                                
                                    m
                                
                                
                                    H
                                
                            
                        
                    .”; and [0266]-[0269]).).  
Regarding amended Claim 12, 
Claim 12 recites one or more transitory computer-readable media storing a sequence of instructions, which when executed by one or more hardware processors cause operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 1, and hence is rejected under similar rationale provided by Kobayashi as indicated in Claim 1. In addition, Kobayashi teaches the machine learning device containing one or more programs implementing the information processing of the machine learning device, where the one or more programs are recorded in a computer-readable recording medium (Kobayashi [0283]-[0284]: “…The information processing according to the fifth embodiment may be realized by causing the machine learning device 100c to execute a program. … An individual program may be recorded in a computer-readable recording medium (for example, the recording medium 113).”; and [0065]-[0066]: “The machine learning device 100 includes a CPU 101, a RAM 102, an HDD 103 … The CPU 101 is a processor which includes an arithmetic circuit that executes program instructions. The CPU 101 loads at least a part of programs or data held in the HDD 103 to the RAM 102 and executes the program. The CPU 101 may include a plurality of processor cores…”).
Regarding amended Claim 13, 
Claim 13 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 2, and hence is rejected under similar rationale provided by Kobayashi as indicated in Claim 2, in view of the rejections applied to Claim 12.
Regarding previously presented Claim 14, 
Claim 14 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 3, and hence is rejected under similar rationale provided by Kobayashi as indicated in Claim 3, in view of the rejections applied to Claim 12.
Regarding amended Claim 15, 
Claim 15 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 4, and hence is rejected under similar rationale provided by Kobayashi as indicated in Claim 4, in view of the rejections applied to Claim 12.
Regarding previously presented Claim 16, 
Claim 16 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 5, and hence is rejected under similar rationale provided by Kobayashi as indicated in Claim 5, in view of the rejections applied to Claim 12.
Regarding original Claim 17, 
Claim 17 recites the one or more non-transitory computer-readable media of Claim 16, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 6, and hence is rejected under similar rationale provided by Kobayashi as indicated in Claim 6, in view of the rejections applied to Claim 16.
Regarding previously presented Claim 20, 
Claim 20 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 9, and hence is rejected under similar rationale provided by Kobayashi as indicated in Claim 9, in view of the rejections applied to Claim 12.
Regarding amended Claim 21, 
Claim 21 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 10, and hence is rejected under similar rationale provided by Kobayashi as indicated in Claim 10, in view of the rejections applied to Claim 12.
Regarding amended Claim 22, 
Claim 22 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 11, and hence is rejected under similar rationale provided by Kobayashi as indicated in Claim 11, in view of the rejections applied to Claim 12.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 7-8 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over 
Kobayashi et al., U.S. PGPUB 2017/0061329, published 3/2/2017 [hereafter referred as Kobayashi] in view of Nguyen et al., U.S. PGPUB 2019/0042887, filed 8/6/2018 [hereafter referred as Nguyen].
Regarding amended Claim 7, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising: 
determining a cost metric of the reference variant by measuring usage of computing resources for training of the reference variant on the training data set (Examiner’s note: As indicated earlier, Kobayashi teaches using a learning execution time                 
                    
                        
                            T
                        
                        
                            i
                            ,
                            j
                        
                        
                            q
                        
                    
                     
                
            (corresponding to a “cost metric”) determined from a time estimation unit to determine whether a machine learning model is taking more computational resources to learn based on the hyperparameter vector being learned during the learning step, where it is interpreted that a machine learning model that completes training within a learning execution time and with a high predictive performance is considered less computationally costly than that takes more learning execution time to train (Kobayashi [0231]). The determination based on this cost metric is performed by calculating an improvement rate                 
                    
                        
                            r
                        
                        
                            i
                        
                        
                            q
                        
                    
                     
                
            , which is based on the learning execution time and a performance improvement amount, where a performance improvement amount based on execution time represents a measurement based on usage of computing resources. Hence, performing a determination of this cost metric at learning step j=1 (corresponding to the “reference variant”) represents performing a determination of this cost metric of the reference variant (Kobayashi Figure 23, elements 133c, 134, 135c; Figure 25, element S128 and [0277]).) …
However, Kobayashi does not explicitly teach
… comparing the cost metric of the new variant of the machine learning algorithm with the cost metric of the reference variant; 
based on comparing the cost metric of the new variant of the machine learning algorithm with the cost metric of the reference variant, determining that the cost metric of the new variant of the machine learning algorithm is lower than the cost metric for the reference variant; 
based on determining that the cost metric of the new variant of the machine learning algorithm is lower than the cost metric for the reference variant, qualifying the new hyper-parameter value for the mini-ML variant of the machine learning algorithm.  
Nguyen teaches
… comparing the cost metric of the new variant of the machine learning algorithm with the cost metric of the reference variant (Examiner’s note: Nguyen teaches a training management module performing comparison of metrics received from the training of multiple models, where the metrics are used to estimate the effectiveness of the trained models. These metrics are based on resource usage costs (representing “cost metrics”), and hence this process involving comparisons of metrics to estimate the effectiveness of trained models corresponds to steps for “comparing the cost metric of the new variant … with the cost metric of the reference variant” (Nguyen Figure 5, elements 510, 512, 514, 518, 520, 522, 525; Figure 1, elements 120, 160; [0089]-[0093]: “…At 510, a plurality of training sets are obtained … At step 512, initial hyper parameter sets are determined. … At 514, training systems are invoked to train models based on the training sets and hyper parameter sets. … At 518, training of a plurality of predictive models is initiated … At step 520, an estimate of the effectiveness of each trained model is determined. If a threshold estimated effectiveness is not reached by any of the trained models, a new hyper parameter set is determined (step 522). … At step 525, a hyper-parameter set of the plurality of sets of hyper-parameters is selected, based on a measure of estimated effectiveness of the trained predictive models. At 530, a production predictive model is generated by training a predictive model using the selected candidate hyper-parameter set … ”; and [0075]-[0076]: “… techniques other than, or in addition to, cross-validation can be used to estimate the effectiveness. In one example, the resource usage costs for using the trained model can be estimated and can be used as a factor to estimate the effectiveness of the trained model. … Training management module 120 can compare the metrics received from each training system 160 to determine if a model should be selected or if an additional round of model training should occur and new hyper parameters generated (212).”).); 
based on comparing the cost metric of the new variant of the machine learning algorithm with the cost metric of the reference variant, determining that the cost metric of the new variant of the machine learning algorithm is lower than the cost metric for the reference variant (Examiner’s note: As indicated earlier, Nguyen teaches a training management module performing comparison of metrics received from the training of multiple models, where the metrics are used to estimate the effectiveness of the trained model based on resource usage costs for using the trained model. Nguyen further teaches an estimate of the effectiveness of each model is determined based on a threshold estimated effectiveness, where the comparison between resource usage costs reaches a threshold level of effectiveness or when the change in effectiveness between two rounds drops below a predefined threshold (indicating that the cost metric from one of the trained models must be lower than the cost metric for another trained model) (Nguyen Figure 5, elements 510, 512, 514, 518, 520, 522, 525; Figure 1, elements 120, 160; [0089]-[0093]; [0075]-[0076]; and [0078]: “… The predictive model generated by each training system 160 or effectiveness metric of the predictive model generated by each training system 160 … can be evaluated as discussed above. Rounds of model training can be repeated using new hyper parameters until a model reaches a threshold level of effectiveness or other condition is met. In some embodiments, training rounds can be repeated until the change in effectiveness between two rounds drops below a pre-defined threshold. In any event, whether preformed in multiple rounds or a single round the most performant hyper parameter set may be selected …”).); 
based on determining that the cost metric of the new variant of the machine learning algorithm is lower than the cost metric for the reference variant, qualifying the new hyper-parameter value for the mini-ML variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation in light of Applicant’s specification paragraph [0039], the term “mini-ML variant” is an alternate name for a variant that is computationally less costly-to-train. As indicated earlier, Nguyen teaches a training management module performing comparison of metrics received from the training of multiple models, where the metrics are used to estimate the effectiveness of the trained model based on resource usage costs for using the trained model. Nguyen further teaches an estimate of the effectiveness of each model is determined based on a threshold estimated effectiveness, where the result of the comparison between resource usage costs is used to select the hyperparameter that identifies the corresponding trained model based on determining this improved effectiveness of resource usage costs, resulting in the identified trained model associated with the selected hyperparameter as being the computationally less costly model (i.e., “the mini-ML variant”)   (Figure 5, elements 510, 512, 514, 518, 520, 522, 525; Figure 1, elements 120, 160; [0089]-[0093]; [0075]-[0076]; and [0078]: “… The predictive model generated by each training system 160 or effectiveness metric of the predictive model generated by each training system 160 … can be evaluated as discussed above. Rounds of model training can be repeated using new hyper parameters until a model reaches a threshold level of effectiveness or other condition is met. In some embodiments, training rounds can be repeated until the change in effectiveness between two rounds drops below a pre-defined threshold. In any event, whether preformed in multiple rounds or a single round the most performant hyper parameter set may be selected …”).).  
Both Kobayashi and Nguyen are analogous art since they both teach machine learning systems that train machine learning algorithms with sets of hyper-parameters.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the learning control unit as taught in Kobayashi and enhance it by directly comparing the cost metrics between two trained machine learning algorithms as taught in Nguyen as a way to select a trained machine learning algorithm using an associated set of hyperparameters that exhibits a higher prediction performance. The motivation to combine is taught in Nguyen, as a way to compare trained models in a distributed environment and to generate a model exhibiting a high performance with an optimal set of hyperparameters, thus improving the predictive performance of the system using the selected model (Nguyen [0025]: “ …the models may be trained with a specific set of hyper-parameters in a distributed way, and then performance of that specific set may be determined, and then the specific set may be adjusted, models may be re-trained using the adjusted set, and so on, until a stopping criterion is met, that may be based on amounts of improvements to the iteratively trained models (e.g., using a convergence criterion), in terms of performance of each iteration of trained models. In this way, the system may determine an optimal set of hyper-parameters that may yield the best predictions.”).
Regarding amended Claim 8, 
Kobayashi teaches
(Currently Amended) The method of Claim 1, further comprising: 
based on the cost metric of the new variant of the machine learning algorithm and comparing the performance score of the new variant of the machine learning algorithm for the training data set with the performance score of the reference variant of the machine learning algorithm for the training data set, qualifying the new hyper-parameter value for the mini-ML variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation in light of Applicant’s specification paragraph [0039], the term “mini-ML variant” is an alternate name for a variant that is computationally less costly-to-train, with the phrase “qualifying the new hyper-parameter value for the mini-ML variant of the machine learning algorithm” broadly indicates identifying the new variant of the machine learning algorithm as being the “mini-ML variant” that satisfies the criteria of being computationally less costly, and hence this limitation broadly recites identifying and storing the new variant of the machine learning algorithm (and its associated hyper-parameter value), where this new variant represents a “mini-ML variant” that satisfies the criteria of being computationally less costly. As indicated earlier, Kobayashi teaches using the improvement rate to check against a threshold R, and performing a determination (Kobayashi Figure 25, step S131) whether to store the model, its prediction performance P (which was earlier determined as taught in Kobayashi [0267]-[0269]), and associated information including the hyper-parameter vector (step S132) or to continue with additional processing (step S114). Hence, the learning control unit performing a determination to store (qualify) the new variant of machine learning algorithm and its associated prediction performance and hyper-parameter (e.g., at learning step j=2) represents the qualifying of the new hyperparameter value associated with a “mini-ML variant” model that represents a computationally less costly variant of the machine learning algorithm (Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132 or S114, and  [0278]-[0281]: “… (S131) The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit specified by the time limit input unit 131. If the elapsed time has exceeded the time limit, the operation proceeds to step S132. Otherwise, the operation returns to step S114.” … (S132) The learning control unit 135c stores the achieved prediction performance P and the model that indicates the prediction performance in the learning result storage unit 123. … In addition, the learning control unit 135c stores the hyperparameter vector 𝛉 use to learn the model …”).); 
modifying the at least one hyper-parameter from the at least one original hyper-parameter value to another hyper-parameter value thereby generating another machine learning algorithm with the at least one hyper-parameter value having the other hyper-parameter value (Examiner’s note: Under its broadest reasonable interpretation, the limitation “modifying the at least one hyper-parameter … to another hyper-parameter value thereby generating another machine learning algorithm” is interpreted as generating another variant of the machine learning algorithm containing a modified hyper-parameter value. Kobayashi teaches a condition where additional learning steps can be initiated if the current learning step is completed within the given learning execution time                 
                    
                        
                            T
                        
                        
                            i
                            ,
                            j
                        
                        
                            q
                        
                    
                     
                
            , where these additional learning steps (e.g., j=3,4,…) also execute the search region determination unit and the hyperparameter adjustment unit to determine different sets of modified hyperparameter vectors using grid or random search methods (Kobayashi [0280]: “… S131 The learning control unit 135c determines whether the time that has elapsed since the start of the machine learning has exceeded a time limit … If the elapsed time has exceeded the time limit … Otherwise, the operations returns to step S114 …”; Figure 24, steps S117, S118 and [0262]-[0266]: (Kobayashi [0265]-[0267]: (S114) The learning control unit 135c selects a virtual algorithm … (S117) The search region determination unit 139 determines a search region that corresponds to the virtual algorithm                 
                    
                        
                            a
                        
                        
                            i
                        
                        
                            q
                        
                    
                
             (the machine learning algorithm                 
                    
                        
                            a
                        
                        
                            i
                        
                    
                
             and the learning time level q) and the sample size                 
                    
                        
                            s
                        
                        
                            j
                        
                    
                
            . Namely, the search region determination unit 139 determines the hyperparameter vector set                 
                    
                        
                            Φ
                        
                        
                            i
                            ,
                            j
                        
                        
                            q
                        
                    
                
             in accordance with the above method. … (S118) The step execution unit 138c executes the j-th learning step of the virtual algorithm                 
                    
                        
                            a
                        
                        
                            i
                        
                        
                            q
                        
                    
                
            . Namely, the hyperparameter adjustment unit 137c selects a hyperparameter vector included in the search region determined in step S117 or a hyperparameter vector near the hyperparameter vector. The step execution unit 138c applies the selected hyperparameter vector to the machine learning algorithm                 
                    
                        
                            a
                        
                        
                            i
                        
                    
                
             and learns a model by using training data having the sample size                 
                    
                        
                            s
                        
                        
                            j
                        
                    
                
            . … The step execution unit 138c repeats the above processing for a plurality of hyperparameter vectors. The step execution unit 138c determines a model, the prediction performance                 
                    
                        
                            p
                        
                        
                            i
                            ,
                            j
                        
                        
                            q
                        
                    
                
             and the execution time                 
                    
                        
                            T
                        
                        
                            i
                            ,
                            j
                        
                        
                            q
                        
                    
                
             from the results of the learning not stopped. …”; and [0249]-[250]).); 
comparing a performance score of the other machine learning algorithm for the training data set with the performance score of the reference variant of the machine learning algorithm for the training data set (Examiner’s note: As indicated earlier, Kobayashi teaches performing successive learning steps j=1,2,3,4,…, where each learning step performed by a learning control unit and step execution unit produces representative machine learning models based on the same machine learning algorithm using a sample size                 
                    
                        
                            s
                        
                        
                            j
                        
                    
                
             from a training data set (Kobayashi Figure 23 and [0256]; Figure 24 and [0261], [0266]-[0269], where the step execution unit 138c refers to Figure 19, [0216]-[0225]), and where the generated representative machine learning model produced by the step execution unit in learning step j=3 represents “the other machine learning algorithm”. Kobayashi teaches at each learning step j, the learning control unit acquires the learned model from the output of the step execution unit, and performs a comparison between the current learned model’s prediction performance with the achieved performance P, where P represents the achieved prediction performance up to now (which is interpreted as being the highest prediction performance from earlier learning steps j=1 and j=2), and as such, this comparison represents a comparison with a “reference variant of machine learning algorithm” (Kobayashi [0267]-[0268]: “(S119) The learning control unit 135c acquires the learned model, the prediction performance                 
                    
                        
                            p
                        
                        
                            i
                            ,
                            j
                        
                        
                            q
                        
                    
                     
                
            thereof, the execution time                 
                    
                        
                            T
                        
                        
                            i
                            ,
                            j
                        
                        
                            q
                        
                    
                
             from the step execution unit 138c. … (S120) The learning control unit 135c compares the prediction performance                 
                    
                        
                            p
                        
                        
                            i
                            ,
                            j
                        
                        
                            q
                        
                    
                
             acquired in step S119 with the achieved prediction performance P (the maximum prediction performance achieved up until now) and determines whether the former is larger than the latter. …”).); 
based on a cost metric of the other algorithm and comparing the performance score of the other machine learning algorithm with the performance score of the reference variant of the machine learning algorithm, qualifying the other hyper-parameter value for the mini-ML variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation in light of Applicant’s specification paragraph [0039], the term “mini-ML variant” is a merely an alternate name for the new variant that is computationally less costly-to-train, and the phrase “qualifying the other hyper-parameter value for the mini-ML variant of the machine learning algorithm” broadly indicates identifying the other variant of the machine learning algorithm as being the “mini-ML variant” that satisfies the criteria of being computationally less costly, and hence this limitation broadly recites identifying and storing the other variant of the machine learning algorithm (and its associated hyper-parameter value), where this other variant represents a “mini-ML variant” that satisfies the criteria of being computationally less costly. This claim limitation is functionally equivalent to the first recited limitation found in this claim, except it is referencing a different learning step (e.g., j=3, instead of j=2) in which to create the “other variant” of the machine learning algorithm, and hence this limitation is also rejected under similar rationale and claim mappings identified in the first claim limitation (Kobayashi Figure 23, element 135c, Figure 25, steps S129→S130→S131→S132 or S114, and [0278]-[0281]).) … 
However, Kobayashi does not explicitly teach
… based on the cost metric of the other algorithm and the cost metric of the new variant of the machine learning algorithm, determining whether to select the new hyper-parameter value or the other hyper-parameter value for the mini-ML variant of the machine learning algorithm.  
Nguyen teaches
… based on the cost metric of the other algorithm and the cost metric of the new variant of the machine learning algorithm, determining whether to select the new hyper-parameter value or the other hyper-parameter value for the mini-ML variant of the machine learning algorithm (Examiner’s note: Under its broadest reasonable interpretation in light of Applicant’s specification paragraph [0039], the term “mini-ML variant” is a merely an alternate name for the new variant that is computationally less costly-to-train. As indicated earlier, Nguyen teaches a training management module performing comparison of metrics received from the training of multiple models, where the metrics are used to estimate the effectiveness of trained models. These metrics are based on resource usage costs (representing “cost metrics”), and are used for selecting the hyperparameter set between one model and another based on the improved effectiveness, such that this process of applying resource usage metrics to determine whether to select a hyperparameter set between trained models (one of which represents a less costly trained model based on resource usage costs) correspond to steps for “determining whether to select the new hyper-parameter value or the other hyper-parameter value for the mini-ML variant of the machine learning algorithm” (Nguyen Figure 5, elements 510, 512, 514, 518, 520, 522, 525; Figure 1, elements 120, 160; [0089]-[0093]: “…At 510, a plurality of training sets are obtained … At step 512, initial hyper parameter sets are determined. … At 514, training systems are invoked to train models based on the training sets and hyper parameter sets. … At 518, training of a plurality of predictive models is initiated … At step 520, an estimate of the effectiveness of each trained model is determined. If a threshold estimated effectiveness is not reached by any of the trained models, a new hyper parameter set is determined (step 522). … At step 525, a hyper-parameter set of the plurality of sets of hyper-parameters is selected, based on a measure of estimated effectiveness of the trained predictive models. At 530, a production predictive model is generated by training a predictive model using the selected candidate hyper-parameter set … ”; and [0075]-[0076]: “… techniques other than, or in addition to, cross-validation can be used to estimate the effectiveness. In one example, the resource usage costs for using the trained model can be estimated and can be used as a factor to estimate the effectiveness of the trained model. … Training management module 120 can compare the metrics received from each training system 160 to determine if a model should be selected or if an additional round of model training should occur and new hyper parameters generated (212).”).).  
Both Kobayashi and Nguyen are analogous art since they both teach machine learning systems that train machine learning algorithms with sets of hyper-parameters.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the learning control unit as taught in Kobayashi and enhance it by directly comparing the cost metrics between two trained machine learning algorithms as taught in Nguyen as a way to select a trained machine learning algorithm using an associated set of hyperparameters that exhibits a higher prediction performance. The motivation to combine is taught in Nguyen, as provided in the prior art claim mapping of Claim 7 recited above.
Regarding amended Claim 18, 
Claim 18 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 7, and hence is rejected under similar rationale and motivations provided by Kobayashi and Nguyen as indicated in Claim 7, in view of the rejections applied to Claim 12.
Regarding amended Claim 19, 
Claim 19 recites the one or more non-transitory computer-readable media of Claim 12, where the one or more transitory computer-readable media further comprises instructions causing operations comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 8, and hence is rejected under similar rationale and motivations provided by Kobayashi and Nguyen as indicated in Claim 8, in view of the rejections applied to Claim 12.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM WAI YIN KWAN whose telephone number is 303-297-4332. The examiner can normally be reached Monday-Friday 8:00am - 4:30pm PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 571-272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/WILLIAM WAI YIN KWAN/Examiner, Art Unit 2121                                                                                                                                                                                                        
/DANIEL T PELLETT/Primary Examiner, Art Unit 2121