Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The amendment filed 2021-07-12 has been entered. Applicant’s amendments to the claims overcome each and every 112(b) rejection previously set forth in the Non-Final Office Action mailed 2021-06-17.
Allowable Subject Matter
The following is an examiner’s statement of reasons for allowance: 
Regarding independent claims 1 and 23, Hsu et. al. (“A Practical Guide to Support Vector Classification”; hereinafter Hsu), Section 3.2 Para 4, discloses:  “We recommend a ‘grid-search’ on C and g using cross-validation. Various pairs of (C; g) values are tried and the one with the best cross-validation accuracy is picked. We found that trying exponentially growing sequences of C and g is a practical method to identify good parameters (for example, C = 2-5, 2-3,…, 215, g = 2-15, 2-13,…, 23).”  Here, Hsu discloses for each hyperparameter tuple of a plurality of hyperparameter tuples, as here a grid-search comprises iterating through (C; g) tuples of two hyperparameters, each of these tuples contains a distinct plurality of values.  Hsu also discloses a current value range for both C and g.  Hsu, Section 3.2 Para 7, discloses:  “Since doing a complete grid-search may still be time-consuming, we recommend using a coarse grid first.  After identifying a “better” region on the grid, a finer grid search on that region can be conducted.  To illustrate this, we do an experiment on the problem “german” from the Statlog collection (Michie et al., 1994). After scaling this set, we first use a coarse grid (Figure 2) and 3, 25) with the cross-validation rate 77.5%.  Next we conduct a finer grid search on the neighborhood of (23, 25) (Figure 3) and obtain a better cross-validation rate 77.6% at (23.25, 2-5.25).  After the best (C, g) is found, the whole training set is trained again to generate the final classifier.”  Here, Hsu discloses calculating a score (“cross-validation rate”) over epochs (“coarse grid first…finer grid search”, wherein a cross-validation rate for each must be arrived at via training).  While Hsu does not explicitly teach “all values of the plurality of hyperparameter tuples that belong to a same hyperparameter have a same value unless said same value is said particular hyperparameter”, it is obvious that any grid search may be performed in this way.  For example, in iterating over every (C, g), one could perform this in a particular order such as (C1, g1), (C1, g2),…,(C1, gn), (C2, g1), (C2, g2),…,(C2, gn), etc., such that only one hyperparameter is changed over each tuple, before moving on to the next set of tuples.  Also, while Hsu does not teach “that is not a categorical hyperparameter”, this is implied by Hsu.  While a true exhaustive grid search on a full range of discrete values can easily accommodate categorical hyperparameters, Hsu is narrowing down a range to find a better “region” or “neighborhood” on a grid.  As categorical hyperparameters may have no intrinsic ordering, a “region” or “neighborhood” in a geometric space would have no meaning, and thus it appears Hsu is not using categorical hyperparameters.
A distinction is found in the limitation “narrowing the current value range of the particular hyperparameter based on an intersection point of a first line that is based on said scores and a second line that is based on said scores”.  Hsu, as shown above in Section 3.2 Para 7 discloses narrowing the current value range of the particular hyperparameter (“conduct a finer grid search”) based on said scores (“obtain a better cross-validation rate”).  However, this based on an intersection point of a first line that is based on said scores and a second line that is based on said scores.  One may attempt to argue the following:

    PNG
    media_image1.png
    951
    838
    media_image1.png
    Greyscale


narrowing the current value range of the particular hyperparameter (“C”) is based on 4 lines, which comprise the intersection of a first and second line.  However, this is not a reasonable argument.  The intersection itself is completely inconsequential to the range of C.   All that matters for the range of the particular hyperparameter (“C”) are, quite obviously, the min and max values of C (1 to 5).  

    PNG
    media_image2.png
    429
    749
    media_image2.png
    Greyscale

Thus, it is not a reasonable interpretation that Hsu teaches narrowing the current value range of the particular hyperparameter based on an intersection point of a first line that is based on said scores and a second line that is based on said scores.  The intersections drawn above merely indicate boundaries of the narrowed two-dimensional hyperparameter space as a whole, and are actually a consequence (and not a cause) of narrowing each particular hyperparameter.
Bergstra et. al. (“Random Search for Hyper-Parameter Optimization”; hereinafter Bergstra) discloses a random search instead of an exhaustive grid search.  Bergstra, like Hsu, does not focus on one particular hyperparameter at a time, but rather looks at the multidimensional hyperparameter space as a whole.  Bergstra,  Pg 283 below the bullets, discloses “In this paper, we focus on random search, that is, independent draws from a uniform density from the same configuration space as would be spanned by a regular grid, as an alternative strategy for producing a trial set {l(1) ... l(S)}.”  Thus, Bergstra samples multidimensional space {l(1) ... l(S)} like Hsu samples 2-dimensional space (C, g).  Neither focuses on a particular hyperparameter individually.  And narrowing the range by random sampling is certainly not by the intersection of two lines.
Another method of hyperparameter optimization is Bayesian optimization.  Sarkar et. al. (US 2019/0095785 A1; hereinafter Sarkar) discloses Bayesian optimization.  Sarkar Para [0034], discloses:  “Regardless, in an embodiment, the data analytics service 206 utilizes compute resources 210 to run a Bayesian optimization algorithm that generates one or more suggested hyperparameter values based on outputs generated from running a machine-learning algorithm for one epoch (e.g., a first portion among many of a machine-learning process) and modifies the hyperparameter to the suggested value for a second epoch subsequent to the first epoch—in some cases, a machine-learning process runs for two epochs (i.e., the first epoch using randomly selected values and the second epoch using suggested values from a Bayesian optimization algorithm), and in some cases, the process runs for several epochs in which subsequent outputs are used by the optimization algorithm to iteratively generate more refined suggestions for the hyperparameter values”.  Here, Sarkar discloses epochs in order to narrow the range (“generate more refined suggestions”) of the hyperparameters.  Sarkar, Para [0053], last sentence discloses:  “In an embodiment, a client receives the results, and determines whether to further adjust one or more hyperparameters to improve the quality of the training in subsequent training runs.”  Here, again, Sarkar discloses treating the multidimensional hyperparameter space as a whole (“adjust one or more hyperparameters”), and not specifically adjusting a particular hyperparameter.  Even if this was done on a particular hyperparameter, a closer look at Bayesian optimization reveals that it is not based on an intersection between a first line and a second line.  Brochu et. al. (“A tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning”; hereinafter Brochu), Abstract, discloses:  “Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments—active user modelling with preferences, and hierarchical reinforcement learning— and a discussion of the pros and cons of Bayesian optimization based on our experiences.”  Here, Brochu discloses that Bayesian optimization works by finding suggested points for improvement based on a probability distribution, but not on an intersection between a first line and a second line.  Brochu, Figure 5, shows the suggested sampling areas in one dimension:

    PNG
    media_image3.png
    122
    471
    media_image3.png
    Greyscale

And Brochu, Figure 6, shows in two dimensions:

    PNG
    media_image4.png
    192
    170
    media_image4.png
    Greyscale

Here, again, in the two dimensional case it is clear that Bayesian optimization searches for hyperparameter tuples by looking holistically at the multidimensional space, and not individually by particular hyperparameter.
	Maclaurin et. al. (“Gradient-based Hyperparameter Optimization through Reversible Learning”; hereinafter Maclaurin) discloses another method of hyperparameter optimization, in this case a gradient-based approach.  Maclaurin, Figure 1, discloses the following:

    PNG
    media_image5.png
    344
    333
    media_image5.png
    Greyscale

Here, it is clear that Maclaurin also samples from the multidimensional hyperparameter space as a whole, and does not individually optimize each particular hyperparameter, nor is this done by the intersection of a first line and a second line.
	In summary, Examiner has determined that for the following existing methods of hyperparameter optimization:
Grid Search
Random Search
Bayesian Optimization
Gradient-Based
None of them, nor any variations thereof, anticipate any reasonable interpretation of narrowing the current value range of the particular hyperparameter based on an intersection point of a first line that is based on said scores and a second line that is based on said scores.  
For at least these reasons, independent claims 1 and 23 are allowable over the prior art of record. Claims 2-22 and 24-44 are allowable by virtue of their dependence from their respective base claims.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Allowable Subject Matter
Claims 1-44 are allowed over the prior art of record.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEONARD A SIEGER whose telephone number is (571)272-9710.  The examiner can normally be reached on M-F 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/L.A.S./Examiner, Art Unit 2126   
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126