DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-19 are pending in the application.
Examiner’s Note: The examiner has cited particular passages including column and line numbers, paragraphs as designated numerically and/or figures as designated numerically in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claims, other passages, paragraphs and figures of any and all cited prior art references may apply as well. It is respectfully requested from the applicant, in preparing an eventual response, to fully consider the context of the passages, paragraphs and figures as taught by the prior art and/or cited by the examiner while including in such consideration the cited prior art references in their entirety as potentially teaching all or part of the claimed invention. MPEP 2141.02 VI: “PRIOR ART MUST BE CONSIDERED IN ITS ENTIRETY, INCLUDING DISCLOSURES THAT TEACH AWAY FROM THE CLAIMS."

Response to Arguments and Amendment
Regarding to Priority Claim:
Application argues, since examiner has not presented any evidence showing why one of ordinary skill in the art would not recognize the claimed invention in the disclosure of the provisional application, therefore the claimed subject matter is adequately supported and enabled by U.S. Provisional Patent Application No. 62/697,578.
Note:   Under 35 U.S.C. 119(e), the written description and drawing(s) (if any) of the provisional application must adequately support and enable the subject matter claimed in the nonprovisional application that claims the benefit of the provisional application. In New Railhead Mfg., L.L.C. v. Vermeer Mfg. Co., 298 F.3d 1290, 1294, 63 USPQ2d 1843, 1846 (Fed. Cir. 2002), the court held that for a nonprovisional application to be afforded the benefit date of the provisional application, "the specification of the provisional must ‘contain a written description of the invention and the manner and process of making and using it, in such full, clear, concise, and exact terms,’ 35 U.S.C. 112¶1, to enable an ordinarily skilled artisan to practice the invention claimed in the nonprovisional application."
In the instant case, the provisional application does not provide any drawing(s) and the written description is directed to a Multitask experiments useful in situations where fast approximation to the true metric under consideration are available.  Thus, there’s nothing in the written description of the nonprovisional application that requires rebut and/or explanation.  One of ordinary skilled in the art would clearly see that the nonprovisional application is fatally defective in supporting the claims of the later-filed application.  Examiner fails to find, in the written description of the nonprovisional application, any teachings that is remotely related to a system to accelerate tuning of hyperparameters.  Therefore, the claims of the instant application would not be considered the benefit of the prior filed application under U.S.C. 119 (e).  
Regarding 103 rejection:
Applicant’s presentative simply dismisses the teachings of Walters and Basu because the filing date of the references is after the priority data of the instant application.  However, it is clear that the instant application has not complied with one or more condition for receiving the benefit of an earlier filing date under U.S.C 119(e).  Therefore, Walters and Basu still qualified as the prior art to the instant application. 
Since applicant has not argued against the combine teachings of Walters and Basu with regard to the newly amended claims, the examiner concludes that applicant’s arguments against the combination of Walters and Basu are unpersuasive.  

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365(c) is acknowledged. Receipt is acknowledged of papers submitted under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365(c), which papers have been placed of record in the file. 
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365(c) is acknowledged.  Applicant has not complied with one or more conditions for receiving the benefit of an earlier filing date under 35 U.S.C. 119(e) as follows:
The later-filed application must be an application for a patent for an invention which is also disclosed in the prior application (the parent or original nonprovisional application or provisional application). The disclosure of the invention in the parent application and in the later-filed application must be sufficient to comply with the requirements of the first paragraph of 35 U.S.C. 112.  See Transco Products, Inc. v. Performance Contracting, Inc., 38 F.3d 551, 32 USPQ2d 1077 (Fed. Cir. 1994).
The disclosure of the prior-filed application, Application No. 62/697578, fails to provide adequate support or enablement in the manner provided by the first paragraph of 35 U.S.C. 112 for one or more claims of this application.  Any claims pertaining to Figure 1-4 would not be considered the benefit of a prior-filed application under 35 U.S.C. 119(e) [Provisional Application # 62/697578]

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-4, 8-10, 12-13, 15, 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Basu et al. US Pub. No. 2020/0226496 (“Basu”) in view of Walters et al. US Pub. No. 2020/0302234 (“Walters”).
Regarding claim 1, Basu teaches a system to accelerate tuning of hyperparameters, the system comprising:
a memory [206];
programmable circuitry [204]; and
instructions in the memory to cause the programmable circuitry [Hyperparameter Tuning Server 110 – see Fig. 1 and 2; para. 0056] to:
access a multi-task tuning work request to tune hyperparameters of a model of a subscriber to a tuning service, the multi-task tuning work request to include:
[0064] The model selection application 214 is configured to provide one or more graphical user interfaces that allow the user of the one or more client devices 104-108 to select a machine-learning model from one or more machine-learning model(s) 232. Examples of trainable machine-learning model(s) 232 include, but are not limited to, Nearest Neighbor, Naïve Bayes, Decision Trees, Linear Regression, Support Vector Machines (SVM), and neural networks

[0065] Furthermore, each of the trainable machine-learning model(s) 232 may be associated with one or more hyperparameters that define the corresponding machine-learning model. Examples of hyperparameters include, but are not limited to, a learning rate, a minimal loss reduction, maximal depth, minimum sum of instance weight for a child, subsample ratio, subsample ratio of columns for a tree, subsample ratio of columns for each split, and one or more regularization terms. In selecting a machine-learning model via the model selection application 214, the hyperparameter tuning server 110 may automatically select one or more hyperparameters 226 associated with the selected machine-learning model 232 to optimize. Additionally, and/or alternatively, a user may specify which of the hyperparameters 226 associated with a selected machine-learning model 232 to optimize.

[0066] The evaluation application 216 is configured to instruct the master server 112 to optimize one or more of the hyperparameters 226 for a selected machine-learning model 232. In addition, the evaluation application 216 may communicate one or more tuning parameter(s) 228 to the master server 112 that define the optimization parameters for optimizing the hyperparameters 226. For example, the tuning parameters 228 may include the predetermined sampling percentage value (e.g., the value of η), an upper limit value (e.g., the value of N), the performance metric and/or quality metric function to determine (e.g., f(x)), one or more values that define X, the kernel function k.sub.θ, or any combination thereof.

(i) a full tuning task [80% to 100%] to tune the hyperparameters of the model, the full tuning task to include a first tuning parameters [user select – see Para. 0065] governing a first tuning operation of the tuning service; and
(ii) a partial tuning task [50%] to tune the hyperparameters of the model, the partial tuning task to include a second tuning parameters [user select – see Para. 0065] governing a second tuning operation of tuning service;
[0068] The predetermined convergence percentage indicates the percentage of hyperparameter values that are to satisfy the convergence threshold for the master server 112 to affirmatively determine that the hyperparameter values have converged on corresponding values. In one embodiment, it may be sufficient for a majority of the hyperparameter values to be converging (e.g., 50%). In another embodiment, a user may desire that a supermajority of hyperparameter values have converged (es., 60%, 70%, 80%, etc.) on corresponding values. Accordingly, the predetermined convergence percentage may be configurable by the user (e.g., through the evaluation application 216), may be programmed as a default value into the master server 112 and/or the hyperparameter tuning server 110, or combinations thereof. In referring to Algorithm 2, above, the convergence threshold and the predetermined convergence percentage may be used by the master server 112 (or other computing device) at step twelve.

[0069] In one embodiment, the tuning parameters 228 and their corresponding values are predetermined. In another embodiment, the user may provide the hyperparameter tuning parameter server 110 with one or more tuning parameter values 228 via one or more client devices 104-108 and the evaluation application 216. Depending on the flexibility afforded to the user, the user may also select which of the tuning parameter value(s) that he or she will provide. In this manner, the hyperparameter tuning server 110 may include default values for one or more of the tuning parameter(s) 228, the user may provide one or more values for the tuning parameter(s) 228, or a combination of default values and user-provided values may be used in optimizing one or more of the hyperparameter(s) 226.

[0070] Using the tuning parameter values, the selected hyperparameter(s) 226, and a selected machine-learning model 232, the master server 112 instructs the one or more execution servers 114 to execute the kernel function and/or quality metric function over a sample of the training data selected from the database 116. As discussed above, the amount of training data to sample may be predetermined by the sampling percentage value η.

[0077] The hyperparameter tuning server 110 then receives a selection of tuning parameters to use in optimizing the selected machine-learning model (Operation 306). The hyperparameter tuning server 110 may also receive one or more of the tuning values from the user via one or more client devices 104-108. Additionally and/or alternatively, the hyperparameter tuning server 110 may be preconfigured or programmed with default values for the tuning parameters 228. In some instances, the user may be unable to define or provide the values for the tuning parameters 228. As discussed above with reference to FIG. 2, examples of tuning parameters include the predetermined sampling percentage value (e.g., the value of η), an upper limit value (e.g., the value of N), the performance metric and/or quality metric function to determine (e.g., f(x)), one or more values that define X, the kernel function ko, and a predetermined distance or convergence threshold.

execute the first tuning operation of the full tuning task based on the first tuning parameters [see Fig. 3 – step 332; Para. 0091]; 
execute the second tuning operation of the partial tuning task based on the second tuning parameters [see Fig. 3 – step 332; Para. 0091];
[0065] Furthermore, each of the trainable machine-learning model(s) 232 may be associated with one or more hyperparameters that define the corresponding machine-learning model. Examples of hyperparameters include, but are not limited to, a learning rate, a minimal loss reduction, maximal depth, minimum sum of instance weight for a child, subsample ratio, subsample ratio of columns for a tree, subsample ratio of columns for each split, and one or more regularization terms. In selecting a machine-learning model via the model selection application 214, the hyperparameter tuning server 110 may automatically select one or more hyperparameters 226 associated with the selected machine-learning model 232 to optimize. Additionally, and/or alternatively, a user may specify which of the hyperparameters 226 associated with a selected machine-learning model 232 to optimize.

[0066] The evaluation application 216 is configured to instruct the master server 112 to optimize one or more of the hyperparameters 226 for a selected machine-learning model 232. In addition, the evaluation application 216 may communicate one or more tuning parameter(s) 228 to the master server 112 that define the optimization parameters for optimizing the hyperparameters 226. For example, the tuning parameters 228 may include the predetermined sampling percentage value (e.g., the value of η), an upper limit value (e.g., the value of N), the performance metric and/or quality metric function to determine (e.g., f(x)), one or more values that define X, the kernel function k.sub.θ, or any combination thereof.

[0070] Using the tuning parameter values, the selected hyperparameter(s) 226, and a selected machine-learning model 232, the master server 112 instructs the one or more execution servers 114 to execute the kernel function and/or quality metric function over a sample of the training data selected from the database 116. As discussed above, the amount of training data to sample may be predetermined by the sampling percentage value η.

[0078] The hyperparameter tuning server 110 may then receive a selection of one or more hyperparameters 226 to optimize (Operation 308). Accordingly, in one embodiment, the user selects or indicates which hyperparameter 226 to optimize. Additionally and/or alternatively, the hyperparameter tuning server 110 may be preconfigured or programmed to optimize particular hyperparameters.

[0079] The hyperparameters to optimize may be specific to the selected machine-learning model. Accordingly, depending on which machine-learning model is selected, the hyperparameter tuning server 110 may optimize different sets of hyperparameters. By way of example, and without limitation, examples of hyperparameters include a learning rate, a minimal loss reduction, maximal depth, minimum sum of instance weight for a child, subsample ratio, subsample ratio of columns for a tree, subsample ratio of columns for each split, and one or more regularization terms.
[see further para. 0091]


generate a first suggestion, the first suggestion set to include one or more first proposed values for the hyperparameters based on the execution of the full tuning task [see step 334 of Fig. 3D; Para. 0092]; and 
generate a second suggestion set, the second suggestion set to include one or more second proposed values for the hyperparameters based on the execution of the partial tuning task [see step 334 of Fig. 3D; Para. 0092].
[0074] The result of the optimization process yields the hyperparameter value(s) 230. The hyperparameter tuning server 110 may then communicate the hyperparameter value(s) 230 to one or more of the client device(s) 104-108. The user may then use the hyperparameter value(s) 230 in implementing the corresponding machine-learning model selected from the machine-learning model(s) 232.

[see further discussed paragraph related to Fig. 3A to 3D]

Basu does not teach wherein after an identified performance metric of the model using the one or more second proposed values for the hyperparameters corresponding to the partial turning task satisfies a performance threshold, set the partial tuning task as a proxy for the full tuning task to accelerate a tuning of the hyperparameters of the model.
Walters teaches a  systems and methods for generating machine-learning models, and more particularly, to systems that efficiently generate machine-learning models by estimating minimum data requirements for training models and tuning hyper-parameters. 
Specifically, Walters teaches wherein after an identified performance metric of the model using the one or more proposed values for the hyperparameters derived from the execution of a partial turning task [Step 712 of fig. 7 – Tune hyper-parameters for secondary models using only the limited training data] satisfies a performance threshold [Step 714 – assess the accuracy], set the partial tuning task as a proxy for the full tuning task [Step 720 – generate a database entry associated data category, model type, number of samples, and tuned hyper-parameters; see also Fig. 9] to accelerate a tuning of the hyperparameters of the model.
[0105] In step 712, hyper-parameters of the secondary models may be tuned. For example, each one of the models from the secondary sequences may be passed through a hyper-parameter tuning process. For example, optimization system 105 may perform a grid search by running each one of the secondary models for each setting of parameters. Alternatively, or additionally, optimization system 105 may perform a random search of hyper-parameters for each one of the secondary models and/or perform a Bayesian optimization or an evolutionary optimization. Step 712 may be performed concurrently with step 710. For example, as the sequence of progressively less data is being used, models may be run through the tuning operations.

[0106] In step 714 optimization system 105 may assess the accuracy of each one of the secondary models. In step 714, optimization system may use validation data to determine the accuracy of each secondary model. For example, once hyper-parameters have been tuned in step 714 for secondary models based on the CNN for the monthly transactions data category, optimization system 105 may determine the accuracy of the model by comparing inputs of validation data with the known outputs for the validation data.

[0108] In step 716, based on the secondary models and their estimated accuracy, optimization system 105 may identify a minimum viable model from each sequence of secondary models. The minimum viable model may include the model that uses the smallest training dataset, identified from the progressively smaller datasets, and still achieves the desired accuracy threshold as evaluated in step 718. For example, with three data categories and four model types, optimization system would identify at least twelve minimum viable models, one for each category data profile and each type of model generated from the category data profile.

[0132] In step 912, optimization system may determine a minimum number of samples for the target model requested in step 902 using the user data as the training dataset. Once optimization system 105 identifies the data profile of the user data is related to a data category, such as the ones determined for step 704 (FIG. 7), optimization system 105 may request information about minimum data requirements for sample models. In some embodiments, step 912 may involve retrieving, from database 180, a minimum number of samples for the data profile based on: the data profile most closely related to the user data profile, and the target model type. For example, if in step 910 optimization system associated the user data with a monthly data profile, in step 912 optimization system may review the minimum required number of samples for viable models identified in process 800. Because the data profiles are similar, it is possible to estimate that the sample models would behave similarly to the user data, allowing the determination of minimum data requirements for the user requested model based on the sample models. For example, if the user requested a CNN with 98% accuracy and the user data indicates the data profile similar to yearly transactions, optimization system 105 may identify the minimum viable model for a CNN based on sample models created with yearly data profiles that achieve a 98% accuracy. Because the data profiles are similar and the user is requesting a CNN, the minimum data requirements for viable models and the tuned hyper-parameters discovered for the sample CNN could be applied, at least as a starting point, to the new dataset. This estimation of minimum number of samples required for the target model may save resources and improve efficiency in the generation of machine-learning models.

In summary, Walters teaches a system and method for tuning hyper-parameters with minimum iterations and dataset (partial tuning).  Specifically, Waters teaches, after an identified performance metric of the model using the one or more proposed values for the hyperparameters corresponding to the partial turning task satisfies a performance threshold, save the partial tuning task (as a proxy) for subsequence full tuning task.  Thus accelerating a tuning of the hyperparameters of the model.  
Before the effective filing data of the claimed invention, it would have been obvious to one of ordinary skill in the art to have modified the system of Basu with, after an identified performance metric of the model using the one or more proposed values for the hyperparameters corresponding to the partial turning task satisfies a performance threshold, set the partial tuning task as a proxy for the full tuning task to accelerate a tuning of the hyperparameters of the model of Walters.  Such feature would enable the system of Basu to use one or more proposed second values for the hyperparameters corresponding to the partial turning task, when satisfied a performance threshold, to serve as a proxy for subsequence full tuning task.   Thus would help to reduce processing time and computing resources as suggested by Walters in Par. 0053.
Regarding claim 2, Basu in view of Walters teaches the second tuning operation of the partial tuning task is an abbreviated tuning operation relative to the first tuning operation of the full tuning task, the abbreviated tuning operation to use one or more of less time for execution or less computing resources for execution [Para. 0013, 0053, 0095 of Walters].
Regarding claim 3, Walters does not expressly teach collects observation data, the observation data to include a real-world performance of the one or more second proposed values for the hyperparameters of the second suggestion set.  However, Examiner takes official notice that such feature is well known in the art of neural network training.  One of ordinary skill in the art would motivated to provide such feature in order to determine the accuracy of the tuning hyperparameters based on real-world performance.  
Regarding claim 4, Basu teaches an application programming interface that is in operable communication with the tuning service, the application programming interface to: configure the multi-task tuning work request [Para. 0042 - [0042] The client devices 104-108 may include one or more applications (also referred to as “apps”) such as, but not limited to, a web browser, messaging application, electronic mail (email) application, a programmatic application, and the like. In some embodiments, if the programmatic application is included in one or more of the client devices 104-108, then this application is configured to locally provide the user interface and at least some of the functionalities with the application configured to communicate with the hyperparameter tuning server 110], the configuration of the multi- task tuning work request to include: defining the first set tuning parameters for the full tuning task, and defining the second set tuning parameters for the partial tuning task [Para. 0069 - the tuning parameters 228 and their corresponding values are predetermined. In another embodiment, the user may provide the hyperparameter tuning parameter server 110 with one or more tuning parameter values 228 via one or more client devices 104-108 and the evaluation application 216. Depending on the flexibility afforded to the user, the user may also select which of the tuning parameter value(s) that he or she will provide].
Regarding claim 8, Basu in view of Walters teaches simultaneously assess ones of the one or more second proposed values for the hyperparameters of the second suggestion set to accelerate the tuning of the hyperparameters of the model [Para. 0045 - hyper-parameter optimizer 130 may be in communication with computer clusters 160 and generate parallel processing requests to adjust and search hyper-parameters to allow multiple hyper-parameter configurations to be evaluated concurrently; Para. 0076 of Walters].
Regarding claim 9, Basu in view of Walters teaches constructs a surrogate model that is an approximation of the model of the subscriber, the assessing of the ones of the one or more second proposed values for the hyperparameters of the second suggestion is performed via the surrogate model [Para. 0111 - optimization system 105 may generate a database entry that associates the data category, model type, number of samples, and tuned hyper-parameters; Para. 0132 - Once optimization system 105 identifies the data profile of the user data is related to a data category, such as the ones determined for step 704 (FIG. 7), optimization system 105 may request information about minimum data requirements for sample models. In some embodiments, step 912 may involve retrieving, from database 180, a minimum number of samples for the data profile based on: the data profile most closely related to the user data profile, and the target model type – of Walters].
Regarding claim 10, Basu in view of Walters teaches the partial tuning task is set as the proxy for the full tuning task, the programmable circuitry to search only a parameter space of the partial tuning task for new proposed values for the hyperparameters of the model [Para. 0113 -  optimization system 105 may generate a secondary model for a CNN using a limited training dataset that only utilizes 50% of the total available data; Para. 0132 - if the user requested a CNN with 98% accuracy and the user data indicates the data profile similar to yearly transactions, optimization system 105 may identify the minimum viable model for a CNN based on sample models created with yearly data profiles that achieve a 98% accuracy. Because the data profiles are similar and the user is requesting a CNN, the minimum data requirements for viable models and the tuned hyper-parameters discovered for the sample CNN could be applied, at least as a starting point, to the new dataset. This estimation of minimum number of samples required for the target model may save resources and improve efficiency in the generation of machine-learning models – of Walters].
Regarding claims 12-13,15, and 17, they are directed the method of steps to implement the system as set forth in claims 1-4, 8-10.  Therefore, they are rejected on the same basis as set forth hereinabove.

Allowable Subject Matter
Claims 5-7, 11, 14, 16, 18-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  Claims 5-7, 11, 14, 16, 18-19 are considered allowable since, when reading the claims in light of the specification, none of the references of record alone or in combination disclose or suggest the combination of subject matter specified in the dependent claim(s).  

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VINCENT HUY TRAN whose telephone number is (571)272-7210. The examiner can normally be reached M-F 7:00-4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thomas C Lee can be reached on 571-272-3667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

VINCENT H TRAN
Primary Examiner
Art Unit 2115



/VINCENT H TRAN/            Primary Examiner, Art Unit 2115