DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Claims 1, 8, 11, 13, 14, and 18 were amended. Claims 1-20 are pending and are examined herein.
Claims 1-20 are rejected under 35 USC 112(a) as failing to comply with the written description requirement.
Claims 1-20 are rejected under 35 USC 112(b).
Applicant’s amendment overcomes the previous grounds of rejection of claims 1-20 under 35 USC 103; however, upon further consideration, new grounds of rejection under 35 USC 103 necessitated by amendment are presented herein.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 08/04/2022 has been entered.
 
Response to Arguments
	Applicant’s arguments filed 08/04/2022 regarding the rejection under 35 USC 103 have been fully considered, but are moot in view of the new grounds of rejection necessitated by amendment. In particular, newly-cited Da is relied upon to teach the threshold. Note rejection under 35 USC 112(b) and interpretation.

	Applicant’s arguments filed 08/04/2022 regarding the claim interpretation regarding claim 5-6, 8, 15-16 and 18 have been fully considered, but are not persuasive. The claims recite operations which are contingent on a condition being met. Nevertheless, for the purposes of compact prosecution, it has further been indicated how claims 8 and 18 would be rejected if the precedent condition were required to be met (i.e., if the claim were amended to require that a programmatic error always occur). See rejection.

Claim Interpretation – Contingent Limitations
	The interpretation of contingent limitations may be found at MPEP 2111.04, section II. In particular,  “The broadest reasonable interpretation of a system (or apparatus or product) claim having structure that performs a function, which only needs to occur if a condition precedent is met, requires structure for performing the function should the condition occur.” 

	Claims 5 and 15 recite the contingent limitation “when hang occurs…the model training is terminated and results of the programmatic errors are returned to the user”. This is a contingent limitation whose antecedent condition is not necessarily met. 

	Claims 6 and 16 further recite the contingent limitation “the model training is terminated when a run time reaches a predetermined maximum time”. The claim does not guarantee that the maximum time is reached.

	Claims 8 and 18 further recite the contingent limitation “the operations comprise deploying, if no programmatic errors occur in the launched model training, full hyperparameter model optimization with multiple containers of models evaluating the hyperparameter space”.

Claim Rejections - 35 USC § 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Independent claims 1 and 11 recite “generate feature matrices based on the features of the first hyperparameters identified by the natural language processing technique and a threshold of the feature matrices”. Applicant identifies as-filed [0178] as providing support for this claim limitation. As-filed [0178] reads (emphasis added): 
[0178] In some embodiments, autoencoders may generate one or more feature matrices based on the identified keywords or characteristics of the hyperparameters after using NLP techniques. Quick hyperparameter instance 2107 may cluster one or more vectors or other components of the feature matrices associated with the retrieved hyperparameters and corresponding vectors or other components of the one or more feature matrices from the autoencoders. The autoencoders may map the clusters to determine expected namings of hyperparameters. The autoencoders may also determine similar namings for a given name. For example, quick hyperparameter instance 2107 may apply one or more thresholds to one or more vectors or other components of the feature matrices associated with the retrieved hyperparameters, corresponding vectors or other components of the one or more feature matrices from the autoencoders, or distances therebetween in order to classify the retrieved hyperparameters into one or more clusters. Additionally or alternatively, quick hyperparameter instance 2107 may apply hierarchical clustering, centroid-based clustering, distribution-based clustering, density-based clustering, or the like to the one or more vectors or other components of the feature matrices associated with the retrieved hyperparameters, the corresponding vectors or other components of the one or more feature matrices from the autoencoders, or the distances therebetween. In any of the embodiments described above, quick hyperparameter instance 2107 may perform fuzzy clustering such that each retrieved hyperparameter has an associated score (such as 3 out of 5, 22.5 out of 100, a letter grade such as ‘A’ or ‘C,’ or the like) indicating a degree of belongingness in each cluster. The measures of matching may then be based on the clusters (e.g., distances between a cluster including hyperparameters in hyperparameter space 106 and clusters including the retrieved hyperparameters or the like).

	That is, the specification appears to disclose clustering or classifying vectors or other components of the (already generated) feature matrices using one or more thresholds. However, this portion of the specification does not disclose the generation of the feature matrices being based on a threshold of the feature matrices. The remainder of the specification does not appear to provide support for the claim limitation. Dependent claims 2-10 and 12-20 do not resolve the issue and are rejected with the same rationale.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

	Claims 1 and 11 recite “generate feature matrices based on the features of the first hyperparameters identified by the natural language processing technique and a threshold of the feature matrices; cluster one or more vectors of the feature matrices to generate at least one first cluster”. As per MPEP 2173.03, “A claim, although clear on its face, may also be indefinite when a conflict or inconsistency between the claimed subject matter and the specification disclosure renders the scope of the claim uncertain as inconsistency with the specification disclosure or prior art teachings may make an otherwise definite claim take on an unreasonable degree of uncertainty.” In this case, the specification does not provide support for using a threshold of the feature matrices to generate feature matrices as indicated above with respect to the rejection under 35 USC 112(a). However, the specification would appear to support using a threshold in the subsequent clustering step. For the purposes of applying the prior art and to practice compact prosecution, the threshold is being interpreted as being applied in the clustering step rather than the generating step. Dependent claims 2-10 and 12-20 do not resolve the issue and are rejected with the same rationale.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over “Koch” (US 2018/0240041 A1, previously cited) in view of “Hammond” (US 2017/0213155 A1), further in view of “Yokoyama” (US 2019/0385083 A1), further in view of “Huang” (US 2019/0236487 A1), and further in view of “Da” (US 2019/0026648 A1).

	Regarding claim 1, Koch teaches
	A training model generator system, comprising: one or more memory units storing instructions; and one or more processors configured to execute the stored instructions (A system comprising processor/memory is described at [0005].
	to perform operations for tuning hyperparameters, …, comprising: (Abstract and Fig 5-6 and accompanying description in the written specification show/describe a method for identifying hyperparameters.)
	receiving a request to complete a hyperparameter optimization task; (Steps 502-520 show the system receiving information for the performance of a hyperparameter optimization task. In particular, the request includes a dataset (506-508), a model type for which to identify hyperparameters (512), potentially receiving an indicator of hyperparameter values to identify for the model type (516)The hyperparameter optimization task occur at step 528 and [0146] show/describe receiving a request for a hyperparameter optimization.)
	initiating a model generation task based on the requested hyperparameter optimization task; supplying first computing resources to a hyperparameter determination virtual computing instance configured to (Step 528 and [0146] show/describe receiving a request for a hyperparameter optimization by the selection manager device. Fig. 6A-C appears to be the method corresponding to executing the hyperparameter selection task. The hyperparameter determination is understood to be a first step of a model generation task because it is used in subsequent steps 530-534 to generate a final model. [0071] indicates that this may be performed using a session, which may be interpreted as a hyperparameter determination virtual computing instance. See also [0036] for the sessions.) 
	execute a deployment script to identify first hyperparameters to be evaluated by the model generation task, the deployment script comprising a range of values to be tested; and ([0151] indicates that the tuning is performed by the selection manager device 104 described above. Step 602 instantiates the iteration manager. [0152] indicates that this includes determining a configuration list that includes a set of hyperparameter configurations to evaluate. Since this is computer-implemented, there is code which causes these operations to occur. The code which causes these operations is understood to correspond to the deployment script. [0244] indicates that the computer may operate according to computer-executable instructions.)
	supplying second computing resources to a quick hyperparameter virtual computing instance configured to: receive the identified first hyperparameters from the hyperparameter determination virtual computing instance; and -2-Application No. 16/584,652 Attorney Docket No. 05793.3822-00000(Steps 530 and 532 receive the hyperparameters and select hyperparameters from these results as described at [0147-0148]. It is understood that since Koch teaches a computer-implemented method, any operation requires a provision of computing resources for its performance. [0071] indicates that this may be performed using a session, which may be interpreted as a hyperparameter determination virtual computing instance. See also [0036] for the sessions )
	…launching a model training using the determined values and the first hyperparameters; (Step 534 shows training the model with the selected hyperparameter configuration and values (which may be selected in the manner taught by the combination described in more detail below.)
	Koch does not appear to explicitly teach
	apply a natural language processing technique to identify features of the first hyperparameters;
	generate feature matrices based on the features of the first hyperparameters identified by the natural language processing technique and a threshold of the feature matrices;
	cluster one or more vectors of the feature matrices to generate at least one first cluster;
	retrieve second hyperparameters from a hyperparameter space based on a distance between the at least one first cluster and at least one second cluster including one of the second hyperparameters, the retrieved second hyperparameters having structural similarities to the first hyperparameters; and
	determine values within the range of values that return a fastest model run time with the second hyperparameters;
	launching a model training using the determined values and the first hyperparameters; 
	in response to a programmatic error, associated with the model training, terminating the model training and notifying a user.  
	However, Hammond teaches
	determine values within the range of values that return a fastest model run time with the second hyperparameters; ([0075], see especially very end of paragraph, describes various criteria that the system may use to identify topologies (e.g., a neural network topology as described at [0065]) of an optimal model. A configuration of a neural network is understood to comprise hyperparameters (e.g., number and configuration of nodes). At [0075], it is indicated that performance time is one of the criteria. That is, [0075] teaches identifying hyperparameters based the model run time. The location in memory where this data is stored is the hyperparameter space. For the system to identify hyperparameters based on these model run times, the run times are necessarily accessed. In the context of figures 5A-5B, the AI objects are returned at step 114-120. In particular, the search is performed at step 114 (described at [0135]) and the architect module may identify topologies at the first part of step 120 (described at [0138]). [0076] indicates that a best trained AI model may be determined by means of optimal results based on factors such as performance time. That is, an optimal (i.e., minimum) time may be determined among the models. While Hammond teaches a range of embodiments (e.g., determining an optimum based on accuracy), the teaching of Hammond includes the claimed limitation.) 
	launching a model training using the determined values and the first hyperparameters; 
 ([0076] indicates that a neural network may then be trained based on the determined information. In the context of figures 5A-5B, this is the last part of step 120 and step 122 (described at [0138-0139]).)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Koch to use a database-based hyperparameter selection/optimization approach based on model run time because this allows for the reuse, reconfigure ability, and recomposition of the trained AI data objects from the AI database into a new trained AI model as described by Hammond at [0005]. Furthermore, It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to base the selection on model runtime because this would allow the training of models which perform efficiently, which would be important in environments in which fast processing is essential (e.g., self-driving cars) or environments in which processing resources are limiting (e.g., mobile or embedded applications.
	The combination of Koch and Hammond does not appear to explicitly teach
	apply a natural language processing technique to identify features of the first hyperparameters;
	generate feature matrices based on the features of the first hyperparameters identified by the natural language processing technique and a threshold of the feature matrices;
	cluster one or more vectors of the feature matrices to generate at least one first cluster;
	retrieve second hyperparameters from a hyperparameter space based on a distance between the at least one first cluster and at least one second cluster including one of the second hyperparameters, the retrieved second hyperparameters having structural similarities to the first hyperparameters; and
	…in response to a programmatic error, associated with the model training, terminating the model training and notifying a user.  
	However, Yokoyama—directed to analogous art—teaches
	generate feature matrices based on the features of the first hyperparameters… ([0084] indicates that the characteristics may be represented as a multidimensional vector (i.e., feature vector). Note that a vector is a type of matrix (e.g., a 2-dimensional column vector is a 2x1 matrix).)
	…cluster one or more vectors of the feature matrices to generate at least one first cluster; (Abstract describes determining parameters (note that a hyperparameter is a type of parameter) based on blocks (i.e., clusters) of parameters. [0084] indicates that the blocks (i.e., clusters) of parameters are determined by computing a distance between characteristics of the parameters.)
	retrieve second hyperparameters from a hyperparameter space based on a distance between the at least one first cluster and at least one second cluster including one of the second hyperparameters, the retrieved second hyperparameters having structural similarities to the first hyperparameters; and	([0092] describes determining a next block (i.e., set of parameters) to investigate by determining a block that is closest to the previous block centroid. [0091] indicates that distance represents a similarity, so nearby blocks may be reasonably interpreted as having structural similarities.
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination of Koch and Hammond to perform clustering of the parameters as taught by Yokoyama because this allows for parallelization of the optimization (see Yokoyama at [0034]), which results in decreased learning time (see Yokoyama at [0003]).
	The combination of Koch, Hammond and Yokoyama does not appear to explicitly teach 
	apply a natural language processing technique to identify features of the first hyperparameters;
	generate feature matrices based on the features of the first hyperparameters identified by the natural language processing technique …
	and a threshold of the feature matrices;…cluster one or more vectors of the feature matrices to generate at least one first cluster;
	…in response to a programmatic error, associated with the model training, terminating the model training and notifying a user.  
	However, Huang—directed to analogous art—teaches 
	apply a natural language processing technique to identify features of the first hyperparameters; generate feature matrices based on the features of the first hyperparameters identified by the natural language processing technique…; ([0062] describes parsing (which is a natural language processing technique) to identify values/features to be associated with the hyperparameters. In the combination described above, the values parsed as taught by Huang would then be the features analyzed and in particular used in the clustering taught by Yokoyama which includes representing the parameters as vectors/matrices as described above.)
	in response to a programmatic error, associated with the model training, terminating the model training and notifying a user.  ([0018] indicates that the system may monitor job status of tuning/training jobs including identifying failures (i.e., programmatic errors). [0021] indicates that  notifications may be provided when the status of a job changes. See also [0051-0052]. [0053] indicates that in response to an error, when insufficient progress has been made, the job may not be automatically retried (i.e., the training is terminated since it is not continued via retrial).)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination described above to use natural language processing technique to identify features of the hyperparameters as taught by Huang because this allows users to provide values as input as taught by Huang at [0062]. Moreover, it would have been obvious to monitor for errors/failures because this allows for a user to be emailed as described at [0052] and allows for the system to terminate training instances which did not show progress upon failure as described at [0053].
	The combination of Koch, Hammond, Yokoyama and Huang does not appear to explicitly teach
	and a threshold of the feature matrices…cluster one or more vectors of the feature matrices to generate at least one first cluster
	However, Da—directed to analogous art—teaches
	and a threshold of the feature matrices…cluster one or more vectors of the feature matrices to generate at least one first cluster ([0004, 0039] describes clustering points in a parameter space using a user-defined threshold lambda. In the combination with Yokoyama, Yokoyama already teaches clustering vector/matrix representations of parameters as described above. Da is relied upon to teach performing the clustering using a threshold.)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination of Koch, Hammond, Yokoyama and Huang to use a threshold when performing the clustering because Yokoyama already teaches performing clustering of vector/matrix representations of parameters, but does not go into detail regarding the clustering and the use of a threshold as taught by Da ensures that clusters are uncorrelated as described by Da at [0039]. 

	Regarding claim 2, the rejection of claim 1 is incorporated herein. Furthermore, Koch teaches
	wherein the request … indicates model characteristics comprising at least one of a model type, a data schema, a data statistic, a training dataset type, a model task, a training dataset identifier, or a hyperparameter space.  (Koch teaches model type (step 512 and [0075]), data schema (step 525 and [0144]), data statistic (step 522 includes fit statistics, see [0102]), training data set type (step 506 and [0072], the data set itself "indicates" data set type), model task (step 512, [0075]), request for a type of model is a task of training that type of model), training dataset identifier (step 506 and [0072]), hyperparameter space (step 516 and [0089]).)
	Koch does not appear to explicitly teach 
	wherein the request includes an API call and
	However, Hammond—directed to analogous art—teaches
	wherein the request includes an API call and ([0105-0106] describes interacting with the model server via an API interface.)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Koch to use an API interface because this interface is “simple” (i.e., easy to use) as described by Hammond at [0105].

	Regarding claim 3, the rejection of claim 1 is incorporated herein. Furthermore, Koch teaches
	wherein supplying the first and second computing resources to the hyperparameter determination and quick hyperparameter virtual computing instances comprises generating the instances, respectively.  ([0040] describes the work being distributed across multiple sessions. Sessions are "created" (see [0158]).)

	Regarding claim 4, the rejection of claim 1 is incorporated herein. Furthermore, Koch teaches
	wherein retrieving the second plurality of hyperparameters from the hyperparameter space further comprises direct submission to the system by at least one of the user or script profiling.  (Provided by user at step 516 and [0089].)

	Regarding claim 5, the rejection of claim 1 is incorporated herein. Furthermore, Koch teaches
	wherein if hang occurs in the launched model training, the model training is terminated (The system taught by Koch and described above is capable of terminating training as described at [0180]. See contingent limitation discussion.)
	Koch does not appear to explicitly teach 
	and results of the programmatic errors are returned to the user.  
	However, Huang—directed to analogous art—teaches
	wherein if hang occurs in the launched model training, the model training is terminated and results of the programmatic errors are returned to the user.   ([0018] indicates that the system may monitor job status of tuning/training jobs including identifying failures (i.e., programmatic errors). When the job has failed, but before a retry is attempted, progress has stalled, so hang has occurred. [0021] indicates that  notifications may be provided when the status of a job changes. See also [0051-0052]. [0053] indicates that in response to an error, when insufficient progress has been made, the job may not be automatically retried (i.e., the training is terminated since it is not continued via retrial).)
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have performed this combination for the reasons given above with respect to claim 1.

	Regarding claim 6, the rejection of claim 1 is incorporated herein. Furthermore, Koch teaches
	wherein the model training is terminated when a run time reaches a predetermined maximum time.  (Model termination based on time is described/shown at step 650 and [0180].)

	Regarding claim 7, the rejection of claim 1 is incorporated herein. The combination of Koch and Hammond does not appear to explicitly teach, but Yokoyama teaches
	wherein at least one hyperparameter in the hyperparameter space is associated with a score indicating a degree of belongingness to a respective cluster (Abstract describes determining parameters (note that a hyperparameter is a type of parameter) based on blocks (i.e., clusters) of parameters. [0084] indicates that the blocks (i.e., clusters) of parameters are determined by computing a distance between characteristics of the parameters. Moreover, [0084] indicates that the characteristics may be represented as a multidimensional vector (i.e., feature vector). A distance to a cluster is a number (i.e. a score) which indicates a degree of belongingness to the cluster (i.e., smaller distances represent a larger degree of belongingness and larger distances represent a smaller degree of belongingness).)
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have performed this combination for the reasons given above with respect to claim 1.

	Regarding claim 8, the rejection of claim 1 is incorporated herein. Furthermore, Koch teaches
	wherein the operations further comprise deploying, if no programmatic errors occur in the launched model training, full hyperparameter model optimization with multiple containers of models evaluating the hyperparameter space.  (This is a contingent limitation as discussed above. Fig. 6B shows the system for deploying multiple sessions for evaluating multiple hyperparameters. Each session is understood to be a "container". These are assigned at step 628 and executed at 632. See [0169-0171]. This means that the system of Koch has the structure required for performing the function “deploying…full hyperparameter model optimization with multiple containers of models evaluating the hyperparameters space” should the condition “no programmatic errors occur in the launched model training”. This is what is required in view of the interpretation of contingent limitations identified above.)
	Furthermore, Huang—directed to analogous art—teaches at [0051-0053] that jobs may be monitored for errors and that sometimes jobs proceed from start to finish without errors. In the combination with Koch, it would be obvious to proceed with the deploying full hyperparameter model optimization taught by Koch as described above because this would avoid the wasted compute used to perform all of the tasks leading up to the deploying full hyperparameter model optimization when there is no reason to stop the method (as programmatic errors may not occur). 

	Regarding claim 9, the rejection of claim 8 is incorporated herein. Furthermore, Koch teaches
	wherein the operations further comprise providing a trained model to a model optimizer based on performance metrics.  (Fig. 6C shows the final steps. Described at [0188-0195]. In particular, the hyperparameters with the best results are obtained at step 660 and the best model is trained at step 618. This is stored, which is understood to be providing it to a model optimizer (i.e., a system for obtaining a best model). This is based on the objective function value as described at [0188].)

	Regarding claim 10, the rejection of claim 1 is incorporated herein. Furthermore, Koch teaches
	wherein the hyperparameters and associated model run times are stored in the hyperparameter space and are associated with respective hyperparameters. (Step 672 creates the results table as at step 508, which includes storing run times along with associated hyperparameters as described at [0073].)

	Regarding claim 11,  Koch teaches
	A training model generator system, comprising: one or more memory units storing instructions; and one or more processors configured to execute the stored instructions to perform operations comprising: (A system comprising processor/memory is described at [0005].)
	receiving a request input at an interface to complete a hyperparameter optimization task; (Steps 502-520 show the system receiving information for the performance of a hyperparameter optimization task. In particular, the request includes a dataset (506-508), a model type for which to identify hyperparameters (512), potentially receiving an indicator of hyperparameter values to identify for the model type (516)The hyperparameter optimization task occur at step 528 and [0146] show/describe receiving a request for a hyperparameter optimization. The input interface is shown at Figure 2, element 202 and is described at [0044].)
	initiating a model generation task based on the requested hyperparameter optimization task; configuring a distributor to route messages between a hyperparameter determination virtual computing instance and a quick hyperparameter virtual computing instance; supplying first computing resources to the hyperparameter determination virtual computing instance configured to (Step 528 and [0146] show/describe receiving a request for a hyperparameter optimization by the selection manager device. Fig. 6A-C appears to be the method corresponding to executing the hyperparameter selection task. The hyperparameter determination is understood to be a first step of a model generation task because it is used in subsequent steps 530-534 to generate a final model. [0071] indicates that this may be performed using a session, which may be interpreted as a hyperparameter determination virtual computing instance. See also [0036] for the sessions. The sessions are coordinated by session manager device 400 (i.e., a “distributor”), shown in Figure 4A and described at [0038-0040].)
	execute a deployment script to identify first hyperparameters to be evaluated by the model generation task, the deployment script comprising a range of values to be tested; and ([0151] indicates that the tuning is performed by the selection manager device 104 described above. Step 602 instantiates the iteration manager. [0152] indicates that this includes determining a configuration list that includes a set (i.e., range) of hyperparameter configurations to evaluate. Since this is computer-implemented, there is code which causes these operations to occur. The code which causes these operations is understood to correspond to the deployment script.)
	…supplying second computing resources to a quick hyperparameter determination virtual computing instance configured to: receive the identified first hyperparameters from the hyperparameter determination virtual computing instance; and (Fig. 6A-C appears to be the method corresponding to executing the hyperparameter selection task. The hyperparameter determination is understood to be a first step of a model generation task because it is used in subsequent steps 530-534 to generate a final model. Steps 608-612 and [0158-0159] show/describe running a plurality of sessions to determine a performance for each hyperparameter configuration value. That is, the hyperparameter space is investigated based on retrieving the hyperparameter configuration values. Steps 530 and 532 receive the hyperparameters and select hyperparameters from these results as described at [0147-0148]. It is understood that since Koch teaches a computer-implemented method, any operation requires a provision of computing resources for its performance. [0071] indicates that this may be performed using a session, which may be interpreted as a hyperparameter determination virtual computing instance. See also [0036] for the sessions)
…launching a model training using the determined values and the first hyperparameters; (Step 534 shows training the model with the selected hyperparameter configuration and values (which may be selected in the manner taught by the combination described in more detail below.)
	Koch does not appear to explicitly teach
	apply a natural language processing technique to identify features of the first hyperparameters;
	generate feature matrices based on the features of the first hyperparameters identified by the natural language processing technique…
	and a threshold of the feature matrices;…cluster one or more vectors of the feature matrices to generate at least one first cluster;
	retrieve second hyperparameters from a hyperparameter space based on a distance between the at least one first cluster and at least one second cluster including one of the second hyperparameters, the retrieved second hyperparameters having structural similarities to the first hyperparameters; and
	determine values within the range of values that return a fastest model run time with the second hyperparameters;
	launching a model training using the determined values and the first hyperparameters; 
	in response to a programmatic error, associated with the model training, terminating the model training and notifying a user.  
	However, Hammond teaches
	determine values within the range of values that return a fastest model run time with the second hyperparameters; ([0075], see especially very end of paragraph, describes various criteria that the system may use to identify topologies (e.g., a neural network topology as described at [0065]) of an optimal model. A configuration of a neural network is understood to comprise hyperparameters (e.g., number and configuration of nodes). At [0075], it is indicated that performance time is one of the criteria. That is, [0075] teaches identifying hyperparameters based the model run time. The location in memory where this data is stored is the hyperparameter space. For the system to identify hyperparameters based on these model run times, the run times are necessarily accessed. In the context of figures 5A-5B, the AI objects are returned at step 114-120. In particular, the search is performed at step 114 (described at [0135]) and the architect module may identify topologies at the first part of step 120 (described at [0138]). [0076] indicates that a best trained AI model may be determined by means of optimal results based on factors such as performance time. That is, an optimal (i.e., minimum) time may be determined among the models. While Hammond teaches a range of embodiments (e.g., determining an optimum based on accuracy), the teaching of Hammond includes the claimed limitation.) 
	launching a model training using the determined values and the first hyperparameters; 
 ([0076] indicates that a neural network may then be trained based on the determined information. In the context of figures 5A-5B, this is the last part of step 120 and step 122 (described at [0138-0139]).)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Koch to use a database-based hyperparameter selection/optimization approach based on model run time because this allows for the reuse, reconfigure ability, and recomposition of the trained AI data objects from the AI database into a new trained AI model as described by Hammond at [0005]. Furthermore, It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to base the selection on model runtime because this would allow the training of models which perform efficiently, which would be important in environments in which fast processing is essential (e.g., self-driving cars) or environments in which processing resources are limiting (e.g., mobile or embedded applications.
	The combination of Koch and Hammond does not appear to explicitly teach
	apply a natural language processing technique to identify features of the first hyperparameters;
	generate feature matrices based on the features of the first hyperparameters identified by the natural language processing technique…
	and a threshold of the feature matrices;…cluster one or more vectors of the feature matrices to generate at least one first cluster;
	retrieve second hyperparameters from a hyperparameter space based on a distance between the at least one first cluster and at least one second cluster including one of the second hyperparameters, the retrieved second hyperparameters having structural similarities to the first hyperparameters; and
	…in response to a programmatic error, associated with the model training, terminating the model training and notifying a user.  
	However, Yokoyama—directed to analogous art—teaches
	generate feature matrices based on the features of the first hyperparameters… ([0084] indicates that the characteristics may be represented as a multidimensional vector (i.e., feature vector). Note that a vector is a type of matrix (e.g., a 2-dimensional column vector is a 2x1 matrix).)
	…cluster one or more vectors of the feature matrices to generate at least one first cluster; (Abstract describes determining parameters (note that a hyperparameter is a type of parameter) based on blocks (i.e., clusters) of parameters. [0084] indicates that the blocks (i.e., clusters) of parameters are determined by computing a distance between characteristics of the parameters.)
	retrieve second hyperparameters from a hyperparameter space based on a distance between the at least one first cluster and at least one second cluster including one of the second hyperparameters, the retrieved second hyperparameters having structural similarities to the first hyperparameters; and	([0092] describes determining a next block (i.e., set of parameters) to investigate by determining a block that is closest to the previous block centroid. [0091] indicates that distance represents a similarity, so nearby blocks may be reasonably interpreted as having structural similarities.
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination of Koch and Hammond to perform clustering of the parameters as taught by Yokoyama because this allows for parallelization of the optimization (see Yokoyama at [0034]), which results in decreased learning time (see Yokoyama at [0003]).
	The combination of Koch, Hammond and Yokoyama does not appear to explicitly teach 
	apply a natural language processing technique to identify features of the first hyperparameters;
	generate feature matrices based on the features of the first hyperparameters identified by the natural language processing technique;
	…in response to a programmatic error, associated with the model training, terminating the model training and notifying a user.  
	However, Huang—directed to analogous art—teaches 
	apply a natural language processing technique to identify features of the first hyperparameters; generate feature matrices based on the features of the first hyperparameters identified by the natural language processing technique; ([0062] describes parsing (which is a natural language processing technique) to identify values/features to be associated with the hyperparameters. In the combination described above, the values parsed as taught by Huang would then be the features analyzed and in particular used in the clustering taught by Yokoyama.)
	in response to a programmatic error, associated with the model training, terminating the model training and notifying a user.  ([0018] indicates that the system may monitor job status of tuning/training jobs including identifying failures (i.e., programmatic errors). [0021] indicates that  notifications may be provided when the status of a job changes. See also [0051-0052]. [0053] indicates that in response to an error, when insufficient progress has been made, the job may not be automatically retried (i.e., the training is terminated since it is not continued via retrial).)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination described above to use natural language processing technique to identify features of the hyperparameters as taught by Huang because this allows users to provide values as input as taught by Huang at [0062]. Moreover, it would have been obvious to monitor for errors/failures because this allows for a user to be emailed as described at [0052] and allows for the system to terminate training instances which did not show progress upon failure as described at [0053].
	The combination of Koch, Hammond, Yokoyama and Huang does not appear to explicitly teach
	and a threshold of the feature matrices…cluster one or more vectors of the feature matrices to generate at least one first cluster
	However, Da—directed to analogous art—teaches
	and a threshold of the feature matrices…cluster one or more vectors of the feature matrices to generate at least one first cluster ([0004, 0039] describes clustering points in a parameter space using a user-defined threshold lambda. In the combination with Yokoyama, Yokoyama already teaches clustering vector/matrix representations of parameters as described above. Da is relied upon to teach performing the clustering using a threshold.)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination of Koch, Hammond, Yokoyama and Huang to use a threshold when performing the clustering because Yokoyama already teaches performing clustering of vector/matrix representations of parameters, but does not go into detail regarding the clustering and the use of a threshold as taught by Da ensures that clusters are uncorrelated as described by Da at [0039].
	
	Claims 12-16 are substantially similar to claims 2-6, respectively, and are rejected with the same rationale in view of the rejection of claim 11.

	Regarding claim 17, the rejection of claim 11 is incorporated herein. The combination of Koch, Hammond, and Yokoyama does not appear to explicitly teach 
	wherein the operations further comprise terminating the model training in response to a user input.  
	However, Huang—directed to analogous art—teaches
	wherein the operations further comprise terminating the model training in response to a user input.  ([0072] indicates that the operator may kill/terminate a running job (i.e., training as described at [0065]).
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination described above to terminate model training in response to a user input because this allows a user to effect control over the running jobs as described at [0072].

	Claims 18-20 are substantially similar to claims 8-10, respectively, and are rejected with the same rationale in view of the rejection of claim 11.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Markus A Vasquez whose telephone number is (303)297-4432. The examiner can normally be reached Monday to Friday 9AM to 4PM PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached on (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARKUS A. VASQUEZ/Examiner, Art Unit 2121