DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
The action is responsive to the Applicant’s Amendment filed on 6/06/2022. Claims 1-20 are pending in the application. Claims 1, 5, 8, 12, 15, and 19 are amended.
Applicant’s amendments to the claims have overcome each and every objection previously set forth in the Non-Final Office Action mailed 04/01/2022.
The 112(b) rejection of claims 1-4, 6, 8-11, 13, 15-18, and 20 previously set forth in the Non-Final Office Action mailed 04/01/2022 is hereby withdrawn.
	
Response to Arguments
Applicant’s arguments with respect to the rejections previously made and the amended claims filed on 6/06/2022 have been fully considered but they are not persuasive. In view of the claim amendments, the rejections are being updated accordingly.   
In regards to independent claim 1, Applicant argued that, “The Office is interpreting the latter-sizes of compute nodes-as the ‘token count’ recited in claim 1. Yet, as amended herein, the recited token counts ‘comprise a number of virtual machine cores.’ (Emphasis added). Idicula says nothing about numbers of VM cores, and therefore does not describe the received or selected token counts of claim 1”.
In response to the arguments, it is submitted the cited limitations are being properly addressed by Idicula based at least on Idicula teaching the following:
Idicula discloses numbers of VM cores in para [0216] which states, “VMM 1030 instantiates and runs one or more virtual machine instances… VMM 1030 may provide full hardware and CPU virtualization.” See also Fig. 10. The one or more virtual machine instances providing CPU virtualization correspond to the VM cores of claim 1. Idicula teaches in [Abstract], “Embodiments further utilize trained data capacity machine learning (DC-ML) models to predict a minimum number of compute nodes needed to run a workload.” A compute node provide resources such as processors and memory, and a core is the part of a processor that does the computations or executes the workload. Therefore, the token count corresponds to the number of compute nodes that is needed to run a workload. 
Accordingly, Idicula teaches the token count as the optimal number of compute nodes to run a given workload and therefore, describes the received or selected token counts of claim 1. In para [0027], Idicula teaches, “Specifically, embodiments identify, for a given in-memory workload, one or both of a minimum number of compute nodes and an optimal number of compute nodes to execute the workload.” Idicula also teaches in para [0173], “Matrix partitioning may achieve horizontal scaling such as with symmetric multiprocessing (SMP) such as with a multicore central processing unit (CPU) and or multiple coprocessors.” Therefore, a compute node with a multicore central processing unit would be counted likewise. In addition, Idicula teaches in para [0017], “FIG. 2 depicts a flowchart for automatically predicting an optimal number of compute nodes for a cluster to run a given workload.” See also para [Abstract], [0022], [0027]-[0034], [0219], for determining the number of needed compute nodes to run a workload. Thus, for at least the reasons as set forth above, it is submitted that the amended limitation of “wherein the token counts comprise a number of virtual machine cores” is properly addressed.
Also, in response to the Applicant’s argument that, “Again, Idicula uses ratios of different types of high-level operations, rates of index updates, rates of storage expansion or re-organization, rates of buffer cache misses and flushes, ratios of database commit operations to total numbers of database operations, and sizes of compute node clusters on which the workloads are run to compute job training data,” Idicula in para [0123], teaches that those characteristics are only examples of information that may be provided in historical records on which the workloads are run to compute job training data. Idicula teaches that, “ML service 350 populates the initial training corpus with features of the workloads that are derivable from the workload data from such data sources, along with any available information about performance metrics for the workloads running on any size of compute node cluster.” Thus, the historical data is not limited to the characteristics listed by the Applicant, and historical data for token counts for a plurality of prior jobs is depicted in the graph of Fig. 1, where the token counts correspond to the number of nodes in the cluster needed to execute the workload. 
In regards to independent claims 8 and 15, the emphasized limitations that the Applicant argues in claims 8, and 15 are similar to the emphasized limitations of claim 1, which have been addressed above. See the response of claim 1 above for explanation.
Furthermore, it is also submitted that all limitations in pending claims, including those not specifically argued, are properly addressed. The reason is set forth in the rejections. See claim analysis below for detail.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Idicula (US 20200125568 A1, hereinafter Idicula).

Regarding Claim 1, Idicula discloses a method of optimizing job runtimes ([Abstract]: Embodiments utilize trained query performance machine learning (QP-ML) models to predict an optimal compute node cluster size for a given in-memory workload; [0077]: According to an embodiment, the optimal cluster size is based on the runtime), the method comprising: 
receiving training data comprising historical run data ([0122]: ML service 350 collects an initial training corpus from available database performance-related data sources such as records of historical database operations and/or established database benchmarks such as TPC-C, TPC-H, etc.), the historical run data comprising 
job characteristics ([0134]: ML service 350 records features of the workloads), 
runtime results ([0134]: ML service 350 also gathers performance metrics from the workload experiments, including task runtimes and the network wait times), and 
token counts for each of a plurality of prior jobs ([0016]: FIG. 1 depicts graphs of query runtimes, for four different types of queries, plotted against numbers of compute nodes for clusters running the respective queries [Historical run data for token counts for a plurality of prior jobs is depicted in the graph of Fig. 1, where the token counts correspond to the number of nodes in the cluster needed to execute the workload; [0027]: Specifically, embodiments identify, for a given in-memory workload, one or both of a minimum number of compute nodes and an optimal number of compute nodes to execute the workload. See also para [Abstract], [0022], [0027]-[0034], [0123], [0140], [0219]), and
 the job characteristics comprising an intermediate representation ([0134]: ML service 350 records features of the workloads… including generated query plans, and the intermediate result sizes for the queries in the workload) and 
job graph data, wherein the token counts comprise a number of virtual machine cores ([0016]: FIG. 1 depicts graphs of query runtimes, for four different types of queries, plotted against numbers of compute nodes for clusters running the respective queries; Fig. 10; [0216]-[0217]: VMM 1030 instantiates and runs one or more virtual machine instances… VMM 1030 may provide full hardware and CPU virtualization);
based at least on the training data, training a token estimator, the token estimator comprising a machine learning (ML) model ([0033]: Thus, embodiments use trained query performance machine learning models to predict the optimal number of compute nodes for a given workload [A token corresponds to the number of compute nodes for a given workload]; [0088]: For example, once ML service 350 predicts an optimal compute node cardinality for a particular workload, ML service 350 estimates a query response time performance metric and/or throughput performance metric (e.g., queries per second) for the workload given the predicted optimal number of nodes [ML service 350 corresponds to the token estimator]); 
receiving job characteristics for a user-submitted job (Fig. 2; [0034]: Specifically, at step 202 of flowchart 200, workload information for a particular database workload is received, where the workload information includes at least (a) a portion of a dataset for the particular database workload and (b) one or more queries being run in the particular database workload. For example, a machine learning service, such as ML service 350 depicted in FIG. 3, receives a request, from a user, to automatically provision a particular workload); 
based at least on the received job characteristics, generating, with the token estimator, token prediction data for the user-submitted job ([0051]: As indicated in step 204 of flowchart 200, ML service 350 predicts runtimes of queries in workload 360, based on multiple compute node cardinalities, using one or more trained QP-ML models. The accuracy of a query/workload runtime prediction is affected by the granularity at which the prediction is generated. [The multiple compute node cardinalities correspond to the received job characteristics. ML service 350 corresponds to the token estimator. The predicted runtimes of queries in workload 360 correspond to the token prediction data]); 
selecting a token count for the user-submitted job, based at least on the token prediction data (Fig. 2: [0076]-[0078]: Returning to the discussion of flowchart 200 of FIG. 2, at step 206, based on the plurality of predicted query runtimes identified for each query, of the one or more queries, an optimal number of compute nodes for the particular database workload is determined. [The optimal number of compute nodes corresponds to the token count]); 
identifying the selected token count to an execution environment ([0116]: At step 208 of flowchart 200 (FIG. 2), output that specifies the optimal number of compute nodes for the particular database workload is generated within a memory… e.g., within memory of server device 314); and 
executing, with the execution environment, the user-submitted job in accordance with the selected token count ([0118]: ML service automatically provisions workload 360 using the generated output… As a further example, ML service 350 automatically provisions the optimal number of compute nodes, in the generated output, and loads the workload onto the provisioned cluster of compute nodes, e.g., in response to a request from the user to automatically provision the workload using the identified optimal compute node).

Regarding Claim 2, Idicula discloses the method of claim 1, further comprising: outputting execution results for the user-submitted job, wherein a runtime for the user-submitted job is based at least on the selected token count ([0116]: At step 208 of flowchart 200 (FIG. 2), output that specifies the optimal number of compute nodes for the particular database workload is generated within a memory, and at step 708 of flowchart 700 (FIG. 7), output that specifies the minimum number of compute nodes for the particular database workload is generated within a memory… ML service 350 generates output that specifies one or both of the optimal compute node cardinality and minimum compute node cardinality for workload 360, e.g., within memory of server device 314).

Regarding Claim 3, Idicula discloses the method of claim 1, wherein selecting the token count comprises: receiving the selected token count through a user input; or setting the selected token count based at least on a recommended token count in the token prediction data ([0032]: According to an embodiment, the basis for determining an optimal number of compute nodes for a given workload is identified by a user; [0202]: An input device 914, including alphanumeric and other keys, is coupled to bus 902 for communicating information and command selections to processor 904. Another type of user input device is cursor control 916, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 904 and for controlling cursor movement on display 912).

Regarding Claim 4, Idicula discloses the method of claim 1, wherein generating token prediction data comprises: generating monotonically non-increasing curve data for the user-submitted job, the curve data indicating a predicted runtime for each of a plurality of selectable token counts; or generating, for the user-submitted job, a point prediction runtime value for an identified token count (Fig. 6A; [0082]: To illustrate in the context of chart 600, ML service 350 determines the relative predicted speeds of each compute node cardinality 1-8 based on a ratio between (a) a maximum cardinality-specific total workload runtime of the calculated cardinality-specific workload runtimes, and (b) the cardinality-specific total workload runtime for the respective compute node cardinality).

Regarding Claim 5, Idicula discloses the method of claim 1, wherein the ML model comprises at least one ML model selected from a list comprising: a multi-layer fully connected neural network (NN), or a graph neural network (GNN) ([0154]: Examples of machine learning algorithms include decision trees, support vector machines (SVM), Bayesian networks, stochastic algorithms such as genetic algorithms (GA), and connectionist topologies such as artificial neural networks (ANN). See also [0151]: Machine Learning Models).

Regarding Claim 6, Idicula discloses the method of claim 1, further comprising:
 generating simulated run data based at least on the historical run data and constant token-seconds values ([0122]: Specifically, ML service 350 collects an initial training corpus from available database performance-related data sources such as records of historical database operations and/or established database benchmarks such as TPC-C, TPC-H, etc); and 
augmenting the training data with the simulated run data ([0130]: Thus, according to an embodiment, the training data generation framework formulates, and causes to be run, experiments over known workloads to generate additional training data that is not present in an initial training corpus).

Regarding Claim 7, Idicula discloses the method of claim 1, wherein the user-submitted job comprises an ad hoc job ([0034]: FIG. 2 depicts a flowchart 200 for automatically predicting an optimal number of compute nodes for a cluster to run a given workload. Specifically, at step 202 of flowchart 200, workload information for a particular database workload is received, where the workload information includes at least (a) a portion of a dataset for the particular database workload and (b) one or more queries being run in the particular database workload. For example, a machine learning service, such as ML service 350 depicted in FIG. 3, receives a request, from a user, to automatically provision a particular workload).

Regarding Claim 8, Idicula discloses a system for optimizing job runtimes ([0024]: FIG. 9 depicts a computer system that may be used in an embodiment), the system comprising: 
a processor (Fig. 9, processor 904); and 
a computer-readable medium storing instructions (Fig. 9, storage device 910) that are operative upon execution by the processor to: 
receive training data comprising historical run data ([0122]: ML service 350 collects an initial training corpus from available database performance-related data sources such as records of historical database operations and/or established database benchmarks such as TPC-C, TPC-H, etc.), the historical run data comprising 
job characteristics ([0134]: ML service 350 records features of the workloads), 
runtime results ([0134]: ML service 350 also gathers performance metrics from the workload experiments, including task runtimes and the network wait times), and 
token counts for a plurality of prior jobs ([0016]: FIG. 1 depicts graphs of query runtimes, for four different types of queries, plotted against numbers of compute nodes for clusters running the respective queries [Historical run data for token counts for a plurality of prior jobs is depicted in the graph of Fig. 1, where the token counts correspond to the number of nodes in the cluster needed to execute the workload; [0027]: Specifically, embodiments identify, for a given in-memory workload, one or both of a minimum number of compute nodes and an optimal number of compute nodes to execute the workload. See also para [Abstract], [0022], [0027]-[0034], [0123], [0140], [0219]), and
 the job characteristics comprising an intermediate representation ([0134]: ML service 350 records features of the workloads… including generated query plans, and the intermediate result sizes for the queries in the workload) and 
job graph data, wherein the token counts comprise a number of virtual machine cores ([0016]: FIG. 1 depicts graphs of query runtimes, for four different types of queries, plotted against numbers of compute nodes for clusters running the respective queries; Fig. 10; [0216]-[0217]: VMM 1030 instantiates and runs one or more virtual machine instances… VMM 1030 may provide full hardware and CPU virtualization);
based at least on the training data, training a token estimator, the token estimator comprising a machine learning (ML) model ([0033]: Thus, embodiments use trained query performance machine learning models to predict the optimal number of compute nodes for a given workload [A token corresponds to the number of compute nodes for a given workload]; [0088]: For example, once ML service 350 predicts an optimal compute node cardinality for a particular workload, ML service 350 estimates a query response time performance metric and/or throughput performance metric (e.g., queries per second) for the workload given the predicted optimal number of nodes [ML service 350 corresponds to the token estimator]); 
receive job characteristics for a user-submitted job (Fig. 2; [0034]: Specifically, at step 202 of flowchart 200, workload information for a particular database workload is received, where the workload information includes at least (a) a portion of a dataset for the particular database workload and (b) one or more queries being run in the particular database workload. For example, a machine learning service, such as ML service 350 depicted in FIG. 3, receives a request, from a user, to automatically provision a particular workload); 
based at least on the received job characteristics, generating, with the token estimator, token prediction data for the user-submitted job ([0051]: As indicated in step 204 of flowchart 200, ML service 350 predicts runtimes of queries in workload 360, based on multiple compute node cardinalities, using one or more trained QP-ML models. The accuracy of a query/workload runtime prediction is affected by the granularity at which the prediction is generated. [The multiple compute node cardinalities correspond to the received job characteristics. ML service 350 corresponds to the token estimator. The predicted runtimes of queries in workload 360 correspond to the token prediction data]); 
select a token count for the user-submitted job, based at least on the token prediction data (Fig. 2: [0076]-[0078]: Returning to the discussion of flowchart 200 of FIG. 2, at step 206, based on the plurality of predicted query runtimes identified for each query, of the one or more queries, an optimal number of compute nodes for the particular database workload is determined. [The optimal number of compute nodes corresponds to the selected token count]); 
identify the selected token count to an execution environment ([0116]: At step 208 of flowchart 200 (FIG. 2), output that specifies the optimal number of compute nodes for the particular database workload is generated within a memory… e.g., within memory of server device 314); and 
execute, with the execution environment, the user-submitted job in accordance with the selected token count ([0118]: ML service automatically provisions workload 360 using the generated output… As a further example, ML service 350 automatically provisions the optimal number of compute nodes, in the generated output, and loads the workload onto the provisioned cluster of compute nodes, e.g., in response to a request from the user to automatically provision the workload using the identified optimal compute node).

Regarding Claim 9, Idicula discloses the system of claim 8, wherein the instructions are further operative to: output execution results for the user-submitted job, wherein a runtime for the user-submitted job is based at least on the selected token count ([0116]: At step 208 of flowchart 200 (FIG. 2), output that specifies the optimal number of compute nodes for the particular database workload is generated within a memory, and at step 708 of flowchart 700 (FIG. 7), output that specifies the minimum number of compute nodes for the particular database workload is generated within a memory… ML service 350 generates output that specifies one or both of the optimal compute node cardinality and minimum compute node cardinality for workload 360, e.g., within memory of server device 314).

Regarding Claim 10, Idicula discloses the system of claim 8, wherein selecting the token count comprises: receiving the selected token count through a user input; or setting the selected token count based at least on a recommended token count in the token prediction data ([0032]: According to an embodiment, the basis for determining an optimal number of compute nodes for a given workload is identified by a user; [0202]: An input device 914, including alphanumeric and other keys, is coupled to bus 902 for communicating information and command selections to processor 904. Another type of user input device is cursor control 916, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 904 and for controlling cursor movement on display 912).

Regarding Claim 11, Idicula discloses the system of claim 8, wherein generating token prediction data comprises: generating monotonically non-increasing curve data for the user-submitted job, the curve data indicating a predicted runtime for each of a plurality of selectable token counts; or generating, for the user-submitted job, a point prediction runtime value for an identified token count (Fig. 6A; [0082]: To illustrate in the context of chart 600, ML service 350 determines the relative predicted speeds of each compute node cardinality 1-8 based on a ratio between (a) a maximum cardinality-specific total workload runtime of the calculated cardinality-specific workload runtimes, and (b) the cardinality-specific total workload runtime for the respective compute node cardinality).

Regarding Claim 12, Idicula discloses the system of claim 8, wherein the ML model comprises at least one ML model selected from a list comprising: a multi-layer fully connected neural network (NN), or a graph neural network (GNN) ([0154]: Examples of machine learning algorithms include decision trees, support vector machines (SVM), Bayesian networks, stochastic algorithms such as genetic algorithms (GA), and connectionist topologies such as artificial neural networks (ANN). See also [0151]: Machine Learning Models).

Regarding Claim 13, Idicula discloses the system of claim 8, wherein the instructions are further operative to: 
generate simulated run data based at least on the historical run data and constant token-seconds values ([0122]: Specifically, ML service 350 collects an initial training corpus from available database performance-related data sources such as records of historical database operations and/or established database benchmarks such as TPC-C, TPC-H, etc); and 
augment the training data with the simulated run data ([0130]: Thus, according to an embodiment, the training data generation framework formulates, and causes to be run, experiments over known workloads to generate additional training data that is not present in an initial training corpus).

Regarding Claim 14, Idicula discloses the system of claim 8, wherein the user-submitted job comprises an ad hoc job ([0034]: FIG. 2 depicts a flowchart 200 for automatically predicting an optimal number of compute nodes for a cluster to run a given workload. Specifically, at step 202 of flowchart 200, workload information for a particular database workload is received, where the workload information includes at least (a) a portion of a dataset for the particular database workload and (b) one or more queries being run in the particular database workload. For example, a machine learning service, such as ML service 350 depicted in FIG. 3, receives a request, from a user, to automatically provision a particular workload).

Regarding Claim 15, Idicula discloses one or more computer storage devices having computer-executable instructions stored thereon (Fig 9; [0201]: A storage device 910, such as a magnetic disk, optical disk, or solid-state drive is provided and coupled to bus 902 for storing information and instructions), which, on execution by a computer, cause the computer to perform operations comprising: 
receiving training data comprising historical run data ([0122]: ML service 350 collects an initial training corpus from available database performance-related data sources such as records of historical database operations and/or established database benchmarks such as TPC-C, TPC-H, etc.), the historical run data comprising 
job characteristics ([0134]: ML service 350 records features of the workloads), 
runtime results ([0134]: ML service 350 also gathers performance metrics from the workload experiments, including task runtimes and the network wait times), and 
token counts for a plurality of prior jobs ([0016]: FIG. 1 depicts graphs of query runtimes, for four different types of queries, plotted against numbers of compute nodes for clusters running the respective queries [Historical run data for token counts for a plurality of prior jobs is depicted in the graph of Fig. 1, where the token counts correspond to the number of nodes in the cluster needed to execute the workload; [0027]: Specifically, embodiments identify, for a given in-memory workload, one or both of a minimum number of compute nodes and an optimal number of compute nodes to execute the workload. See also para [Abstract], [0022], [0027]-[0034], [0123], [0140], [0219]), and
 the job characteristics comprising an intermediate representation ([0134]: ML service 350 records features of the workloads… including generated query plans, and the intermediate result sizes for the queries in the workload) and 
job graph data, wherein the token counts comprise a number of virtual machine cores and associated memory ([0016]: FIG. 1 depicts graphs of query runtimes, for four different types of queries, plotted against numbers of compute nodes for clusters running the respective queries; Fig. 10; [0216]-[0217]: VMM 1030 instantiates and runs one or more virtual machine instances… VMM 1030 may provide full hardware and CPU virtualization);
based at least on the training data, training a token estimator, the token estimator comprising a machine learning (ML) model ([0033]: Thus, embodiments use trained query performance machine learning models to predict the optimal number of compute nodes for a given workload [A token corresponds to the number of compute nodes for a given workload]; [0088]: For example, once ML service 350 predicts an optimal compute node cardinality for a particular workload, ML service 350 estimates a query response time performance metric and/or throughput performance metric (e.g., queries per second) for the workload given the predicted optimal number of nodes [ML service 350 corresponds to the token estimator]); 
receiving job characteristics for a user-submitted job (Fig. 2; [0034]: Specifically, at step 202 of flowchart 200, workload information for a particular database workload is received, where the workload information includes at least (a) a portion of a dataset for the particular database workload and (b) one or more queries being run in the particular database workload. For example, a machine learning service, such as ML service 350 depicted in FIG. 3, receives a request, from a user, to automatically provision a particular workload); 
based at least on the received job characteristics, generating, with the token estimator, token prediction data for the user-submitted job ([0051]: As indicated in step 204 of flowchart 200, ML service 350 predicts runtimes of queries in workload 360, based on multiple compute node cardinalities, using one or more trained QP-ML models. The accuracy of a query/workload runtime prediction is affected by the granularity at which the prediction is generated. [The multiple compute node cardinalities correspond to the received job characteristics. ML service 350 corresponds to the token estimator. The predicted runtimes of queries in workload 360 correspond to the token prediction data]); 
selecting a token count for the user-submitted job, based at least on the token prediction data (Fig. 2: [0076]-[0078]: Returning to the discussion of flowchart 200 of FIG. 2, at step 206, based on the plurality of predicted query runtimes identified for each query, of the one or more queries, an optimal number of compute nodes for the particular database workload is determined. [The optimal number of compute nodes corresponds to the selected token count]); 
identifying the selected token count to an execution environment ([0116]: At step 208 of flowchart 200 (FIG. 2), output that specifies the optimal number of compute nodes for the particular database workload is generated within a memory… e.g., within memory of server device 314); and 
executing, with the execution environment, the user-submitted job in accordance with the selected token count ([0118]: ML service automatically provisions workload 360 using the generated output… As a further example, ML service 350 automatically provisions the optimal number of compute nodes, in the generated output, and loads the workload onto the provisioned cluster of compute nodes, e.g., in response to a request from the user to automatically provision the workload using the identified optimal compute node).

Regarding Claim 16, Idicula discloses the one or more computer storage devices of claim 15, wherein the operations further comprise: outputting execution results for the user-submitted job, wherein a runtime for the user-submitted job is based at least on the selected token count ([0118]: As a further example, ML service 350 automatically provisions the optimal number of compute nodes, in the generated output, and loads the workload onto the provisioned cluster of compute nodes; [0119]: FIG. 8 depicts a graph 800 with results of utilizing techniques described herein to predict task runtime).

Regarding Claim 17, Idicula discloses the one or more computer storage devices of claim 15, wherein selecting the token count comprises: receiving the selected token count through a user input; or setting the selected token count based at least on a recommended token count in the token prediction data ([0032]: According to an embodiment, the basis for determining an optimal number of compute nodes for a given workload is identified by a user; [0202]: An input device 914, including alphanumeric and other keys, is coupled to bus 902 for communicating information and command selections to processor 904. Another type of user input device is cursor control 916, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 904 and for controlling cursor movement on display 912).

Regarding Claim 18, Idicula discloses the one or more computer storage devices of claim 15, wherein generating token prediction data comprises: generating monotonically non-increasing curve data for the user-submitted job, the curve data indicating a predicted runtime for each of a plurality of selectable token counts; or generating, for the user-submitted job, a point prediction runtime value for an identified token count (Fig. 6A; [0082]: To illustrate in the context of chart 600, ML service 350 determines the relative predicted speeds of each compute node cardinality 1-8 based on a ratio between (a) a maximum cardinality-specific total workload runtime of the calculated cardinality-specific workload runtimes, and (b) the cardinality-specific total workload runtime for the respective compute node cardinality).

Regarding Claim 19, Idicula discloses the one or more computer storage devices of claim 15, wherein the ML model comprises at least one ML model selected from a list comprising: a multi-layer fully connected neural network (NN), or a graph neural network (GNN) ([0154]: Examples of machine learning algorithms include decision trees, support vector machines (SVM), Bayesian networks, stochastic algorithms such as genetic algorithms (GA), and connectionist topologies such as artificial neural networks (ANN). See also [0151]: Machine Learning Models]).

Regarding Claim 20, Idicula discloses the one or more computer storage devices of claim 15, wherein the operations further comprise: 
generating simulated run data based at least on the historical run data and constant token-seconds values ([0122]: Specifically, ML service 350 collects an initial training corpus from available database performance-related data sources such as records of historical database operations and/or established database benchmarks such as TPC-C, TPC-H, etc); and 
augmenting the training data with the simulated run data ([0130]: Thus, according to an embodiment, the training data generation framework formulates, and causes to be run, experiments over known workloads to generate additional training data that is not present in an initial training corpus).




Examiner Note
Examiner has cited particular columns/paragraph and line numbers in the references applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
In the case of amending the Claimed invention, Applicant is respectfully requested to indicate the portion(s) of the specification which dictate(s) the structure relied on for proper interpretation and also to verify and ascertain the metes and bounds of the claimed invention. This will assist in expediting compact prosecution. MPEP 714.02 recites: "Applicant should also specifically point out the support for any amendments made to the disclosure. See MPEP § 163.06. An amendment which does not comply with the provisions of 37 CFR 1.12l(b), (c),  (d), and (h) may be held not fully responsive. See MPEP § 714." Amendments not pointing to
specific support in the disclosure may be deemed as not complying with provisions of 37 C.F.R. 1.131(b), (c), (d), and (h) and therefore held not fully responsive. Generic statements such as "Applicants believe no new matter has been introduced" may be deemed insufficient.

Conclusion

THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIRLEY D. HICKS whose telephone number is (571)272-3304.  The examiner can normally be reached on Mon - Fri 7:30 - 4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on (571) 272-4034.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/S D H/Examiner, Art Unit 2168
/IRETE F EHICHIOYA/Supervisory Patent Examiner, Art Unit 2168