Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Detailed Action
Remarks
	This communication has been issued in response to Applicant’s amended claim language filed 29 March 2022.  Claims 1-3, 5-21 & 23-27 are now pending in this application.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 25 February 2022 was filed after the mailing date of the disclosure.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-21 & 23-27 are rejected under 35 U.S.C. 103 as being unpatentable over Einkauf et al (USPG Pub No. 20160323377A1; Einkauf hereinafter) in view of Richards et al (US Patent No. 8,510,273B2; Richards hereinafter).

As for Claim 1, Einkauf teaches, A method for auto-scaling a query engine, the method comprising:
“monitoring, by one or more processors, query traffic at the query engine” (see pp. [0022-0024], [0048]; e.g., the reference of Einkauf provides methods for implementing automatic scaling of computing resources within a cluster-based distributed computing system.  Einkauf provides for the monitoring of metrics, applied automatically in response to one or more requests from one or more clients where interaction with a service is performed.  According to at least paragraphs [0024] & [0028], a MapReduce or Hadoop framework is utilized as an engine {i.e. “query engine”} for the execution of one or more auto-scaling policies); and
“instructing, by one or more processors, an auto-scaling of a cluster of worker nodes to change a number of worker nodes of the worker nodes available in the cluster based on the comparison, over a defined period of time, of the query traffic relative to a defined upscaling threshold and a defined downscaling threshold, wherein the auto-scaling adds or removes a number of worker nodes to a cluster of worker nodes that processes queries of the query engine (see pp. [0030]; e.g., paragraph [0030] teaches of receiving metrics {i.e. “work pending for each job”, “number of pending jobs per container”, “available capacity and remaining capacity”} from one or more of at least “Hadoop framework”, “Hadoop Yarn”, or “HDFS”, which is utilized by at least one or more of an “auto-scaling rules engine”, along with other trigger types {i.e. time, day, date, cost}, to determine which nodes are available for addition or removal when increasing or decreasing the capacity of nodes in a cluster based on at least behaviors and workloads {i.e. considered equivalent to Applicant’s “query traffic”}.   Auto-scaling techniques may determine which nodes are eligible for removal when reducing capacity in a cluster based on their types, roles, behavior and/or the workloads they are configured to accommodate.  “Instance groups”, to which the one or more nodes belong {i.e. task nodes, “various nodes in a MapReduce cluster”}, are considered equivalent to Applicant’s one or more of a “service class”, as each node grouping is designated for particular functions {i.e. “core nodes” are nodes designed to have storage and execute jobs; task nodes are designed for managing jobs} and, as described within paragraph [0032], can have capacity increased within a MapReduce cluster with one or more available instances from various resource pools according to applicable management policies and/or service agreements.  Paragraph [0104] expounds further on the instantiation of at least an auto-scaling policy including a rule that causes a cluster “to be scaled up if HDFS utilization exceeds 90% for more than 2 hours”, reading on Applicant’s claimed limitation, as a threshold is “breached” in regards to a corresponding predetermined threshold amount.  The cited paragraph [0104] also expounds further on the instantiation of at least an auto-scaling policy including a rule that causes a cluster “to be scaled down if the cluster is idle for more than one hour”, therefore, changing the number of nodes within one or more clusters/”instance groups” over a period of time in regards to at least a received workload).
The reference of Einkauf does not appear to explicitly recite the amended limitations of, “classifying, by one or more processors, queries of the query traffic to a service class of a plurality of service classes based on a level of complexity, wherein an overall resource usage for each service class of the plurality of service classes is equal”; and “comparing, by one or more processors, query traffic, over a defined period of time, for each service class of the plurality of service classes with a concurrency threshold of a maximum number of queries of the service class allowed to be concurrently processed by each service class”.
The reference of Richards recites the amended limitations of, 
“classifying, by one or more processors, queries of the query traffic to a service class of a plurality of service classes based on a level of complexity, wherein an overall resource usage for each service class of the plurality of service classes is equal” (see col. 5, lines 26-53; col. 7, lines 9-67 through col. 8, lines 1-31, 57-67; col. 9, lines 1-13; e.g., the reference of Richards serves as an enhancement to the teachings of the Einkauf reference by provides mechanisms for tracking the number of queries received for processing for a workload to facilitate arrival rate qualifications to throughput service level goals (SLGs).  Column 5, lines 26-53 teaches of a system that is a goal-orientated workload management system capable of supporting complex workloads and self-adjusting to various types of workloads {i.e. workload queries}, with system operations having a plurality of phases, such as “...1) assigning a set of incoming request characteristics to workload groups, assigning the workload groups to priority classes, and assigning goals (called Service Level Goals or SLGs) to the workload groups”, where the assigning of workloads to “priority classes” is considered equivalent to Applicant’s classifying queries to “service classes”.  Richards, at least within column 7, lines 9-67 through column 8, lines 1-31, 57-67 and column 9, lines 1-13 teach of providing an “internal monitoring and regulating component (regulator)” that dynamically monitors workload characteristics using workload rules or other heuristics based on past and current performance of the system, where detection of through-put based SLG misses undergo further qualification based on actual arrival rates of workloads (WDs).  Missed SLGs can be indications of “system overload” or “system under-demand” {i.e. Applicant’s “overall resource usage”}, and, if the arrival rate of a WD is “less than or equal to” the “Throughput-SLG of the WD”, then the cause of the missed Throughput-SLG is under-demand, for example.  As discussed within column 13, lines 41-55, SLGs that have been “missed”, as well as additional system performance conditions, such as, “...1. Any sustained high or low usage of a resource, such as high CPU usage, high I/O usage, a higher than average arrival rate, or a high concurrency rate; 2. Any unusual resource depletion, such as running out of AWTs, problems with flow control, and unusually high memory usage...”, can be considered for the triggering of one or more actions); and 
“comparing, by one or more processors, query traffic, over a defined period of time, for each service class of the plurality of service classes with a concurrency threshold of a maximum number of queries of the service class allowed to be concurrently processed by each service class” (see col. 9, lines 14-46; e.g., the system of Richards teaches of utilizing at least a “comparator” which determines if a received request should be queued or released for execution by determining the “workload group assignment” {i.e. considered equivalent to the “priority class” of Richards and “service classes” amended by Applicant} for the request {i.e. query} and comparing the workload group’s performance against the workload rules.  The comparator compares the “concurrency level of requests” {i.e. considered equivalent to Applicant’s “concurrency threshold”} being executed under the workload group to which the request is assigned, reading on Applicant’s claimed limitation.  As stated within column 7, lines 19-33, the number of queries received for processing for each workload is tracked to facilitate arrival rate qualifications to throughput service level goals, and a “number of queries counter” associated with a particular workload is incremented each time a query assigned to the particular workload is received.  Concurrency levels {i.e. number of concurrent executing queries} are monitored and must adhere to a defined threshold and in view of request execution characteristics against a set of exception conditions).
The combined references of Einkauf and Richards are considered analogous art for being within the same field of endeavor, which is the automatic scaling of computing resource instances in a cluster-based distributed computing system having mechanisms for tracking the number of queries received for processing.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the classification of received queries and comparing of query activity with concurrency levels, as taught by Richards, with the method of Einkauf, in order to detect over/under performance of workload management techniques when comparing query performance metrics against Service Level Goals (SLGs) for workloads, and determining reasons for missed Service Level Goals conformance. (Richards; col. 1, lines 23-67 through col. 2, lines 1-13) 

As for Claim 2, Einkauf recites the limitation of, “wherein the defined upscaling threshold and the defined downscaling threshold are each a respective defined threshold ratio of a number of queries in the query traffic compared to the concurrency threshold for each service class” (see pp. [0104]; e.g., Paragraph [0104] expounds further on the instantiation of at least an auto-scaling policy including a rule that causes a cluster “to be scaled up if HDFS utilization exceeds 90% for more than 2 hours” or including a rule that causes a cluster “to be scaled down if the cluster is idle for more than one hour”, reading on Applicant’s claimed limitation, as a threshold is “breached” in regards to a corresponding predetermined threshold amount.  The metric of “90%”, in regards to HDFS utilization, is a representation of a ratio where two metrics are compared.  Previous text of paragraphs [0086-0088] teach of auto-scaling policies including a set of “cluster-level limits”, considered equivalent to Applicant’s one or more of a “concurrency threshold”, as they can be configured to provide as least, “an optional maximum instance count that constrains how many instances can be added by an auto-scaling operation. For example, in order to constrain the operation so that no more than twenty-five instances are included in the affected cluster or instance group following an auto-scaling operation to add capacity, the policy may set this limit to a value of "25", thus, adjusting one or more of a “cluster-level limit” while “following an auto-scaling operation” in regards to an affected cluster or instance group).

As for Claim 3, Einkauf teaches, further comprising:
“evaluating, by one or more processors, a number of worker nodes to change based on an aggregation of the comparisons of query traffic across all service classes at a given time” (see pp. [0030]; e.g., paragraph [0030] teaches of receiving metrics {i.e. “work pending for each job”, “number of pending jobs per container”, “available capacity and remaining capacity”} from one or more of at least “Hadoop framework”, “Hadoop Yarn”, or “HDFS”, which is utilized by at least an “auto-scaling rules engine”, along with other trigger types {i.e. time, day, date, cost}, to determine which nodes are available for removal when reducing capacity in a cluster based on at least behaviors and workloads {i.e. considered equivalent to Applicant’s “query traffic”}). 

As for Claim 5, Einkauf teaches, wherein evaluating a number of worker nodes to change further comprises:
“for each service class for which the defined downscaling threshold is breached, determining, by one or more processors, the number of worker nodes to change based on a current proportion of worker nodes assigned to the service class and a required decrease in capacity based on the comparison” (see pp. [0030]; e.g., paragraph [0030] teaches of receiving metrics {i.e. “work pending for each job”, “number of pending jobs per container”, “available capacity and remaining capacity”} from one or more of at least “Hadoop framework”, “Hadoop Yarn”, or “HDFS”, which is utilized by at least an “auto-scaling rules engine”, along with other trigger types {i.e. time, day, date, cost}, to determine which nodes are available for addition/removal when increasing or decreasing capacity in a cluster based on at least behaviors and workloads {i.e. considered equivalent to Applicant’s “query traffic”}.  “Instance groups”, to which the one or more nodes belong {i.e. task nodes, “various nodes in a MapReduce cluster”}, are considered equivalent to Applicant’s one or more of a “service class”, and, as described within paragraph [0032], can have capacity increased within a MapReduce cluster with one or more available instances from various resource pools according to applicable management policies and/or service agreements.  Paragraph [0104] expounds further on the instantiation of at least an auto-scaling policy including a rule that causes a cluster “to be scaled down if the cluster is idle for more than one hour”, reading on Applicant’s claimed limitation, as a threshold is “breached” in regards to a corresponding predetermined threshold amount).

As for Claim 6, Einkauf teaches, further comprising:
“adjusting, by one or more processors, the concurrency threshold for one or more service classes based on a new number of worker nodes in the cluster after auto-scaling” (see pp. [0086-0088]; e.g., as discussed within rationale provided above, the reference of Einkauf teaches of auto-scaling policies including a set of “cluster-level limits”, considered equivalent to Applicant’s “concurrency threshold”, as they can be configured to provide as least, “an optional maximum instance count that constrains how many instances can be added by an auto-scaling operation. For example, in order to constrain the operation so that no more than twenty-five instances are included in the affected cluster or instance group following an auto-scaling operation to add capacity, the policy may set this limit to a value of "25", thus, adjusting one or more of a “cluster-level limit” while “following an auto-scaling operation” in regards to an affected cluster or instance group).

As for Claim 7, Einkauf teaches, further comprising:
“determining, by one or more processors, the query traffic based on queued queries waiting in queues for each service class” (see pp. [0125]; e.g., the cited paragraphs [0120-0121], [0125-0126] teaches of waiting for prioritized tasks to be performed or tasks to be fully completed by one or more compute nodes that are currently performing tasks in order to instantiate one or more of an auto-scaling policy).
The reference of Einkauf does not appear to explicitly recite the amended language of, “wherein the defined upscaling threshold is breached when a ratio of a number of the queued queries waiting in the queues over the defined period of time for the service class compared to the concurrency threshold for the service class is greater than a defined ratio for a defined period of time”.
The reference of Richards recites the amended language of, “wherein the defined upscaling threshold is breached when a ratio of a number of the queued queries waiting in the queues over the defined period of time for the service class compared to the concurrency threshold for the service class is greater than a defined ratio for a defined period of time” (see col. 7, lines 9-67; col. 8, lines 1-31, 57-67; col. 9, lines 1-13, 26-38; e.g., Richards, at least within column 7, lines 9-67 through column 8, lines 1-31, 57-67 and column 9, lines 1-13 teach of providing an “internal monitoring and regulating component (regulator)” that dynamically monitors workload characteristics using workload rules or other heuristics based on past and current performance of the system, where detection of through-put based SLG misses undergo further qualification based on actual arrival rates of workloads (WDs).  Missed SLGs can be indications of “system overload” or “system under-demand” {i.e. Applicant’s “overall resource usage”}, and, if the arrival rate of a WD is “greater than” the “Throughput-SLG”, then the detection of an uninteresting and inconsequential situation is avoided, and the cause of the missed SLG is “system overload”, for example. A comparator component of the system extracts one or more requests from a queue of delayed requests once monitored workgroup performance is determined to reach an acceptable level).
The combined references of Einkauf and Richards are considered analogous art for being within the same field of endeavor, which is the automatic scaling of computing resource instances in a cluster-based distributed computing system having mechanisms for tracking the number of queries received for processing.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the classification of received queries and comparing of query activity with concurrency levels, as taught by Richards, with the method of Einkauf, in order to detect over/under performance of workload management techniques when comparing query performance metrics against Service Level Goals (SLGs) for workloads, and determining reasons for missed Service Level Goals conformance. (Richards; col. 1, lines 23-67 through col. 2, lines 1-13)

As for Claim 8, Einkauf teaches, further comprising:
“determining, by one or more processors, the query traffic based on active queries for each service class” (see pp. [0148]; e.g., the reference of Einkauf teaches of instance clients being configured to direct network traffic to one or more of a “compute instance”, where compute instances may attach or map to one or more data volumes provided by a “block-based storage service” for performing various operations.  Compute instances handle computational workloads for compute intensive applications); and
“wherein the defined downscaling threshold is breached when a ratio of a number of the active queries for a service class compared to the concurrency threshold for the service class is less than a defined ratio for the defined period of time” (see pp. [0121]; e.g., the cited paragraph [0121] provides an example embodiment where an instance group {i.e. “service class”} includes nodes associated with a policy specifying that when a CPU is not being used, the node should be removed {i.e. considered equivalent to Applicant’s “a defined period of time” where no active queries are received}.  The cited paragraph teaches of taking into consideration “the number of times a given block must be replicated across the cluster”, thus, determining the number of nodes needed for data block replication, and utilizing that number {i.e. target replication factor”} when determining the shrinking of a cluster through node removal.  As stated within rationale provided above, paragraph [0104] expounds further on the instantiation of at least an auto-scaling policy including a rule that causes a cluster “to be scaled up if HDFS utilization exceeds 90% for more than 2 hours” or including a rule that causes a cluster “to be scaled down if the cluster is idle for more than one hour”, reading on Applicant’s claimed limitation, as a threshold is “breached” in regards to a corresponding predetermined threshold amount.  The metric of “90%”, in regards to HDFS utilization, is a representation of a ratio where two metrics are compared.  Previous text of paragraphs [0086-0088] teach of auto-scaling policies including a set of “cluster-level limits”, considered equivalent to Applicant’s one or more of a “concurrency threshold”, as they can be configured to provide as least, “an optional maximum instance count that constrains how many instances can be added by an auto-scaling operation).

As for Claim 9, Einkauf teaches, further comprising:
“providing, by one or more processors, multiple node groups each comprising a subset of the available worker nodes in the cluster, wherein the node groups are configured for expected durations of queries” (see pp. [0064-0065]; e.g., the primary reference teaches that a MapReduce cluster may be configured to automatically scale up/down when triggered by at least a metric crossing a specified threshold for a specific time period, such as an auto-scaling action to reduce capacity of nodes {i.e. worker/slave/data nodes} based on “the number of mappers in the cluster is less than 2 for at least 60 minutes”.  The MapReduce cluster may also auto-scale up/down to increase/decrease capacity based on “an estimated time to complete all in-progress and pending jobs on the cluster, considered equivalent to Applicant’s claimed limitation, where Applicant’s disclosure of “draining” involves the scaling down of nodes based on completion of estimated completion times); and
“mapping, by one or more processors, a service class of queries to a node group in order to assign queries of a service class to worker nodes of the mapped node group” (see pp. [0041], [0043-0044]; e.g., the primary reference teaches of utilizing at least a “MapReduce program, including at least a “mapper process” that performs filtering and sorting, and allows a parallel application to be mapped to a set of computing nodes for processing.  A “master node” may control task distribution to “worker nodes”.  Paragraphs [0043-0044] teach that worker nodes perform “similar” tasks {i.e. partitioned input data} concurrently under the direction of the master node(s), and based on boundaries between individual records/ individual lines of data, related items or families of items, thus, displaying affinity between the worker nodes and ”instance groups” of nodes based on families of related data, for example).

As for Claim 10, Einkauf teaches, further comprising:
“auto-scaling, by one or more processors, to remove a number of worker nodes by selecting a number of worker nodes for draining before removal according to node groups with worker nodes being selected from a node group configured for a lowest possible expected duration of queries” (see pp. [0064-0065]; e.g., the primary reference teaches that a MapReduce cluster may be configured to automatically scale up/down when triggered by at least a metric crossing a specified threshold for a specific time period, such as an auto-scaling action to reduce capacity of nodes {i.e. worker/slave/data nodes} based on “the number of mappers in the cluster is less than 2 for at least 60 minutes”.  The MapReduce cluster may also auto-scale up/down to increase/decrease capacity based on “an estimated time to complete all in-progress and pending jobs on the cluster, considered equivalent to Applicant’s claimed limitation, where Applicant’s disclosure of “draining” involves the scaling down of nodes based on completion of estimated completion times).

As for Claim 11, Einkauf teaches, wherein the node groups are dynamic and adjusted according to auto-scaling of the worker nodes” (see pp. [0069-0080]; e.g., the reference of Einkauf teaches of automatic cluster scaling governed by one or more auto-scaling policies, where an auto-scaling policy may contain one or more rules for the adding/removal of nodes from clusters of specific instance groups based on one or more of a plurality of triggering conditions {i.e. “...[0077] the amount or percentage of capacity (e.g., the number or percentage of resource instances) to add to or remove from the cluster (or specific instance groups thereof.  As stated previously within this communication, paragraph [0041] provides teachings of at least a “mapper process” which performs filtering and sorting, and a “reducer process”, helping to control the distribution of tasks by other computing nodes {i.e. “worker nodes”}). 

As for Claim 12, Einkauf teaches, A method for auto-scaling a query engine, the method comprising:
“auto-scaling, by one or more processors, worker nodes by adding or removing a number of worker nodes to worker nodes available in a cluster based on query traffic at a query engine” (see pp. [0030]; e.g., paragraph [0030] teaches of receiving metrics {i.e. “work pending for each job”, “number of pending jobs per container”, “available capacity and remaining capacity”} from one or more of at least “Hadoop framework”, “Hadoop Yarn”, or “HDFS”, which is utilized by at least an “auto-scaling rules engine”, along with other trigger types {i.e. time, day, date, cost}, to determine which nodes are available for addition/removal when increasing or decreasing capacity in a cluster based on at least behaviors and workloads {i.e. considered equivalent to Applicant’s “query traffic”}.  “Instance groups”, to which the one or more nodes belong {i.e. task nodes, “various nodes in a MapReduce cluster”}, are considered equivalent to Applicant’s one or more of a “service class”, and, as described within paragraph [0032], can have capacity increased within a MapReduce cluster with one or more available instances from various resource pools according to applicable management policies and/or service agreements.  Paragraph [0104] expounds further on the instantiation of at least an auto-scaling policy including a rule that causes a cluster “to be scaled up if HDFS utilization exceeds 90% for more than 2 hours”, reading on Applicant’s claimed limitation, as a threshold is “breached” in regards to a corresponding predetermined threshold amount.  The cited paragraph [0104] also expounds further on the instantiation of at least an auto-scaling policy including a rule that causes a cluster “to be scaled down if the cluster is idle for more than one hour”, therefore, changing the number of nodes within one or more clusters/”instance groups” over a period of time in regards to at least a received workload);
“providing, by one or more processors, multiple node groups, each comprising a subset of the available worker nodes in the cluster, wherein the node groups are configured for an expected duration of queries” (see pp. [0064-0065]; e.g., the primary reference teaches that a MapReduce cluster may be configured to automatically scale up/down when triggered by at least a metric crossing a specified threshold for a specific time period, such as an auto-scaling action to reduce capacity of nodes {i.e. worker/slave/data nodes} based on “the number of mappers in the cluster is less than 2 for at least 60 minutes”.  The MapReduch cluster may also auto-scale up/down to increase/descrease capacity based on “an estimated time to complete all in-progress and pending jobs on the cluster, considered equivalent to Applicant’s claimed limitation, where Applicant’s disclosure of “draining” involves the scaling down of nodes based on completion of estimated completion times); and
“mapping, by one or more processors, each service class of queries to a node group according to an affinity between service classes and node groups” (see pp. [0041], [0043-0044]; e.g., the primary reference teaches of utilizing at least a “MapReduce program, including at least a “mapper process” that performs filtering and sorting, and allows a parallel application to be mapped to a set of computing nodes for processing.  A “master node” may control task distribution to “worker nodes”.  Paragraphs [0043-0044] teach that worker nodes perform “similar” tasks {i.e. partitioned input data} concurrently under the direction of the master node(s), and based on boundaries between individual records/ individual lines of data, related items or families of items, thus, displaying affinity between the worker nodes and ”instance groups” of nodes based on families of related data, for example),
wherein auto-scaling by removing a number of worker nodes further comprises:
“selecting, by one or more processors, a number of worker nodes for draining before removal according to node groups with worker nodes being selected from a node group configured for a lowest possible expected duration of queries” (see pp. [0028]; e.g., the reference of Einkauf teaches of instantiating auto-scaling rules in regards to the one or more nodes of a cluster, where an example is provided for allowing the scaling-up of a cluster based on “utilization greater than 90% for more than 2 hours”, and scaling-down a cluster if it is “idle for more than 1 hour”, thus, taking “duration” in consideration of job performance of one or more nodes of a cluster using auto-scaling rules/policies.  Another example accounts for a user to “define an auto-scaling policy specifying that a cluster should automatically scale up or down on a certain day of the week (or date) and/or at a certain time of day, when a particular threshold for a default or custom metric is exceeded for a given period of time, when the estimated time to complete all pending jobs exceeds a specified service level agreement, or according to other auto-scaling rules”, continuing to read on Applicant’s claimed limitation.  Paragraph [0121] further reads on this teaching, as it is taught that nodes can be “decommissioned” after “waiting until the rebalancing of the data from each decommissioned node has been completed before each node is terminated”, thus, reading on Applicant’s “draining”).
The reference of Einkauf does not recite the limitation of, “classifying, by one or more processors, queries of the query traffic for a service class of a plurality of service classes based on respective levels of query complexity, where an overall resource usage for each service class of the plurality of service classes is equal”.
The reference of Richards teaches, “classifying, by one or more processors, queries of the query traffic for a service class of a plurality of service classes based on respective levels of query complexity, where an overall resource usage for each service class of the plurality of service classes is equal” (see col. 5, lines 26-53; col. 7, lines 9-67 through col. 8, lines 1-31, 57-67; col. 9, lines 1-13; e.g., the reference of Richards serves as an enhancement to the teachings of the Einkauf reference by provides mechanisms for tracking the number of queries received for processing for a workload to facilitate arrival rate qualifications to throughput service level goals (SLGs).  Column 5, lines 26-53 teaches of a system that is a goal-orientated workload management system capable of supporting complex workloads and self-adjusting to various types of workloads {i.e. workload queries}, with system operations having a plurality of phases, such as “...1) assigning a set of incoming request characteristics to workload groups, assigning the workload groups to priority classes, and assigning goals (called Service Level Goals or SLGs) to the workload groups”, where the assigning of workloads to “priority classes” is considered equivalent to Applicant’s classifying queries to “service classes”.  Richards, at least within column 7, lines 9-67 through column 8, lines 1-31, 57-67 and column 9, lines 1-13 teach of providing an “internal monitoring and regulating component (regulator)” that dynamically monitors workload characteristics using workload rules or other heuristics based on past and current performance of the system, where detection of through-put based SLG misses undergo further qualification based on actual arrival rates of workloads (WDs).  Missed SLGs can be indications of “system overload” or “system under-demand” {i.e. Applicant’s “overall resource usage”}, and, if the arrival rate of a WD is “less than or equal to” the “Throughput-SLG of the WD”, then the cause of the missed Throughput-SLG is under-demand, for example.  As discussed within column 13, lines 41-55, SLGs that have been “missed”, as well as additional system performance conditions, such as, “...1. Any sustained high or low usage of a resource, such as high CPU usage, high I/O usage, a higher than average arrival rate, or a high concurrency rate; 2. Any unusual resource depletion, such as running out of AWTs, problems with flow control, and unusually high memory usage...”, can be considered for the triggering of one or more actions).
The combined references of Einkauf and Richards are considered analogous art for being within the same field of endeavor, which is the automatic scaling of computing resource instances in a cluster-based distributed computing system having mechanisms for tracking the number of queries received for processing.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the classification of received queries and comparing of query activity with concurrency levels, as taught by Richards, with the method of Einkauf, in order to detect over/under performance of workload management techniques when comparing query performance metrics against Service Level Goals (SLGs) for workloads, and determining reasons for missed Service Level Goals conformance. (Richards; col. 1, lines 23-67 through col. 2, lines 1-13)

As for Claim 13, Einkauf teaches, wherein selecting a number of worker nodes for draining further comprises: “ordering, by one or more processors, the node groups with node groups for lower expected durations of queries being drained first” (see pp. [0028]; e.g., the reference of Einkauf teaches of instantiating auto-scaling rules in regards to the one or more nodes of a cluster, where an example is provided for allowing the scaling-up of a cluster based on “utilization greater than 90% for more than 2 hours”, and scaling-down a cluster if it is “idle for more than 1 hour”, thus, taking “duration” in consideration of job performance of one or more nodes of a cluster using auto-scaling rules/policies.  Another example accounts for a user to “define an auto-scaling policy specifying that a cluster should automatically scale up or down on a certain day of the week (or date) and/or at a certain time of day, when a particular threshold for a default or custom metric is exceeded for a given period of time, when the estimated time to complete all pending jobs exceeds a specified service level agreement, or according to other auto-scaling rules”, continuing to read on Applicant’s claimed limitation.  Paragraph [0121] further reads on this teaching, as it is taught that nodes can be “decommissioned” after “waiting until the rebalancing of the data from each decommissioned node has been completed before each node is terminated”, thus, reading on Applicant’s “draining”, as jobs are completed before removal).
   
Claim 14 amounts to a method dependent upon Independent Claim 12 comprising instructions that, when executed by one or more processors, performs the method of Claim 11, which dependent upon Independent Claim 1.  Accordingly, Claim 14 is rejected for substantially the same reasons as presented above for Claim 11, and based on the references’ disclosure of the necessary supporting hardware and software (Einkauf; see Fig. 11; see pp. [0138-0143]; e.g., method for implementation integrating hardware and software components).

As for Claim 15, Einkauf teaches, further comprising:
comparing, by one or more processors, query traffic in the form of active queries for each service class with a concurrency threshold of a maximum number of queries of the service class to be concurrently processed (see pp. [0023]; e.g., the Einkauf reference teaches of programmatically scaling a cluster {i.e. “instance group having one or more nodes” considered equivalent to Applicant’s “service class”} up or down based on a workload {i.e. considered equivalent to Applicant’s “query traffic”}, allowing the system to determine when and if to scale up based on actual demand.  According to paragraph [0029], some system may include a default auto-scaling rule specifying that if HDFS utilization exceeds a “default maximum utilization threshold” for more than a default number of hours, the system will automatically add HDFS capacity to the cluster, helping customers “ensure that they always have the right amount of capacity in their clusters”. An auto-scaling policy can be defined to specify that capacity should be maintained at a particular utilization level or capacity should be increased as much as possible {i.e. considered equivalent to Applicant’s “a maximum number of queries of the service class allowed to be concurrently processed”} while keeping the cost per hour below a pre-determined maximum cost, allowing instance cost to be evaluated, and capacity to be added/removed to a cluster after each evaluation. As discussed within paragraph [0029], a new cluster may be brought up to replace a failed cluster, and the new cluster may be automatically scaled up over time to accommodate a growing workload. Text of paragraphs [0086-0088] teach of auto-scaling policies including a set of “cluster-level limits”, considered equivalent to Applicant’s one or more of a “concurrency threshold”, as they can be configured to provide as least, “an optional maximum instance count that constrains how many instances can be added by an auto-scaling operation); and 
auto-scaling, by one or more processors, by removing a number of worker nodes from the worker nodes available in the cluster based on the comparison breaching a defined downscaling threshold with the breaching maintained for a defined period of time (see pp. [0030]; e.g., paragraph [0030] teaches of receiving metrics {i.e. “work pending for each job”, “number of pending jobs per container”, “available capacity and remaining capacity”} from one or more of at least “Hadoop framework”, “Hadoop Yarn”, or “HDFS”, which is utilized by at least an “auto-scaling rules engine”, along with other trigger types {i.e. time, day, date, cost}, to determine which nodes are available for addition/removal when increasing or decreasing capacity in a cluster based on at least behaviors and workloads {i.e. considered equivalent to Applicant’s “query traffic”}.  “Instance groups”, to which the one or more nodes belong {i.e. task nodes, “various nodes in a MapReduce cluster”}, are considered equivalent to Applicant’s one or more of a “service class”, and, as described within paragraph [0032], can have capacity increased within a MapReduce cluster with one or more available instances from various resource pools according to applicable management policies and/or service agreements.  Paragraph [0104] expounds further on the instantiation of at least an auto-scaling policy including a rule that causes a cluster “to be scaled down if the cluster is idle for more than one hour”, reading on Applicant’s claimed limitation, as a threshold is “breached” in regards to a corresponding predetermined threshold amount). 

As for Claim 16, Einkauf teaches, wherein the defined downscaling threshold is breached when a ratio of a number of active queries for a service class compared to the concurrency threshold for the service class is less than a defined ratio for a defined period of time (see pp. [0121]; e.g., the cited paragraph [0121] provides an example embodiment where an instance group {i.e. “service class”} includes nodes associated with a policy specifying that when a CPU is not being used, the node should be removed {i.e. considered equivalent to Applicant’s “a defined period of time” where no active queries are received}.  The cited paragraph teaches of taking into consideration “the number of times a given block must be replicated across the cluster”, thus, determining the number of nodes needed for data block replication, and utilizing that number {i.e. target replication factor”} when determining the shrinking of a cluster through node removal).

Claims 17-20 amount to system claims comprising instructions that, when executed by one or more processors, perform the method of Claims 1-3 & 6, respectively.  Accordingly, Claims 17-20 are rejected for substantially the same reasons as presented above for Claims 1-3 & 6, and based on the references’ disclosure of the necessary supporting hardware and software (Einkauf; see Fig. 11; see pp. [0138-0143]; e.g., method for implementation integrating hardware and software components).


Claims 21, 23 & 24 amount to system claims comprising instructions that, when executed by one or more processors, perform the method of Claims 12-15, respectively.  Accordingly, Claims 21, 23 & 24 are rejected for substantially the same reasons as presented above for Claims 12-15, and based on the references’ disclosure of the necessary supporting hardware and software (Einkauf; see Fig. 11; see pp. [0138-0143]; e.g., method for implementation integrating hardware and software components).

	Claim 25 amounts to a computer program product claim comprising instructions that, when executed by one or more processors, perform the method of Claim 1.  Accordingly, Claims 25 is rejected for substantially the same reasons as presented above for Claim 1, and based on the references’ disclosure of the necessary supporting hardware and software (Einkauf; see Fig. 11; see pp. [0138-0143]; e.g., method for implementation integrating hardware and software components).

As for Claim 26, Einkauf reference teaches of programmatically scaling a cluster {i.e. “instance group having one or more nodes” considered equivalent to Applicant’s one or more of a “service class”} up or down based one or more of an assigned workload {i.e. considered equivalent to Applicant’s “query traffic”}, allowing the system to determine when and if to scale up based on actual demand rather than utilizing a blind estimate.
Einkauf does not appear to explicitly recite the limitations of, “program instructions to determine that a ratio of queued queries compared to the concurrency threshold for the service class is greater than a threshold ratio for a queued time greater than a defined response time”; and “program instructions to evaluate a number of nodes for addition to the service class”.
Richards teaches, wherein comparing query traffic for each service class with a concurrency threshold of a maximum number of queries of the service class allowed to be concurrently processed, further comprises: 
“program instructions to determine that a ratio of queued queries compared to the concurrency threshold for the service class is greater than a threshold ratio for a queued time greater than a defined response time” (see col. 7, lines 9-67; col. 8, lines 1-31, 57-67; col. 9, lines 1-13, 26-38; e.g., Richards, at least within column 7, lines 9-67 through column 8, lines 1-31, 57-67 and column 9, lines 1-13 teach of providing an “internal monitoring and regulating component (regulator)” that dynamically monitors workload characteristics using workload rules or other heuristics based on past and current performance of the system, where detection of through-put based SLG misses undergo further qualification based on actual arrival rates of workloads (WDs).  Missed SLGs can be indications of “system overload” or “system under-demand” {i.e. Applicant’s “overall resource usage”}, and, if the arrival rate of a WD is “greater than” the “Throughput-SLG”, then the detection of an uninteresting and inconsequential situation is avoided, and the cause of the missed SLG is “system overload”, for example. A comparator component of the system extracts one or more requests from a queue of delayed requests once monitored workgroup performance is determined to reach an acceptable level); and 
“program instructions to evaluate a number of nodes for addition to the service class” (see col. 9, lines 36-67; col. 10, lines 1-6; e.g., the reference of Richards teaches of providing an “exception monitoring process” that receives throughput from one or more of a plurality of “AMP Worker Task (AWTs) that await the retrieval of one or more requests dispatched to “priority class buckets”.  AWTs support the parallel performance architecture, can be pre-allocated and assigned to each “Access Module Processor (AMP) to work on a queue system, and are of a finite number with a limited number available to perform new work on the system, as orchestrated by internal workflow management).
The combined references of Einkauf and Richards are considered analogous art for being within the same field of endeavor, which is the automatic scaling of computing resource instances in a cluster-based distributed computing system having mechanisms for tracking the number of queries received for processing.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the classification of received queries and comparing of query activity with concurrency levels, as taught by Richards, with the method of Einkauf, in order to detect over/under performance of workload management techniques when comparing query performance metrics against Service Level Goals (SLGs) for workloads, and determining reasons for missed Service Level Goals conformance. (Richards; col. 1, lines 23-67 through col. 2, lines 1-13)

As for Claim 27, Einkauf reference teaches of programmatically scaling a cluster {i.e. “instance group having one or more nodes” considered equivalent to Applicant’s one or more of a “service class”} up or down based one or more of an assigned workload {i.e. considered equivalent to Applicant’s “query traffic”}, allowing the system to determine when and if to scale up based on actual demand rather than utilizing a blind estimate.
Einkauf does not appear to explicitly recite the limitations of, “setting, by one or more processors, for each service class, the concurrency threshold of the maximum number of queries allowed to be concurrently processed by each service class”; “ wherein the setting includes setting a first concurrency threshold for at least a first service class, a second concurrency threshold for a second service class, and a third concurrency threshold for a third service class, wherein the first concurrency threshold, the second concurrency threshold, and the third concurrency threshold set the maximum number of queries for each service class with the overall resource usage that is equal for the first service claims, the second service class, and the third service class”.
Richards teaches, wherein the concurrency threshold of the maximum number of queries allowed to be concurrently processed by each service class, further comprises: 
“setting, by one or more processors, for each service class, the concurrency threshold of the maximum number of queries allowed to be concurrently processed by each service class” (see col. 7, lines 19-33; e.g., As stated within the cited portion of Richards, “...A number of queries counter associated with a particular workload is incremented each time a query assigned to the particular workload is received thereby tracking the arrival rate of workload queries. The number of queries counter may be periodically reset, e.g., upon an event detection or the expiration of an event interval. Concurrency levels, i.e., the number of concurrent executing queries from each respective workload group, are monitored, and if current workload group concurrency levels are above an administrator-defined threshold, a request in that workload group waits in a queue prior to execution until the concurrency level subsides below the defined threshold”, thus, providing for the “setting...the concurrency threshold...” as “current workload group concurrency levels” adhere to an “administrator-defined threshold”); 
“ wherein the setting includes setting a first concurrency threshold for at least a first service class, a second concurrency threshold for a second service class, and a third concurrency threshold for a third service class, wherein the first concurrency threshold, the second concurrency threshold, and the third concurrency threshold set the maximum number of queries for each service class with the overall resource usage that is equal for the first service claims, the second service class, and the third service class” (see col. 16, lines 29-39; e.g., As stated within rationale provided within this communication above, system operations have a plurality of phases, such as “...1) assigning a set of incoming request characteristics to workload groups, assigning the workload groups to priority classes, and assigning goals (called Service Level Goals or SLGs) to the workload groups”, where the assigning of workloads to “priority classes” is considered equivalent to Applicant’s classifying queries to “service classes”.  Richards, at least within column 7, lines 9-67 through column 8, lines 1-31, 57-67 and column 9, lines 1-13 teach of providing an “internal monitoring and regulating component (regulator)” that dynamically monitors workload characteristics using workload rules or other heuristics based on past and current performance of the system, where detection of through-put based SLG misses undergo further qualification based on actual arrival rates of workloads (WDs).  Missed SLGs can be indications of “system overload” or “system under-demand” {i.e. Applicant’s “overall resource usage”}, and, if the arrival rate of a WD is “less than or equal to” the “Throughput-SLG of the WD”, then the cause of the missed Throughput-SLG is under-demand.  According to at least column 16, lines 29-39, where concurrency levels can be increased or decreased to a defined normal level based on at least exceeding anticipated arrival rates of workloads compared to the arrival rates defined by one or more of an SLG). 
The combined references of Einkauf and Richards are considered analogous art for being within the same field of endeavor, which is the automatic scaling of computing resource instances in a cluster-based distributed computing system having mechanisms for tracking the number of queries received for processing.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the classification of received queries and comparing of query activity with concurrency levels, as taught by Richards, with the method of Einkauf, in order to detect over/under performance of workload management techniques when comparing query performance metrics against Service Level Goals (SLGs) for workloads, and determining reasons for missed Service Level Goals conformance. (Richards; col. 1, lines 23-67 through col. 2, lines 1-13)


Response to Arguments
Applicant's arguments and amendments, with respect to Einkauf and Chan’s alleged failure to teach the subject matter of at least Claims 1, 12, 17 and 21 have been fully considered, and are persuasive in-part, as the Einkauf reference has been maintained, and the Chan reference has been withdrawn from consideration due to Applicant’s persuasive argument.  Updated rationale has been provided within this communication above.  
Upon further consideration and in direct response to Applicant’s numerous claim amendments, a new ground(s) of rejection for Claims 1-3, 5-21 & 23-27 is made in view of Richards et al (US Patent No. 8,510,273B2).

Conclusion
The prior art made of reference, but not relied upon is considered pertinent to Applicant’s disclosure.
**Vutharkar et al (US Patent No. 10,924,398B2) teaches time-series data monitoring with a sharded server.
**Zhou et al (USPG Pub No. 20210011916A1) teaches a data query method, apparatus and device.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAHEEM HOFFLER whose telephone number is (571)270-1036. The examiner can normally be reached Monday-Friday: 10:00am-2:00pm; 6pm-10:00pm w/ flex;.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached on 5712724241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TAMARA T KYLE/Supervisory Patent Examiner, Art Unit 2156                                                                                                                                                                                                        
/RAHEEM HOFFLER/
Examiner
Art Unit 2156

								7/9/2022