DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office correspondence is in response to the application number 17/400208 filed on August 12, 2022.  Claims 1 – 20 are pending.
Priority
This application is a continuation claiming the benefit of prior filed application No. 16/587906 (now U.S. Patent 11,153,375) filed on September 30, 2019 and which was co-pending with the instant application.  The instant application is entitled to the priority date of September 30, 2019.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08/12/2021 was filed concurrently with the mailing date of the application on 08/12/2021.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements is being considered by the examiner.
Double Patenting
The non-statutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A non-statutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on non-statutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1 - 20 are rejected on the ground of non-provisional non-statutory anticipatory-type double patenting as being unpatentable over claims 1 – 16 of U.S. Patent 11,153,375.  Although some of the conflicting claims are not identical, they are not patently distinct from each other because both sets of claims are directed to the same invention.  This is a non- provisional non-statutory anticipatory-type double patenting rejection since the claims directed to the same invention have been patented.
In regard to claim 1:
Application 17/400208
U.S. Patent 11,153,375
1. A method of managing a cloud scaling system, the method comprising:
11. A non-transitory computer-readable storage medium storing non transitory computer-executable program instructions, wherein when executed by a processing device, the computer-executable program instructions cause the processing device to perform operations comprising:
determining, for a cloud computing system, a compute scaling adjustment that indicates an adjustment to a number of compute instances of the cloud computing system by applying a machine learning model to a compute capacity of the cloud computing system and usage metrics
determining, for a cloud computing system having a number of compute instances, a compute scaling adjustment by applying a machine learning model to
wherein: 
the compute capacity indicates a number of allocated compute instances, 
the usage metrics indicate pending task requests in a queue of the cloud computing system, and
(a) a compute capacity indicating a number of allocated compute instances of the cloud computing system and 
(b) usage metrics indicating pending any processing requests in a queue of the cloud computing system, wherein the compute scaling adjustment indicates an adjustment to the number of compute instances wherein the machine learning model uses reinforcement learning;
the machine learning model is trained using reinforcement learning and a reward function that is a function of a number of requests in the queue and the number of allocated compute instances and comprises one or more of an overage of the number of compute instances relative to a maximum number of compute instances, a number of pending processing requests in the queue, or a weighted sum of the number of compute instances relative to a current load, wherein the current load is a proportion of a current number of compute instances that is used by tasks in the queue; and
computing a reward value by evaluating a reward function that is a function of a number of requests in the queue and a number of allocated compute instances and comprises one or more of; 
(a) an overage of the number of compute instances relative to a maximum number of compute instances, 
(b) a number of pending processing requests in the queue, or 
(c) a weighted sum of the number of compute instances relative to a current load, wherein the current load is a proportion of a current number of compute instances that is used by tasks in the queue; and 
adjusting an internal parameter of the machine learning model based on the reward value; and responsive to determining that the reward value is above a threshold for the compute scaling adjustment,
providing the compute scaling adjustment to the cloud computing system, wherein the cloud computing system adjusts the number of allocated compute instances.
providing the compute scaling adjustment to the cloud computing system, wherein the cloud computing system allocates or deallocates or more compute instances.

It is clear that all of the elements of the instant application 17/400208 (herein ‘208) claim 1 are to be found in U.S. Patent 11,153,375 (herein ‘375) claim 11 (as the instant application ‘208 claim 1 fully encompasses  Patent ‘375 claim 11).  The difference between ‘208 claim1 and ‘375 claim 11 lies in the fact that the ‘375 claim includes many more elements and is thus much more specific.  Thus the invention of claim 11 of the ‘375 patent is in effect a “species” of the “generic” invention of ‘208 claim 1.  It has been held that the generic invention is “anticipated” by the “species”.  See In re Goodman, 29 USPQ2d 2010 (Fed. Cir. 1993).  Since the ‘208 claim 1 is anticipated by claim 11 of ‘375, it is not patently distinct from ‘375 claim 11.
In regard to claim 2, see claim 2 of ‘375.
In regard to claim 3, see claim 13 of ‘375.
In regard to claim 4, see claim 4 of ‘375.
In regard to claim 5, see claim 1 of ‘375.
In regard to claim 6, see claim 1 of ‘375.
In regard to claim 7:
Application 17/400208
U.S. Application 11,153,375
7. A cloud computing system comprising: 
a processor; and 
non-transitory computer-readable storage medium storing computer-executable program instructions, wherein when executed by the processor, the computer-executable program instructions cause the processor to perform operations comprising
1. A cloud scaling system comprising:
 one or more processing devices; and 
a non-transitory computer-readable medium communicatively coupled to the one or more processing devices, wherein the one or more processing devices are configured to execute instructions and thereby perform operations comprising:
determining, for a cloud computing system, a compute scaling adjustment that indicates an adjustment to a number of compute instances of the cloud computing system by applying a machine learning model to a compute capacity of the cloud computing system and usage metrics, wherein:
determining, for the cloud computing system, a compute scaling adjustment by applying a machine learning model to (a) the compute capacity of the cloud computing system and 
(b) the usage metrics, wherein: the compute scaling adjustment indicates an adjustment to a number of compute instances of the cloud computing system; and
the compute capacity indicates a number of allocated compute instances and, 
the usage metrics indicate pending task requests in a queue of the cloud computing system, and
accessing, from a cloud computing system, 
(a) a compute capacity indicating a number of allocated compute instances of the cloud computing system and 
(b) usage metrics indicating pending task requests in a queue of the cloud computing system;
the machine learning model is trained using reinforcement learning and a reward function that is a function of a number of requests in the queue and the number of allocated compute instances and comprises one or more of an overage of the number of compute instances relative to a maximum number of compute instances, a number of pending processing requests in the queue, or a weighted sum of the number of compute instances relative to a current load, wherein the current load is a proportion of a current number of compute instances that 1s used by tasks in the queue; and
the machine learning model is trained using reinforcement learning and a reward function that is a function of a number of requests in the queue and the number of allocated compute instances and comprises one or more of 
(a) an overage of the number of compute instances relative to a maximum number of compute instances, 
(b) a number of pending processing requests in the queue, or 
(c) a weighted sum of the number of compute instances relative to a current load, wherein the current load is a proportion of a current number of compute instances that is used by tasks in the queue; 
computing a reward value by evaluating the reward function;
providing the compute scaling adjustment to the cloud computing system, wherein the cloud computing system adjusts the number of allocated compute instances.
providing the reward value to the machine learning model, wherein the machine learning model adjusts one or more internal parameters to maximize a cumulative reward; and 
responsive to determining that the cumulative reward is above a threshold, providing the compute scaling adjustment to the cloud computing system, wherein the cloud computing system adjusts the number of allocated compute instances.

It is clear that all of the elements of the instant application 17/400208 (herein ‘208) claim 7 are to be found in U.S. Patent 11,153,375 (herein ‘375) claim 1 (as the instant application ‘208 claim 7 fully encompasses  Patent ‘375 claim 1).  The difference between ‘208 claim 7 and ‘375 claim 1 lies in the fact that the ‘375 claim includes many more elements and is thus much more specific.  Thus the invention of claim 1 of the ‘375 patent is in effect a “species” of the “generic” invention of ‘208 claim 7.  It has been held that the generic invention is “anticipated” by the “species”.  See In re Goodman, 29 USPQ2d 2010 (Fed. Cir. 1993).  Since the ‘208 claim 7 is anticipated by claim 1 of ‘375, it is not patently distinct from ‘375 claim 1.
In regard to claim 8, see claim 5 of ‘375.
In regard to claim 9, see claim 3 of ‘375.
In regard to claim 10, see claim 13 of ‘375.
In regard to claim 11, see claim 1 of ‘375.
In regard to claim 12, see claims 1, 9, and 10 of ‘375.
In regard to claim 13:
Application 17/400208
U.S. Patent 11,153,375
13. A method of facilitating learning of a machine learning model, the method comprising:
5. A method of facilitating learning of a machine learning model, the method comprising:
accessing historical data comprising, for a point in time: 
usage metrics indicating pending processing requests 1n a queue and a current utilization of available compute instances;
accessing historical data comprising, for a point in time: (a) a compute capacity; and (b) usage metrics indicating pending processing requests in a queue and a current utilization of available compute instances;
determining a compute scaling adjustment by applying a machine learning model to the compute capacity and to the usage metrics, the compute scaling adjustment indicating an adjustment to a number of compute instances;
determining a compute scaling adjustment for a cloud computing model by applying a machine learning model to (a) the compute capacity indicating a number of allocated compute instances and (b) the usage metrics, the compute scaling adjustment indicating an adjustment to a number of compute instances;
computing a reward value by: 
calculating a negative penalty based on the compute scaling adjustment; 
normalizing and negating a number of tasks in the queue, 
weighing the number of compute instances by a current load, wherein the current load is a proportion of a current number of compute instances that is used by tasks in the queue, and 
calculating the reward value from the negative penalty, the normalized and negated number of tasks, and the weighted number of compute instances; and
modifying the number of compute instances of the cloud computing model according to the compute scaling adjustment; computing a reward value as a function of (a) an overage of the modified number of compute instances relative to a maximum number of compute instances, (b) a number of pending processing requests in the queue, or (c) a weighted sum of the modified number of compute instances relative to a load, wherein the load is a proportion of a current number of compute instances that is used by tasks in the queue;
providing the reward value to the machine learning model, wherein the machine learning model adjusts one or more internal parameters to maximize a cumulative reward.
providing the reward value to the machine learning model, wherein the machine learning model adjusts one or more internal parameters to maximize a cumulative reward; and

responsive to determining that the cumulative reward is above a threshold, providing the machine learning model to a cloud compute scaling system, wherein the cloud compute scaling system causes the cloud computing system to use the machine learning model to determine an additional compute scaling adjustment and applies the additional compute scaling adjustment to adjust the number of one or more compute instances on a cloud computing system.

It is clear that all of the elements of the instant application 17/400208 (herein ‘208) claim 13 are to be found in U.S. Patent 11,153,375 (herein ‘375) claim 5 (as the instant application ‘208 claim 13 fully encompasses  Patent ‘375 claim 5).  The difference between ‘208 claim 13 and ‘375 claim 5 lies in the fact that the ‘375 claim includes many more elements and is thus much more specific.  Thus the invention of claim 5 of the ‘375 patent is in effect a “species” of the “generic” invention of ‘208 claim 13.  It has been held that the generic invention is “anticipated” by the “species”.  See In re Goodman, 29 USPQ2d 2010 (Fed. Cir. 1993).  Since the ‘208 claim 13 is anticipated by claim 5 of ‘375, it is not patently distinct from ‘375 claim 5.
In regard to claim 14, see claim 14 of ‘375.
In regard to claim 15, see claim 6 of ‘375.
In regard to claim 16, see claim 8 of ‘375.
In regard to claim 17, see claim 7 of ‘375.
In regard to claim 18, see claim 12 of ‘375.
In regard to claim 19, see claim 15 of ‘375.
In regard to claim 20, see claim 16 of ‘375.
35 USC § 101 Analysis
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. 

The claimed invention is directed to statutory subject matter.  The claims are directed to non-abstract improvements in computer related technology.  A claim is non-statutory when it is directed to a judicial exception (e.g. either one of mathematical concepts, mental processes, or certain methods of organizing human activity) without significantly more.  The claimed invention is not directed to a judicial exception.  Instead, the claimed invention is directed to a technological improvement for adjusting a compute capacity of a cloud computing system by applying a machine learning model to the compute capacity indicating a number of allocated compute instances of the cloud computing system and usage metrics indicating pending task requests in a queue of the cloud computing system, to determine a compute scaling adjustment which is an adjustment to a number of compute instances of the cloud computing system, wherein the machine learning model is trained and a reward function that is a function of a number of requests in the queue and a number of allocated compute instances.  Further the reward function is evaluated to compute a reward value which is provided to the machine learning model which then adjusts one or more internal parameters to produce a cumulative or second reward, which if above a threshold number, the resulting compute scaling adjustment is provided to the cloud computing systems to adjust the number of allocated compute instances.  The ordered combination of the elements and limitations bound the claimed invention to a specific and useful improvement for scaling amounts of compute instances in real-time.
Allowable Subject Matter
Claims 6 and 12 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claims 13 – 20 are allowable over the prior art but have outstanding double patenting rejections as was described in the previous sections.  The applicant is required to file a terminal disclaimer that references U.S. Patent 11,153,375 to overcome the double patenting rejections.
The following limitations have been found to be allowable over the prior art as recited in an ordered combination:
calculating a negative penalty based on the compute scaling adjustment; 
normalizing and negating a number of tasks in the queue, 
weighing the number of compute instances by a current load, wherein the current load is a proportion of a current number of compute instances that is used by tasks in the queue, and 
calculating the reward value from the negative penalty, the normalized and negated number of tasks, and the weighted number of compute instances;  
These limitations as an ordered combination identify distinctive elements of the claimed invention.  When considered as a whole, these limitations in combination with the other limitations of the independent claims overcome the prior art of record.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1 – 5, and 7 – 11 are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (U.S. 10,402,733 B1; herein referred to as Li) in view of Tesauro et al. (U.S. 2007/0203871 A1; herein referred to as Tesauro) in further view of Saillet et al. (U.S. 2020/0379803 A1; herein referred to as Saillet).
In regard to claim 1, Li teaches a method of managing a cloud scaling system, the method comprising (Col 1: Lines 40-49 “ . . . a method comprises the following steps. Workload data associated with past execution of an application by a computing system is obtained. Two or more prediction models are trained using the obtained past workload data. A weight is assigned to each of the two or more trained prediction models. The two or more weighted prediction models are combined to form an ensemble prediction model configured to predict, in real-time, workload associated with future execution of the application by the computing system . . . “): 
determining, for a cloud computing system(see Col 3: Lines 28 – 40: “ . . . As shown in system 100, input (workload data) 102 is provided to a plurality of independent machine learning models 104-1, 104-2, 104-3, . . . , 104-i. In this example, the models comprise a linear regression model (104-1), a decision tree model (104-2), a hidden Markov model (104-3), and one or more additional machine learning models 104-i. Other statistical classifiers can be used. The outputs of the independent models are respectively weighted (w.sub.1, w.sub.2, w.sub.3. . . w.sub.i), and then linearly combined in linear combiner 106. The output 110 of the ensemble model (denoted as 108 comprising 104-1, 104-2, 104-3. . . 104-i, and 106) is a weighted linear combination of the outputs of the individual models . . .”), a compute scaling adjustment (see Fig. 2, Col 2: Line 5: “. . .  FIG. 2 illustrates a real-time workload prediction system . . .”) that indicates an adjustment to a number of compute instances  (see Col 6: Lines 41-48: “. . . the output of model 210 is the number of expected instances. Module 212 generates decisions on resource adjustment based on the output from model 210, the current instance number, and information on resource consumption. If module 212 generates a scaling up decision, for example, launching a new instance running the web service on a certain host, the compute node 214 would operate to start a new instance . . .”) of the cloud computing system(e.g. prediction model) (Col 4: Lines 41 – 45 “. . . an adaptive ensemble prediction model based on machine learning algorithms is provided to predict the workload of the tenant's applications in real-time . . .”) by applying a machine learning model(Col 2: Lines 64 – 68: “. . . an adaptive ensemble prediction model based on learning algorithms. Examples of machine learning algorithms include, but are not limited to, linear regression, decision tree, and hidden Markov models . . .”) to a compute capacity of the cloud computing system  (see Col 5: Lines 20 – 24 “ . . . agents 206 located inside the cloud platform (IaaS 202) and running instances collect resource usage and status data from the VMs and the platform itself, and then send the collected data out to data engine 216 with a pre-defined time interval and usage metrics (Col 5: Lines 48 – 51 “ . .. the training set of historical data including resource usage, statistics, as well as the corresponding instance number, [a, b, c, d, e, N].sub.training, is fed into each learning algorithm to build the prediction model. . .”), wherein: 
the compute capacity indicates a number of allocated compute instances (see Col 3: Lines 65-67; Col 4: Lines 1-7 “ . . . Given 5 learning models (104), their outputs are 10, 11, 12, 11, and 10, respectively, where the number 10 means the first learning model considers 10 instances are needed for the web services running with performance guarantees. Assume that the weight group learned during building this ensemble model 108 is (0.1, 0.15, 0.25, 0.2, 0.3), then the final output is: 10×0.1+11×0.15+12×0.25+11×0.2+10×0.3=10.85≈11. That is, the ensemble model considers that 11 instances are needed based on current workload . . . “), 
the usage metrics indicate pending task requests in a queue of the cloud computing system (see Col 3: Lines 41-52 “ . . . The input 102 includes information on current resource usage by instances and statistics on a workload running in these instances, while the output 110 is the number of instances (N). The resource usage normally refers to CPU (central processing unit) utilization (a), memory consumed (b), and network throughput (c). The workload statistics, given a web service running in the instances, refers to the number of user requests processed per unit time (which can be obtained from the application logs), and other measurements, for example, the periodicity (d) and burstiness (e) of the workload, calculated with the current user request number and historical numbers within the specific time window. . . .”), 
and providing the compute scaling adjustment to the cloud computing system, wherein the cloud computing system adjusts the number of allocated compute instances (see Col 7: Lines 17 – 25: “ . . . Analytics engine 308 manages the resource usage data of each instance running user applications, and uses the data to predict workload with a prediction model 310. The prediction model is constructed and adapted via the adaptive ensemble learning techniques described herein. The final scaling decisions are sent out to dynamically adjust the instances. With the run-time resource usage data, the resource configuration is dynamically adjusted with the changing workload in a proactive way . . .”), or a weighted sum of the number of compute instances relative to a current load  (see Col 3: Lines 59 – 64: “. . . The output from each learning model is the predicted number of instances. With a group of weights (w.sub.1, w.sub.2, w.sub.3, . . . , w.sub.i), the linear combiner 106 calculates the final output 110. For example, if we use equal weights for all learning models, the output 110 is the average of instance numbers . . .”),
Li fails to explicitly teach and the machine learning model is trained using reinforcement learning and a reward function that is a function of a number of requests in the queue and the number of allocated compute instances and comprises one or more of an overage of the number of compute instances relative to a maximum number of compute instances, a number of pending processing requests in the queue, wherein the current load is a proportion of a current number of compute instances that is used by tasks in the queue;  However Tesauro teaches and the machine learning model is trained using reinforcement learning and a reward function(see ¶ [0024]: “. . . the method 400 applies a reward-based learning algorithm (e.g., a Reinforcement Learning algorithm) to the training data. In one embodiment, the reward-based learning algorithm incrementally learns a value function, Q(s, a), denoting the cumulative discounted or undiscounted long-range expected value when action a is taken in state s. The value function Q(s, a) induces a new policy by application of a value-maximization principle that stipulates selecting, among all admissible actions that could be taken in state s, the action with the greatest expected value. The value function Q(s, a) may be learned by a value function learning algorithm such as Temporal Distance Learning, Q-Learning or Sarsa . . .”) that is a function of a number of requests in the queue(e.g. number of page requests) (see ¶ [0035]: “ . . .  In accordance with application of this initial policy, the autonomic manager 502 reports observations (i.e., state/action /reward tuples) to the system log data module 506, which logs the observations as training data for the reward-based learning module 508. In one embodiment, the application environment state, s, at time t comprises the average demand (e.g., number of page requests per second) at time t, the mean response time at time t, the mean queue length at time t and the previous resource level assigned at time t-1 . . .”) and the number of allocated compute instances(e.g. allocated resources) (see ¶ [0032]: “ . . . the application environment 500 comprises at least an autonomic manager element 502, an initial value function module 504, a system log data module 506, a reward-based learning (e.g., Reinforcement Learning) module 508 and a trained value function module 510. Interactions of the application environment 500 with its SLA 514, its client demand, its currently allocated resources . . .”; see ¶ [0034]: “ . . . pertaining to open-loop traffic, the initial value function is based on a parallel M/M/1 queuing methodology, which estimates, in the current application state, how a hypothetical change in the number of assigned servers would change anticipated mean response time (and thereby change the anticipated utility as defined by the SLA 514).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the applicant’s invention to incorporate a method and system for implementing a reward based learning of improved systems management policies that is applied to distributed computer systems to improve the resource allocation for application requests, as taught by Tesauro, into a method and system for training predication models using workload data that determines a number of expected instances required to execute the workloads, as taught by Li.  Such incorporation injects best practices for using reward based machine learning algorithms when managing a cloud based computer system, and providing a model that scales instances based on projected demands of the system.
The combination of Li and Tesauro fails to explicitly teach and comprises one or more of an overage of the number of compute instances relative to a maximum number of compute instances, a number of pending processing requests in the queue, wherein the current load is a proportion of a current number of compute instances that is used by tasks in the queue;  However Saillet teaches and comprises one or more of an overage of the number of compute instances relative to a maximum number of compute instances (see  ¶ [0002]: “ . . . A workload manager is a system that schedules the execution of one or multiple computing workflows so that the utilization of the resources required by these workflows is maximized, but does not exceed a maximum limit . . . “ see  ¶ [0006]: “ . . determining, by the machine logic, a time of execution of each workflow in the queue is further based on a maximum usage for each computing resource. Doing so ensures that a maximum usage level of a computing resource is not exceeded and that the particular computing resource is not overcommitted . . .”.), a number of pending processing requests in the queue (see  ¶ [0002] “ . . . a workload manager monitors the availability of critical resources of the system such as CPU usage, available random-access memory (RAM), input/output (I/O) utilization, etc., and maintains a queue of requests. When a workflow is added to the queue, the system starts immediately or delays the execution of the workflow based on how much resources are available. For instance, a workload manager may be set up to block any new incoming workflow if more than five workflows are already running, or if the CPU or RAM usage is more than 80%. At regular intervals, the system will check if the conditions allowing the execution of a new workflow are met or not i.e., after one of the running workflows has completed, and will start the next workflow in the queue as soon as the actual system resource usage allows it. In this example, the workflows in the queue have to wait until they are at the top of the queue and the conditions allowing a new workflow to be executed are met. . .”), wherein the current load (e.g. amount of resources used) is a proportion of a current number of compute instances (e.g. workflows) that is used by tasks in the queue (see ¶¶ [0009-0012] “ . . . The system includes a resource analyzer to determine an available amount of each of multiple computing resources over time at a computing device. The system also includes a workflow analyzer to determine an expected usage of each computing resource to execute each workflow in a queue. The system also includes a scheduler to determine a time of execution of each workflow in the queue based on the available amount of each of the multiple computing resources over time and the expected usage of each computing resource to execute each workflow in the queue. Such a system is more efficient and more error resistant during execution of data processing workflows. For example, as described above, different workflows may utilize computing resources differently and the current system, as opposed to classical systems, accounts for the unique computing resource consumption of each job and may thus allow certain workflows even though overall computing resource usage is above a threshold level, so long as the certain workflows would not result in an overcommitment of the resources over time.   In one optional example of the system, the system also includes a graph generator. The graph generator 1) generates a time-based graph of the available amount of each of the multiple computing resources over time and 2) superimposes on the time-based graph, expected usage of each computing resource to execute each workflow in the queue. This is done in order to determine if computing resource usage exceeds a maximum value over the execution of the workflow and to determine which workflow should be executed at which point in time to maximize the computing resource utilization without exceeding its maximum usage. Such a graph generator allows a user to visualize, and a system to determine, resource usage over time, which classical approaches may not account for as they may make determinations based on one point in time.  The present specification also describes a computer-implemented method. According to the computer-implemented method, machine logic schedules a number of workflows in a scheduling system for execution by a data processing system. The machine logic also optimizes a time of execution of each workflow. This is done by, 1) simulating an availability of multiple computing resources over a period of time, 2) simulating the expected usage of each computing resource to execute each workflow from a start point of execution, 3) creating a superimposed time-based graph of computing resource usage to decide which workflow to execute at which point of time, and 4) rescheduling the workflows based on the superimposed time-based graph to form a queue of workflows wherein the multiple computing resource usage is maximized. Rescheduling the workflows based on simulated resource availability and expected usage allows for customized workflow execution based on system parameters as opposed to canned, out of the box thresholds. Moreover, creating a superimposed time-based graph allows for a visualization of the resource usage over time to verify, and allow manipulation of, workflow execution.  In one optional example of the method, optimizing a time of execution of each workflow is repeated multiple times during execution of workflows in the queue.;
It would have been obvious to one with ordinary skill in the art before the effective filing date of the applicant’s invention to incorporate a method and system for determining an available amount of each of multiple computing resources by machine logic over a period of time at a computing device. The machine logic also determines an expected usage of each computing resource to execute each workflow in a queue, as taught by Saillet, into a method and system for training predication models using workload data that determines a number of expected instances required to execute the workloads, the models are calculated using a reward based learning of improved systems management policies that is applied to distributed computer systems to improve the resource allocation for application requests, as taught by the combination of Li and Tesauro.  Such incorporation enables the models generated to determine levels of resources necessary to support varying amounts of instances and workloads being executed in the cloud. 
In regard to claim 2, the combination of Li, Tesauro, and Saillet teaches further comprising receiving a processing request from a client computing device, wherein determining the compute scaling adjustment further comprises applying the machine learning model to the processing request along with the compute capacity of the cloud computing system and the usage metrics (see LI Col 3: Lines 41-64 as described for the rejection of claim 1 and is incorporated herein).
In regard to claim 3, the combination of Li, Tesauro, and Saillet teaches further comprising removing the processing request from the queue of the cloud computing system and executing the processing request(see Li: Col 4: Lines 17-23 “ . . . a tenant consumes the virtual resource in the cloud platform and runs his applications and services. The appropriate configuration of virtual resources is fundamental to meet the service level requirements in terms of performance, availability, capacity, etc., and to satisfy the experience of the user of the application (and any service level agreement). . .” ).
In regard to claim 4, the combination of Li, Tesauro, and Saillet teaches wherein adjusting the number of compute instances comprises allocating one or more hardware devices to the cloud scaling system or removing the one or more hardware devices from the cloud computing system (see Li Col 7: Lines 17 -25: “. . . Analytics engine 308 manages the resource usage data of each instance running user applications, and uses the data to predict workload with a prediction model 310. The prediction model is constructed and adapted via the adaptive ensemble learning techniques described herein. The final scaling decisions are sent out to dynamically adjust the instances. With the run-time resource usage data, the resource configuration is dynamically adjusted with the changing workload in a proactive way . . .”; Col 7: Lines 56 – 67; Col 8: Lines 1-2: “ . . . As an example of a processing platform on which a workload prediction system (e.g., 300 of FIG. 3) can be implemented is processing platform 500 shown in FIG. 5. The processing platform 500 in this embodiment comprises a plurality of processing devices, denoted 502-1, 502-2, 502-3, . . . 502-N, which communicate with one another over a network 504. It is to be appreciated that the methodologies described herein may be executed in one such processing device 502, or executed in a distributed manner across two or more such processing devices 502. It is to be further appreciated that a server, a client device, a computing device or any other processing platform element may be viewed as an example of what is more generally referred to herein as a "processing device." . . .”).
In regard to claim 5, the combination of Li, Tesauro, and Saillet teaches further comprising: computing a reward value by evaluating the reward function (see Tesauro - ¶ [0024]: “ . . . one applies to each observed state/action/reward tuple the following learning algorithm: .DELTA.Q(z.sup.t)=.alpha.(t)[r.sup.t+.gamma.Q(z.sup.t+1)-Q(z.sup.1)] (EQN. 1) 
where Z.sup.t is the initial embedded (state, action) pair at time t, r.sup.t is the immediate reward at time t for taking the action a.sup.t in the initial state s.sup.t, z.sup.t+1 is the next embedded (state, action) pair at time t+1, .gamma. is a constant representing a "discount parameter" (having a value between zero and one that expresses the present value of an expected future reward) and .alpha.(t) is a "learning rate" parameter that decays to zero asymptotically to e ensure convergence . . .”); and providing the reward value to the machine learning model (see Tesauro  - ¶ [0036] “ . . . The system log data module 506 provides training data (logged observations) to the reward-based learning module 508, which applies an reward-based learning algorithm to the training data in order to learn a new value function Q(s, n) that estimates the long-term value of the allocation of a specified resource (e.g., n servers) to the application environment operating in its current state s . . .”), wherein the machine learning model adjusts one or more internal parameters to maximize a cumulative reward (e.g. long-range value function O(s,n)) (see Tesauro  -¶ [0036] “ . . . the new value function Q(s, n) is represented by a standard multi-layer perceptron function approximator comprising one input unit per state variable in the state description at time t, one input unit to represent the resource level (e.g., number of servers) assigned at time t, a single hidden layer comprising twelve sigmoidal hidden units and a single linear output unit estimating the long-range value function Q(s, n).  . . .”).
The motivation to combine the references is described for the motivation of claim 1 and is incorporated herein.  Additionally, Tesauro provides reward algorithms that chooses a best learning model that can be used to choose the instances to be allocated.
In regard to claim 7, Li teaches a cloud computing system comprising(see Fig. 2, Col 2: Line 5: “. . .  FIG. 2 illustrates a real-time workload prediction system . . .”): 
a processor(see Fig. 5 Col 7: Lines 56 – 62: “. . . As an example of a processing platform on which a workload prediction system (e.g., 300 of FIG. 3) can be implemented is processing platform 500 shown in FIG. 5. The processing platform 500 in this embodiment comprises a plurality of processing devices, denoted 502-1, 502-2, 502-3, . . . 502-N, which communicate with one another over a network 504 . . .”); and non-transitory computer-readable storage medium storing computer-executable program instructions (Col 8: Lines 9 – 10: “. . . The processing device 502-1 in the processing platform 500 comprises a processor 510 coupled to a memory 512 . . . “), wherein when executed by the processor, the computer-executable program instructions cause the processor to perform operations comprising (see Col 8: Lines 16 – 30: “. . . Components of systems as disclosed herein can be implemented at least in part in the form of one or more software programs stored in memory and executed by a processor of a processing device such as processor 510. Memory 512 (or other storage device) having such program code embodied therein is an example of what is more generally referred to herein as a processor-readable storage medium. Articles of manufacture comprising such processor-readable storage media are considered embodiments of the invention. A given such article of manufacture may comprise, for example, a storage device such as a storage disk, a storage array or an integrated circuit containing memory. The term "article of manufacture" as used herein should be understood to exclude transitory, propagating signals . . .”): 
determining, for a cloud computing system, a compute scaling adjustment (e.g. prediction model) (Col 4: Lines 41 – 45 “. . . an adaptive ensemble prediction model based on machine learning algorithms is provided to predict the workload of the tenant's applications in real-time . . .”)  that indicates an adjustment to a number of compute instances of the cloud computing system by applying a machine learning model(Col 2: Lines 64 – 68: “. . . an adaptive ensemble prediction model based on learning algorithms. Examples of machine learning algorithms include, but are not limited to, linear regression, decision tree, and hidden Markov models . . .”) to a compute capacity of the cloud computing system(see Col 5: Lines 20 – 24 “ . . . agents 206 located inside the cloud platform (IaaS 202) and running instances collect resource usage and status data from the VMs and the platform itself, and then send the collected data out to data engine 216 with a pre-defined time interval   and usage metrics(Col 5: Lines 48 – 51 “ . .. the training set of historical data including resource usage, statistics, as well as the corresponding instance number, [a, b, c, d, e, N].sub.training, is fed into each learning algorithm to build the prediction model. . .”), wherein: the compute capacity(see Col 3: Lines 41 – 46: “ . . . The input 102 includes information on current resource usage by instances and statistics on a workload running in these instances, while the output 110 is the number of instances (N). The resource usage normally refers to CPU (central processing unit) utilization (a), memory consumed (b), and network throughput (c) . . .”) indicates a number of allocated compute instances (see Col 6: Lines 41-48: “. . . the output of model 210 is the number of expected instances. Module 212 generates decisions on resource adjustment based on the output from model 210, the current instance number, and information on resource consumption. If module 212 generates a scaling up decision, for example, launching a new instance running the web service on a certain host, the compute node 214 would operate to start a new instance . . .”) and, 
the usage metrics indicate pending task requests in a queue (e.g. current user request number) of the cloud computing system(see Col 3: Lines 46 – 55 “ . . . The workload statistics, given a web service running in the instances, refers to the number of user requests processed per unit time (which can be obtained from the application logs), and other measurements, for example, the periodicity (d) and burstiness (e) of the workload, calculated with the current user request number and historical numbers within the specific time window. The periodicity can be measured by auto-correlation, and the burstiness can be measured using entropy. Briefly, the input 102 is a series of numbers [a, b, c, d, e] . . .”), or a weighted sum of the number of compute instances relative to a current load (see Col 3: Lines 59 – 64: “. . . The output from each learning model is the predicted number of instances. With a group of weights (w.sub.1, w.sub.2, w.sub.3, . . . , w.sub.i), the linear combiner 106 calculates the final output 110. For example, if we use equal weights for all learning models, the output 110 is the average of instance numbers . . .”), 
and providing the compute scaling adjustment to the cloud computing system, wherein the cloud computing system adjusts the number of allocated compute instances  (see Col 7: Lines 17 – 25: “ . . . Analytics engine 308 manages the resource usage data of each instance running user applications, and uses the data to predict workload with a prediction model 310. The prediction model is constructed and adapted via the adaptive ensemble learning techniques described herein. The final scaling decisions are sent out to dynamically adjust the instances. With the run-time resource usage data, the resource configuration is dynamically adjusted with the changing workload in a proactive way . . .”).
Li fails to explicitly teach and the machine learning model is trained using reinforcement learning and a reward function that is a function of a number of requests in the queue and the number of allocated compute instances and comprises one or more of an overage of the number of compute instances relative to a maximum number of compute instances, a number of pending processing requests in the queue, wherein the current load is a proportion of a current number of compute instances that is used by tasks in the queue;  However Tesauro teaches and the machine learning model is trained using reinforcement learning and a reward function (see ¶ [0024]: “. . . the method 400 applies a reward-based learning algorithm (e.g., a Reinforcement Learning algorithm) to the training data. In one embodiment, the reward-based learning algorithm incrementally learns a value function, Q(s, a), denoting the cumulative discounted or undiscounted long-range expected value when action a is taken in state s. The value function Q(s, a) induces a new policy by application of a value-maximization principle that stipulates selecting, among all admissible actions that could be taken in state s, the action with the greatest expected value. The value function Q(s, a) may be learned by a value function learning algorithm such as Temporal Distance Learning, Q-Learning or Sarsa . . .”) that is a function of a number of requests in the queue (e.g. number of page requests) (see ¶ [0035]: “ . . .  In accordance with application of this initial policy, the autonomic manager 502 reports observations (i.e., state/action /reward tuples) to the system log data module 506, which logs the observations as training data for the reward-based learning module 508. In one embodiment, the application environment state, s, at time t comprises the average demand (e.g., number of page requests per second) at time t, the mean response time at time t, the mean queue length at time t and the previous resource level assigned at time t-1 . . .”)  and the number of allocated compute instances (e.g. allocated resources) (see ¶ [0032]: “ . . . the application environment 500 comprises at least an autonomic manager element 502, an initial value function module 504, a system log data module 506, a reward-based learning (e.g., Reinforcement Learning) module 508 and a trained value function module 510. Interactions of the application environment 500 with its SLA 514, its client demand, its currently allocated resources . . .”; see ¶ [0034]: “ . . . pertaining to open-loop traffic, the initial value function is based on a parallel M/M/1 queuing methodology, which estimates, in the current application state, how a hypothetical change in the number of assigned servers would change anticipated mean response time (and thereby change the anticipated utility as defined by the SLA 514). 
The motivation to combine Tesauro with Li is described for the rejection of claim 1 and is incorporated herein.
The combination of Li and Tesauro fails to explicitly teach and comprises one or more of an overage of the number of compute instances relative to a maximum number of compute instances, a number of pending processing requests in the queue, wherein the current load is a proportion of a current number of compute instances that is used by tasks in the queue;  However Saillet teaches and comprises one or more of an overage of the number of compute instances relative to a maximum number of compute instances (see ¶ [0002], ¶ [0006] as described for the rejection of claim 1 and is incorporated herein), a number of pending processing requests in the queue (see  ¶ [0002] as described for the rejection of claim 1 and is incorporated herein), wherein the current load (e.g. amount of resources used) is a proportion of a current number of compute instances (e.g. workflows) that is used by tasks in the queue (see ¶¶ [0009-0012] as described for the rejection of claim 1 and is incorporated herein).
The motivation to combine Saillet with the combination of Li and Tesauro is described for the rejection of claim 1 and is incorporated herein. 
In regard to claim 8, the combination of Li, Tesauro, and Saillet teaches wherein when executed by the processor, the computer-executable program instructions further cause the processor to perform operations (see Li Col 8: Lines 9 – 10 as described for the rejection of claim 7 and is incorporated herein) comprising: receiving a processing request from a client computing device (see Li Col 3: Lines 41 – 52: “. . . The input 102 includes information on current resource usage by instances and statistics on a workload running in these instances, while the output 110 is the number of instances (N). The resource usage normally refers to CPU (central processing unit) utilization (a), memory consumed (b), and network throughput (c). The workload statistics, given a web service running in the instances, refers to the number of user requests processed per unit time (which can be obtained from the application logs), and other measurements, for example, the periodicity (d) and burstiness (e) of the workload, calculated with the current user request number and historical numbers within the specific time window . . . “); and applying the machine learning model to the processing request along with the compute capacity of the cloud computing system and the usage metrics (see Li Col 3: Lines 41-64: “ . . . The input 102 includes information on current resource usage by instances and statistics on a workload running in these instances, while the output 110 is the number of instances (N). The resource usage normally refers to CPU (central processing unit) utilization (a), memory consumed (b), and network throughput (c). The workload statistics, given a web service running in the instances, refers to the number of user requests processed per unit time (which can be obtained from the application logs), and other measurements, for example, the periodicity (d) and burstiness (e) of the workload, calculated with the current user request number and historical numbers within the specific time window. The periodicity can be measured by auto-correlation, and the burstiness can be measured using entropy. Briefly, the input 102 is a series of numbers [a, b, c, d, e]. Specifically, classification methods such as the decision tree might perform better with discrete data; in such case, the discretization is needed before feeding inputs to the classification model. The output from each learning model is the predicted number of instances. With a group of weights (w.sub.1, w.sub.2, w.sub.3, . . . , w.sub.i), the linear combiner 106 calculates the final output 110. For example, if we use equal weights for all learning models, the output 110 is the average of instance numbers . . .”).
In regard to claim 9, the combination of Li, Tesauro, and Saillet teaches wherein when executed by the processor, the computer-executable program instructions further cause the processor to perform operations (see Li Col 8: Lines 9 – 10 as described for the rejection of claim 7 and is incorporated herein) comprising removing the processing request from the queue of the cloud computing system and executing the processing request (see Li: Col 4: Lines 17-23 “ . . . a tenant consumes the virtual resource in the cloud platform and runs his applications and services. The appropriate configuration of virtual resources is fundamental to meet the service level requirements in terms of performance, availability, capacity, etc., and to satisfy the experience of the user of the application (and any service level agreement). . .” ).
In regard to claim 10, the combination of Li, Tesauro, and Saillet teaches wherein adjusting the number of compute instances comprises allocating one or more hardware devices to the cloud scaling system or removing the one or more hardware devices from the cloud computing system (see Li Col 7: Lines 17 -25: “. . . Analytics engine 308 manages the resource usage data of each instance running user applications, and uses the data to predict workload with a prediction model 310. The prediction model is constructed and adapted via the adaptive ensemble learning techniques described herein. The final scaling decisions are sent out to dynamically adjust the instances. With the run-time resource usage data, the resource configuration is dynamically adjusted with the changing workload in a proactive way . . .”; Col 7: Lines 56 – 67; Col 8: Lines 1-2: “ . . . As an example of a processing platform on which a workload prediction system (e.g., 300 of FIG. 3) can be implemented is processing platform 500 shown in FIG. 5. The processing platform 500 in this embodiment comprises a plurality of processing devices, denoted 502-1, 502-2, 502-3, . . . 502-N, which communicate with one another over a network 504. It is to be appreciated that the methodologies described herein may be executed in one such processing device 502, or executed in a distributed manner across two or more such processing devices 502. It is to be further appreciated that a server, a client device, a computing device or any other processing platform element may be viewed as an example of what is more generally referred to herein as a "processing device." . . .”).
In regard to claim 11, the combination of Li, Tesauro, and Saillet teaches wherein when executed by the processor, the computer-executable program instructions further cause the processor to perform operations (see Li Col 8: Lines 9 – 10 as described for the rejection of claim 7 and is incorporated herein) comprising: computing a reward value by evaluating the reward function (see Tesauro - ¶ [0024]: “ . . . one applies to each observed state/action/reward tuple the following learning algorithm: .DELTA.Q(z.sup.t)=.alpha.(t)[r.sup.t+.gamma.Q(z.sup.t+1)-Q(z.sup.1)] (EQN. 1) 
where Z.sup.t is the initial embedded (state, action) pair at time t, r.sup.t is the immediate reward at time t for taking the action a.sup.t in the initial state s.sup.t, z.sup.t+1 is the next embedded (state, action) pair at time t+1, .gamma. is a constant representing a "discount parameter" (having a value between zero and one that expresses the present value of an expected future reward) and .alpha.(t) is a "learning rate" parameter that decays to zero asymptotically to ensure convergence . . . ; and providing the reward value to the machine learning model (see Tesauro  - ¶ [0036] “ . . . The system log data module 506 provides training data (logged observations) to the reward-based learning module 508, which applies an reward-based learning algorithm to the training data in order to learn a new value function Q(s, n) that estimates the long-term value of the allocation of a specified resource (e.g., n servers) to the application environment operating in its current state s . . .”); and 
providing the reward value to the machine learning model (see Tesauro  - ¶ [0036] “ . . . The system log data module 506 provides training data (logged observations) to the reward-based learning module 508, which applies an reward-based learning algorithm to the training data in order to learn a new value function Q(s, n) that estimates the long-term value of the allocation of a specified resource (e.g., n servers) to the application environment operating in its current state s . . .”), wherein the machine learning model adjusts one or more internal parameters to maximize a cumulative reward(e.g. long-range value function O(s,n)) (see Tesauro  -¶ [0036] “ . . . the new value function Q(s, n) is represented by a standard multi-layer perceptron function approximator comprising one input unit per state variable in the state description at time t, one input unit to represent the resource level (e.g., number of servers) assigned at time t, a single hidden layer comprising twelve sigmoidal hidden units and a single linear output unit estimating the long-range value function Q(s, n).  . . .”).
The motivation to combine the references is described for the rejection of claim 1 and is incorporated herein.  Additionally, Tesauro provides reward algorithms that chooses a best learning model that can be used to choose the instances to be allocated.
Conclusion
There are prior art made of record which are not relied upon but are considered pertinent to applicant’s disclosure.  They are listed on the PTO-892 accompanying this action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES N FIORILLO whose telephone number is (571)272-9909.  The examiner can normally be reached on 7:30 - 5 PM Mon - Fri..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John A. Follansbee can be reached on 571-272-3964.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JAMES N FIORILLO/Examiner, Art Unit 2444