DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to Applicant’s Amendment and Remarks filed on 30 June 2022. 
Claims 1-21 are pending in this application.


Claim objections
Claims 6, 14 and 20 are objected to because of the following informalities:
In claims 6, 14 and 20 (line# refers to claim 6), line 8-9, it recites “deploy…on the at least one worker host”. However, prior to this phrase at line 6-7, it recites “selected at least one worker host”. Thus, it is unclear whether the second recitation of “at least one worker host” is the same or different from the first recitation of “selected at least one worker host”. It should be amended as “deploy…on the selected at least one worker host”.
Appropriate correction is required.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.  
Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1, Statutory Category: Yes, the claim 1 is a computer-implemented method that recites a series of steps and therefore falls in the statutory category of a process.
Step 2A- Prong 1: Judicial Exception Recited: Yes, the claim recites: “in response to the load information for the first role meeting the first condition, increasing, a number of the first application containers by the first scaling factor, and increasing a number of second application containers by the second scaling factor” and “selecting, based on loads of the plurality of worker hosts, at least one worker host of the plurality of worker hosts on which to deploy the new first application container and the new second application container” As drafted, the claim as a whole recites a method including steps that could be performed in the human mind, but for the recitation of generic computing components. The human mind can easily judging/evaluating/planning to adjust the resources allocation (i.e., scaling in/out (i.e., decreasing/adding the resources/instances) based on the workload information and determining/selecting a proper resource(i.e., host) for deploying based on the load information of host. For example, a person can easily evaluating/determining/judging/plaining to adding/increasing or decreasing new resources for processing the application based on the received workload information and selecting a less loaded physical host for deploying the new resources (i.e., based on the determining/judging the workload information of physical host, heavy load? Or less load?). Therefore, but for the recitation of generic computing components, these steps may be a Mental Processes that can be performed in the human mind (including an observation, evaluation, judgment, opinion). 
Therefore, yes, the claims do recite judicial exceptions.
Step 2A- Prong 2: Integrated into a practical Application: No, this judicial exception is not integrated into a practical application. In particular, the claim recites an additional limitations that “receiving information regarding a role of a plurality of roles, wherein a second role of the plurality of roles is dependent upon a first role of the plurality of roles” and “receiving load information for the first role”, “collect load information of the new first application container” and “collect load information of the new second application container” which is insignificant pre-solution data gathering (see MPEP § 2106.05(g)). In addition, “a plurality of application containers”, “a first virtual cluster”, “a distributed processing environment on which a stateful application is or will be deployed”, “the stateful application performed by each respective application container of the plurality of application containers,” “wherein the distributed processing environment comprises a plurality of worker hosts containing multiple application containers for respective virtual clusters including the first virtual cluster, and wherein the plurality of application containers in the first virtual cluster are distributed across multiple worker hosts of the plurality of worker hosts”, “a controller host of the distributed processing environment”, “container agents in first application containers”, “a new first application container”, “new second application container” and “a respective container agent” (i.e., Applying the judicial exception with, or by use of, a particular machine MPEP 2106.05(b) and an attempt to generally link the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))). Further, “maintaining, a set of role-based autoscaling policies defining conditions under which the plurality of roles are to be scaled, wherein a role-based autoscaling policy of the set of role-based autoscaling policies for the first role specifies a first condition that triggers scaling out of the first role by a first scaling factor and scaling out of the second role by a second scaling factor in tandem” which is Insignificant Extra-Solution Activity (i.e., mere data storing; see MPEP §2106.05(g)). The combination of these additional elements is no more than mere instructions to apply the exception using a generic computer component. Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to the abstract idea.
Step 2B: Claim provides an Inventive Concept: No. As discussed with respect to Step 2A prong Two, the additional element “a plurality of application containers”, “a first virtual cluster”, “a distributed processing environment on which a stateful application is or will be deployed”, “the stateful application performed by each respective application container of the plurality of application containers,” “wherein the distributed processing environment comprises a plurality of worker hosts containing multiple application containers for respective virtual clusters including the first virtual cluster, and wherein the plurality of application containers in the first virtual cluster are distributed across multiple worker hosts of the plurality of worker hosts”, “a controller host of the distributed processing environment”, “container agents in first application containers”, “a new first application container”, “new second application container” and “a respective container agent” (i.e., Applying the judicial exception with, or by use of, a particular machine MPEP 2106.05(b) and an attempt to generally link the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))). In addition, the limitation “receiving information regarding a role of a plurality of roles, wherein a second role of the plurality of roles is dependent upon a first role of the plurality of roles” and “receiving load information for the first role”, “collect load information of the new first application container” and “collect load information of the new second application container” which is insignificant pre-solution data gathering (see MPEP § 2106.05(g)) and the limitation of “maintaining, a set of role-based autoscaling policies defining conditions under which the plurality of roles are to be scaled, wherein a role-based autoscaling policy of the set of role-based autoscaling policies for the first role specifies a first condition that triggers scaling out of the first role by a first scaling factor and scaling out of the second role by a second scaling factor in tandem” is Insignificant extra-solution activity (i.e., mere data storing; see MPEP §2106.05(g)) which are additionally well understood, routine, conventional activity (see MPEP § 2106.05(d)). Courts have identified “receiving and transmitting data, storing and retrieving information”, et cetera as well understood, routine, conventional and an attempt to generally link the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))). The same analysis applies here in 2B, i.e., mere instructions to apply an exception on a generic computer cannot integrate a judicial exception into a practical application at Step 2A. These additional elements and combination of the elements does not amount to significant more than the exception itself or provide an inventive concept in Step 2B.

Under the 2019 PEG, a conclusion that an additional element is insignificant extra-solution activity in Step 2A should be re-evaluated in Step 2B. Here, the receiving steps and collecting steps were considered to be extra-solution activity in Step 2A as insignificant pre-solution data gathering, and maintaining step was considered to be Insignificant extra-solution activity and thus it is re-evaluated in Step 2B to determine if it is more than what is well understood, routine, conventional activity in the field.
The receiving and collecting steps are for the purpose of “collecting” the data and these can be reached on one of court case (Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016) (collection, analysis and display data) see MPEP § 2106.05(g)). Additionally, the maintain step is for the purpose of “storing” the data and this can be reached on one of court case (Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93). Accordingly, a conclusion that the receiving, collecting and maintaining are well understood, routine, conventional activity is supported under Berkheimer options 2.

For these reasons, there is no inventive concept in the claim, and thus the claim is ineligible. 

Independent claims 9 and 17 are rejected for the same reason as claim 1 above. Claim 17 further recites “a processor” and “a non-transitory computer-readable medium”. These additional elements are directed to generally link the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))). 

With respect to the dependent claim 2, the claim elaborates that wherein the role-based autoscaling policy also specifies a second condition that triggers scaling in of the first role by a third scaling factor and scaling in of the second role by a fourth scaling factor in tandem, the method further comprising: in response to the load information for the first role meeting the second condition, decreasing the number of the first application containers in the first virtual cluster that perform the first role by the third scaling factor, and decreasing the number of the second application containers in the first virtual cluster that perform the second role by the fourth scaling factor (“decreasing” (i.e., adjusting) the number of the first/second application containers based on the “load information” as being treated as part of abstract idea and is analogues to Mental processes, such that concept can be performed in the human mind. In addition, the claim as a whole is a Mental Processes that can be performed in the human mind (including an observation, evaluation, judgment, opinion)).

With respected to the dependent claim 3, the claim elaborates that fetching, by the controller host, the set of role-based autoscaling policies from a database, wherein the role-based autoscaling policy defines an evaluation period at which the load information is to be evaluated; and based on the evaluation period of the role-based autoscaling policy, retrieving, by the controller host, the load information that has been previously gathered from each first application container of the first virtual cluster and stored in storage. (“fetching” and “retrieving” is being treated as a well understood, routine, conventional activity such that this additional element does not integrate the abstract idea into a practical application, such evidence can be found in Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016) (collection, analysis and display data) see MPEP § 2106.05(g) and (Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93)).


With respected to the dependent claim 4, the claim elaborates that wherein different ones of the virtual clusters are to execute respective different stateful applications (“virtual clusters are to execute respective different stateful applications” as being treated as generally link the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))). 


With respected to the dependent claim 5, the claim elaborates that wherein the load information comprises a measure of central processing unit (CPU) usage by the first application containers, a measure of memory usage by the first application containers, a measure of disk usage by the first application containers, or a measure of network usage by the first application containers. (“a measure of central processing unit (CPU) usage”, “a measure of memory usage” , “a measure of disk usage” and “a measure of network usage” as being treated as generally link the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))). 


With respected to the dependent claim 6, the claim elaborates that sending, by the controller host, a request to scale out to a worker agent in the selected at least one worker host, the request to scale out to cause the worker agent to deploy the new first application container and the new second application container on the at least one worker host. (“sending” the request as being treated as Insignificant Extra-Solution Activity (see MPEP §2106.05(g)) and a well understood, routine, conventional activity such that this additional element does not integrate the abstract idea into a practical application, such evidence can be found in Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016) see MPEP § 2106.05(g))).

With respected to the dependent claim 7, the claim elaborates that wherein the at least one worker host previously did not include any of the first application containers of the first virtual cluster (“worker host” and “first application containers of the first virtual cluster” as being treated as generally link the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))). 


With respected to the dependent claim 8, the claim elaborates that wherein the load information meeting the first condition comprises the load information meeting any one or more of a set of conditions that include the first condition (“load information” that meeting “any one or more of a set of conditions” as being treated as generally link the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)). In addition, determining/judging of “meeting” the conditions as being treated as part of abstract idea and is analogues to Mental processes, such that concept can be performed in the human mind. In addition, the claim as a whole is a Mental Processes that can be performed in the human mind (including an observation, evaluation, judgment, opinion)).

Dependent claims 10-16 recite the same features as applied to claims 2-8 respectively above, therefore they are also rejected under the same rationale.

Dependent claims 18-21 recite the same features as applied to claims 2-3, 6 and 5 respectively above, therefore they are also rejected under the same rationale.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7-13, 15-19 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over EINKAUF et al. (US. Pub. 2016/0323377 A1) in view of Bauman et al. (US Patent. 10,333,901 B1) and further in view of Tang et al. (US Patent. 10,409,642 B1), SHEN et al. (US Pub. 2018/0316751 A1) and McDowell (US Patent. 10,637,738 B1).
EINKAUF and Tang were cited in the previous Office Action.
SHEN was cited in the IDS filed on 06/27/2022.

As per claim 1, EINKAUF teaches the invention substantially as claimed including A computer-implemented method comprising: 
for a plurality of application containers in a first virtual cluster of a distributed processing environment on which a stateful application is or will be deployed (EINKAUF, Fig. 1, 100 provider network (as distributed processing environment), 120 MapReduce cluster (as first virtual cluster), 125A-I instances (as plurality of application containers such as the containers described in [0024], [0026], [0030], etc); Fig. 2, 220, begin executing a distributed application on the cluster of nodes; Fig. 6, 610, a service receives a request to create a cluster of virtualized computing resource instances on which to execute a given application (as stateful application) (or computation thereof) on behalf of a service customer or subscriber; [0002] line 3, large-scale computing resources; [0025] lines 4-13, the systems may avoid removing a node during an operation to reduce capacity if the node stores important state information (as stateful application, since the state is stored) (e.g., if it stores data and it cannot be gracefully decommissioned)…In other words, unlike in existing auto-scaling solutions, the systems described herein may apply intelligence in scaling operations due to the unique behaviors of at least some of the nodes; also see [0024] lines 1-3, Existing auto-scaling solutions are typically designed for stateless workloads in systems with homogeneous nodes), receiving information regarding a role of a plurality of roles of the stateful application performed by each respective application container of the plurality of application containers (EINKAUF, Fig. 6, 630, the service receiving input defining an expression to be evaluated as part of an auto-scaling policy, and the expression may include one or more default metrics that are emitted by the service provider system, by the cluster, or by the given application, and/or one or more custom metrics (as receiving information (i.e., metrics)) that are emitted by the application or that are created through aggregation; [0037] lines 4-9, during execution of the application, gathering and/or aggregating metrics that are relevant to trigger condition(s), as in 230. Examples of such metrics (some of which may be application-specific, workload-specific, and/or specific to a particular instance group); also see [0036] As illustrated at 210, in this example, the method may include a service provider or service receiving input from a client associating one or more auto-scaling policies with a cluster of nodes; Abstract, Different policies may be applied to different subsets of cluster resources (e.g., different instance groups containing nodes of different types or having different roles) (i.e., user input received at step 210 defines at least a role of a cluster of nodes and an associated auto-scaling policy to be applied to the cluster having the defined role); Claim 15, lines 1-6, wherein each one of the two or more instance groups comprises computing resource instances of a respective different type or computing resource instances having a respective different role in the execution of the distributed application on the cluster (i.e., application specific metrics relate to the roles of the application executed on the cluster)); 
maintaining, by a controller host of the distributed processing environment, a set of role-based autoscaling policies defining conditions under which the plurality of roles are to be scaled (EINKAUF, Fig. 1, 170 resource management data base; Fig. 13, 1300 provider network, 1310 virtualization service (as controller host); [0090] lines 10-11, auto-scaling policy document may be stored in the control plane; [0165] lines 2-4, such components may be implemented within the control plane of virtualization services 1310 (as maintained by the virtualization services (controller host); [0036] lines 5-11, one or more auto-scaling policies (as set of role-based autoscaling policies) with a cluster of nodes. As illustrated in this example, each of the policies may be dependent on one or more trigger conditions and may specify a particular auto-scaling action to be taken if/when trigger conditions are met (e.g., increasing or decreasing the number of nodes in the cluster or within an instance group within the cluster; [0037] lines 4-9, during execution of the application, gathering and/or aggregating metrics that are relevant to trigger condition(s), as in 230. Examples of such metrics (some of which may be application-specific, workload-specific, and/or specific to a particular instance group) (as plurality of roles to be scaled); also see [0061] lines 2-4, define an auto-scaling policy that is dependent on expressions based on a variety of trigger types (metrics) from a variety of trigger sources; lines 14-20, the trigger data may include performance or behavior metrics, storage metrics (e.g., consumption of storage, remaining capacity), corn-like expressions (e.g., time information, clock/calendar types of triggering information), metrics indicating the state or number of pending or currently executing jobs, pricing information, cost information, or other metrics), wherein a role-based autoscaling policy of the set of role-based autoscaling policies specifies [conditions] that triggers scaling out of the first role by a first scaling factor and scaling out of the second role by a second scaling factor (EINKAUF, [Abstract] Different policies may be applied to different subsets of cluster resources (e.g., different instance groups containing nodes of different types or having different roles) (i.e., a first autoscaling policy is applied to a first group of nodes having a first role, while a second autoscaling policy is applied to a second group of nodes having a second role); [0091] lines 3-6, combine auto-scaling polices (e.g., the user may include multiple auto-scaling rules within a single policy or may associate multiple auto-scaling policies (each defining one or more auto-scaling rules) with the same cluster or instance group thereof (the combined policies with the same cluster as a role-based autoscaling policy); [0038] lines 9-13, when an auto-scaling trigger condition is detected based on the obtained and/or aggregated metrics, shown as the positive exit from 240, the method may include initiating the taking of the corresponding auto-scaling action; [0064] lines 2-3, automatically scale up or down when triggered by one or more of the following; [0066]-[0068] (as including conditions); [0066] lines 1-7, a cluster metric…For example, an auto-scaling action (e.g., an action to add capacity) may be triggered if the storage-to-virtualized-computing-service throughput is greater than or equal to 100 for at least 120 minutes (as second role); [0068] lines 1-3, The day (or date) and/or time—For example, an auto-scaling action (e.g., an action to add or reduce capacity) may be triggered every Saturday at 17:00 (as first role). [0069] lines 8-10, an auto-scaling policy may contain one or more rules, and each rule may contain some or all of the following elements; [0077] lines 1-5, the amount or percentage of capacity (e.g., the number or percentage of resource instances) to add to…For example the policy may specify the change in resource capacity as one of the following: [0078] lines 1-2,  “5” (e.g., 5 resource instances should be added or removed) (as scaling out of the first role by a first scaling factor); [0103] lines 4-7, For example, this expression may be included in an auto-scaling policy specifying that an auto-scaling action (e.g., adding 20 nodes to the cluster (as scaling out of the second role by a second scaling factor)) should be performed every Saturday night at midnight (i.e., triggering a combined auto-scaling policy auto-scales a first group of nodes having a first role by a first factor (e.g., adding a number or percentage of resource instances), while auto-scaling a second group of nodes having a second role by a second factor (e.g., reducing the number or percentage of resource instances))); and 
in response to the load information meeting the conditions, increasing, by the controller host, a number of the first application containers in the first virtual cluster that perform the first role by the first scaling factor, and increasing a number of second application containers in the first virtual cluster that perform the second role by the second scaling factor, wherein the increased number of the first application containers includes a new first application container, and the increased number of the second application containers includes a new second application container; and deploy the new first application container and the new second application container. (EINKAUF, Fig. 9, 940, 950, yes, to 960 and 970 the resource manager for the cluster initiates the auto-scaling action, in accordance with the corresponding auto-scaling policy; [0165] lines 1-4, although no monitoring components or auto-scaling rules engines are shown in FIG. 13, such components may be implemented within the control plane of virtualization services 1310 (as controller host) [0035] lines 3-11, resource management database 170 may store resource usage data, which may include the past task execution history for a client 110, resource utilization history, billing history, and overall resource usage trends for a given set of resource instances…the resource manager 150 may use past resource usage data and trends for a given set of resource instances…determining how and/or when to perform various auto-scaling actions; [0066] lines 1-7, a cluster metric…For example, an auto-scaling action (e.g., an action to add capacity) may be triggered if the storage-to-virtualized-computing-service throughput is greater than or equal to 100 for at least 120 minutes (as load information meeting the conditions); [0069] lines 1-2, automatic cluster scaling may be governed by one or more auto-scaling policies; lines 8-10, an auto-scaling policy may contain one or more rules, and each rule may contain some or all of the following elements; [0077] lines 1-5, the amount or percentage of capacity (e.g., the number or percentage of resource instances) to add to…For example the policy may specify the change in resource capacity as one of the following: [0078] lines 1-2,  “5” (e.g., 5 resource instances should be added or removed (as increasing/deploy a number of new first application container that perform the first role (throughput loading) by the first scaling factor, i.e., 5); [0103] lines 4-7, For example, this expression may be included in an auto-scaling policy specifying that an auto-scaling action (e.g., adding 20 nodes (as increasing/deploy new second application container) to the cluster) should be performed every Saturday night at midnight);).

EINKAUF fails to specifically teach wherein the distributed processing environment comprises a plurality of worker hosts containing multiple application containers for respective virtual clusters including the first virtual cluster, and wherein the plurality of application containers in the first virtual cluster are distributed across multiple worker hosts of the plurality of worker hosts.

However, Bauman teaches wherein the distributed processing environment comprises a plurality of worker hosts containing multiple application containers for respective virtual clusters including the first virtual cluster, and wherein the plurality of application containers in the first virtual cluster are distributed across multiple worker hosts of the plurality of worker hosts (Bauman, Fig. 4, 400 Compute service provider (as distributed processing environment), 403 and 405 as virtual clusters, 402A-C server computers (as worker hosts), instance 406A-406C (as a plurality of worker hosts containing multiple application containers), 405 as first virtual cluster that across two worker hosts (i.e., 402B-402C server computers) that having plurality of application containers (i.e., instance 406B-C)); Col 18, lines 22-25, For example, instances (or VMIs) 406A within server computer 402A may be located within private cloud 403, and instances 406B-406C within server computers 402B-402C may be located within private cloud 405; Col 18, lines 63-66,  scale-up rules for use in determining when new instances should be instantiated and scale-down rules for use in determining when existing instances should be terminated).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of EINKAUF with Bauman because Bauman’s teaching of providing a cloud/cluster across multiple server providers for creating an isolated region would have provided EINKAUF’s system with the advantage and capability to allow the system to improving the overall system security and performance for the server/hosts.

EINKAUF and Bauman fail to specifically teach wherein a second role of the plurality of roles is dependent upon a first role of the plurality of roles, and wherein a role-based autoscaling policy, it is for the first role specifies a first condition that triggers scaling out of the first role by a first scaling factor and scaling out of the second role by a second scaling factor in tandem; and receiving load information for the first role, in response to the load information for the first role meeting the first condition, increasing a number of the first application containers that perform the first role by the first scaling factor, and increasing a number of second application containers that perform the second role by the second scaling factor.

However, Tang teaches wherein a second role of the plurality of roles is dependent upon a first role of the plurality of roles (Tang, Col 5, lines 31-46, Often when one resource type of an application stack, such as the application stack 102, needs to be scaled (e.g., up, down, in, out, etc.) due to some change in resource usage or demand, other resource types of the application stack may also need to be scaled as well (i.e., in tandem). Take, for example, an application stack comprising a group of virtual machine instances of a virtual computing system service and a group of database tables of a database service. In this example, the virtual machine instances operate as Web servers and cause data to be stored to the group of database tables. In this example, as the application stack receives more network traffic volume and data to be stored in the database tables, additional virtual machine instances may need to be instantiated to handle the network traffic volume (as second role), and the database tables may need to be increased in size in order to store the additional data (as second role is dependent upon a first role, since the database tables need to be increased when virtual machine instances is increasing for more network traffic volume occurring));
wherein a role-based autoscaling policy, it is for the first role specifies a first condition that triggers scaling out of the first role by a first scaling factor and scaling out of the second role by a second scaling factor in tandem; and receiving load information for the first role, in response to the load information for the first role meeting the first condition, increasing a number of the first application containers that perform the first role by the first scaling factor, and increasing a number of second application containers that perform the second role by the second scaling factor (Tang, Fig. 4, load metric; Fig. 7, 702 receive request for generate set of polices for application stack, 704 obtain historical metrics data for application stack, 706, 708, 710, 712 determine scaling type and amount; Col 5, lines 31-35, such as the application stack 102, needs to be scaled (e.g., up, down, in, out, etc.) due to some change in resource usage or demand, other resource types of the application stack may also need to be scaled as well (i.e., in tandem); lines 43-46, additional virtual machine instances may need to be instantiated to handle the network traffic volume (as first role), and the database tables may need to be increased in size in order to store the additional data (as second role); Col 19, lines 24-26, obtains historical data about the metrics and the resources of the application stack; Col 25, lines 4-46, a customer has set a target tracking value for CPU utilization at 50% and a current measurement of CPU utilization is 75%. In this example, 75% is 50% over the 50% target value. Thus, the scaling service using this method of proportional scaling may compute a new capacity for the scalable resource (e.g., virtual machine group) to add 50% more to the resource dimension (e.g., number of virtual machines in the virtual machine group); Col 26, lines 1-5, for resources that typically need to be scaled in tandem (e.g., a virtual machine group and a relational database), the scaling service can recommend policies that cause such resources to be scaled together; lines 28-34, a single scaling policy can have all the information needed to scale multiple resources. In one example, if CPU utilization exceeds 50% (as a role-based autoscaling policy that for the first role specifies a first condition (i.e., exceeds 50%), the scaling policy directs the scaling service to increase compute (e.g., processor) capacity by 25% (as first scaling factor), increase relational database table read IOPS by 10%, and increase the size of the database table by 10% (as second scaling factor)).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of EINKAUF and Bauman with Tang because Tang’s teaching of scaling the different resources for performing different roles (network traffic and data storing) in tandem would have provided EINKAUF and Bauman’s system with the advantage and capability to improve the efficiency of the computing services by synchronizing resources of different types to be scaled in tandem (see Tang, Col 3, lines 5-9, improve the efficiency of computing services).

EINKAUF, Bauman and Tang fail to specifically teach the received load information is from container agents in first application containers; and the increased new first application container that includes a respective container agent to collect load information of the new first application container, and the increased a new second application container that includes a respective container agent to collect load information of the new second application container.

However, Shen teaches the received load information is from container agents in first application containers; and the increased new first application container that includes a respective container agent to collect load information of the new first application container, and the increased a new second application container that includes a respective container agent to collect load information of the new second application container (Shen, Fig. 1, 69 group of resource instances, 67-1 to 67-P, 68-1 to 68-P AA (as respective container agent within application containers); [0049] lines 1-11, The autoscaling component 62 communicates with at least two different types of resources. For example, the autoscaling component 62 communicates with a resource allocator 66 that scales out or scales in a group 69 of data and/or computing resources by directly increasing or decreasing individual resource instances 67-1, 67-2, . . . , and 67-P (collectively resource instances 67). In some examples, each of the resource instances 67-1, 67-2, . . . , and 67-P includes an agent application (AA) 68-1, 68-2, . . . , and 68-P that generates and/or aggregates log and metric data having a common schema; [0059] lines 3-4, Agent applications 165 may be used to collect and send metrics and log data; [0083] lines 1-15, The agent applications 562 monitor predetermined log and metric parameters of the resource instance…the metric data for a virtual machine may include an operating load on the virtual machine (such as an average percentage of the full processor capacity during a predetermined period), a minimum percentage and a maximum percentage. In some examples, the agent applications 562 aggregate the log and/or metric data over one or more predetermined periods; also see [0038] lines 6-9, an autoscale component scales out by deploying one or more VMs to the tenant to increase capacity by a predetermined amount such as 10% or 20% [Examiner noted: each VM (as application container) having respective agent to collect and send the load information, and therefore when deploying the new VMs, the new VMs (as including first and second application containers) also including the respective agent, since the load information need to be collected]).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of EINKAUF, Bauman and Tang with Shen because Shen’s teaching of providing a respective agent application for collecting the load information for the resource instances would have provided EINKAUF, Bauman and Tang’s system with the advantage and capability to easily determining the resource utilization for each resource instances in order to allow the system to predicating the future load utilization and determining the resource needed for the different applications which improving the system performance and efficiency.  

EINKAUF, Bauman, Tang and Shen fail to specifically teach when deploying the new first application container and the new second application container, it is based on  selecting, based on loads of the plurality of worker hosts, at least one worker host of the plurality of worker hosts on which to deploy.

However, McDowell teaches when deploying the new first application container and the new second application container, it is based on  selecting, based on loads of the plurality of worker hosts, at least one worker host of the plurality of worker hosts on which to deploy (McDowell, Col 8, lines 10-19, The determination of the physical host for the virtual computer system instance may be based on a variety of factors, including a particular geographic area based at least in part on an Internet Protocol (IP) address associated with the customer, load on one or more physical hosts, network traffic associated with the one or more physical hosts, request response latency of the one or more physical hosts or any other information suitable for selecting a physical hosts to instantiate one or more computer instances).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of EINKAUF, Bauman, Tang and Shen with McDowell because McDowell’s teaching of selecting a physical host for deploying the computer instances based on the load information would have provided EINKAUF, Bauman, Tang and Shen’s system with the advantage and capability to efficiently utilizing the resources among the different physical hosts which improving the system resource utilization and efficiency. 

As per claim 2, EINKAUF, Bauman, Tang, Shen and McDowell teach the invention according to claim 1 above. EINKAUF further teaches wherein the role-based autoscaling policy also specifies a second conditions that triggers scaling in of the first role by a third scaling factor and scaling in of the second role by a fourth scaling factor (EINKAUF, [0064] lines 2-3, automatically scale up or down when triggered by one or more of the following; [0065] lines 1-6, a metric captured by a monitoring service crossing a specified threshold for a specified time period—For example, an auto-scaling action (e.g., an action to reduce capacity) may be triggered if the number of mappers in the cluster is less than 2 for at least 60 minutes (as second set of conditions); [0122] lines 4-9, auto-scaling policies may place a value on each node (as include third scaling factor and fourth scaling factor) (e.g., relative to its eligibility or suitability for removal) and the policies may rely on the value of the when making decisions about which instances to remove (e.g., avoiding data loss on nodes that carry data). [0077] lines 4-5, the policy may specify the change in resource capacity as one of the following; [0079] lines 1-2, “20%” (e.g., the change should represent 20% of the current resource instances) (as scaling in of the first role by a third scaling factor); [0103] lines 4-8, this expression may be included in an auto-scaling policy specifying that an auto-scaling action (e.g., adding 20 nodes to the cluster) should be performed every Saturday night at midnight. In this example, a complementary auto-scaling policy may specify that the cluster should be reduced at 04:00 every Monday morning (as scaling in of the second role by fourth scaling factor; see scaling factor indicated in [0078] lines 1-2, “5” (e.g., 5 resource instances should be added or removed)); 
the method further comprises in response to the load information for the first role meeting the second conditions, decreasing the number of the first application containers in the first virtual cluster that perform the first role by the third scaling factor and decreasing the number of the second application containers in the first virtual cluster that perform the second role by the fourth scaling factor (EINKAUF, Fig. 9, 940, 950, yes, to 960 and 970 the resource manager for the cluster initiates the auto-scaling action; [0077] lines 4-5, the policy may specify the change in resource capacity as one of the following; [0079] lines 1-2, “20%” (e.g., the change should represent 20% of the current resource instances) (as scaling in of the first role by a third scaling factor); [0103] lines 4-8, this expression may be included in an auto-scaling policy specifying that an auto-scaling action (e.g., adding 20 nodes to the cluster) should be performed every Saturday night at midnight. In this example, a complementary auto-scaling policy may specify that the cluster should be reduced at 04:00 every Monday morning (as scaling in of the second role by fourth scaling factor; see scaling factor indicated in [0078] lines 1-2,  “5” (e.g., 5 resource instances should be added or removed).
	In addition, Tang teaches that the role-based autoscaling policy also specifies a second condition that triggers scaling in of the first role by a third scaling factor and scaling in of the second role by a fourth scaling factor in tandem (Tang, Col 5, lines 31-46, Often when one resource type of an application stack, such as the application stack 102, needs to be scaled (e.g., up, down, in, out, etc.) due to some change in resource usage or demand, other resource types of the application stack may also need to be scaled as well (i.e., in tandem). Take, for example, an application stack comprising a group of virtual machine instances of a virtual computing system service and a group of database tables of a database service. In this example, the virtual machine instances operate as Web servers and cause data to be stored to the group of database tables. Col 31, lines 2-5, identification of correlations between different resource types. Such correlations may identify potential resource types that should be scaled in tandem; Col 26, lines 38-43, if read throughput of the relational database exceeds 30% of a target throughput, a third scaling policy may direct the scaling service to reduce the relational database table read IOPS by 40%, and decrease compute capacity by 10%).

As per claim 3, EINKAUF, Bauman, Tang, Shen and McDowell teach the invention according to claim 1 above. EINKAUF further teaches fetching, by the controller host, the set of role-based autoscaling policies from a database (EINKAUF, Fig. 1, 170 resource management database; [0034] lines 18-21, information representing the user-defined policies (and/or any default auto-scaling policies supported by the service) and associations between the policies and MapReduce cluster 120 (or specific instance groups thereof) may be stored in resource management database 170; [0115] lines 23-27, the monitoring service may fetch the policy and the metrics on which it depends, and make them available to the auto-scaling rules engine…and initiate any actions that are called for by the policy; [0165] lines 1-4, although no monitoring components or auto-scaling rules engines are shown in FIG. 13, such components may be implemented within the control plane of virtualization services (as controller host)), wherein the role-based autoscaling policy defines an evaluation period at which the load information is to be evaluated (EINKAUF, [0066] lines 1-7, a cluster metric (e.g., one that is published by the cluster but is not available in the monitoring service) crossing a specified threshold for a specified time period (as evaluation period)—For example, an auto-scaling action (e.g., an action to add capacity) may be triggered if the storage-to-virtualized-computing-service throughput is greater than or equal to 100 for at least 120 minutes; also see [0071] lines 1-2, “numberOfMappers <2 for at least 60 minutes”; [0072] lines 1-3, OR(“numberOfMappers <2 for at least 60 minutes”,“numberOfMappers <5 for at least 120 minutes”); and 
based on the evaluation period of the role-based autoscaling policy, retrieving, by the controller host, the load information that has been previously gathered from each first application container of the first virtual cluster and stored in storage (EINKAUF, [0035] lines 3-16, store resource usage data, which may include the past task execution history for a client 110, resource utilization history (as previously gathered), billing history, and overall resource usage trends for a given set of resource instances that may be usable for the client's tasks. In some cases, the resource manager 150 may use past resource usage data (as retrieving) and trends for a given set of resource instances to develop projections of future resource usage and may use these projections in developing execution plans or in determining how and/or when to perform various auto-scaling actions); also see [0128] lines 5-18, a monitoring service to monitor the behavior of one or more clusters of computing resource instances. The method may include the monitoring service receiving metrics from a cluster of computing resource instances on which a distributed application is executing, as in 920. For example, the monitoring service may receive metrics from one or more computing resource instances within the cluster (some of which may belong to different instance groups). monitoring service aggregating at least some of the received metrics and making them available to an auto-scaling rules engine (e.g., by passing them to the auto-scaling rules engine or by storing them in a memory that is accessible to the auto-scaling rules engine).

As per claim 4, EINKAUF, Bauman, Tang, Shen and McDowell teach the invention according to claim 1 above. EINKAUF teaches wherein the virtual cluster are to execute stateful application (EINKAUF, Fig. 1, 120 MapReduce cluster (as first virtual cluster); Fig. 2, 220, begin executing a distributed application on the cluster of nodes; Fig. 6, 610, a service receives a request to create a cluster of virtualized computing resource instances on which to execute a given application (as stateful application) (or computation thereof) on behalf of a service customer or subscriber; [0025] lines 4-13, the systems may avoid removing a node during an operation to reduce capacity if the node stores important state information (as stateful application, since the state is stored) (e.g., if it stores data and it cannot be gracefully decommissioned)…In other words, unlike in existing auto-scaling solutions, the systems described herein may apply intelligence in scaling operations due to the unique behaviors of at least some of the nodes; also see Claim 15, lines 1-6, wherein each one of the two or more instance groups comprises computing resource instances of a respective different type or computing resource instances having a respective different role in the execution of the distributed application on the cluster). In addition, Bauman teaches different ones of the virtual clusters are to execute respective different stateful applications (Bauman, Fig. 4, 400, 403 and 405 as virtual clusters, 402A-C server computers (as worker hosts), instance 406A-406C, 405 as first virtual cluster; Col 18, lines 22-25, For example, instances (or VMIs) 406A within server computer 402A may be located within private cloud 403, and instances 406B-406C within server computers 402B-402C may be located within private cloud 405; Col 18, lines 63-66,  scale-up rules for use in determining when new instances should be instantiated and scale-down rules for use in determining when existing instances should be terminated; Col 20, lines 41-44, private cloud may perform the policy based data aggregation functionalities described herein (e.g., as described in reference to aggregator service 140 in FIGS. 1-3B (as each private cloud (i.e., virtual cluster) perform respective function/application)).

As per claim 5, EINKAUF, Bauman, Tang, Shen and McDowell teach the invention according to claim 1 above. EINKAUF further teaches wherein the load information comprises a measure of central processing unit (CPU) usage by the first application containers, a measure of memory usage by the first application containers, a measure of disk usage by the first application containers, or a measure of network usage by the first application containers (EINKAUF, [0028] lines 16-18, monitoring process observed that there was no CPU utilization for a certain period of time; [0128] lines 5-18, a monitoring service to monitor the behavior of one or more clusters of computing resource instances. The method may include the monitoring service receiving metrics from a cluster of computing resource instances on which a distributed application is executing, as in 920. For example, the monitoring service may receive metrics from one or more computing resource instances within the cluster (some of which may belong to different instance groups); also see [0149] lines 12-24, compute intensive applications (e.g., high-traffic web applications…memory intensive workloads…and storage optimized workloads (e.g., data warehousing and cluster file systems). Size of compute instances, such as a particular number of virtual CPU cores, memory, cache, storage).

As per claim 7, EINKAUF, Bauman, Tang, Shen and McDowell teach the invention according to claim 1 above. McDowell further teaches wherein the at least one worker host previously did not include any of the first application containers of the first virtual cluster (McDowell, Col 7, lines 62-66, The user interface may allow the user to search for and select a particular product and initiate instantiation of that product. When instantiating a product, server computer system 200 can, using the product's template, automatically create and configure a virtual computer system instance 220 with the selected product for use by the user; Col 8, lines 10-19, The determination of the physical host for the virtual computer system instance may be based on a variety of factors, including a particular geographic area based at least in part on an Internet Protocol (IP) address associated with the customer, load on one or more physical hosts, network traffic associated with the one or more physical hosts, request response latency of the one or more physical hosts or any other information suitable for selecting a physical hosts to instantiate one or more computer instances [Examiner noted: previously the worker host did not include any of the first application containers for the product, therefore, selecting the worker hosts for deploying the first application containers (i.e., one or more computer instances)]).

As per claim 8, EINKAUF, Bauman, Tang, Shen and McDowell teach the invention according to claim 1 above. Tang further teaches wherein the load information meeting the first condition comprises the load information meeting any one or more of a set of conditions that include the first condition (Tang, Col 26, lines 6-27,  Tandem scaling may involve multiple policies (e.g., one for each resource to be scaled) that, as a result of a breach of a threshold, multiple scaling policies are invoked (as one or more of a set of conditions including first condition). Note that the policies need not be invoked by a breach of the same threshold, for example, if the customer specifies to maintain CPU utilization of a virtual machine group at 50% and the scaling service determines that, when the virtual machine group is at 50% CPU utilization, a relational database associated with the virtual machine group experiences a consumption rate of 45%. In this example, the scaling service recommends two scaling policies: one scaling policy that scales the virtual machine group if CPU utilization exceeds 50% and another scaling policy that scales the relational database group if consumption exceeds 45%. Given the determined relationship between the two usage rates, both scaling policies are likely to be invoked at approximately the same time. Having different scaling policies that trigger on different usage thresholds for different resources allows flexibility, for example, in cases where for some reason one resource has a very high load but does not impact the other resource; in this manner, the resources would scale independently).

As per claims 9-13 and 15-16, they are non-transitory machine readable claims of claims 1-5 and 7-8 respectively above. Therefore, they are rejected for the same reasons as claims 1-5 and 7-8 respectively above.

As per claim 17, it is a system claim of claim 1 above. Therefore, it is rejected for the same reason as claim 1 above. In addition, EINKAUF further teaches a processor; and a non-transitory computer-readable medium storing instructions executable on the processor to (EINKAUF, Fig. 17, 1710a, processor; 1720 system memory, 1725 code; also see claim 19, A non-transitory computer-accessible storage medium storing program instructions that when executed on one or more computers cause the one or more computers to implement a distributed computing service).

As per claim 18-19 and 21, they are system claims of claims 2-3 and 5 respectively above. Therefore, they are rejected for the same reasons as claims 2-3 and 5 respectively above.


Claims 6, 14 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over EINKAUF, Bauman, Tang, Shen and McDowell, as applied to claims 1, 9 and 17 respectively above, and further in view of Tang et al. (US Patent. 9,692,707 B2 (hereafter Tang’707’)).

As per claim 6, EINKAUF, Bauman, Tang, Shen and McDowell teach the invention according to claim 1 above. EINKAUF teaches to scale out by deploy the new first application container and the new second application container (EINKAUF, [0077] lines 1-5, the amount or percentage of capacity (e.g., the number or percentage of resource instances) to add to…For example the policy may specify the change in resource capacity as one of the following: [0078] lines 1-2,  “5” (e.g., 5 resource instances should be added or removed (as increasing/deploy a number of new first application container that perform the first role (throughput loading) by the first scaling factor, i.e., 5); [0103] lines 4-7, For example, this expression may be included in an auto-scaling policy specifying that an auto-scaling action (e.g., adding 20 nodes (as increasing/deploy new second application container) to the cluster) should be performed every Saturday night at midnight).
In addition, McDowell teaches deploy the new first application container and the new second application container on the [selected] at least one worker host (McDowell, Col 8, lines 10-19, The determination of the physical host for the virtual computer system instance may be based on a variety of factors, including a particular geographic area based at least in part on an Internet Protocol (IP) address associated with the customer, load on one or more physical hosts, network traffic associated with the one or more physical hosts, request response latency of the one or more physical hosts or any other information suitable for selecting a physical hosts to instantiate one or more computer instances).

EINKAUF, Bauman, Tang, Shen and McDowell fail to specifically teach when deploying, it is sending, by the controller host, a request to scale out to a worker agent in the selected at least one worker host, the request to scale out to cause the worker agent to deploy.

However, Tang’707’ teaches when deploying, it is sending, by the controller host, a request to scale out to a worker agent in the selected at least one worker host, the request to scale out to cause the worker agent to deploy (Tang ‘707’, Fig. 2, 204 resource engine (as controller host); Claim 7, Resource Engine performs the following steps…sending a request event to an agent running on said Compute Resource, Network Resource, or Storage Resource via the ICM (Infrastructure Communication Manager); executing, by said agent, one of the specific deployment operations, including at least two deployment operations of creating a new virtual machine instance).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of EINKAUF, Bauman, Tang, Shen and McDowell with Tang’707’ because Tang’707’’s teaching of using a resource engine (as controller host) for sending the request and allow the agent within the compute resource (as worker host) to deploying the virtual machine instances would have provided EINKAUF, Bauman, Tang, Shen and McDowell’s system with the advantage and capability to allow the system to easily manage the virtual machine instance deployment which improving the system performance and efficiency. 

As per claims 14 and 20, they are non-transitory machine readable and system claims of claim 6 above. Therefore, they are rejected for the same reason as claim 6 above.


Response to Arguments
The Amendment filed on 06/30/2022 has been entered. Applicant’s amendment has overcome the previous rejections under 35 U.S.C § 112(b). The rejection of 112(b) has been withdraw.

Applicant’s argument filed on 06/30/2022 with respect to claims 1-21 have been fully considered but they are not persuasive and deemed moot in view of the new grounds of rejection necessitated by Applicant’s amendment.  


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZUJIA XU whose telephone number is (571)272-0954. The examiner can normally be reached M-F 9:00-5:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Meng-Ai An can be reached on (571) 272-3756. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Z.X./Examiner, Art Unit 2195                                                                                                                                                                                                        
/MICHAEL W AYERS/Primary Examiner, Art Unit 2195