DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to communication received on 05/10/2021. The applicant has submitted 20 claims for examination, all claims are currently pending. 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 6-13, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Ou US 2021/0263780, and further in view of Benjamin US 2021/0311655.
Regarding claim 1, Ou teaches a method comprising: maintaining, by a service management system, a plurality of services that are distributed across a plurality of clusters, wherein each service of the plurality of services serves a functionality in a data storage system; 
[" Embodiments described herein are generally directed to a role-based autoscaling approach for providing fine-grained control of scaling of nodes of a stateful application in a large scale virtual data processing (LSVDP) environment. In the following description, numerous specific details are set forth in order to provide a thorough understanding of example embodiments. It will be apparent, however, to one skilled in the art that embodiments described herein may be practiced without some of these specific details.", ¶9]

["There are two general types of applications: stateless and stateful. A stateless application (e.g., a web server) does not store data generated from one session for use in a subsequent session. As such, there is no dependency on the local container storage for a stateless workload. In contrast, stateful applications (e.g., artificial intelligence (AI) applications and applications relating to storing and processing big data, including data science, analytics, machine learning (ML), and deep learning (DL)) are services that rely on backing storage, and maintaining state is expected as part of running the service. Apache Hadoop and Apache Spark are non-limiting examples of software frameworks for storing data and running applications on clusters of hosts that are intended to provide massive storage of data and enormous processing power to support concurrent tasks or jobs by distributing data and calculations across different hosts so multiple tasks can be accomplished simultaneously.", ¶10]

receiving a request to scale a service of the plurality of services( user interface used request to deploy/scale and autoscaling policy defined by user, ¶26);
[" In the context of the present example, the controller host 110 includes a user interface 111, a management module 112, a management database 113, a policy engine 114, a load monitor 115, and a load database 116. The user interface 111 may provide an interface to the user to facilitate creating of a virtual cluster (e.g., virtual clusters 123a-n) on a worker host (e.g., worker host 12a-n) and facilitate configuration of one or more role-based autoscaling policies for each role of an application that will be deployed within a virtual cluster (e.g., virtual cluster 123). Alternatively or additionally, the role-based autoscaling policies may be provided in the form of object notation files.", ¶26]

["In the present example, the application containers 121a-m and 121n-x may cooperate with each other to implement the function/service of the application associated with their respective virtual clusters For example, application container 121", ¶30]
 accessing dependency data representing dependencies among the plurality of services(scaling of services is performed based on dependencies of roles performed by  services, scaling is performed responsive to such dependencies, ¶s11,22 23 )  
["For stateful applications that are deployed in distributed computing environments, such as an LSVDP environment (e.g., Hadoop or Spark), each host of the cluster may cooperatively work with the others to implement the function of the application. Each host of the cluster may include multiple nodes (e.g., application containers), each having one role (which may include multiple related services), operating within a virtual cluster. As a result of different tasks being performed by the different roles, the bottleneck for each role may be different. For example, one role might be central processing unit (CPU) intensive, and another role might be memory or Input/Output (I/O) intensive. Additionally, there may be dependencies among the various roles. For example, for each two nodes performing a first role (e.g., data analysis), it may be desirable to have one node performing a second role (e.g., reporting). This creates difficulties for existing autoscaling approaches, which typically perform scaling in or scaling out of nodes independently. Some vendors have attempted to address these issues with application-specific autoscaling approaches bound to particular applications; however, such application-specific autoscaling approaches require in-depth knowledge of the application logic; and furthermore due to their tight coupling with the application logic cannot be used for other applications.", ¶11]
["In one embodiment, several related services may be represented by one container, and one role is defined for each type of container. For purposes of illustration, consider a simplified Hadoop cluster that is used mainly for a map/reduce job. In this example, three types of roles may be defined, including a “controller” role, a “worker” role, and a “manager” role. In this example, the controller role may include a HDFS namenode and a Yet Another Resource Negotiator (YARN) resource manager to manage the distributed resources; the worker role may include an HDFS datanode, and a YARN node manager to store the data and execute the map/reduce tasks; and the manager role may include an Ambari or Cloudera Manager to manage the whole virtual cluster. In this non-limiting example, if the Hadoop cluster also needs to handle database workload, then it may also include a “database” role that includes an Hbase HRegionServer etc.", ¶22]

["The previous example is intended to illustrate that a “role” can be thought of as a tag for a set of grouped services. Even for the same application, the customer/user can name different roles as they want to the same service group. As those skilled in the art will appreciate, the services that may be grouped together as one role is application dependent. In embodiments described herein, the autoscaling policy operates at a higher-level in relation to containers and the roles that they represent. As such, the autoscaling policy need not know which or how many services are included in that role and is therefore decoupled from such application dependencies.", ¶23]

determining, based on the dependency data, a set of services of the plurality of services to scale based on a scaling of the service, the set of services including the service(scaling of services performing a particular role and  identified dependent service, ¶34 )
[" In the context of the present example, the policy engine 114 is responsible for retrieving the virtual cluster configuration/deployment information and the corresponding role-based autoscaling policies from the management database 113. Based on the role-based autoscaling policies, the policy engine 114 may also collect load information from the load monitor 115 for each role of the virtual cluster. In one embodiment, the role-based autoscaling policies provide information indicative of how frequently the load information for the corresponding role is to be evaluated. The policy engine 114 may also be responsible for evaluating the load information against the role-based autoscaling policies to determine whether the load associated with a particular roles meets the corresponding set of conditions defined in a role-based autoscaling policy for the particular role; and if so, informing the management module 112 to scale up/scale down the containers that belong to the particular role as well as containers that belong to any dependent roles.", ¶34]

Ou teaches scaling up and down based on dependencies between services but does not teach determining, by the service management system, a scaling sequence in which the set of services are to be scaled based on the dependency data; and scaling the set of services based on the scaling sequence. Benjamin in the same field of endeavor as the invention teaches a system for management of deployed services in a network which includes scaling of services. Benjamin teaches determining, by the service management system, a scaling sequence in which the set of services are to be scaled based on the dependency data and scaling the set of services based on the scaling sequence(hierarchical dependencies of services is determined and services are scaled in bottom up direction prevent overload), ¶40,42).
["In some embodiments, the performance controller 270 performs a scaling operation to one of the software entities SE-1, SE-2, . . . , SE-N that is located at bottom of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N. The performance of a software entity that is dependent from another software entity can be affected by the performance of that other software entity. For example, the performance of a software entity that is located at top of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N can be affected by the performance of a software entity that is located at bottom of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N. In an embodiment, the performance controller 270 identifies a first software entity and a second software entity of the software entities SE-1, SE-2, . . . , SE-N as unhealthy software entities, where the first software entity is dependent upon the second software entity (e.g., the output of the second software entity being an input of the first software entity). In this example, the performance controller 270 only performs the scaling operation to the second software entity, not to the first software entity. In some embodiments, the performance controller 270 determines whether one of the software entities SE-1, SE-2, . . . , SE-N executing in the cloud computing environment is in a scaling grace period and exempts the software entity that is in the scaling grace period from the scaling operation. Using a scaling grace period for a specific software entity enables fine-grained control (e.g., per service control) over scaling policies.", ¶40]

[" In the embodiment depicted in FIG. 3, the software entities SE-1, SE-2, . . . , SE-6 have a specific dependency hierarchy or topology. In particular, the software entity SE-1 is dependent from the software entity SE-4 (e.g., the output of the software entity SE-4 being an input of the software entity SE-1) and the software entity SE-4 is dependent from the software entity SE-5 (e.g., the output of the software entity SE-5 being an input of the software entity SE-4). In addition, in the embodiment depicted in FIG. 3, no software entity is dependent from the software entity SE-2 and the software entity SE-3 is dependent from the software entity SE-6 (e.g., the output of the software entity SE-6 being an input of the software entity SE-3). However, the dependency hierarchy of the software entities SE-1, SE-2, . . . , SE-6 included in the application 312 is not limited to the example illustrated in FIG. 3. The performance of a software entity that is dependent from another software entity can be affected by the performance of that other software entity. For example, the performance of the software entity SE-1 can be affected by the performance of the software entity SE-4 and the performance of the software entity SE-5, the performance of the software entity SE-4 can be affected by the performance of the software entity SE-5, and the performance of the software entity SE-3 can be affected by the performance of the software entity SE-6. In some embodiments, to control a performance metric (e.g., the response time) of the application 312, the performance controller 370 adjusts the operation of a software entity at the bottom of dependency hierarchy. For example, instead of adjusting the operation of the software entity SE-1, the performance controller 370 adjusts the operation of the software entity SE-5 (e.g., scales up or down the software entity SE-5). In another example, instead of adjusting the operation of the software entity SE-3, the performance controller 370 adjusts the operation of the software entity SE-6 (e.g., scales up or down the software entity SE-6).", ¶42]

It would have been obvious to a person of ordinary skill in the art at the time of the effective filing of the instant application to modify Ou with scaling up services in a sequence of hierarchical order(bottom up). The reason for this modification would be to scale services in a way that prevents overload of downstream services.
Regarding claim 10, Ou teaches a non-transitory computer-readable storage medium storing executable computer instructions that, when executed by one or more processors, cause the one or more processors to perform operations, the operations comprising: maintaining, by a service management system, a plurality of services that are distributed across a plurality of clusters, wherein each service of the plurality of services serves a functionality in a data storage system; 
[" Embodiments described herein are generally directed to a role-based autoscaling approach for providing fine-grained control of scaling of nodes of a stateful application in a large scale virtual data processing (LSVDP) environment. In the following description, numerous specific details are set forth in order to provide a thorough understanding of example embodiments. It will be apparent, however, to one skilled in the art that embodiments described herein may be practiced without some of these specific details.", ¶9]

["There are two general types of applications: stateless and stateful. A stateless application (e.g., a web server) does not store data generated from one session for use in a subsequent session. As such, there is no dependency on the local container storage for a stateless workload. In contrast, stateful applications (e.g., artificial intelligence (AI) applications and applications relating to storing and processing big data, including data science, analytics, machine learning (ML), and deep learning (DL)) are services that rely on backing storage, and maintaining state is expected as part of running the service. Apache Hadoop and Apache Spark are non-limiting examples of software frameworks for storing data and running applications on clusters of hosts that are intended to provide massive storage of data and enormous processing power to support concurrent tasks or jobs by distributing data and calculations across different hosts so multiple tasks can be accomplished simultaneously.", ¶10]

receiving a request to scale up a first service of the plurality of services( user interface used request to deploy/scale and autoscaling policy defined by user, ¶26);
[" In the context of the present example, the controller host 110 includes a user interface 111, a management module 112, a management database 113, a policy engine 114, a load monitor 115, and a load database 116. The user interface 111 may provide an interface to the user to facilitate creating of a virtual cluster (e.g., virtual clusters 123a-n) on a worker host (e.g., worker host 12a-n) and facilitate configuration of one or more role-based autoscaling policies for each role of an application that will be deployed within a virtual cluster (e.g., virtual cluster 123). Alternatively or additionally, the role-based autoscaling policies may be provided in the form of object notation files.", ¶26]

["In the present example, the application containers 121a-m and 121n-x may cooperate with each other to implement the function/service of the application associated with their respective virtual clusters For example, application container 121", ¶30]

 accessing dependency data representing dependencies among the plurality of services(scaling of services is performed based on dependencies of roles performed by  services, scaling is performed responsive to such dependencies, ¶s11,22 23 )  
["For stateful applications that are deployed in distributed computing environments, such as an LSVDP environment (e.g., Hadoop or Spark), each host of the cluster may cooperatively work with the others to implement the function of the application. Each host of the cluster may include multiple nodes (e.g., application containers), each having one role (which may include multiple related services), operating within a virtual cluster. As a result of different tasks being performed by the different roles, the bottleneck for each role may be different. For example, one role might be central processing unit (CPU) intensive, and another role might be memory or Input/Output (I/O) intensive. Additionally, there may be dependencies among the various roles. For example, for each two nodes performing a first role (e.g., data analysis), it may be desirable to have one node performing a second role (e.g., reporting). This creates difficulties for existing autoscaling approaches, which typically perform scaling in or scaling out of nodes independently. Some vendors have attempted to address these issues with application-specific autoscaling approaches bound to particular applications; however, such application-specific autoscaling approaches require in-depth knowledge of the application logic; and furthermore due to their tight coupling with the application logic cannot be used for other applications.", ¶11]
["In one embodiment, several related services may be represented by one container, and one role is defined for each type of container. For purposes of illustration, consider a simplified Hadoop cluster that is used mainly for a map/reduce job. In this example, three types of roles may be defined, including a “controller” role, a “worker” role, and a “manager” role. In this example, the controller role may include a HDFS namenode and a Yet Another Resource Negotiator (YARN) resource manager to manage the distributed resources; the worker role may include an HDFS datanode, and a YARN node manager to store the data and execute the map/reduce tasks; and the manager role may include an Ambari or Cloudera Manager to manage the whole virtual cluster. In this non-limiting example, if the Hadoop cluster also needs to handle database workload, then it may also include a “database” role that includes an Hbase HRegionServer etc.", ¶22]

["The previous example is intended to illustrate that a “role” can be thought of as a tag for a set of grouped services. Even for the same application, the customer/user can name different roles as they want to the same service group. As those skilled in the art will appreciate, the services that may be grouped together as one role is application dependent. In embodiments described herein, the autoscaling policy operates at a higher-level in relation to containers and the roles that they represent. As such, the autoscaling policy need not know which or how many services are included in that role and is therefore decoupled from such application dependencies.", ¶23] 

determining, based on the dependency data, a set of services of the plurality of services to scale up based on a scaling of the service, the set of services including a second service and the first service(scaling of services performing a particular role and  identified dependent service, ¶34 )
[" In the context of the present example, the policy engine 114 is responsible for retrieving the virtual cluster configuration/deployment information and the corresponding role-based autoscaling policies from the management database 113. Based on the role-based autoscaling policies, the policy engine 114 may also collect load information from the load monitor 115 for each role of the virtual cluster. In one embodiment, the role-based autoscaling policies provide information indicative of how frequently the load information for the corresponding role is to be evaluated. The policy engine 114 may also be responsible for evaluating the load information against the role-based autoscaling policies to determine whether the load associated with a particular roles meets the corresponding set of conditions defined in a role-based autoscaling policy for the particular role; and if so, informing the management module 112 to scale up/scale down the containers that belong to the particular role as well as containers that belong to any dependent roles.", ¶34]
 
Ou does not teach the second service being a bottommost service in the set of services, wherein other services in the set of services depend on the second service;  determining, by the service management system, a scaling sequence in which the set of services are to be scaled based on the dependency data; and scaling up the set of services according to the scaling sequence, wherein the scaling sequence indicates to scale the second service before scaling another service of the set of services.
Benjamin in the same field of endeavor as the invention teaches a system for management of deployed services in a network which includes scaling of services. Benjamin teaches the second service being a bottommost service in the set of services, wherein other services in the set of services depend on the second service;  
["In some embodiments, the performance controller 270 performs a scaling operation to one of the software entities SE-1, SE-2, . . . , SE-N that is located at bottom of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N. The performance of a software entity that is dependent from another software entity can be affected by the performance of that other software entity. For example, the performance of a software entity that is located at top of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N can be affected by the performance of a software entity that is located at bottom of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N. ", ¶40]
determining, by the service management system, a scaling sequence in which the set of services are to be scaled based on the dependency data(scaling according to  bottom-up sequence ¶45); 
["The performance control process depicted in FIG. 5 uses a bottom-up graph traversal approach in which downstream services are protected from being overloaded due to upstream scaling because the performance control process is aware of the dependencies between the software entities. In addition, the performance control process depicted in FIG. 5 can improve or even optimize SLO enforcement (e.g. performance, availability, etc.) as well as efficiency (e.g., resource usage). ", ¶45]
and scaling up the set of services according to the scaling sequence, wherein the scaling sequence indicates to scale the second service before scaling another service of the set of services (hierarchical dependencies of services is determined and services are scaled in bottom up direction prevent overload), ¶40,42).
["In some embodiments, the performance controller 270 performs a scaling operation to one of the software entities SE-1, SE-2, . . . , SE-N that is located at bottom of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N. The performance of a software entity that is dependent from another software entity can be affected by the performance of that other software entity. For example, the performance of a software entity that is located at top of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N can be affected by the performance of a software entity that is located at bottom of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N. In an embodiment, the performance controller 270 identifies a first software entity and a second software entity of the software entities SE-1, SE-2, . . . , SE-N as unhealthy software entities, where the first software entity is dependent upon the second software entity (e.g., the output of the second software entity being an input of the first software entity). In this example, the performance controller 270 only performs the scaling operation to the second software entity, not to the first software entity. In some embodiments, the performance controller 270 determines whether one of the software entities SE-1, SE-2, . . . , SE-N executing in the cloud computing environment is in a scaling grace period and exempts the software entity that is in the scaling grace period from the scaling operation. Using a scaling grace period for a specific software entity enables fine-grained control (e.g., per service control) over scaling policies.", ¶40]

[" In the embodiment depicted in FIG. 3, the software entities SE-1, SE-2, . . . , SE-6 have a specific dependency hierarchy or topology. In particular, the software entity SE-1 is dependent from the software entity SE-4 (e.g., the output of the software entity SE-4 being an input of the software entity SE-1) and the software entity SE-4 is dependent from the software entity SE-5 (e.g., the output of the software entity SE-5 being an input of the software entity SE-4). In addition, in the embodiment depicted in FIG. 3, no software entity is dependent from the software entity SE-2 and the software entity SE-3 is dependent from the software entity SE-6 (e.g., the output of the software entity SE-6 being an input of the software entity SE-3). However, the dependency hierarchy of the software entities SE-1, SE-2, . . . , SE-6 included in the application 312 is not limited to the example illustrated in FIG. 3. The performance of a software entity that is dependent from another software entity can be affected by the performance of that other software entity. For example, the performance of the software entity SE-1 can be affected by the performance of the software entity SE-4 and the performance of the software entity SE-5, the performance of the software entity SE-4 can be affected by the performance of the software entity SE-5, and the performance of the software entity SE-3 can be affected by the performance of the software entity SE-6. In some embodiments, to control a performance metric (e.g., the response time) of the application 312, the performance controller 370 adjusts the operation of a software entity at the bottom of dependency hierarchy. For example, instead of adjusting the operation of the software entity SE-1, the performance controller 370 adjusts the operation of the software entity SE-5 (e.g., scales up or down the software entity SE-5). In another example, instead of adjusting the operation of the software entity SE-3, the performance controller 370 adjusts the operation of the software entity SE-6 (e.g., scales up or down the software entity SE-6).", ¶42]

It would have been obvious to a person of ordinary skill in the art at the time of the effective filing of the instant application to modify Ou with scaling up services in a sequence of hierarchical order(bottom up). The reason for this modification would be to scale services in a way that prevents overload of downstream services.
Regarding claims 2 and 11, Ou teaches wherein each service of the set of services is associated with a deployment size that indicates an amount of resources consumed by the service.
[“At block 340, a scale up or scale down request may be issued. For example, responsive to a notification from the policy engine that a scale out or scale in policy for a particular role has been triggered, the management module may request the worker agent on the appropriate worker host to increase/decrease the number of nodes for the particular role by a step size indicated in the scaling policy. Similarly, when the triggered scale out or scale in policy identifies a dependent role to also be scaled out or scaled in, the management module may request the worker agent on the appropriate worker host to increase/decrease the number of nodes for the dependent role by a step size indicated in the scaling policy.”, ¶52]

Regarding claims 3 and 12, Ou teaches determining an allocation ratio based on the deployment size associated with each service of the set of services, wherein scaling the set of services is further based on the allocation ratio(step size to increase based on policy, ¶52).
 ["At block 340, a scale up or scale down request may be issued. For example, responsive to a notification from the policy engine that a scale out or scale in policy for a particular role has been triggered, the management module may request the worker agent on the appropriate worker host to increase/decrease the number of nodes for the particular role by a step size indicated in the scaling policy. Similarly, when the triggered scale out or scale in policy identifies a dependent role to also be scaled out or scaled in, the management module may request the worker agent on the appropriate worker host to increase/decrease the number of nodes for the dependent role by a step size indicated in the scaling policy.", ¶52]

Regarding claims 4 and 13, Ou teaches wherein scaling the set of services further comprises: determining a scaling factor that indicates a percentage of scaling for one iterative step; and executing an iterative process by iteratively scaling the set of services based on the scaling factor until a target deployment size is reached.
[" In one embodiment, the first scaling factor (or step), which indicates the number of nodes to add or remove, can be defined independently for scaling up and scaling down within the role-based autoscaling policy for the role at issue. According to one embodiment, as expansion does not typically have any adverse side effects, the expansion of the number of nodes performing a role can be performed more aggressively than contraction of the number of nodes performing the role. To reduce resource waste, the virtual cluster of the stateful application can be created with minimal cluster size and then automatically expanded based on the autoscaling policy. As described in further detail below with reference to FIGS. 5A-B, to adapt to aggressive expansion, the autoscaling policy may use a condition set evaluation approach (e.g., an “if_any” statement) that triggers responsive to any of multiple specified conditions being satisfied.", ¶43]

Regarding claim 6, Ou teaches wherein scaling the set of services includes scaling up or scaling down the set of services.
[" At block 340, a scale up or scale down request may be issued. For example, responsive to a notification from the policy engine that a scale out or scale in policy for a particular role has been triggered, the management module may request the worker agent on the appropriate worker host to increase/decrease the number of nodes for the particular role by a step size indicated in the scaling policy. Similarly, when the triggered scale out or scale in policy identifies a dependent role to also be scaled out or scaled in, the management module may request the worker agent on the appropriate worker host to increase/decrease the number of nodes for the dependent role by a step size indicated in the scaling policy.", ¶52]

Regarding claim 7, Benjamin teaches wherein the dependency data includes tiered hierarchical levels, and wherein scaling up the set of services occurs in an order from lower tiered levels to higher tiered levels
[" The performance controller 270 depicted in FIG. 2 is an embodiment of the performance controller 170 depicted in FIG. 1. In some embodiments, the performance controller 270 is configured to control the software entities SE-1, SE-2, . . . , SE-N of the application 212 such that an SLO of the software entities SE-1, SE-2, . . . , SE-N of the application 212 satisfies a predetermined threshold (e.g., to be equal to, above, or below the predetermined threshold). For example, the performance controller 270 controls the overall response time of the software entities SE-1, SE-2, . . . , SE-N of the application 212 to be below a predetermined response time threshold. In an embodiment, the performance controller 270 is configured to determine dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N, determine operational status of each of the software entities SE-1, SE-2, . . . , SE-N executing in the hybrid cloud system 100, and in response to the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N and the operational status of each of the software entities SE-1, SE-2, . . . , SE-N, perform a scaling operation to the software entities SE-1, SE-2, . . . , SE-N such that an SLO of the hybrid cloud system 100 satisfies a predetermined threshold. For example, the performance controller 270 is configured to perform a scaling operation (e.g., a scale up operation to increase software processing capacity) to the software entities SE-1, SE-2, . . . , SE-N such that application response time of the hybrid cloud system 100 is below a predetermined threshold. By scaling one or more the software entities SE-1, SE-2, . . . , SE-N based on the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N and the operational status of each of the software entities SE-1, SE-2, . . . , SE-N, the SLO of the hybrid cloud system 100 can be maintained at a specific level (e.g., to be equal to, above, or below a predetermined threshold).", ¶39]
 
Ou/Benjamin teaches scaling up in a bottom up manner as discussed above. Benjamin teaches scaling down services but does not specifically teaches scaling down the set of services occurs in an order from higher level tiers to lower level tiers. Benjamin teaches that scaling up using a bottom-up sequence is done for protection of downstream services. 
It would have been obvious to a person of ordinary skill in the art at the time of the effective filing of the instant application to modify Ou/Benjamin by further scaling down the set of services occurs in an order from higher level tiers to lower level tiers. The reason for this modification would be the same as for scaling up services in a bottom-up sequence, which is to protect downstream services. It would be logical of one of ordinary skill with knowledge of the teaching in Benjamin that if scaling up of services are not done in a bottom up hierarchical order lower hierarchical services may be overloaded the same overload can occur if in a scaling-down lower level services are removed before the higher level services 
	 
Regarding claim 8, Benjamin teaches wherein scaling the set of services further comprises: identifying a second cluster to allocate the set of services to; 
[" The performance controller 270 depicted in FIG. 2 is an embodiment of the performance controller 170 depicted in FIG. 1. In some embodiments, the performance controller 270 is configured to control the software entities SE-1, SE-2, . . . , SE-N of the application 212 such that an SLO of the software entities SE-1, SE-2, . . . , SE-N of the application 212 satisfies a predetermined threshold (e.g., to be equal to, above, or below the predetermined threshold). For example, the performance controller 270 controls the overall response time of the software entities SE-1, SE-2, . . . , SE-N of the application 212 to be below a predetermined response time threshold. In an embodiment, the performance controller 270 is configured to determine dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N, determine operational status of each of the software entities SE-1, SE-2, . . . , SE-N executing in the hybrid cloud system 100, and in response to the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N and the operational status of each of the software entities SE-1, SE-2, . . . , SE-N, perform a scaling operation to the software entities SE-1, SE-2, . . . , SE-N such that an SLO of the hybrid cloud system 100 satisfies a predetermined threshold. For example, the performance controller 270 is configured to perform a scaling operation (e.g., a scale up operation to increase software processing capacity) to the software entities SE-1, SE-2, . . . , SE-N such that application response time of the hybrid cloud system 100 is below a predetermined threshold. By scaling one or more the software entities SE-1, SE-2, . . . , SE-N based on the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N and the operational status of each of the software entities SE-1, SE-2, . . . , SE-N, the SLO of the hybrid cloud system 100 can be maintained at a specific level (e.g., to be equal to, above, or below a predetermined threshold).", ¶39]

scaling up the set of services on the second cluster based on the dependency information in a bottom to top order;
["The performance control process depicted in FIG. 5 uses a bottom-up graph traversal approach in which downstream services are protected from being overloaded due to upstream scaling because the performance control process is aware of the dependencies between the software entities. In addition, the performance control process depicted in FIG. 5 can improve or even optimize SLO enforcement (e.g. performance, availability, etc.) as well as efficiency (e.g., resource usage). ", ¶45]

Ou/Benjamin teaches scaling up in a bottom up manner as discussed above. Benjamin teaches scaling down services but does not specifically teach scaling down the set of services on the first cluster based on the dependency information in a top to bottom order.. Benjamin teaches that scaling up using a bottom-up sequence is done for protection of downstream services. 
It would have been obvious to a person of ordinary skill in the art at the time of the effective filing of the instant application to modify Ou/Benjamin by further and scaling down the set of services on the first cluster based on the dependency information in a top to bottom order. The reason for this modification would be the same as for scaling up services in a bottom-up sequence, which is to protect downstream services. It would be logical of one of ordinary skill with knowledge of the teaching in Benjamin that if scaling up of services are not done in a bottom up hierarchical order lower hierarchical services may be overloaded the same overload can occur if in a scaling-down lower level services are removed before the higher level services 
Regarding claim 9, Ou teaches wherein the service depends on more than one other service in the set of services(there are multiple roles that specify multiple dependencies, ¶12).
[" Embodiments described herein seek to improve resource utilization for stateful applications running in an LSVDP environment in an application agnostic manner using role-based autoscaling policies as well as information regarding dependencies among various roles. In one embodiment the autoscaling approach does not require knowledge regarding the specific application logic, and can rely simply on information regarding the utilization of or load on various resources (e.g., CPU, memory, network, disk I/O, and the like) that have been allocated to the respective nodes. In this manner, the autoscaling approach proposed by embodiments described herein is more flexible, decoupled from the underlying application logic and can therefore be generalized for use in connection a broad variety of applications.", ¶12]

Regarding claim 15, Benjamin teaches  wherein the dependency data includes dependencies arranged in an order with a bottommost level and a topmost level, and wherein scaling up the set of services occurs in an order from the bottommost level to the topmost level(scaling according to  bottom-up sequence ¶45); 
["The performance control process depicted in FIG. 5 uses a bottom-up graph traversal approach in which downstream services are protected from being overloaded due to upstream scaling because the performance control process is aware of the dependencies between the software entities. In addition, the performance control process depicted in FIG. 5 can improve or even optimize SLO enforcement (e.g. performance, availability, etc.) as well as efficiency (e.g., resource usage). ", ¶45]


Claims 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Ou US 2021/0263780 and further in view of Benjamin US 2021/0311655 and Boss US 2013/0311988.
Regarding claim 16, Ou teaches a system comprising: memory with instructions encoded thereon; and one or more processors that, when executing the instructions, perform operations comprising: maintaining, by a service management system, a plurality of services that are distributed across a plurality of clusters, wherein each service of the plurality of services serves a functionality in a data storage system;
[" Embodiments described herein are generally directed to a role-based autoscaling approach for providing fine-grained control of scaling of nodes of a stateful application in a large scale virtual data processing (LSVDP) environment. In the following description, numerous specific details are set forth in order to provide a thorough understanding of example embodiments. It will be apparent, however, to one skilled in the art that embodiments described herein may be practiced without some of these specific details.", ¶9]

["There are two general types of applications: stateless and stateful. A stateless application (e.g., a web server) does not store data generated from one session for use in a subsequent session. As such, there is no dependency on the local container storage for a stateless workload. In contrast, stateful applications (e.g., artificial intelligence (AI) applications and applications relating to storing and processing big data, including data science, analytics, machine learning (ML), and deep learning (DL)) are services that rely on backing storage, and maintaining state is expected as part of running the service. Apache Hadoop and Apache Spark are non-limiting examples of software frameworks for storing data and running applications on clusters of hosts that are intended to provide massive storage of data and enormous processing power to support concurrent tasks or jobs by distributing data and calculations across different hosts so multiple tasks can be accomplished simultaneously.", ¶10]

 receiving a request to decommission a cluster of the plurality of clusters, wherein a service of the plurality of services is run by the cluster( user interface used request to deploy/scale and autoscaling policy defined by user, ¶26);
[" In the context of the present example, the controller host 110 includes a user interface 111, a management module 112, a management database 113, a policy engine 114, a load monitor 115, and a load database 116. The user interface 111 may provide an interface to the user to facilitate creating of a virtual cluster (e.g., virtual clusters 123a-n) on a worker host (e.g., worker host 12a-n) and facilitate configuration of one or more role-based autoscaling policies for each role of an application that will be deployed within a virtual cluster (e.g., virtual cluster 123). Alternatively or additionally, the role-based autoscaling policies may be provided in the form of object notation files.", ¶26]

["In the present example, the application containers 121a-m and 121n-x may cooperate with each other to implement the function/service of the application associated with their respective virtual clusters For example, application container 121", ¶30]
 
accessing dependency data representing dependencies among the plurality of services(scaling of services is performed based on dependencies of roles performed by  services, scaling is performed responsive to such dependencies, ¶s11,22 23 )  
["For stateful applications that are deployed in distributed computing environments, such as an LSVDP environment (e.g., Hadoop or Spark), each host of the cluster may cooperatively work with the others to implement the function of the application. Each host of the cluster may include multiple nodes (e.g., application containers), each having one role (which may include multiple related services), operating within a virtual cluster. As a result of different tasks being performed by the different roles, the bottleneck for each role may be different. For example, one role might be central processing unit (CPU) intensive, and another role might be memory or Input/Output (I/O) intensive. Additionally, there may be dependencies among the various roles. For example, for each two nodes performing a first role (e.g., data analysis), it may be desirable to have one node performing a second role (e.g., reporting). This creates difficulties for existing autoscaling approaches, which typically perform scaling in or scaling out of nodes independently. Some vendors have attempted to address these issues with application-specific autoscaling approaches bound to particular applications; however, such application-specific autoscaling approaches require in-depth knowledge of the application logic; and furthermore due to their tight coupling with the application logic cannot be used for other applications.", ¶11]
["In one embodiment, several related services may be represented by one container, and one role is defined for each type of container. For purposes of illustration, consider a simplified Hadoop cluster that is used mainly for a map/reduce job. In this example, three types of roles may be defined, including a “controller” role, a “worker” role, and a “manager” role. In this example, the controller role may include a HDFS namenode and a Yet Another Resource Negotiator (YARN) resource manager to manage the distributed resources; the worker role may include an HDFS datanode, and a YARN node manager to store the data and execute the map/reduce tasks; and the manager role may include an Ambari or Cloudera Manager to manage the whole virtual cluster. In this non-limiting example, if the Hadoop cluster also needs to handle database workload, then it may also include a “database” role that includes an Hbase HRegionServer etc.", ¶22]

["The previous example is intended to illustrate that a “role” can be thought of as a tag for a set of grouped services. Even for the same application, the customer/user can name different roles as they want to the same service group. As those skilled in the art will appreciate, the services that may be grouped together as one role is application dependent. In embodiments described herein, the autoscaling policy operates at a higher-level in relation to containers and the roles that they represent. As such, the autoscaling policy need not know which or how many services are included in that role and is therefore decoupled from such application dependencies.", ¶23]
 determining, based on the dependency data, a set of services of the plurality of services to scale based on a scaling of the service, the set of services including the service(scaling of services performing a particular role and  identified dependent service, ¶34 )
[" In the context of the present example, the policy engine 114 is responsible for retrieving the virtual cluster configuration/deployment information and the corresponding role-based autoscaling policies from the management database 113. Based on the role-based autoscaling policies, the policy engine 114 may also collect load information from the load monitor 115 for each role of the virtual cluster. In one embodiment, the role-based autoscaling policies provide information indicative of how frequently the load information for the corresponding role is to be evaluated. The policy engine 114 may also be responsible for evaluating the load information against the role-based autoscaling policies to determine whether the load associated with a particular roles meets the corresponding set of conditions defined in a role-based autoscaling policy for the particular role; and if so, informing the management module 112 to scale up/scale down the containers that belong to the particular role as well as containers that belong to any dependent roles.", ¶34]

Ou teaches scaling up and down based on dependencies between services but does not teach determining, by the service management system, scaling sequences in which the set of services are to be scaled based on the dependency data; scaling up the set of services on the second cluster based on a first scaling sequence of the scaling sequences; and scaling down the set of services on the first cluster based on a second scaling sequence of the scaling sequences. Benjamin in the same field of endeavor as the invention teaches a system for management of deployed services in a network which includes scaling of services. Benjamin teaches determining, by the service management system, scaling sequences in which the set of services are to be scaled based on the dependency data(hierarchical dependencies of services is determined and services are scaled in bottom up direction prevent overload), ¶40,42)
 scaling up the set of services on the second cluster based on a first scaling sequence of the scaling sequences; and scaling down the set of services on the first cluster based on a second scaling sequence of the scaling sequences(hierarchical dependencies of services is determined and services are scaled in bottom up direction prevent overload), ¶40,42).
["In some embodiments, the performance controller 270 performs a scaling operation to one of the software entities SE-1, SE-2, . . . , SE-N that is located at bottom of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N. The performance of a software entity that is dependent from another software entity can be affected by the performance of that other software entity. For example, the performance of a software entity that is located at top of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N can be affected by the performance of a software entity that is located at bottom of the dependency hierarchy between the software entities SE-1, SE-2, . . . , SE-N. In an embodiment, the performance controller 270 identifies a first software entity and a second software entity of the software entities SE-1, SE-2, . . . , SE-N as unhealthy software entities, where the first software entity is dependent upon the second software entity (e.g., the output of the second software entity being an input of the first software entity). In this example, the performance controller 270 only performs the scaling operation to the second software entity, not to the first software entity. In some embodiments, the performance controller 270 determines whether one of the software entities SE-1, SE-2, . . . , SE-N executing in the cloud computing environment is in a scaling grace period and exempts the software entity that is in the scaling grace period from the scaling operation. Using a scaling grace period for a specific software entity enables fine-grained control (e.g., per service control) over scaling policies.", ¶40]

[" In the embodiment depicted in FIG. 3, the software entities SE-1, SE-2, . . . , SE-6 have a specific dependency hierarchy or topology. In particular, the software entity SE-1 is dependent from the software entity SE-4 (e.g., the output of the software entity SE-4 being an input of the software entity SE-1) and the software entity SE-4 is dependent from the software entity SE-5 (e.g., the output of the software entity SE-5 being an input of the software entity SE-4). In addition, in the embodiment depicted in FIG. 3, no software entity is dependent from the software entity SE-2 and the software entity SE-3 is dependent from the software entity SE-6 (e.g., the output of the software entity SE-6 being an input of the software entity SE-3). However, the dependency hierarchy of the software entities SE-1, SE-2, . . . , SE-6 included in the application 312 is not limited to the example illustrated in FIG. 3. The performance of a software entity that is dependent from another software entity can be affected by the performance of that other software entity. For example, the performance of the software entity SE-1 can be affected by the performance of the software entity SE-4 and the performance of the software entity SE-5, the performance of the software entity SE-4 can be affected by the performance of the software entity SE-5, and the performance of the software entity SE-3 can be affected by the performance of the software entity SE-6. In some embodiments, to control a performance metric (e.g., the response time) of the application 312, the performance controller 370 adjusts the operation of a software entity at the bottom of dependency hierarchy. For example, instead of adjusting the operation of the software entity SE-1, the performance controller 370 adjusts the operation of the software entity SE-5 (e.g., scales up or down the software entity SE-5). In another example, instead of adjusting the operation of the software entity SE-3, the performance controller 370 adjusts the operation of the software entity SE-6 (e.g., scales up or down the software entity SE-6).", ¶42]

It would have been obvious to a person of ordinary skill in the art at the time of the effective filing of the instant application to modify Ou with scaling up services in a sequence of hierarchical order(bottom up). The reason for this modification would be to scale services in a way that prevents overload of downstream services.
Ou/Benjamin does not teach identifying a second cluster from the plurality of clusters based on the second cluster having enough capacity to process requests from the set of services. Boss in the same field of endeavor as the invention teaches system for virtual machine deployment in a cloud network. Boss teaches identifying a second cluster from the plurality of clusters based on the second cluster having enough capacity to process requests from the set of services.
[“ As indicated above, embodiments of the present invention relate to the migration of virtual machines (VMs) between networked computing environments (e.g., cloud computing environments) based on resource utilization. Specifically, embodiments of the present invention provide an approach to select an optimal set (one or more) of VMs as candidates for pre-staged migration. In a typical embodiment, when a first cloud environment nears physical resource capacity, an optimal set of VMs will be identified for migration to a second cloud environment that has sufficient capacity to accommodate workload(s) from the first cloud environment. To make this process more efficient, data associated with the set of virtual machines may be "pre-stage" replicated from the first cloud environment to the second cloud environment (e.g., in advance of the migration of the identified set of VMs).”, ¶18]

It would have been obvious to a person of ordinary skill in the art at the time of the effective filing of the instant application to modify Ou/Benjamin with determination of sufficient capacity to host workloads as taught by Boss. The reason for this modification would be to ensure sufficient capacity exists to host scaled services prior scaling up services.
Regarding claim 17, Ou teaches wherein each service of the set of services is associated with a deployment size that indicates an amount of resources consumed by the service.
[“At block 340, a scale up or scale down request may be issued. For example, responsive to a notification from the policy engine that a scale out or scale in policy for a particular role has been triggered, the management module may request the worker agent on the appropriate worker host to increase/decrease the number of nodes for the particular role by a step size indicated in the scaling policy. Similarly, when the triggered scale out or scale in policy identifies a dependent role to also be scaled out or scaled in, the management module may request the worker agent on the appropriate worker host to increase/decrease the number of nodes for the dependent role by a step size indicated in the scaling policy.”, ¶52]

Regarding claim 18, Ou teaches determining an allocation ratio based on the deployment size associated with each service of the set of services, wherein scaling the set of services is further based on the allocation ratio(step size to increase based on policy, ¶52).
 ["At block 340, a scale up or scale down request may be issued. For example, responsive to a notification from the policy engine that a scale out or scale in policy for a particular role has been triggered, the management module may request the worker agent on the appropriate worker host to increase/decrease the number of nodes for the particular role by a step size indicated in the scaling policy. Similarly, when the triggered scale out or scale in policy identifies a dependent role to also be scaled out or scaled in, the management module may request the worker agent on the appropriate worker host to increase/decrease the number of nodes for the dependent role by a step size indicated in the scaling policy.", ¶52]

Regarding claim 19, Ou teaches wherein scaling the set of services further comprises: determining a scaling factor that indicates a percentage of scaling for one iterative step; and executing an iterative process by iteratively scaling the set of services based on the scaling factor until a target deployment size is reached.
[" In one embodiment, the first scaling factor (or step), which indicates the number of nodes to add or remove, can be defined independently for scaling up and scaling down within the role-based autoscaling policy for the role at issue. According to one embodiment, as expansion does not typically have any adverse side effects, the expansion of the number of nodes performing a role can be performed more aggressively than contraction of the number of nodes performing the role. To reduce resource waste, the virtual cluster of the stateful application can be created with minimal cluster size and then automatically expanded based on the autoscaling policy. As described in further detail below with reference to FIGS. 5A-B, to adapt to aggressive expansion, the autoscaling policy may use a condition set evaluation approach (e.g., an “if_any” statement) that triggers responsive to any of multiple specified conditions being satisfied.", ¶43]

	

Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Ou/Benjamin as applied to claim 4 above, and further in view of Guniguntala  US 2020/0404051.
Regarding claims 5 and 14, Ou/Benjamin do not teach determining a scaling ratio based on workload associated with the set of services, wherein the scaling ratio is a ratio of scaling based on workload associated with the service relative to workload associated with other services in the set of services, and wherein each iteration of the iterative process is further based on the scaling ratio. Guniguntala in the same field of endeavor as the invention teaches a system for service deployment and scaling. Guniguntala teaches determining a scaling ratio based on workload associated with the set of services, wherein the scaling ratio is a ratio of scaling based on workload associated with the service relative to workload associated with other services in the set of services, and wherein each iteration of the iterative process is further based on the scaling ratio(Guiniguntala teaches iterative process of determining a placement ratio for deployment of services, ¶s 70,74,81).

["For example, broker 110 can attempt to optimize a current configuration of computing environments, e.g. involving their placement ratios at timed intervals, e.g. once per hour, or upon the occurrence of a predetermined event for example. If the one or more criterion at block 1110 is not satisfied, broker 110 can return to iteratively perform the loop of blocks 1107-1110 until the criterion is satisfied. When the one or more criterion of block 1110 is satisfied, broker 110 can proceed to evaluation block 1111 and adjust block 1112 to perform an optimization process.", ¶70]
["At block 1113, broker 110 can determine whether an exit condition has been satisfied. An exit condition can be, for example, a time period for deployment of an application component group has expired or that a total budget threshold has been exceeded. If an exit condition has not been satisfied, broker 110 can simply return to block 1107 and can iteratively perform the loop of blocks 1107-1113 and can iteratively optimize selected computing environments and a placement ratio until an exit condition is achieved at block 1113. When an exit condition is satisfied at block 1113, broker 110 can proceed to block 1114. At block 1114, broker 110 can return to block 1101 to receive new computing environment data and/or application data (which broker 110 can be performing at all times during deployment of system 100).", ¶74]
[" Configuration manager 111 can determine a Placement Ratio for initial deployment using a process as set forth in the flowchart of FIG. 5A. For determining an initial placement ratio, configuration manager 111 can use data of computing environments database 2121, applications database 2122, and global area B of monitoring data history database 2123. Subsequent to startup e.g. at iteratively performed block 1111 data of local area A can also be used for determining an iteratively and dynamically updated placement ratio.", ¶81]

It would have been obvious to a person of ordinary skill in the art at the time of the effective filing of the instant application to modify Ou/Benjamin with determination of placement(i.e. deployment) ratio and scaling of service consistent with the ratio. The reason for this modification would be to deploy services in the optimal ratio of services.



Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Ou/Benjamin/Boss as applied to claim 4 above, and further in view of Guniguntala  US 2020/0404051.
Regarding claims 5 and 14, Ou/Benjamin/Boss do not teach determining a scaling ratio based on workload associated with the set of services, wherein the scaling ratio is a ratio of scaling based on workload associated with the service relative to workload associated with other services in the set of services, and wherein each iteration of the iterative process is further based on the scaling ratio. Guniguntala in the same field of endeavor as the invention teaches a system for service deployment and scaling. Guniguntala teaches determining a scaling ratio based on workload associated with the set of services, wherein the scaling ratio is a ratio of scaling based on workload associated with the service relative to workload associated with other services in the set of services, and wherein each iteration of the iterative process is further based on the scaling ratio(Guiniguntala teaches iterative process of determining a placement ratio for deployment of services, ¶s 70,74,81).

["For example, broker 110 can attempt to optimize a current configuration of computing environments, e.g. involving their placement ratios at timed intervals, e.g. once per hour, or upon the occurrence of a predetermined event for example. If the one or more criterion at block 1110 is not satisfied, broker 110 can return to iteratively perform the loop of blocks 1107-1110 until the criterion is satisfied. When the one or more criterion of block 1110 is satisfied, broker 110 can proceed to evaluation block 1111 and adjust block 1112 to perform an optimization process.", ¶70]
["At block 1113, broker 110 can determine whether an exit condition has been satisfied. An exit condition can be, for example, a time period for deployment of an application component group has expired or that a total budget threshold has been exceeded. If an exit condition has not been satisfied, broker 110 can simply return to block 1107 and can iteratively perform the loop of blocks 1107-1113 and can iteratively optimize selected computing environments and a placement ratio until an exit condition is achieved at block 1113. When an exit condition is satisfied at block 1113, broker 110 can proceed to block 1114. At block 1114, broker 110 can return to block 1101 to receive new computing environment data and/or application data (which broker 110 can be performing at all times during deployment of system 100).", ¶74]
[" Configuration manager 111 can determine a Placement Ratio for initial deployment using a process as set forth in the flowchart of FIG. 5A. For determining an initial placement ratio, configuration manager 111 can use data of computing environments database 2121, applications database 2122, and global area B of monitoring data history database 2123. Subsequent to startup e.g. at iteratively performed block 1111 data of local area A can also be used for determining an iteratively and dynamically updated placement ratio.", ¶81]

It would have been obvious to a person of ordinary skill in the art at the time of the effective filing of the instant application to modify Ou/Benjamin with determination of placement(i.e. deployment) ratio and scaling of service consistent with the ratio. The reason for this modification would be to deploy services in the optimal ratio of services.




Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to TOM Y. CHANG whose telephone number is (571)270-5938.  The examiner can normally be reached on Monday - Thursday from 9am to 5pm.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, William Trost , can be reached on (571)272-7872. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through 
Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/TOM Y CHANG/
Primary Examiner, Art Unit 2442