DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
	Applicant’s arguments regarding the 35 USC § 103 rejections have been fully considered but are not persuasive.  Applicant argues that the claims are allowable because “ while Donahue does discuss estimating a number of requests generally, it remains silent on determining a number of pending operations associated with a service, where the determination is based on information stored in a data structure maintained by a host agent. Instead, and at best, Donahue describes a scaling framework that operates by making scaling decisions based on application-level data, that includes the number of pending requests directed towards a computing application. However, mere general discussion of using pending request data to make scaling decision is not the same, nor suggestive of, “determine[ing] whether [a] service has any pending operations requested by the plurality of other services in the cluster or cloud environment, wherein the determination is based at least on information stored in a data structure maintained by the host agent.” (Applicant’s Remarks, Pg. 15). Examiner respectfully disagrees. Donahue teaches persistent queues and a database storing information about pending requests directed toward a service. This information is used to make scaling decisions. ([0013],  an auto-scaling component comprises a data collector 124 and a scaling module 122 that together manage the auto-scaling of instances executing a cloud-based computing application (e.g., a content server or some other web application). For the purposes of this description, a cloud-based computing application executing on the instances 110, 120, and 130, for which the scaling decisions are being made, is referred to as the subject computing application. The data collectors 124 and the scaling modules 122, which are shown in FIG. 1 as executing on instances 110, 120, and 130, interact through a set of persistent queues 142, 144, 146, and 148, and through a common database 150 ; and [0016], the pending requests estimator returns the estimated value of the number of current pending requests. In some embodiments, the number of pending requests for the subject computing application may be estimated by examining the length of the relevant input queue or by examining a count of database records that match a certain query condition or some other computation. These database records that match a certain query condition may be termed, for the purposes of this description, request records). Applicant’s remarks regarding newly amended claim language has been fully addressed in the rejections below.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
	The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-6 and 8-38 are rejected under 35 U.S.C. 103 as being unpatentable over Laribi et al. (United States Patent Application Publication 2013/0297802) in view of Donahue et al. (United States Patent Application Publication 2014/0040885).
As per claim1, Laribi teaches, a system comprising: 
	a computing node including a processor ([0078], The client 102, …may be deployed as and/or executed on any type and form of computing device…FIGS. 1E and 1F depict block diagrams of a computing device 100 useful for practicing an embodiment of the client 102; and Fig. 1E, Block 101), a hypervisor ([0148], a computing device 100 executes one or more types of hypervisors; and [0152], multiple hypervisors manage one or more of the guest operating systems executed on one of the computing devices 100) and a host agent  ([0242], cloud service providers 716) in communication with the hypervisor ([0242], a computing device, such as an appliance 200 or an appliance cluster 600 may communicate with one or more cloud service providers 716 providing one or more virtual or physical machines 718 in a hypervisor farm 720. The hypervisor farm 720 may provide one or more virtual or physical machines to provide a plurality of instances of applications 714 providing services to a web tier 712 as part of a hosted application 710; and [0246], Cloud service providers 716 may provide one or more physical or virtual machines 718 or other virtual or physical computing devices including virtual servers or routers to provide a hypervisor farm 720 for execution of one or more instances of application 714), the host agent configured to:
	start a service ([0246], one or more physical or virtual machines 718 or other virtual or physical computing devices including virtual servers or routers …for execution of one or more instances of application 714)  at the computing node ([0246], Cloud service providers 716 may provide one or more physical or virtual machines 718 or other virtual or physical computing devices including virtual servers or routers to provide a hypervisor farm 720 for execution of one or more instances of application 714);
	respond to a request  ([0250], responsive to the monitored metric exceeding a threshold of a policy, the AAPE 708 may issue a request to one or more of the cloud service providers 716 to provision or deprovision one or more machines…the cloud service providers 716 may accordingly provision/start or deprovision/shut down the machines in accordance with the request) associated with an operation at the service ([0247], an appliance 200 or appliance cluster 600 may monitor backend services at step 730. Monitoring back end services may comprise monitoring application health, server or virtual server load, latency, bandwidth usage, packet loss, round trip time, CPU utilization, memory utilization, storage utilization, number of clients, time taken to respond to a request, or any other type and form of metric. For example, in one embodiment, an appliance 200 may monitor CPU usage by a machine or server), the request issued by one of the plurality of other services (0224], appliance 200 may comprise an adaptive application provisioning engine (AAPE) 708; and [0250], responsive to the monitored metric exceeding a threshold of a policy, the AAPE 708 may issue a request to one or more of the cloud service providers 716 to provision or deprovision one or more machines) in a cluster or cloud environment ([0241], a cloud computing environment provided by one or more servers, including virtual machines or physical machines), the cluster or cloud environment including the computing node ([0242], a computing device…may communicate with one or more cloud service providers 716 providing one or more virtual or physical machines 718 in a hypervisor farm 720. The hypervisor farm 720 may provide one or more virtual or physical machines to provide a plurality of instances of applications 714); 
	determine whether a performance of the service has met a criteria to stop the service ([0250], responsive to the monitored metric exceeding a threshold of a policy, the AAPE 708 may issue a request to one or more of the cloud service providers 716 to provision or deprovision one or more machines. At step 734, the cloud service providers 716 may accordingly provision/start or deprovision/shut down the machines in accordance with the request).

	Laribi fails to specifically teach respond to a request by …updating information about pending operations associated with the service based on the request;  if the performance of the service has met the criteria: access the information about pending operations associated with the service to determine whether the service has any pending operations that have been requested by the plurality of other services in the cluster or cloud environment, wherein the determination is based at least on information store in a data structure maintained by the host agent; and if it is determined that the service has no pending operations that have been requested by the plurality of other services in the cluster or cloud environment, stop the service.
	However, Donahue teaches, respond to a request  ([0037],  a trigger indicative of a request to evaluate the need to perform a scaling action is detected) by ….updating information about pending operations associated with the service based on the request ([0013], cloud-based computing application executing on the instances 110, 120, and 130, for which the scaling decisions are being made, is referred to as the subject computing application; [0031], The application-level data reflects demand for a subject computing application executing on the virtual instance of a machine provided by a virtualization service; [0033], As explained above, application-level data comprises …an estimate of the number of pending requests directed to the subject computing application….The data collector 310 may also include a pending requests estimator 314 configured to estimate the number of pending requests directed to the subject computing application;  and [0037], the scaling module 320 processes the application-level data collected by the data collector 310 and determines/ selects a scaling action);
	if the performance of the service ([0013], cloud-based computing application executing on the instances 110, 120, and 130, for which the scaling decisions are being made, is referred to as the subject computing application) has met the criteria ([0032], The scaling module 320 may be configured to select a scaling action based on the application-level data provided by the data collector 310, and issue a request to the virtualization service to perform the scaling action with respect to the virtual instance of a machine. The scaling action may be terminating the virtual instance of the machine that hosts the auto-scaling component; and [0033], application-level data comprises latency of a recently processed request directed to the subject computing application and an estimate of the number of pending requests directed to the subject computing application): 
	access the information about pending operations associated with the service to determine whether the service has any pending operations ([0013], cloud-based computing application executing on the instances 110, 120, and 130, for which the scaling decisions are being made, is referred to as the subject computing application; and [0033], The data collector 310 may also include a pending requests estimator 314 configured to estimate the number of pending requests directed to the subject computing application) that have been requested by the plurality of other services in the cluster or cloud environment (Fig. 1, Blocks 110, 120, 142, and 144, [0004], FIG. 1 is a diagrammatic representation of virtual instances interacting with persistent queues;  [0013],The data collectors 124 and the scaling modules 122, which are shown in FIG. 1 as executing on instances 110, 120, and 130, interact through a set of persistent queues 142, 144, 146, and 148, and through a common database 150; [0015], The persistent queues 142, 144, 146, or 148 comprise one or more input queues and one or more output queues; [0016], the number of pending requests for the subject computing application may be estimated by examining the length of the relevant input queue or by examining a count of database records that match a certain query condition or some other computation. These database records that match a certain query condition may be termed, for the purposes of this description, request records; and [0033], application-level data comprises latency of a recently processed request directed to the subject computing application and an estimate of the number of pending requests directed to the subject computing application. The number of pending requests directed to the subject computing application may be based on a length of an input queue provided by the virtualization service. The number of pending requests directed to the subject computing application may also be determined based on a number of request records in a database provided by the virtualization service) , wherein the determination is based at least on information store in a data structure ([0015], the scaling modules 122 use the database 150 for synchronization and coordination. As the load with respect to the instances 110, 120, and 130 changes, it may be advantageous to run more or fewer instances. The approach described herein uses a distributed architecture, where the scaling module 122 executing in each virtual instance makes its own decisions about whether to terminate the instance that hosts it or to add one or more instances; and [0016], the number of pending requests for the subject computing application may be estimated by examining the length of the relevant input queue or by examining a count of database records that match a certain query condition or some other computation. These database records that match a certain query condition may be termed, for the purposes of this description, request records) maintained by the host agent ([0013], the persistent queues 142, 144, 146, and 148 and the database 150 are managed by the provider of cloud-based computing services (also referred to as the cloud provider) and are expected to have high reliability and availability); and 
	if it is determined that the service has no pending operations that have been requested by the plurality of other services in the cluster or cloud environment, stop the service ([0018], Based on the request latency information and the estimated value of the number of current pending requests, the scaling module 122 makes a scaling decision, which is either to make no change to the number of instances executing the subject computing application, [or] to reduce the number of instances executing the subject computing application by killing the current instance…After a decision has been made, the action is performed).

	Laribi and Donahue are analogous because they are each related to virtual machine management based on performance metrics. Laribi teaches a method of provisioning and deprovisioning virtual machines based on collected metrics (Abstract, systems and methods for adaptive application provisioning for cloud services. An appliance deployed in a network as a gateway may be able to transparently monitor application activity in a cloud computing environment provided by one or more servers, including servers executed by virtual machines, … the appliance may monitor one or more network metrics, including bandwidth usage, latency, congestion, or other issues; and/or may monitor application health or server or virtual machine statistics, including memory and processor usage, bandwidth usage, latency, or other metrics. Responsive to one or more metrics exceeding a threshold, the appliance may automatically provision or start, or deprovision or shut down, one or more virtual or physical machines from a cloud service provider). Donahue teaches scaling virtual machine instances based on monitored performance metrics. ([0011], In the case of a service that is invoked by clients over the Internet, a load balancer that fields and distributes incoming requests may be configured to monitor traffic and use this information to add or remove virtual instances hosting the service as the load changes; and [0018], The scaling module in each instance wakes up periodically and performs a scaling action. The scaling operation may be described as involving two parts: (1) making a "scaling decision," and (2) performing the "scaling actions," in one embodiment, the code for performing scaling operations, the executable code for the scaling module is stored in the database 150 and is loaded into the instance when the instance starts. The data collector 122 passes the request latency information and the "approximate pending" to the scaling module 122. Based on the request latency information and the estimated value of the number of current pending requests, the scaling module 122 makes a scaling decision, which is either to make no change to the number of instances executing the subject computing application, to reduce the number of instances executing the subject computing application by killing the current instance, or to increase the number of instances executing the subject computing application instances by cloning the current instance).  It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention that based on the combination, the teachings of Laribi would be modified with the deprovisioning and pending tasks calculation mechanisms taught by Donahue in order to provision and deprovision virtual machines based on performance metrics. Therefore, it would have been obvious to combine the teachings of Laribi and Donahue. 

As per claim 2, Laribi teaches, wherein the criteria to stop the service comprises at least one of: 
	a memory utilization of the service has exceeded a memory threshold ([0241], the appliance may monitor one or more network metrics, including bandwidth usage, latency, congestion, or other issues; and/or may monitor application health or server or machine statistics, including memory and processor usage, bandwidth usage, latency, or other metrics. Responsive to one or more metrics exceeding a threshold, the appliance may automatically provision or start, or deprovision or shut down, one or more virtual machines and/or physical machines from a cloud service provider, and may provide configuration information to the provisioned machines as needed); and 
	a service time of the service has exceeded a service threshold.

As per claim 3, Laribi teaches, wherein the hypervisor is configured to after starting the service ([0161], the management component 404a identifies a computing device 100b on which to execute a requested virtual machine 406d and instructs the hypervisor 401b on the identified computing device 100b to execute the identified virtual machine), spawn a process to determine whether the performance of the service has met the criteria to stop the service ([0163], a diagram of an embodiment of a virtual appliance 450 operating on a hypervisor 401 of a server 106 is depicted. As with the appliance 200 of FIGS. 2A and 2B, the virtual appliance 450 may provide functionality for availability, performance, offload and security. For availability, the virtual appliance may perform load balancing between layers 4 and 7 of the network and may also perform intelligent service health monitoring; and [0244], The AAPE may monitor status and health of a cloud application or cloud service 710 and may automatically generate requests to provision or de-provision additional virtual machines or start or shutdown additional physical machines 718, as well as providing configuration details to newly provisioned machines).

As per claim 4, Laribi teaches, wherein the spawned process is configured to:
	determine that the performance of the service has met the criteria to stop the service if a service time associated with the service has exceeded a threshold ([0241], the appliance may monitor one or more network metrics, …latency… Responsive to one or more metrics exceeding a threshold, the appliance may automatically provision or start, or deprovision or shut down, one or more virtual machines).

As per claim 5, Donahue teaches, wherein the spawned process is configured to determine the service time by executing a self-diagnosis service to determine the service time based on a response time for the self-diagnosis service ([0015], The approach described herein uses a distributed architecture, where the scaling module 122 executing in each virtual instance makes its own decisions about whether to terminate the instance that hosts it or to add one or more instances; and [0017], An instance periodically invokes the request latency detector of the data collector 124 and records the latency value (indicative of the latency of a recently processed request) returned by the request latency detector. The instance may be configured to maintain the history of request latency for the subject computing application between the times when the scaling module 122 wakes up and performs its task of making a scaling decision).

As per claim 6, Donahue teaches, wherein the host agent is further configured to: 	when determining that the service has no pending operations which have been requested by the other services in the cluster or cloud environment and before stopping the service, record in a log information about a reason to stop the service ([0013], an auto-scaling component comprises a data collector 124 and a scaling module 122 that together manage the auto-scaling of instances executing a cloud-based computing application (e.g., a content server or some other web application); and [0020], before attempting to alter the number of instances executing the subject computing application, the scaling module 122 locks a row associated with the subject computing application (or, e.g., with the data collector 124) in the database 150 and verifies that no other scaling module 122 executing on another instance has performed a modification within a certain time period that may be termed "courtesy interval." …if the scaling module 122 determines, based on the examination of information stored in the database 150, that another scaling module 122 has performed a scaling action with respect to instances executing the subject computing application during the courtesy interval, the current scaling action is discarded and the scaling module 122 releases the lock on the database row. If, on the other hand, no other scaling module 122 has performed a scaling action within the courtesy interval, the scaling module 122 attempts to perform the scaling action; and [0022], If the scaling action is to terminate an instance, the scaling module 122 writes to the database that the current instance is being terminated, releases the database lock, and then attempts to terminate the current instance. The specific operations to terminate an instance may be specific to the particular virtualization service that is providing the cloud. A so-called "grim reaper" service may be provided, that periodically scans the database and terminates any instances that are marked as terminated but are still executing).
	
As per claim 8, Laribi teaches, wherein the hypervisor is further configured to: 
	responsive to the request including a request to terminate the operation at the service ([0009], automatically responsive to the determination a request to the cloud service provider to one of provision or deprovision one or more instances of the application via the cloud service provider; and [0244], appliance 200 may comprise an adaptive application provisioning engine (AAPE) 708), update the data structure ([0103], the policy engine 236 may comprise any logic, rules, functions or operations to determine and provide access, control and management of objects, data or content being cached by the appliance 200 in addition to access, control and management of security, network traffic, network access, compression or any other function or operation performed by the appliance 200)  by decreasing the number of instances associated with an entry for the operation to be terminated or removing the entry for the operation ([0007], The device is configured to determine that each metric of the one or more metrics exceeds each metric's threshold identified by the policy and to transmit automatically responsive to the determination a request to the cloud service provider to one of provision or deprovision one or more instances of the application via the cloud service provider in accordance with the policy; [0241], Responsive to one or more metrics exceeding a threshold, the appliance may automatically provision or start, or deprovision or shut down, one or more virtual machines and/or physical machines from a cloud service provider, and may provide configuration information to the provisioned machines as needed; and [0253], An autoscale action may specify whether to scale up or down, a number of instances to scale up or down and which cloud service provider to provision/deprovisioning for scaling up or down); and terminate the operation ([0241], Responsive to one or more metrics exceeding a threshold, the appliance may automatically …deprovision or shut down, one or more virtual machines and/or physical machines from a cloud service provider).
		
As per claim 9, Donahue teaches, wherein the hypervisor is configured to determine whether the service has any pending operations by: 
	determining that the service has no pending operations if a number of entries for the pending operations is zero ([0033], The number of pending requests directed to the subject computing application may be based on a length of an input queue provided by the virtualization service. The number of pending requests directed to the subject computing application may also be determined based on a number of request records in a database provided by the virtualization service); 
	otherwise, determining that the service has at least a pending critical operation ([0033], The number of pending requests directed to the subject computing application may be based on a length of an input queue provided by the virtualization service. The number of pending requests directed to the subject computing application may also be determined based on a number of request records in a database provided by the virtualization service).

As per claim 10, Laribi teaches, wherein the table is a hash table ([0103],Policy engine 236, in some embodiments, also has access to memory to support data structures such as lookup tables or hash tables to enable user-selected caching policy decisions. In other embodiments, the policy engine 236 may comprise any logic, rules, functions or operations to determine and provide access, control and management of objects, data or content being cached by the appliance 200 in addition to access, control and management of security, network traffic, network access, compression or any other function or operation performed by the appliance 200).

As per claim 11, this is the “method claim” corresponding to claim 1 and is rejected for the same reasons. The same motivation used in the rejection of claim 1 is applicable to the instant claim.
As per claim 12, this claim is similar to claim 2 and is rejected for the same reasons.

As per claim 13, Laribi teaches, wherein determining whether the performance of the service has met the criteria comprises: 
	by the host, after starting the service, spawning a process configured to: 
	determine that the performance of the service has met the criteria to stop the service if a memory utilization of the service has exceeded a threshold ([0163], As with the appliance 200 of FIGS. 2A and 2B, the virtual appliance 450 may provide functionality for availability, performance, offload and security. For availability, the virtual appliance may perform load balancing between layers 4 and 7 of the network and may also perform intelligent service health monitoring; and [0241], the appliance may monitor one or more network metrics, including bandwidth usage, latency, congestion, or other issues; and/or may monitor application health or server or machine statistics, including memory and processor usage, bandwidth usage, latency, or other metrics. Responsive to one or more metrics exceeding a threshold, the appliance may automatically provision or start, or deprovision or shut down, one or more virtual machines and/or physical machines from a cloud service provider, and may provide configuration information to the provisioned machines as needed).

As per claim 14, this claim is similar to claim 4 and is rejected for the same reasons.
As per claim 15, this claim is similar to claim 5 and is rejected for the same reasons. The same motivation used in the rejection of claim 5 is applicable to the instant claim.
As per claim 16, this claim is similar to claim 6 and is rejected for the same reasons.

As per claim 17, Donahue teaches, further comprising, by the hypervisor: 
	responsive to the request being a request to start an operation at the service, updating at the data structure to include the operation or to increment a number of instances associated with an entry for the operation ([0023], scaling module 122 generates a request id, writes the request together with the "create" request id in the database 150, attempts to execute the "create" request, and writes information about the resulting new instance into the database 150), and starting the operation  ([0023], A so-called "birth helper" service may be provided, that periodically scans the database 150 to detect any "create" requests that might not have been completed and attempts to complete them by resubmitting them and updating the instance information when they eventually complete), the data structure including information about pending operations associated with the service ([0016], the number of pending requests for the subject computing application may be estimated by examining…examining a count of database records that match a certain query condition …These database records that match a certain query condition may be termed, for the purposes of this description, request records).

As per claim 18, this claim is similar to claim 8 and is rejected for the same reasons.
As per claim 19, this claim is similar to claim 9 and is rejected for the same reasons.
As per claim 20, this claim is similar to claim 10 and is rejected for the same reasons.

As per claim 21, Laribi teaches the invention substantially as claimed including a system comprising: 
	a computing node including a processor, a hypervisor ([0147], a computing device 100 includes a hypervisor layer, a virtualization layer, and a hardware layer. The hypervisor layer includes a hypervisor 401 (also referred to as a virtualization manager) that allocates and manages access to a number of physical resources in the hardware layer (e.g., the processor(s) 421, and disk(s) 428) by at least one virtual machine executing in the virtualization layer), and a host agent in communication with the hypervisor ([0147], A virtual machine 406 may include a control operating system 405 in communication with the hypervisor 401 and used to execute applications for managing and configuring other virtual machines on the computing device 100), wherein the host agent configured to: 
	start a service at the computing node ([0153], embodiment, a guest operating system 410 communicates with the control operating system 405 via the hypervisor 401 in order to request access to a disk or a network; [0155], the control operating system 405 includes a tools stack 404. In another embodiment, a tools stack 404 provides functionality for interacting with the hypervisor 401, communicating with other control operating systems 405 (for example, on a second computing device 100b), or managing virtual machines 406b, 406c on the computing device 100; and [0161], a management operating system 405a, which may be referred to as a control operating system 405a, includes the management component…the management component 404a identifies a computing device 100b on which to execute a requested virtual machine 406d and instructs the hypervisor 401b on the identified computing device 100b to execute the identified virtual machine; such a management component may be referred to as a pool management component); 
	respond to a request associated with an operation at the service ([0147], A virtual machine 406 may include a control operating system 405 in communication with the hypervisor 401 and used to execute applications for managing and configuring other virtual machines on the computing device 100; and [0157], the guest operating system 410, in conjunction with the virtual machine on which it executes, forms a paravirtualized virtual machine…In still another embodiment, the paravirtualized machine includes the network back-end driver and the block back-end driver included in a control operating system 405, as described above; Examiner Note: The guest virtual machine can be a control virtual machine that manages other virtual machines), the request issued by one of the plurality of other services in a cluster or cloud environment ([0152], the control operating system 405 may execute an administrative application, such as an application including a user interface providing administrators with access to functionality for managing the execution of a virtual machine, including functionality for executing a virtual machine, terminating an execution of a virtual machine, or identifying a type of physical resource for allocation to the virtual machine. In another embodiment, the hypervisor 401 executes the control operating system 405 within a virtual machine 406 created by the hypervisor 401. In still another embodiment, the control operating system 405 executes in a virtual machine 406 that is authorized to directly access physical resources on the computing device 100. In some embodiments, a control operating system 405a on a computing device 100a may exchange data with a control operating system 405b on a computing device 100b, via communications between a hypervisor 401a and a hypervisor 401b. In this way, one or more computing devices 100 may exchange data with one or more of the other computing devices 100 regarding processors and other physical resources available in a pool of resources), the cluster or cloud environment including the computing node ([0010], the device is configured to establish one or more instances of the plurality of instances of the application as a virtual or physical machine executing on one or more servers of the cloud service provider. In some embodiments, the policy specifies to scale up or scale down the number of instances of the application responsive to each metric of the one or more metrics exceeding each metric's threshold), by updating information about pending operations associated with the service based on the request([0107],  packet engine 240 may comprise a buffer for queuing one or more network packets during processing, such as for receipt of a network packet or transmission of a network packet… packet engine 240 is in communication with one or more network stacks 267 to send and receive network packets via network ports 266;  [0189], in one embodiment the packet engine(s) 548A-N can comprise any portion of the appliance 200 described herein, such as any portion of the appliance described in FIGS. 2A and 2B. The packet engine(s) 548A-N can, in some embodiments, comprise any of the following elements: the packet engine 240, a network stack 267…and any other software or hardware element able to receive data packets from one of either the memory bus 556 or the one of more cores 505A-N. In some embodiments, the packet engine(s) 548A-N can comprise one or more vServers 275A-N, or any portion thereof; and [0190], the packet engine 548 may choose not to be associated with a particular entity such that the packet engine 548 can process and otherwise operate on any data packets not generated by that entity or destined for that entity);
	spawn a process associated with the service ([0163], As with the appliance 200 of FIGS. 2A and 2B, the virtual appliance 450 may provide functionality for availability, performance, offload and security. For availability, the virtual appliance may perform load balancing between layers 4 and 7 of the network and may also perform intelligent service health monitoring; and [0164], the virtual appliance may be provided in the form of an installation package to install on a computing device), the spawned process configured to: 	
	determine a memory utilization of the service ([0241], the appliance may monitor one or more network metrics, including bandwidth usage, latency, congestion, or other issues; and/or may monitor application health or server or machine statistics, including memory and processor usage, bandwidth usage, latency, or other metrics); and
	determine whether the memory utilization of the service has exceeded a threshold to stop the service ([0163], As with the appliance 200 of FIGS. 2A and 2B, the virtual appliance 450 may provide functionality for availability, performance, offload and security. For availability, the virtual appliance may perform load balancing between layers 4 and 7 of the network and may also perform intelligent service health monitoring; and [0241], Responsive to one or more metrics exceeding a threshold, the appliance may automatically provision or start, or deprovision or shut down, one or more virtual machines and/or physical machines from a cloud service provider, and may provide configuration information to the provisioned machines as needed).


	Laribi fails to specifically teach,  if the memory utilization of the service has exceeded the threshold: access the information about pending operations associated with the service to determine whether the service has any pending operations requested by the plurality of other services in the cluster or cloud environment; and if it is determined that the service has no pending operations requested by the plurality of other services in the cluster or cloud environment, stop the service.
	However, Donahue teaches, if the memory utilization of the service has exceeded the threshold:
	access the information about pending operations associated with the service to determine whether the service has any pending operations ([0018], The scaling module in each instance wakes up periodically and performs a scaling action. The scaling operation may be described as involving two parts: (1) making a "scaling decision," and (2) performing the "scaling actions,"…The data collector 122 passes the request latency information and the "approximate pending" to the scaling module 122. Based on the request latency information and the estimated value of the number of current pending requests, the scaling module 122 makes a scaling decision, which is either to make no change to the number of instances executing the subject computing application, [or]  to reduce the number of instances executing the subject computing application by killing the current instance; and [0033], The data collector 310 may include a latency module 312 configured to determine latency of a recently processed request directed to the subject computing application. The data collector 310 may also include a pending requests estimator 314 configured to estimate the number of pending requests directed to the subject computing application) requested by the plurality of other services in the cluster or cloud environment (Fig. 1, Blocks 110, 120, 130, 142, 144, 146, and 148; [0013], cloud-based computing application executing on the instances 110, 120, and 130, for which the scaling decisions are being made, is referred to as the subject computing application. The data collectors 124 and the scaling modules 122, which are shown in FIG. 1 as executing on instances 110, 120, and 130, interact through a set of persistent queues 142, 144, 146, and 148, and through a common database 150; [0023], scaling module 122 generates a request id, writes the request together with the "create" request id in the database 150, attempts to execute the "create" request, and writes information about the resulting new instance into the database 150. A so-called "birth helper" service may be provided, that periodically scans the database 150 to detect any "create" requests that might not have been completed and attempts to complete them by resubmitting them and updating the instance information when they eventually complete. The "birth helper" service may be implemented as part of the subject computing application or, alternatively, it can be provided by the cloud provider or by any party; and [0035], The scaling module 230 may be configured to lock a database row associated with the subject computing application, and, based on whether a scaling action with respect to the subject computing application has been performed during a predetermined time interval, proceed with the scaling action or cancel the scaling action); and 
	if it is determined that the service has no pending operations requested by the plurality of other services in the cluster or cloud environment, stop the service ([0018], Based on the request latency information and the estimated value of the number of current pending requests, the scaling module 122 makes a scaling decision, which is either to make no change to the number of instances executing the subject computing application, [or] to reduce the number of instances executing the subject computing application by killing the current instance…After a decision has been made, the action is performed).
	The same motivation used in the rejection of claim 1 is applicable to the instant claim.

As per claim 22, this claim is similar to claim 17 and is rejected for the same reasons.
As per claim 23, this claim is similar to claim 9 and is rejected for the same reasons.

As per claim 24, Laribi teaches, wherein the spawned process is configured to determine that the performance of the service has met the criteria to stop the service if a memory utilization of the service has exceeded a threshold ([0241], the appliance may monitor one or more network metrics, including bandwidth usage, …and/or may monitor application health or server or machine statistics, including memory …usage, bandwidth usage, latency, or other metrics…Responsive to one or more metrics exceeding a threshold, the appliance may automatically provision or start, or deprovision or shut down, one or more virtual machines and/or physical machines from a cloud service provider, and may provide configuration information to the provisioned machines as needed).

As per claim 25, Laribi teaches wherein the hypervisor is further configured to:
	responsive to the request including a request to begin the operation at the service, update the information about pending operations associated with the service by updating a table to include the operation or to increment a number of instances associated with an entry for the operation and start the operation ([0250], responsive to the monitored metric exceeding a threshold of a policy, the AAPE 708 may issue a request to one or more of the cloud service providers 716 to provision…one or more machines…the cloud service providers 716 may accordingly provision/start…the machines in accordance with the request).

As per claim 26, Donahue teaches, wherein the hypervisor includes a data structure containing the information about the pending operations associated with the service and requested by the plurality of other services in the cluster or cloud environment, wherein the host agent is configured to: 
	responsive to the request being a request to terminate the operation at the service (Abstract, The scaling module is to select a scaling action based on the application-level data and issue a request to perform the scaling action with respect to the virtual instance of a machine), update the data structure by decreasing the number of instances associated with an entry for the operation or removing the entry for the operation, and stop the operation ([0022], If the scaling action is to terminate an instance, the scaling module 122 writes to the database that the current instance is being terminated, releases the database lock, and then attempts to terminate the current instance….A so-called "grim reaper" service may be provided, that periodically scans the database and terminates any instances that are marked as terminated but are still executing. The "grim reaper" service may be implemented as part of the subject computing application or, alternatively, it can be provided by the cloud provider or by any party).

As per claim 27, this is the “non-transitory computer readable medium” corresponding to claim 1 and is rejected for the same reasons. The same motivation used in the rejection of claim 1 is applicable to the instant claim.
As per claim 28, this claim is similar to claim 2 and is rejected for the same reasons.

As per claim 29, Laribi teaches, wherein instructions for determining whether the performance of the service has met the criteria further comprising instructions for: 
	after starting the service, spawning a process configured to:
		if a memory' utilization of the service has exceeded a threshold, determine that the performance of the service has met the criteria to stop the service ([0241], the appliance may monitor one or more network metrics, including bandwidth usage, …and/or may monitor application health or server or machine statistics, including memory …usage, bandwidth usage, latency, or other metrics…Responsive to one or more metrics exceeding a threshold, the appliance may automatically provision or start, or deprovision or shut down, one or more virtual machines and/or physical machines from a cloud service provider, and may provide configuration information to the provisioned machines as needed).

As per claim 30, Donahue teaches, wherein instructions for determining whether the performance of the service has met the criteria to stop the service further comprise instructions for: 
	after starting the service, spawning a process ([0017], An instance periodically invokes the request latency detector of the data collector 124 and records the latency value (indicative of the latency of a recently processed request) returned by the request latency detector. The instance may be configured to maintain the history of request latency for the subject computing application between the times when the scaling module 122 wakes up and performs its task of making a scaling decision.) configured to: 
	determine that the performance of the service has met the criteria if a service time associated with the service has exceeded a threshold ([0018], The scaling module in each instance wakes up periodically and performs a scaling action. The scaling operation may be described as involving two parts: (1) making a "scaling decision," and (2) performing the "scaling actions," …The data collector 122 passes the request latency information and the "approximate pending" to the scaling module 122. Based on the request latency information and the estimated value of the number of current pending requests, the scaling module 122 makes a scaling decision, which is either to make no change to the number of instances executing the subject computing application, to reduce the number of instances executing the subject computing application by killing the current instance, or to increase the number of instances executing the subject computing application instances by cloning the current instance. After a decision has been made, the action is performed).

As per claim 31, Donahue teaches, wherein starting the service at the computing node comprises starting a first instance of the service at the computing node, the host agent further configured to: 
	update the information about pending operations associated with the service ([0022], If the scaling action is to terminate an instance, the scaling module 122 writes to the database that the current instance is being terminated) responsive to a second request from another of the plurality of services in the cluster or cloud environment, to begin or terminate an operation at a second instance of the service at a second computing node in the cluster or cloud environment ([0018], The scaling module in each instance wakes up periodically and performs a scaling action; and [0022], A so-called "grim reaper" service may be provided, that periodically scans the database and terminates any instances that are marked as terminated but are still executing. The "grim reaper" service may be implemented as part of the subject computing application or, alternatively, it can be provided by the cloud provider or by any party).

As per claim 32, Laribi teaches, wherein the service ([0246], one or more physical or virtual machines 718 or other virtual or physical computing devices including virtual servers or routers …for execution of one or more instances of application 714) and the plurality of other services in the cluster or cloud environment  are each configured to provide services to user virtual machines in the cluster or cloud environment  ([0091], a first computing device 100a executes an application on behalf of a user of a client computing device 100b. In other embodiments, a computing device 100a executes a virtual machine, which provides an execution session within which applications execute on behalf of a user or a client computing devices 100b. In one of these embodiments, the execution session is a hosted desktop session. In another of these embodiments, the computing device 100 executes a terminal services session. The terminal services session may provide a hosted desktop environment. In still another of these embodiments, the execution session provides access to a computing environment, which may comprise one or more of: an application, a plurality of applications, a desktop application, and a desktop session in which one or more applications may execute)
As per claim 33, this claim is similar to claim 31 and is rejected for the same reasons.
As per claim 34, this claim is similar to claim 32 and is rejected for the same reasons.
As per claim 35, this claim is similar to claim 31 and is rejected for the same reasons.
As per claim 36, this claim is similar to claim 32 and is rejected for the same reasons.
As per claim 37, this claim is similar to claim 31 and is rejected for the same reasons.
As per claim 38, this claim is similar to claim 32 and is rejected for the same reasons.

As per claim 39, Donahue teaches,  wherein the data structure is a table ([0013],  data collectors 124 and the scaling modules 122, which are shown in FIG. 1 as executing on instances 110, 120, and 130, interact through a set of persistent queues 142, 144, 146, and 148, and through a common database 150; and [0015], database 150 may be utilized to perform coordination with the scaling modules 122 executing in the other virtual instances. As mentioned above, the persistent queues 142, 144, 146, and 148, as well as the database 150, may be managed by the provider of cloud-based computing services).
As per claim 40, this claim is similar to claim 39 and is rejected for the same reasons.
As per claim 41, this claim is similar to claim 39 and is rejected for the same reasons.
As per claim 42, this claim is similar to claim 39 and is rejected for the same reasons.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MELISSA A HEADLY whose telephone number is (571)272-1972. The examiner can normally be reached Monday- Friday 9-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached on 571-272-3759. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LEWIS A BULLOCK  JR/Supervisory Patent Examiner, Art Unit 2199                                                                                                                                                                                                        
MELISSA A. HEADLY
Examiner
Art Unit 2199