Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 
NON-FINAL ACTION
This is a Non-Final Office Action responsive to the Response filed by the Patent Owner on 2/26/2021.  The instant application is a reissue divisional application, assigned Serial Number: 16/544,359 (hereinafter the ’359 application), that claims priority to reissue application 15/232,266 (hereinafter the ’266 application).  The ’266 application is a reissue of US Application No. 13/436,271 (hereinafter the ‘271 Application), filed August 12, 2014, which has been granted as US Patent Number 8,806,018 (hereinafter the ‘018 Patent) granted August 12, 2014.
For reissue applications filed before September 16, 2012, all references to 35 U.S.C. 251 and 37 CFR 1.172, 1.175, and 3.73 are to the law and rules in effect on September 15, 2012.  Where specifically designated, these are “pre-AIA ” provisions.  
For reissue applications filed on or after September 16, 2012, all references to 35 U.S.C. 251 and 37 CFR 1.172, 1.175, and 3.73 are to the current provisions.  

Obligations
Applicant is reminded of the continuing obligation under 37 CFR 1.178(b), to timely apprise the Office of any prior or concurrent proceed-ing in which Patent No. 8,806,018 is or was involved. These proceedings would include interferences, reissues, reexaminations, and litigation. 
Applicant is further reminded of the continuing obligation under 37 CFR 1.56, to 
These obligations rest with each individual associated with the filing and prosecution of this application for reissue. See also MPEP §§ 1404, 1442.01 and 1442.04.
Applicant is notified that any subsequent amendment to the specification and/or claims must comply with 37 CFR 1.173(b). 

Prosecution History
During initial examination, the claims where allowed responsive to the Patent Owner (then applicant) adding the below content into the independent claims.   The Examiner noted that the amended claim overcame the prior art of record:  ISCI (U.S. Publication No. 2011/0302578), Breitgand (U.S. Publication No. 2011/0264805) and Zhang (US Patent No. 8,250,198 also previously published as U.S. Publication No. 2011/0040876).  By way of the 2/4/2014 amendment agreed to by the Patent Owner the below added limitations to the independent claims received a notice of allowance on the amended claims on 4/4/2014.
Added limitations:
1. 
…
determining a minimum number, k.sub.reqd, of the plurality of computing resources to be in the lower-setup-cost state and overriding ones of the state-change delay timers as needed to keep the minimum number of the plurality of computing resources in the lower-setup-cost state;
wherein:
said determining includes determining k.sub.reqd as a function of a total number of requests currently distributed among the plurality of computing resources and further as a function of a packing factor and a number, k, of the plurality of computing resources currently in the lower-setup-cost state.

11.
…
a load balancer comprising at least one processor…
and a robustness controller designed and configured to determine a minimum number, k.sub.reqd, of said plurality of computing resources to be in said lower-setup-cost state and override ones of said state-change delay timers as needed to keep the minimum number of said plurality of computing resources in said lower-setup-cost state;
wherein:
said robustness controller is designed and configured to determine k.sub.reqd as a function of a total number of requests currently distributed among said plurality of computing resources and further as a function of a packing factor and a number, k, of said plurality of computing resources currently in said lower-setup-cost state.

12.
…
wherein each of the plurality of computing resources has a state-change delay timer and said machine-executable instructions further includes machine-executable instructions for determining a minimum number, k.sub.reqd, of the plurality of computing resources to be in the lower-setup-cost state and overriding ones of the state-change delay timers as needed to keep the minimum number of the plurality of computing resources in the lower-setup-cost state;
and wherein:
said machine-executable instructions further includes machine-executable instructions for determining k.sub.reqd as a function of a total number of requests currently distributed among the plurality of computing resources and further as a function of a packing factor and a number, k, of the plurality of computing resources currently in the lower-setup-cost state.


From the Notice of Allowance:
Examiner finds ISCI (US Pub. 2011/0302578), Breitgand (US Pub. 2011/0264805) and Zhang (US Pat. 8,250,198 also previously published as US Pub. 2011/0040876) to be indicative of the art at time of filing. None of the three calculate transitioning the states based upon both a packing factor and a number of computing resources currently in the lower-setup-cost state, which is a requirement of all the independents. Without proof that the prior art considered these factors in the decision-making process, Examiner could not reject the claims. The remaining dependent claims further limit the invention. All claims are novel and non-obvious.

Reissue Applications
The amendment filed 2/26/2021 presents amendments to claims 10, 21, 22, and 24 that do not comply with 37 CFR 1.173, which sets forth the manner of making amendments in reissue applications. Specifically, the (1) amended claims are not 
 Any changes relative to the patent being reissued which are made to the specification, including the claims, upon filing, or by an amendment paper in the reissue application, must include the following markings: (1) The matter to be omitted by reissue must be enclosed in brackets; and (2) The matter to be added by reissue must be underlined…
 
	In this instance, all claims are new so they should be fully underlined (without “squiggly lines” for new additions) in every response.  A supplemental paper correctly amending the reissue application is required.  
Further see MPEP 1453.II.B:  
For each new claim added to the reissue by the amendment being submitted (the current amendment), the entire text of the added claim must be presented completely underlined; 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rational supporting the rejection, would be the same under either status.  
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

s 10-14, 18-22, and 24-25 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Hanson et al., Publication Number:  2009/0254660, hereinafter Hanson, Liu et al., Publication Number:  2009/0222562, hereinafter Liu, Nandagopal et al., Publication Number:  20100217866, hereinafter Nandagopal, and Begun et al., Publication Number:  2003/0055969, hereinafter Begun.

With regard to claim 10, which teaches a data center comprising:  a plurality (N0) of request-processing servers for processing incoming job requests received by the data center, wherein: each of the plurality of request-processing servers has at least a first operational state and a second operational state; each of the request-processing servers cannot be in both the first and second operational states at the same time; when in the first operational state, a request-processing server is available for processing an incoming job request; and when in the second operational state, a request-processor server is not available for processing an incoming job request; Hanson teaches a datacenter with the ability to control the state of each of a plurality of servers, where based upon workload and power constraints the system transitions servers between a “powered on” state and an “reduced power" (or idle) state when it is deemed that less servers are necessary, and between “reduced power” (or idle) and “powered on” state when it is determined that workload exceeds current capacity (see paragraphs 2, 18, 20, 21, and 28).  Where a “powered on” state has a quick response time and lower-setup-cost while a “reduced power" state has an increased response time and a higher-setup-cost.  
With regard to claim 10, which teaches a request distribution system in 
With regard to load distribution, Liu is further shown to utilize a load dispatching component 108 that assigns user connection requests to servers turned on, where the dispatching component is capable of both load balancing and load skewing (see paragraph 33).  Here active servers are indexed between 1 and K(t), where K(t) is the number of active servers (see paragraph 51).  Load balancing component 502 provides a dispatching mechanism that attempts to equalize numbers of connections on servers 
	Nandagopal teaches a load balancing system wherein servers are evaluated for their aggregate load as well at the maximum acceptable load that each server can handle before there is a degradation in response time (see paragraphs 25-28), but further teaches an option to send the incoming request to the least-server-id, where each server is assigned a numeric id, and the least server id corresponds to a numeric id assigned the lowest numbered id among all servers present (see paragraph 29).  It would have been obvious to one of ordinary skill in the art, having the teachings of Liu, Hanson, and Nandagopal before them at the time of the invention was made to include load balancing scheme as used by Nandagopal where severs are loaded according to the lowest ID’d server in the method and systems of Hanson and Liu.  One would have been motivated to make such a combination because this provides for the assurance the lowest ID’d servers will be filled first prior to utilizing other servers, thereby allowing 
Hanson, Liu, and Nandagopal teach switching between a lower and higher setup cost state but do not specifically teach the switch being a function of a time-out of a delay.  Begun teaches a similar system for distributing tasks over severs in a load balancing system (see paragraphs 43 and 45), switching between a "reduced power state" and a "higher power state" based upon workload (see paragraph 440), while considering the effects of reboot time in restarting servers from a reduced power state (see paragraph 31), similar to that of Hanson, Liu, and Nandagopal, but further teaches switching between a lower and higher setup cost state based on a timer activated on idle (see paragraph 44).  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson, Liu, Nandagopal, and Begun before them at the time of the invention to include the idle based timer to move unused or underused servers in a higher power / lower-startup-cost state into a ‘sleep’ / reduced-power / high-startup-cost state.  One would have been motivated to make such a combination because this enables underuse to be recognized and dealt with, this is the goal of Hanson/Liu/Nandagopal and eventual effect.  

With regard to claim 11, which teaches wherein the first operational state has a lower set-up cost, Hanson teaches a datacenter with servers in either a “powered on” state or a “reduced power" (or idle) or sleep state.  Where a “powered on” state has a quick response time and lower-setup-cost while a “reduced power" state has an increased response time and a higher-setup-cost (see paragraphs 2, 18, 20, 21, and 28).  

With regard to claim 12, which teaches wherein over the time each of the request-processing servers can transition between the first operational state and the second operational state, and vice versa, Hanson teaches the ability to control the state of each of a plurality of servers, where based upon workload and power constraints the system transitions servers between a “powered on” state and an “reduced power" (or idle) state when it is deemed that less servers are necessary, and between “reduced power” (or idle) and “powered on” state when it is determined that workload exceeds current capacity (see paragraphs 2, 18, 20, 21, and 28).  

With regard to claim 13, which teaches wherein:  the first operational state comprises a powered-up state; and the second operational state comprises a sleep state, Hanson teaches a datacenter with servers in either a “powered on” state or a “reduced power" (or idle) or sleep state.  Where a “powered on” state has a quick response time and lower-setup-cost while a “reduced power" state has an increased response time and a higher-setup-cost (see paragraphs 2, 18, 20, 21, and 28).  

With regard to claim 14, which teaches wherein request-processing servers in the second operational state comprise request-processing servers that need to be acquired to process incoming job requests, Hanson teaches the process of adding /acquiring servers to add to active state in order to process incoming job requests (see paragraphs 20 and 21).  



With regard to claim 19, which teaches wherein the load balancer is programmed to:  determine a required number of request-processing servers that should be in first operational state based on total number of job requests currently being processed by the plurality of request-processing servers; and increase the number of request-processing servers in the first operational state when the required number is greater than the number of request-processing servers in the first operational state, Liu teaches a method and system for determining a minimum number of servers required to satisfy 

With regard to claim 20, which teaches wherein the load balancer controls the number of request-processing servers in the first operational state and the number of request-processing servers in the second operational state based on a total number of job requests currently being processed by the plurality of request-processing servers, Liu teaches a method and system for determining a minimum number of servers required to satisfy a level of service enabling the startup and/or shut down of servers (to a sleep mode) (see paragraphs 29-30), similar to that of Hanson and Begun, but further teaches a specific equation (K ( t ) = max {[ L tot ( t ) / L max] , [N tot ( t ) / N max ]}, where for calculating the minimum number of servers taking into account the maximum of the total login rate (L tot ( t ))  / max log in rate (L max (t)) and the number of connections (N tot ( t )) divided by the max per server (N max ( t )) (see paragraph 30), 

With regard to claim 21, which teaches a method of reducing power consumption by a data center that processes incoming job requests with a plurality of request-processing servers, while still meeting a specified response time service level for the data center, wherein each of the request-processing servers can be in either a first operational state or a second operational state at a given time, wherein when a request-processing server is in the first operational state it is available for processing a job request and when a request-processing server is in the second operational state it is unavailable processing a job request and wherein there are three subsets of request-processing servers at a given time, comprising:  a first subset of request-processing servers that are in the first operational state and processing at least one prior job request and no more than p prior job requests, where p is a predetermined, maximum number of job requests that one of the request processing servers can concurrently process while still meeting the specified response time service level;  a second subset of request-processing servers that are in the first operational state and not processing any prior job requests; and a third subset of request processing servers that are in the second operational state; Hanson teaches method for reducing power consumption give blanket priority to performance by consolidating workload onto the minimum number of machines sufficient to serve it, and turning off the unused machines” (see paragraph 30).  Hanson teaches multiple server states including a “powered on” processing state and multiple non-processing “reduced power" or idle states, where based upon workload and power constraints the system transitions servers between a “powered on” processing state and a “reduced power” / “powered off” / idle state when it is deemed that less servers are necessary, and between “reduced power” / “powered off” / idle state and “powered on” state when it is determined that workload exceeds current capacity (see paragraphs 2, 18, 20, 21, and 28).  Where a “powered on” state has a quick response time and lower-setup-cost while a “reduced power" state has an increased response time and a higher-setup-cost.  
With regard to claim 21, which teaches the method comprising:  receiving, by a request distribution system of the data center, a new incoming job request; distributing, by the request distribution system, the new incoming job request to a request processing server in the first subset of request-processing servers that is processing fewer than p job requests at the time of distribution and to a request processing server in the second subset if all of the request-processing servers in the first subset are processing p requests at the time of distribution, such that the job requests being processed by the data center are concentrated in the first subset of request-processing servers and such that the request processing servers in the second subset are free from processing the job requests; and processing, by the request-processing server in the first subset to which the new incoming job request is distributed, the new incoming job request, Hanson teaches a distributing a stream on content between servers (see if the processing capacity exceeds the current workload, one or more servers may be vacated with the workload consolidated on remaining servers” (see paragraph 20).  Hanson teaches “The evaluation process (step 21) by the performance manager (110) can be based, in part, on information provided by the power manager (120) with regard to recommendations for releasing and/or obtaining servers under current operating conditions and workloads. For example, when the workload changes in such a way that one or more of the running physical servers may be vacated without compromising the performance objectives of the performance manager (110), the power manager (120) may provide a set of recommendations for which servers node in the current subset of running servers are most desirable to be released (vacated). The performance manager (110) can then choose, from among those recommendations, one or more servers to release control of to the, while meeting its performance goals. The power manager (120) may then power-off the vacated server(s).” (see paragraph 21)  Hanson teaches to “give blanket priority to performance by consolidating workload onto the minimum number of machines sufficient to serve it, and turning off the unused machines” (see paragraph 30).
The server management component 102 includes a provisioning component 106 that determines a number of servers within the cluster 104 that should be active. The provisioning component 106 can start up and/or shut down servers in the cluster 104 to maintain the determined number of servers. Pursuant to an illustration, the determined number of servers can be a minimum number of servers required to satisfy a level of quality of service for the Internet service.” (see paragraph 29)  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson before him at the time of the invention was made to include the load balancing / load skewing specifics of Liu in the method and system of Hanson.  One would have been motivated to make such a combination because Liu puts forth a specific means / equation for accomplishing the priorities outlined in Hanson. 

Hanson, Liu, and Nandagopal teach switching between a lower and higher setup cost state but don’t specifically teach the switch being a function of a time-out of a delay.  Begun teaches a similar system for distributing tasks over severs in a load balancing system (see paragraphs 43 and 45), switching between a "reduced power state" and a "higher power state" based upon workload (see paragraph 440), while considering the effects of reboot time in restarting servers from a reduced power state (see paragraph 31), similar to that of Hanson, Liu, and Nandagopal, but further teaches switching between a lower and higher setup cost state based on a timer activated on idle (see paragraph 44).  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson, Liu, Nandagopal, and Begun before them at the time of 

With regard to claim 22, which teaches further comprising:  determining, by the request-distribution system, a required number of request-processing servers that should be in first operational state based on total number of job requests currently being processed by the plurality of request-processing servers; and transitioning one or more request-processing servers in the second operational state to the first operational state such that after the transition, the number of request-processing servers in the first operational state is not less than the required number, Hanson teaches “if the performance is degraded because the current workload exceeds the processing capacity of the set of servers under control of the performance manager (110), one or more additional servers may be needed to achieve tan optimal placement or improved placement” (see paragraph 20).  Hanson teaches “The evaluation process (step 21) by the performance manager (110) can be based, in part, on information provided by the power manager (120) with regard to recommendations for releasing and/or obtaining servers under current operating conditions and workloads. For example, when the workload changes in such a way that one or more of the running physical servers may be vacated without compromising the performance objectives of the performance manager (110), the power manager (120) may provide a set of recommendations for which servers node in the current subset of running servers are most desirable to be released (vacated). The performance manager (110) can then choose, from among those recommendations, one or more servers to release control of to the, while meeting its performance goals. The power manager (120) may then power-off the vacated server(s).  Similarly, when the workload changes in such a way the workload exceeds the total processing capacity of the current subset of running servers, and the performance objects of the performance manager (110) are not being met, the power manager (120) can provide a set recommendations for one or more additional servers to be given control to the performance manager (110) and added to the current subset of running servers, to thereby handle the increased workload.” (see paragraph 21)  
Liu teaches a method and system for determining a minimum number of servers required to satisfy a level of service enabling the startup and/or shut down of servers (to a sleep mode) (see paragraphs 29-30), similar to that of Hanson, but further teaches a specific equation (K ( t ) = max {[ L tot ( t ) / L max] , [N tot ( t ) / N max ]}, where for calculating the minimum number of servers taking into account the maximum of the total login rate (L tot ( t ))  / max log in rate (L max (t)) and the number of connections (N tot ( t )) divided by the max per server (N max ( t )) (see paragraph 30), where this is an equivalent to the equation provided in the application while dividing out / factoring out the current number in low-cost state / powered on.  The calculation of minimum servers required and distribution of Liu further takes in to account past values of the number of active/on servers K(t) at a time (see paragraphs 51 and 59).  Liu further teaches “The server management component 102 includes a provisioning component 106 that determines a number of servers within the cluster 104 that should be active. The provisioning component 106 can start up and/or shut down servers in the cluster 104 to maintain the determined number of servers. Pursuant to an illustration, the determined number of servers can be a minimum number of servers required to satisfy a level of quality of service for the Internet service.” (see paragraph 29)

With regard to claim 24, which teaches reduced-energy-consuming data center that meets a specified response time service level for processing incoming job requests, the data center comprising: a plurality of request-processing servers, wherein each of the request-processing servers can be in either a first operational state or a second operational state at a given time, wherein when a request-processing server is in the first operational state it is available for processing a job request and when a request-processing server is in the second operational state it is unavailable processing a job request, and wherein there are three subsets of request processing servers at a given time, comprising:  a first subset of request-processing servers that are in the first operational state and processing at least one prior job request and no more than p prior job requests, where p is a predetermined, maximum number of job requests that one of the request processing servers can concurrently process while still meeting the specified response time service level; a second subset of request-processing servers that are in the first operational state and not processing any prior job requests; and a third subset of request processing servers that are in the second operational state; Hanson teaches method for reducing power consumption amongst a group of servers by utilizing a “give blanket priority to performance by consolidating workload onto the minimum number of machines sufficient to serve it, and turning off the unused machines” (see paragraph 30).  Hanson teaches multiple server states including a “powered on” processing state and multiple non-processing “reduced power" or idle states, where based upon workload and power constraints the system transitions servers between a “powered on” processing state and a “reduced power” / “powered off” / idle state when it is deemed that less servers are necessary, and between “reduced power” / “powered off” / idle state and “powered on” state when it is determined that workload exceeds current capacity (see paragraphs 2, 18, 20, 21, and 28).  Where a “powered on” state has a quick response time and lower-setup-cost while a “reduced power" state has an increased response time and a higher-setup-cost.  

With regard to claim 24, further teaching a request distribution system that is in communication with the plurality of request-processing servers, wherein the request-distribution system is configured to:  receive a new incoming job request for processing; and distribute the new incoming job request to one of the plurality of request-processing servers in the first subset of request-processing servers that is processing fewer than p job requests at the time of distribution and to a request processing server in the second subset if all of the request-processing servers in the first subset are processing p requests at the time of distribution, such that the job requests being processed by the data center are concentrated in the first subset of request-processing servers and such that the request-processing servers in the second subset are free from processing the job requests, and wherein the request-processing server to which the new incoming job request is distributed processes the new incoming job request, Hanson teaches a distributing a stream on content between servers (see paragraph 24).  Hanson further if the processing capacity exceeds the current workload, one or more servers may be vacated with the workload consolidated on remaining servers” (see paragraph 20).  Hanson teaches “The evaluation process (step 21) by the performance manager (110) can be based, in part, on information provided by the power manager (120) with regard to recommendations for releasing and/or obtaining servers under current operating conditions and workloads. For example, when the workload changes in such a way that one or more of the running physical servers may be vacated without compromising the performance objectives of the performance manager (110), the power manager (120) may provide a set of recommendations for which servers node in the current subset of running servers are most desirable to be released (vacated). The performance manager (110) can then choose, from among those recommendations, one or more servers to release control of to the, while meeting its performance goals. The power manager (120) may then power-off the vacated server(s).” (see paragraph 21)  Hanson teaches to “give blanket priority to performance by consolidating workload onto the minimum number of machines sufficient to serve it, and turning off the unused machines” (see paragraph 30).
The server management component 102 includes a provisioning component 106 that determines a number of servers within the cluster 104 that should be active. The provisioning component 106 can start up and/or shut down servers in the cluster 104 to maintain the determined number of servers. Pursuant to an illustration, the determined number of servers can be a minimum number of servers required to satisfy a level of quality of service for the Internet service.” (see paragraph 29)  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson before him at the time of the invention was made to include the load balancing / load skewing specifics of Liu in the method and system of Hanson.  One would have been motivated to make such a combination because Liu puts forth a specific means / equation for accomplishing the priorities outlined in Hanson. 

Hanson, Liu, and Nandagopal teach switching between a lower and higher setup cost state but do not specifically teach the switch being a function of a time-out of a delay.  Begun teaches a similar system for distributing tasks over severs in a load balancing system (see paragraphs 43 and 45), switching between a "reduced power state" and a "higher power state" based upon workload (see paragraph 440), while considering the effects of reboot time in restarting servers from a reduced power state (see paragraph 31), similar to that of Hanson, Liu, and Nandagopal, but further teaches switching between a lower and higher setup cost state based on a timer activated on idle (see paragraph 44).  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson, Liu, Nandagopal, and Begun before them at the time of 

With regard to claim 25, which teaches wherein the request distribution system is further configured to:  determine a required number of request-processing server that should be in first operational state based on total number of job requests currently being processed by the plurality of request processing servers; and transition one or more request-processing servers in the second operational state to the first operational state such that after the transition, the number of request-processing servers in the first operational state is not less than the required number, Hanson teaches “if the performance is degraded because the current workload exceeds the processing capacity of the set of servers under control of the performance manager (110), one or more additional servers may be needed to achieve tan optimal placement or improved placement” (see paragraph 20).  Hanson teaches “The evaluation process (step 21) by the performance manager (110) can be based, in part, on information provided by the power manager (120) with regard to recommendations for releasing and/or obtaining servers under current operating conditions and workloads. For example, when the workload changes in such a way that one or more of the running physical servers may be vacated without compromising the performance objectives of the performance manager (110), the power manager (120) may provide a set of recommendations for which servers node in the current subset of running servers are most desirable to be released (vacated). The performance manager (110) can then choose, from among those recommendations, one or more servers to release control of to the, while meeting its performance goals. The power manager (120) may then power-off the vacated server(s).  Similarly, when the workload changes in such a way the workload exceeds the total processing capacity of the current subset of running servers, and the performance objects of the performance manager (110) are not being met, the power manager (120) can provide a set recommendations for one or more additional servers to be given control to the performance manager (110) and added to the current subset of running servers, to thereby handle the increased workload.” (see paragraph 21)  
Liu teaches a method and system for determining a minimum number of servers required to satisfy a level of service enabling the startup and/or shut down of servers (to a sleep mode) (see paragraphs 29-30), similar to that of Hanson, but further teaches a specific equation (K ( t ) = max {[ L tot ( t ) / L max] , [N tot ( t ) / N max ]}, where for +calculating the minimum number of servers taking into account the maximum of the total login rate (L tot ( t ))  / max log in rate (L max (t)) and the number of connections (N tot ( t )) divided by the max per server (N max ( t )) (see paragraph 30), where this is an equivalent to the equation provided in the application while dividing out / factoring out the current number in low-cost state / powered on.  The calculation of minimum servers required and distribution of Liu further takes in to account past values of the number of active/on servers K(t) at a time (see paragraphs 51 and 59).  Liu further teaches “The server management component 102 includes a provisioning component 106 that determines a number of servers within the cluster 104 that should be active. The provisioning component 106 can start up and/or shut down servers in the cluster 104 to maintain the determined number of servers. Pursuant to an illustration, the determined number of servers can be a minimum number of servers required to satisfy a level of quality of service for the Internet service.” (see paragraph 29)



Claim 17 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Hanson et al., Publication Number:  2009/0254660, hereinafter Hanson, Liu et al., Publication Number:  2009/0222562, hereinafter Liu, Nandagopal et al., Publication Number:  20100217866, hereinafter Nandagopal, Begun et al., Publication Number:  2003/0055969, hereinafter Begun, and Govett, Patent Number:  5,761,507.

With regard to claim 17, which teaches wherein the non-zero, finite delay time period is between 60 and 260 seconds, Begun teaches switching between a lower and higher setup cost state based on a timer activated on idle (see paragraph 44), but doesn’t give any specific guidance on a specific delay time period.  Govett teaches a similar system in which servers are transitioned from an active state to a stopped state based upon a length of time of the serve being idle (see column 12, lines 1-10 and 50-64 and claims 13-14), similar to that of Hanson, Liu, Nandagopal, and Begun, but further specifically enables a user to set the time period after the server goes idle to delay until stopping he server (see column 6, lines 10-36).  Here a user may freely choose a delay in the specified range of 60-260 seconds or numerous system factors .    


Response to Arguments
Applicant’s arguments filed 2/26/2021 have been fully considered but they are not persuasive.  It is noted that the Patent Owner’s response mirrors arguments submitted and answered after final, similar responses are provided below with further elaboration where deemed helpful.

The Examiner believes the claims are meet by an obvious combination of known job distribution methods coupled with known methods for moving underused servers to a reduced power state. The claims at issue are met by the prior art of record where Hanson and Liu describe the state of the art (at around the level the Patent Owner describes in his background section): Hanson [18-21] teaches consolidating jobs so as 
What is presented in this combination, in a basic sense, is what is covered by the state of the art (Hanson and Liu) plus a known technique for job placement (Nandagopal) and a known technique for moving servers to a reduced power state (Begun).  Job placement by placing jobs in the highest indexed sever with room, clearly fits into load skewing techniques identified in the state of the art references.  While cautiously waiting a period prior to shutting down severs so as to not unnecessarily shut down severs needed in the near future, provides a means of mitigating a concern of the other references.  This use of know techniques combined with the state of the art 

Patent Owner argues that the cited references prefer balancing loads across the servers processing jobs, even when the number of servers processing jobs is consolidated into a few servers. 
In response, the Examiner respectfully submits that Liu specifically addresses the use of and benefits in "Load Skewing" (see paragraphs 62 and 33). This concept is further addressed in Hanson (paragraphs 19-21), where jobs are consolidated into a limited number of servers. Begun (paragraphs 43-44) further shows that if available servers remain idle (does not receive requests for a predetermined time) or processing capacity exceed a predetermined workload, jobs are redistributed/consolidated in the remaining servers and the server is moved to a reduced power state.  Even Nandagopal recognizes the benefits of prioritizing job distribution to the least-server-ID and last-server-selected, each of which continually chooses the same server, rather than spreading the workload amongst all servers. 
This argument is counter to Patent Owner’s own submitted evidence that utilize a combination of load balancing and load skewing techniques with the claimed invention (see below remarks responsive to the Mr. Wu post).

Patent Owner argues that the cited references teach against turning the servers off because of delays in turning the servers back on and the stress on the servers' hardware from repeatedly turning the servers on/off."


Secondary Considerations Considered
The evidence of secondary considerations presented by the Patent Owner have been considered against the underlying evidence supporting the findings of obviousness listed above.  After weighing all of the available evidence, the Examiner finds that the secondary considerations do not overcome the evidence supporting the conclusions of obviousness. 
Specifically, while the secondary considerations and associated affidavits show the success and acclaim AutoScale has had, it is not clear that AutoScale is merely what is broadly claimed in the claims at issue.  A clear one to one nexus has not been established between claim language and the end product.  Rather the Examiner believes the claims are arrived at through an obvious combination of known job distribution methods coupled with known methods for moving underused servers to a reduced power state. The claims at issue are met by the prior art of record where 

The Affidavits themselves break down the invention in terms of these two ‘techniques’ utilized by the Patent to improve efficiency (followed by the third described technique of determining how many servers need be in an active state based on current number of jobs rather than request rate; from new dependent claims 18, 20, and 22).  See element 2.4 of the Wierman declaration (the Harchol-Balter declaration recites the like in Section 3) where: 
A first technique is to pack jobs in a few of the servers, while at the same time not over-packing jobs to those few servers so that they can still meet the desired service levels (e.g., response times) for the data center. This is, in essence, load “skewing” and is the opposite of the then-common technique of load balancing. See ‘018 Patent at col. 7:16-52.
and
The second technique of the Auto Scale invention is called the delayed-off. The idea is that a server is only transitioned then from the lower-setup-cost state to the higher-setup-cost state after the server has remained idle for some contiguous finite period of time, determined by a “timer.” See ‘018 Patent at col. 6:11-67.

With regard to the “first technique”, ‘Load Skewing” was a known technique for dispatching jobs, as evidenced by paragraphs 6 and 33 of Liu:
[0006] In accordance with another aspect, a load skewing component is provided that attempts to dispatch new connection requests to busy servers first. The load skewing component amasses a majority of user connections to a small set of busy servers and maintains the set of busy servers at or close to a target load. If there are no busy servers available to handle additional load, the load skewing component can dispatch connections to the tail servers until capacity becomes available on the busy servers. 

[0033] The server management component 102 includes a load dispatching component 108 that allocates incoming user connection requests to a server in the cluster 104. The load dispatching component 108 assigns user connection requests to servers provisioned or turned on by the provisioning component 106. The load dispatching component 108 can employ a plurality of dispatching algorithms such as, but not limited to, load balancing, load skewing and the like. In load balancing, the load dispatching component 108 attempts to make numbers of connections on servers in the cluster 104 the same. In load skewing, the load dispatching component 108 assigns new user connection requests to busy servers in the cluster 104 first until a threshold is met.


second technique”, Begun specifically teaches that “if a higher power server remains idle (e.g., does not receive or send a data request for a predetermined time)… dispatcher 32 selects a higher power server to be powered down to a reduced power state…”

With regard to the “third technique”, where decisions on changing a server state are based on “current number of jobs” rather than “request rate”, as Patent Owner described was the current practice, Begun specifically, implements its method of power management (making decisions to power down or power up servers) based upon “current workload” of the servers (see paragraph 50).
[0050] The method of power management of the present invention implements a resource manager coupled to a group of servers. The resource manager analyzes the balance of tasks of the group of servers utilizing a set of performance metrics. If the processing capacity of the group of higher power servers exceeds current workload, at least a server in the group is selected to be powered down to a reduced power state. The tasks on the selected server are rebalanced over the remaining higher power servers. However, if the power manager determines that the workload exceeds the processing capacity of the group of servers, at least a server is powered up to a higher power state, and the tasks are rebalanced over the group of servers.


Diving deeper in to the secondary considerations reveals how much of a nexus is truly lacking between the claim language and the secondary considerations.  For example.  Harchol-Balter declaration 8.3 directs the Office to https://engineering.fb.com/2014/08/08/production-engineering/making-facebook-s-software-infrastructure-more-energy-efficient-with-autoscale/ , a blog post by Mr. Wu 
Overall architecture
In each frontend cluster, Facebook uses custom load balancers to distribute workload to a pool of web servers. Following the implementation of Autoscale, the load balancer now uses an active, or “virtual,” pool of servers, which is essentially a subset of the physical server pool. Autoscale is designed to dynamically adjust the active pool size such that each active server will get at least medium-level CPU utilization regardless of the overall workload level. The servers that aren’t in the active pool don’t receive traffic.

    PNG
    media_image1.png
    197
    291
    media_image1.png
    Greyscale

Figure 1: Overall structure of Autoscale
We formulate this as a feedback loop control problem, as shown in Figure 1. The control loop starts with collecting utilization information (CPU, request queue, etc.) from all active servers. Based on this data, the Autoscale controller makes a decision on the optimal active pool size and passes the decision to our load balancers. The load balancers then distribute the workload evenly among the active servers. It repeats this process for the next control cycle.

This very limited 3 page summary of the integration reveals seemingly important features that make Autoscale work that aren’t revealed in the claims.  Specifically this idea of using load balancing in combination with load skewing to make optimal use of available active severs.  Patent Owner argued against this same feature in the presented references. 

Additionally, it appears that a great deal of the benefit achieved though integration of Autoscale is a product of its Decision Logic:

Decision logic
A key part of the feedback loop is the decision logic. We want to make an optimal decision that will adapt to the varying workload, including workload surges or drops due to unexpected events. On one hand, we want to maximize the energy-saving opportunity. On the other, we don’t want to over-concentrate the traffic in a way that could affect site performance.
For this to work, we employ the classic control theory and PI controller to get the optimal control effect of fast reaction time, small overshoots, etc. To apply the control theory, we need to first model the relationship of key factors such as CPU utilization and request-per-second (RPS). To do this, we conduct experiments to understand how they correlate and then estimate the model based on experimental data. For example, Figure 2 shows the experimental results of the relationship between CPU and RPS for one type of web server at Facebook. In the figure, the blue dots are the raw data points while the red dashed line is the estimated model (piece-wise linear). With the models obtained, the controller is then designed using the classic stability analysis to pick the best control parameters.

    PNG
    media_image2.png
    344
    708
    media_image2.png
    Greyscale

Figure 2: Experimental results of the relationship between CPU and RPS for one type of web server; the red dashed line is the estimated piece-wise linear model

This section notes for Autoscale to “employ the classic control theory and PI controller to get the optimal control effect of fast reaction time, small overshoots, etc. To apply the control theory, we need to first model the relationship of key factors such as CPU utilization and request-per-second (RPS).”  In addition, to the use of two seemingly rather important unclaimed design logic features: “control theory” and “PI controller”, the underlying control is said to be based upon factors such as “request-per-second”, where this sounds much closer to the “request rate” that Patent Owner is currently arguing that the references teach, and less like the “current number of request” that Patent Owner argues the claims are limited to.  



In summary, Patent Owner has failed to establish the required nexus between the secondary considerations and the claimed subject matter.  Therefore, PO’s submitted evidence of non-obviousness (i.e., commercial success) is deemed insufficient to overcome the strong showing of obviousness in the applied obviousness rejections of record.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DENNIS G BONSHOCK whose telephone number is (571)272-4047.  The examiner can normally be reached on M-F 7:15 - 4:45.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexander Kosowski can be reached on 571-272-3744.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DENNIS G BONSHOCK/           Primary Examiner, Art Unit 3992                                                                                                                                                                                             
Conferees:

/ADAM L BASEHOAR/           Primary Examiner, Art Unit 3992                                                                                                                                                                                             
/ALEXANDER J KOSOWSKI/           Supervisory Patent Examiner, Art Unit 3992