PNG
    media_image1.png
    340
    340
    media_image1.png
    Greyscale
United States Patent and Trademark Office    
        
            
                                
            
        
    

Commissioner for Patents
United States Patent and Trademark Office
P.O. Box 1450
Alexandria, VA 22313-1450
www.uspto.gov











BEFORE THE PATENT TRIAL AND APPEAL BOARD


Application Number: 16/544,359
Filing Date: 19 Aug 2019
Appellant(s): Carnegie Mellon University



__________________
Mark G. Knedeisen (Reg. No. 42,747)
For Appellant


EXAMINER’S ANSWER





This is in response to the appeal brief filed 4/6/2022.

(1) Grounds of Rejection to be Reviewed on Appeal

Every ground of rejection set forth in the Office action dated 4/7/2021 from which the appeal is taken is being maintained by the Examiner.

The following ground(s) of rejection are applicable to the appealed claims:

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rational supporting the rejection, would be the same under either status.  
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

Claims 10-14, 18-22, and 24-25 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Hanson et al., Publication Number:  2009/0254660, hereinafter Hanson, Liu et al., Publication Number:  2009/0222562, hereinafter Liu, Nandagopal et al., Publication Number:  20100217866, hereinafter Nandagopal, and Begun et al., Publication Number:  2003/0055969, hereinafter Begun.

With regard to claim 10, which teaches a data center comprising:  a plurality (N0) of request-processing servers for processing incoming job requests received by the data center, wherein: each of the plurality of request-processing servers has at least a first operational state and a second operational state; each of the request-processing servers cannot be in both the first and second operational states at the same time; when in the first operational state, a request-processing server is available for processing an incoming job request; and when in the second operational state, a request-processor server is not available for processing an incoming job request; Hanson teaches a datacenter with the ability to control the state of each of a plurality of servers, where based upon workload and power constraints the system transitions servers between a “powered on” state and an “reduced power" (or idle) state when it is deemed that less servers are necessary, and between “reduced power” (or idle) and “powered on” state when it is determined that workload exceeds current capacity (see paragraphs 2, 18, 20, 21, and 28).  Where a “powered on” state has a quick response time and lower-setup-cost while a “reduced power" state has an increased response time and a higher-setup-cost.  
With regard to claim 10, which teaches a request distribution system in communication with the plurality of request-processing servers, wherein the load balancer comprises one or more front-end servers, and wherein the load balancer receives the incoming job requests to the data center and distributes the received incoming job requests to the plurality of request-processing servers for processing, wherein, in routing a new incoming job request:  all (n) of the request-processing servers in the first operational state are indexed, n <= N, and of the n request-processing servers, the load balancer routes the new incoming job request to a first indexed request-processing server that is currently processing fewer than p requests, where p is a predetermined packing factor, Hanson teaches a distributing a stream on content between servers (see paragraph 24).  Hanson further teaches a performance manager 110 that places or otherwise redistributes a current workload on the subset of powered-on nodes under its control, while taking into account current operating conditions and workloads (see paragraphs 19-21).  Here servers / nodes are indexed as N = {n(1), . . . ,n(N)}, while jobs / applications are indexed as M = { m(1), . . . m(M) } (see paragraph 36), where jobs are allocated to servers while considering response time goals (service level requirement) and  maximum optimization / workload intensity (maximum requests servable) (see paragraphs 37-48).   
With regard to load distribution, Liu is further shown to utilize a load dispatching component 108 that assigns user connection requests to servers turned on, where the dispatching component is capable of both load balancing and load skewing (see paragraph 33).  Here active servers are indexed between 1 and K(t), where K(t) is the number of active servers (see paragraph 51).  Load balancing component 502 provides a dispatching mechanism that attempts to equalize numbers of connections on servers in the cluster.  The load balancing component applies a round-robin mechanism for distribution (see paragraphs 52-53).  Load skewing aspects of Liu distribute new request to servers with loads less than a target but also closer to the target (see paragraphs 56-61), while further saving energy by enabling the turning off unused servers which occurs at an interval (see paragraph 59).  Liu further provides of a hybrid with aspects load balancing and aspects of load skewing (see paragraph 62).  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson before him at the time of the invention was made to include the load balancing / load skewing specifics of Liu in the method and system of Hanson.  One would have been motivated to make such a combination because Liu puts forth a specific means / equation for accomplishing the priorities outlined in Hanson. 
	Nandagopal teaches a load balancing system wherein servers are evaluated for their aggregate load as well at the maximum acceptable load that each server can handle before there is a degradation in response time (see paragraphs 25-28), but further teaches an option to send the incoming request to the least-server-id, where each server is assigned a numeric id, and the least server id corresponds to a numeric id assigned the lowest numbered id among all servers present (see paragraph 29).  It would have been obvious to one of ordinary skill in the art, having the teachings of Liu, Hanson, and Nandagopal before them at the time of the invention was made to include load balancing scheme as used by Nandagopal where severs are loaded according to the lowest ID’d server in the method and systems of Hanson and Liu.  One would have been motivated to make such a combination because this provides for the assurance the lowest ID’d servers will be filled first prior to utilizing other servers, thereby allowing others to fulfill their jobs, time out, and move to a reduced power state.
Hanson, Liu, and Nandagopal teach switching between a lower and higher setup cost state but do not specifically teach the switch being a function of a time-out of a delay.  Begun teaches a similar system for distributing tasks over severs in a load balancing system (see paragraphs 43 and 45), switching between a "reduced power state" and a "higher power state" based upon workload (see paragraph 440), while considering the effects of reboot time in restarting servers from a reduced power state (see paragraph 31), similar to that of Hanson, Liu, and Nandagopal, but further teaches switching between a lower and higher setup cost state based on a timer activated on idle (see paragraph 44).  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson, Liu, Nandagopal, and Begun before them at the time of the invention to include the idle based timer to move unused or underused servers in a higher power / lower-startup-cost state into a ‘sleep’ / reduced-power / high-startup-cost state.  One would have been motivated to make such a combination because this enables underuse to be recognized and dealt with, this is the goal of Hanson/Liu/Nandagopal and eventual effect.  

The Examiner believes the claims are meet by an obvious combination of known job distribution methods coupled with known methods for moving underused servers to a reduced power state. The claims at issue are met by the prior art of record where Hanson and Liu describe the state of the art (at around the level the Patent Owner describes in his background section): Hanson [18-21] teaches consolidating jobs so as to free up servers to move to a reduced power state. Liu [62] [33] teaches load skewing; and with Load skewing aspects of Liu distributing new request to servers with loads less than a target but also closer to the target (see paragraphs 56-61), while further saving energy by enabling the turning off unused servers which occurs at an interval (see paragraph 59). While the Hanson and Liu references are supplemented by two other references that teach server job distribution techniques including ID number based provisioning (as in Nandagopal) and timer based moving of servers to a reduced power state (as in Begun). Here Nandagopal [29] teaches using Least-Server-ID to place jobs into servers if multiple servers are both less than max capacity and meet needs of the job. Begun [43-44] teaches setting a timer when no jobs are at the server, to move to a reduced power state after expiration of said timer. 
What is presented in this combination, in a basic sense, is what is covered by the state of the art (Hanson and Liu) plus a known technique for job placement (Nandagopal) and a known technique for moving servers to a reduced power state (Begun).  Job placement by placing jobs in the highest indexed sever with room, clearly fits into load skewing techniques identified in the state of the art references.  While cautiously waiting a period prior to shutting down severs so as to not unnecessarily shut down severs needed in the near future, provides a means of mitigating a concern of the other references.  This use of know techniques combined with the state of the art references would clearly have been obvious to one of ordinary skill in the art at the time of the invention. 


With regard to claim 11, which teaches wherein the first operational state has a lower set-up cost, Hanson teaches a datacenter with servers in either a “powered on” state or a “reduced power" (or idle) or sleep state.  Where a “powered on” state has a quick response time and lower-setup-cost while a “reduced power" state has an increased response time and a higher-setup-cost (see paragraphs 2, 18, 20, 21, and 28).  

With regard to claim 12, which teaches wherein over the time each of the request-processing servers can transition between the first operational state and the second operational state, and vice versa, Hanson teaches the ability to control the state of each of a plurality of servers, where based upon workload and power constraints the system transitions servers between a “powered on” state and an “reduced power" (or idle) state when it is deemed that less servers are necessary, and between “reduced power” (or idle) and “powered on” state when it is determined that workload exceeds current capacity (see paragraphs 2, 18, 20, 21, and 28).  

With regard to claim 13, which teaches wherein:  the first operational state comprises a powered-up state; and the second operational state comprises a sleep state, Hanson teaches a datacenter with servers in either a “powered on” state or a “reduced power" (or idle) or sleep state.  Where a “powered on” state has a quick response time and lower-setup-cost while a “reduced power" state has an increased response time and a higher-setup-cost (see paragraphs 2, 18, 20, 21, and 28).  

With regard to claim 14, which teaches wherein request-processing servers in the second operational state comprise request-processing servers that need to be acquired to process incoming job requests, Hanson teaches the process of adding /acquiring servers to add to active state in order to process incoming job requests (see paragraphs 20 and 21).  

With regard to claim 18, which teaches wherein the load balancer is programmed to control the number of request-processing servers in the first operational state and the number of request processing servers in the second operational state based on a total number of job requests currently being processed by the plurality of request-processing servers, Liu teaches a method and system for determining a minimum number of servers required to satisfy a level of service enabling the startup and/or shut down of servers (to a sleep mode) (see paragraphs 29-30), utilizing a specific equation (K ( t ) = max {[ L tot ( t ) / L max] , [N tot ( t ) / N max ]}, where for calculating the minimum number of servers taking into account the maximum of the total login rate (L tot ( t ))  / max log in rate (L max (t)) and the number of connections (N tot ( t )) divided by the max per server (N max ( t )) (see paragraph 30), where this is an equivalent to the equation provided in the application while dividing out / factoring out the current number in low-cost state / powered on.  The calculation of minimum servers required and distribution of Liu further takes in to account past values of the number of active/on servers K(t) at a time (see paragraphs 51 and 59).  

With regard to claim 19, which teaches wherein the load balancer is programmed to:  determine a required number of request-processing servers that should be in first operational state based on total number of job requests currently being processed by the plurality of request-processing servers; and increase the number of request-processing servers in the first operational state when the required number is greater than the number of request-processing servers in the first operational state, Liu teaches a method and system for determining a minimum number of servers required to satisfy a level of service enabling the startup and/or shut down of servers (to a sleep mode) (see paragraphs 29-30), and further teaches a specific equation (K ( t ) = max {[ L tot ( t ) / L max] , [N tot ( t ) / N max ]}, where for calculating the minimum number of servers taking into account the maximum of the total login rate (L tot ( t ))  / max log in rate (L max (t)) and the number of connections (N tot ( t )) divided by the max per server (N max ( t )) (see paragraph 30), where this is an equivalent to the equation provided in the application while dividing out / factoring out the current number in low-cost state / powered on.  The calculation of minimum servers required and distribution of Liu further takes in to account past values of the number of active/on servers K(t) at a time (see paragraphs 51 and 59).  This equation is further shown to not directly take into account “changes in job request size”, at least to the extent the claimed invention does.  

With regard to claim 20, which teaches wherein the load balancer controls the number of request-processing servers in the first operational state and the number of request-processing servers in the second operational state based on a total number of job requests currently being processed by the plurality of request-processing servers, Liu teaches a method and system for determining a minimum number of servers required to satisfy a level of service enabling the startup and/or shut down of servers (to a sleep mode) (see paragraphs 29-30), similar to that of Hanson and Begun, but further teaches a specific equation (K ( t ) = max {[ L tot ( t ) / L max] , [N tot ( t ) / N max ]}, where for calculating the minimum number of servers taking into account the maximum of the total login rate (L tot ( t ))  / max log in rate (L max (t)) and the number of connections (N tot ( t )) divided by the max per server (N max ( t )) (see paragraph 30), where this is an equivalent to the equation provided in the application while dividing out / factoring out the current number in low-cost state / powered on.  The calculation of minimum servers required and distribution of Liu further takes in to account past values of the number of active/on servers K(t) at a time (see paragraphs 51 and 59).  This equation is further shown to not directly take into account “changes in job request size”, at least to the extent the claimed invention does.  

With regard to claim 21, which teaches a method of reducing power consumption by a data center that processes incoming job requests with a plurality of request-processing servers, while still meeting a specified response time service level for the data center, wherein each of the request-processing servers can be in either a first operational state or a second operational state at a given time, wherein when a request-processing server is in the first operational state it is available for processing a job request and when a request-processing server is in the second operational state it is unavailable processing a job request and wherein there are three subsets of request-processing servers at a given time, comprising:  a first subset of request-processing servers that are in the first operational state and processing at least one prior job request and no more than p prior job requests, where p is a predetermined, maximum number of job requests that one of the request processing servers can concurrently process while still meeting the specified response time service level;  a second subset of request-processing servers that are in the first operational state and not processing any prior job requests; and a third subset of request processing servers that are in the second operational state; Hanson teaches method for reducing power consumption amongst a group of servers by utilizing a “give blanket priority to performance by consolidating workload onto the minimum number of machines sufficient to serve it, and turning off the unused machines” (see paragraph 30).  Hanson teaches multiple server states including a “powered on” processing state and multiple non-processing “reduced power" or idle states, where based upon workload and power constraints the system transitions servers between a “powered on” processing state and a “reduced power” / “powered off” / idle state when it is deemed that less servers are necessary, and between “reduced power” / “powered off” / idle state and “powered on” state when it is determined that workload exceeds current capacity (see paragraphs 2, 18, 20, 21, and 28).  Where a “powered on” state has a quick response time and lower-setup-cost while a “reduced power" state has an increased response time and a higher-setup-cost.  
With regard to claim 21, which teaches the method comprising:  receiving, by a request distribution system of the data center, a new incoming job request; distributing, by the request distribution system, the new incoming job request to a request processing server in the first subset of request-processing servers that is processing fewer than p job requests at the time of distribution and to a request processing server in the second subset if all of the request-processing servers in the first subset are processing p requests at the time of distribution, such that the job requests being processed by the data center are concentrated in the first subset of request-processing servers and such that the request processing servers in the second subset are free from processing the job requests; and processing, by the request-processing server in the first subset to which the new incoming job request is distributed, the new incoming job request, Hanson teaches a distributing a stream on content between servers (see paragraph 24).  Hanson further teaches a performance manager 110 that places or otherwise redistributes a current workload on the subset of powered-on nodes under its control, while taking into account current operating conditions and workloads (see paragraphs 19-21).  Here servers / nodes are indexed as N = {n(1), . . . ,n(N)}, while jobs / applications are indexed as M = { m(1), . . . m(M) } (see paragraph 36), where jobs are allocated to servers while considering response time goals (service level requirement) and  maximum optimization / workload intensity (maximum requests servable) (see paragraphs 37-48).   Hanson teaches “if the processing capacity exceeds the current workload, one or more servers may be vacated with the workload consolidated on remaining servers” (see paragraph 20).  Hanson teaches “The evaluation process (step 21) by the performance manager (110) can be based, in part, on information provided by the power manager (120) with regard to recommendations for releasing and/or obtaining servers under current operating conditions and workloads. For example, when the workload changes in such a way that one or more of the running physical servers may be vacated without compromising the performance objectives of the performance manager (110), the power manager (120) may provide a set of recommendations for which servers node in the current subset of running servers are most desirable to be released (vacated). The performance manager (110) can then choose, from among those recommendations, one or more servers to release control of to the, while meeting its performance goals. The power manager (120) may then power-off the vacated server(s).” (see paragraph 21)  Hanson teaches to “give blanket priority to performance by consolidating workload onto the minimum number of machines sufficient to serve it, and turning off the unused machines” (see paragraph 30).
With regard to load distribution, Liu is further shown to utilize a load dispatching component 108 that assigns user connection requests to servers turned on, where the dispatching component is capable of both load balancing and load skewing (see paragraph 33).  Here active servers are indexed between 1 and K(t), where K(t) is the number of active servers (see paragraph 51).  Load balancing component 502 provides dispatching mechanisms that attempts to equalize numbers of connections on servers in the cluster.  The load balancing component applies a round-robin mechanism for distribution (see paragraphs 52-53).  Load skewing aspects of Liu distribute new request to servers with loads less than a target but also closer to the target (see paragraphs 56-61), while further saving energy by enabling the turning off unused servers which occurs at an interval (see paragraph 59).  Liu further provides of a hybrid with aspects load balancing and aspects of load skewing (see paragraph 62).  Liu teaches “The server management component 102 includes a provisioning component 106 that determines a number of servers within the cluster 104 that should be active. The provisioning component 106 can start up and/or shut down servers in the cluster 104 to maintain the determined number of servers. Pursuant to an illustration, the determined number of servers can be a minimum number of servers required to satisfy a level of quality of service for the Internet service.” (see paragraph 29)  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson before him at the time of the invention was made to include the load balancing / load skewing specifics of Liu in the method and system of Hanson.  One would have been motivated to make such a combination because Liu puts forth a specific means / equation for accomplishing the priorities outlined in Hanson. 
Nandagopal teaches a load balancing system wherein servers are evaluated for their aggregate load as well at the maximum acceptable load that each server can handle before there is a degradation in response time (see paragraphs 25-28), but further teaches an option to send the incoming request to the least-server-id, where each server is assigned a numeric id, and the least server id corresponds to a numeric id assigned the lowest numbered id among all servers present (see paragraph 29).  It would have been obvious to one of ordinary skill in the art, having the teachings of Liu, Hanson, and Nandagopal before them at the time of the invention was made to include load balancing scheme as used by Nandagopal where severs are loaded according to the lowest ID’d server in the method and systems of Hanson and Liu.  One would have been motivated to make such a combination because this provides for the assurance the lowest ID’d servers will be filled first prior to utilizing other servers, thereby allowing others to fulfill their jobs, time out, and move to a reduced power state.
Hanson, Liu, and Nandagopal teach switching between a lower and higher setup cost state but don’t specifically teach the switch being a function of a time-out of a delay.  Begun teaches a similar system for distributing tasks over severs in a load balancing system (see paragraphs 43 and 45), switching between a "reduced power state" and a "higher power state" based upon workload (see paragraph 440), while considering the effects of reboot time in restarting servers from a reduced power state (see paragraph 31), similar to that of Hanson, Liu, and Nandagopal, but further teaches switching between a lower and higher setup cost state based on a timer activated on idle (see paragraph 44).  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson, Liu, Nandagopal, and Begun before them at the time of the invention to include the idle based timer to move unused or underused servers in a higher power / lower-startup-cost state into a ‘sleep’ / reduced-power / high-startup-cost state.  One would have been motivated to make such a combination because this enables underuse to be recognized and dealt with, this is the goal of Hanson/Liu/Nandagopal and eventual effect.  

The Examiner believes the claims are meet by an obvious combination of known job distribution methods coupled with known methods for moving underused servers to a reduced power state. The claims at issue are met by the prior art of record where Hanson and Liu describe the state of the art (at around the level the Patent Owner describes in his background section): Hanson [18-21] teaches consolidating jobs so as to free up servers to move to a reduced power state. Liu [62] [33] teaches load skewing; and with Load skewing aspects of Liu distributing new request to servers with loads less than a target but also closer to the target (see paragraphs 56-61), while further saving energy by enabling the turning off unused servers which occurs at an interval (see paragraph 59). While the Hanson and Liu references are supplemented by two other references that teach server job distribution techniques including ID number based provisioning (as in Nandagopal) and timer based moving of servers to a reduced power state (as in Begun). Here Nandagopal [29] teaches using Least-Server-ID to place jobs into servers if multiple servers are both less than max capacity and meet needs of the job. Begun [43-44] teaches setting a timer when no jobs are at the server, to move to a reduced power state after expiration of said timer. 
What is presented in this combination, in a basic sense, is what is covered by the state of the art (Hanson and Liu) plus a known technique for job placement (Nandagopal) and a known technique for moving servers to a reduced power state (Begun).  Job placement by placing jobs in the highest indexed sever with room, clearly fits into load skewing techniques identified in the state of the art references.  While cautiously waiting a period prior to shutting down severs so as to not unnecessarily shut down severs needed in the near future, provides a means of mitigating a concern of the other references.  This use of know techniques combined with the state of the art references would clearly have been obvious to one of ordinary skill in the art at the time of the invention. 

With regard to claim 22, which teaches further comprising:  determining, by the request-distribution system, a required number of request-processing servers that should be in first operational state based on total number of job requests currently being processed by the plurality of request-processing servers; and transitioning one or more request-processing servers in the second operational state to the first operational state such that after the transition, the number of request-processing servers in the first operational state is not less than the required number, Hanson teaches “if the performance is degraded because the current workload exceeds the processing capacity of the set of servers under control of the performance manager (110), one or more additional servers may be needed to achieve tan optimal placement or improved placement” (see paragraph 20).  Hanson teaches “The evaluation process (step 21) by the performance manager (110) can be based, in part, on information provided by the power manager (120) with regard to recommendations for releasing and/or obtaining servers under current operating conditions and workloads. For example, when the workload changes in such a way that one or more of the running physical servers may be vacated without compromising the performance objectives of the performance manager (110), the power manager (120) may provide a set of recommendations for which servers node in the current subset of running servers are most desirable to be released (vacated). The performance manager (110) can then choose, from among those recommendations, one or more servers to release control of to the, while meeting its performance goals. The power manager (120) may then power-off the vacated server(s).  Similarly, when the workload changes in such a way the workload exceeds the total processing capacity of the current subset of running servers, and the performance objects of the performance manager (110) are not being met, the power manager (120) can provide a set recommendations for one or more additional servers to be given control to the performance manager (110) and added to the current subset of running servers, to thereby handle the increased workload.” (see paragraph 21)  
Liu teaches a method and system for determining a minimum number of servers required to satisfy a level of service enabling the startup and/or shut down of servers (to a sleep mode) (see paragraphs 29-30), similar to that of Hanson, but further teaches a specific equation (K ( t ) = max {[ L tot ( t ) / L max] , [N tot ( t ) / N max ]}, where for calculating the minimum number of servers taking into account the maximum of the total login rate (L tot ( t ))  / max log in rate (L max (t)) and the number of connections (N tot ( t )) divided by the max per server (N max ( t )) (see paragraph 30), where this is an equivalent to the equation provided in the application while dividing out / factoring out the current number in low-cost state / powered on.  The calculation of minimum servers required and distribution of Liu further takes in to account past values of the number of active/on servers K(t) at a time (see paragraphs 51 and 59).  Liu further teaches “The server management component 102 includes a provisioning component 106 that determines a number of servers within the cluster 104 that should be active. The provisioning component 106 can start up and/or shut down servers in the cluster 104 to maintain the determined number of servers. Pursuant to an illustration, the determined number of servers can be a minimum number of servers required to satisfy a level of quality of service for the Internet service.” (see paragraph 29)

With regard to claim 24, which teaches reduced-energy-consuming data center that meets a specified response time service level for processing incoming job requests, the data center comprising: a plurality of request-processing servers, wherein each of the request-processing servers can be in either a first operational state or a second operational state at a given time, wherein when a request-processing server is in the first operational state it is available for processing a job request and when a request-processing server is in the second operational state it is unavailable processing a job request, and wherein there are three subsets of request processing servers at a given time, comprising:  a first subset of request-processing servers that are in the first operational state and processing at least one prior job request and no more than p prior job requests, where p is a predetermined, maximum number of job requests that one of the request processing servers can concurrently process while still meeting the specified response time service level; a second subset of request-processing servers that are in the first operational state and not processing any prior job requests; and a third subset of request processing servers that are in the second operational state; Hanson teaches method for reducing power consumption amongst a group of servers by utilizing a “give blanket priority to performance by consolidating workload onto the minimum number of machines sufficient to serve it, and turning off the unused machines” (see paragraph 30).  Hanson teaches multiple server states including a “powered on” processing state and multiple non-processing “reduced power" or idle states, where based upon workload and power constraints the system transitions servers between a “powered on” processing state and a “reduced power” / “powered off” / idle state when it is deemed that less servers are necessary, and between “reduced power” / “powered off” / idle state and “powered on” state when it is determined that workload exceeds current capacity (see paragraphs 2, 18, 20, 21, and 28).  Where a “powered on” state has a quick response time and lower-setup-cost while a “reduced power" state has an increased response time and a higher-setup-cost.  

With regard to claim 24, further teaching a request distribution system that is in communication with the plurality of request-processing servers, wherein the request-distribution system is configured to:  receive a new incoming job request for processing; and distribute the new incoming job request to one of the plurality of request-processing servers in the first subset of request-processing servers that is processing fewer than p job requests at the time of distribution and to a request processing server in the second subset if all of the request-processing servers in the first subset are processing p requests at the time of distribution, such that the job requests being processed by the data center are concentrated in the first subset of request-processing servers and such that the request-processing servers in the second subset are free from processing the job requests, and wherein the request-processing server to which the new incoming job request is distributed processes the new incoming job request, Hanson teaches a distributing a stream on content between servers (see paragraph 24).  Hanson further teaches a performance manager 110 that places or otherwise redistributes a current workload on the subset of powered-on nodes under its control, while taking into account current operating conditions and workloads (see paragraphs 19-21).  Here servers / nodes are indexed as N = {n(1), . . . ,n(N)}, while jobs / applications are indexed as M = { m(1), . . . m(M) } (see paragraph 36), where jobs are allocated to servers while considering response time goals (service level requirement) and  maximum optimization / workload intensity (maximum requests servable) (see paragraphs 37-48).   Hanson teaches “if the processing capacity exceeds the current workload, one or more servers may be vacated with the workload consolidated on remaining servers” (see paragraph 20).  Hanson teaches “The evaluation process (step 21) by the performance manager (110) can be based, in part, on information provided by the power manager (120) with regard to recommendations for releasing and/or obtaining servers under current operating conditions and workloads. For example, when the workload changes in such a way that one or more of the running physical servers may be vacated without compromising the performance objectives of the performance manager (110), the power manager (120) may provide a set of recommendations for which servers node in the current subset of running servers are most desirable to be released (vacated). The performance manager (110) can then choose, from among those recommendations, one or more servers to release control of to the, while meeting its performance goals. The power manager (120) may then power-off the vacated server(s).” (see paragraph 21)  Hanson teaches to “give blanket priority to performance by consolidating workload onto the minimum number of machines sufficient to serve it, and turning off the unused machines” (see paragraph 30).
With regard to load distribution, Liu is further shown to utilize a load dispatching component 108 that assigns user connection requests to servers turned on, where the dispatching component is capable of both load balancing and load skewing (see paragraph 33).  Here active servers are indexed between 1 and K(t), where K(t) is the number of active servers (see paragraph 51).  Load balancing component 502 provides dispatching mechanisms that attempts to equalize numbers of connections on servers in the cluster.  The load balancing component applies a round-robin mechanism for distribution (see paragraphs 52-53).  Load skewing aspects of Liu distribute new request to servers with loads less than a target but also closer to the target (see paragraphs 56-61), while further saving energy by enabling the turning off unused servers which occurs at an interval (see paragraph 59).  Liu further provides of a hybrid with aspects load balancing and aspects of load skewing (see paragraph 62).  Liu teaches “The server management component 102 includes a provisioning component 106 that determines a number of servers within the cluster 104 that should be active. The provisioning component 106 can start up and/or shut down servers in the cluster 104 to maintain the determined number of servers. Pursuant to an illustration, the determined number of servers can be a minimum number of servers required to satisfy a level of quality of service for the Internet service.” (see paragraph 29)  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson before him at the time of the invention was made to include the load balancing / load skewing specifics of Liu in the method and system of Hanson.  One would have been motivated to make such a combination because Liu puts forth a specific means / equation for accomplishing the priorities outlined in Hanson. 
Nandagopal teaches a load balancing system wherein servers are evaluated for their aggregate load as well at the maximum acceptable load that each server can handle before there is a degradation in response time (see paragraphs 25-28), but further teaches an option to send the incoming request to the least-server-id, where each server is assigned a numeric id, and the least server id corresponds to a numeric id assigned the lowest numbered id among all servers present (see paragraph 29).  It would have been obvious to one of ordinary skill in the art, having the teachings of Liu, Hanson, and Nandagopal before them at the time of the invention was made to include load balancing scheme as used by Nandagopal where severs are loaded according to the lowest ID’d server in the method and systems of Hanson and Liu.  One would have been motivated to make such a combination because this provides for the assurance the lowest ID’d servers will be filled first prior to utilizing other servers, thereby allowing others to fulfill their jobs, time out, and move to a reduced power state.
Hanson, Liu, and Nandagopal teach switching between a lower and higher setup cost state but do not specifically teach the switch being a function of a time-out of a delay.  Begun teaches a similar system for distributing tasks over severs in a load balancing system (see paragraphs 43 and 45), switching between a "reduced power state" and a "higher power state" based upon workload (see paragraph 440), while considering the effects of reboot time in restarting servers from a reduced power state (see paragraph 31), similar to that of Hanson, Liu, and Nandagopal, but further teaches switching between a lower and higher setup cost state based on a timer activated on idle (see paragraph 44).  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson, Liu, Nandagopal, and Begun before them at the time of the invention to include the idle based timer to move unused or underused servers in a higher power / lower-startup-cost state into a ‘sleep’ / reduced-power / high-startup-cost state.  One would have been motivated to make such a combination because this enables underuse to be recognized and dealt with, this is the goal of Hanson/Liu/Nandagopal and eventual effect.  

The Examiner believes the claims are meet by an obvious combination of known job distribution methods coupled with known methods for moving underused servers to a reduced power state. The claims at issue are met by the prior art of record where Hanson and Liu describe the state of the art (at around the level the Patent Owner describes in his background section): Hanson [18-21] teaches consolidating jobs so as to free up servers to move to a reduced power state. Liu [62] [33] teaches load skewing; and with Load skewing aspects of Liu distributing new request to servers with loads less than a target but also closer to the target (see paragraphs 56-61), while further saving energy by enabling the turning off unused servers which occurs at an interval (see paragraph 59). While the Hanson and Liu references are supplemented by two other references that teach server job distribution techniques including ID number based provisioning (as in Nandagopal) and timer based moving of servers to a reduced power state (as in Begun). Here Nandagopal [29] teaches using Least-Server-ID to place jobs into servers if multiple servers are both less than max capacity and meet needs of the job. Begun [43-44] teaches setting a timer when no jobs are at the server, to move to a reduced power state after expiration of said timer. 
What is presented in this combination, in a basic sense, is what is covered by the state of the art (Hanson and Liu) plus a known technique for job placement (Nandagopal) and a known technique for moving servers to a reduced power state (Begun).  Job placement by placing jobs in the highest indexed sever with room, clearly fits into load skewing techniques identified in the state of the art references.  While cautiously waiting a period prior to shutting down severs so as to not unnecessarily shut down severs needed in the near future, provides a means of mitigating a concern of the other references.  This use of know techniques combined with the state of the art references would clearly have been obvious to one of ordinary skill in the art at the time of the invention. 


With regard to claim 25, which teaches wherein the request distribution system is further configured to:  determine a required number of request-processing server that should be in first operational state based on total number of job requests currently being processed by the plurality of request processing servers; and transition one or more request-processing servers in the second operational state to the first operational state such that after the transition, the number of request-processing servers in the first operational state is not less than the required number, Hanson teaches “if the performance is degraded because the current workload exceeds the processing capacity of the set of servers under control of the performance manager (110), one or more additional servers may be needed to achieve tan optimal placement or improved placement” (see paragraph 20).  Hanson teaches “The evaluation process (step 21) by the performance manager (110) can be based, in part, on information provided by the power manager (120) with regard to recommendations for releasing and/or obtaining servers under current operating conditions and workloads. For example, when the workload changes in such a way that one or more of the running physical servers may be vacated without compromising the performance objectives of the performance manager (110), the power manager (120) may provide a set of recommendations for which servers node in the current subset of running servers are most desirable to be released (vacated). The performance manager (110) can then choose, from among those recommendations, one or more servers to release control of to the, while meeting its performance goals. The power manager (120) may then power-off the vacated server(s).  Similarly, when the workload changes in such a way the workload exceeds the total processing capacity of the current subset of running servers, and the performance objects of the performance manager (110) are not being met, the power manager (120) can provide a set recommendations for one or more additional servers to be given control to the performance manager (110) and added to the current subset of running servers, to thereby handle the increased workload.” (see paragraph 21)  
Liu teaches a method and system for determining a minimum number of servers required to satisfy a level of service enabling the startup and/or shut down of servers (to a sleep mode) (see paragraphs 29-30), similar to that of Hanson, but further teaches a specific equation (K ( t ) = max {[ L tot ( t ) / L max] , [N tot ( t ) / N max ]}, where for +calculating the minimum number of servers taking into account the maximum of the total login rate (L tot ( t ))  / max log in rate (L max (t)) and the number of connections (N tot ( t )) divided by the max per server (N max ( t )) (see paragraph 30), where this is an equivalent to the equation provided in the application while dividing out / factoring out the current number in low-cost state / powered on.  The calculation of minimum servers required and distribution of Liu further takes in to account past values of the number of active/on servers K(t) at a time (see paragraphs 51 and 59).  Liu further teaches “The server management component 102 includes a provisioning component 106 that determines a number of servers within the cluster 104 that should be active. The provisioning component 106 can start up and/or shut down servers in the cluster 104 to maintain the determined number of servers. Pursuant to an illustration, the determined number of servers can be a minimum number of servers required to satisfy a level of quality of service for the Internet service.” (see paragraph 29)



Claim 17 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Hanson et al., Publication Number:  2009/0254660, hereinafter Hanson, Liu et al., Publication Number:  2009/0222562, hereinafter Liu, Nandagopal et al., Publication Number:  20100217866, hereinafter Nandagopal, Begun et al., Publication Number:  2003/0055969, hereinafter Begun, and Govett, Patent Number:  5,761,507.

With regard to claim 17, which teaches wherein the non-zero, finite delay time period is between 60 and 260 seconds, Begun teaches switching between a lower and higher setup cost state based on a timer activated on idle (see paragraph 44), but doesn’t give any specific guidance on a specific delay time period.  Govett teaches a similar system in which servers are transitioned from an active state to a stopped state based upon a length of time of the serve being idle (see column 12, lines 1-10 and 50-64 and claims 13-14), similar to that of Hanson, Liu, Nandagopal, and Begun, but further specifically enables a user to set the time period after the server goes idle to delay until stopping he server (see column 6, lines 10-36).  Here a user may freely choose a delay in the specified range of 60-260 seconds or numerous system factors could cause the value to fit in that range (balancing of computer overhead contributions, convenience of users, average request execution time, etc.).  It would have been obvious to one of ordinary skill in the art, having the teachings of Hanson, Liu, Begun, Nandagopal, and Govett before them at the time of the invention to include the specific user configurable idle based timer to move unused or underused servers in a higher power / lower-startup-cost state into a ‘sleep’ / reduced-power / high-startup-cost state and to set said time at a value between 60 and 260 seconds.  One would have been motivated to make such a combination because this enables a maximization of system resources given system characteristic where the reasoning for choosing the delay time in Govett is similar to those described in column 6 of the subject patent.    

(2) Response to Argument

Before getting to the actual arguments, the Examiner disagrees with how the Patent Owner represents the references in their “Summary of Relied Upon References”:

On page 23, Patent Owner presents that “Liu also does not disclose turning off servers in Liu’s Load-Skewing embodiments…”.  This is clearly not the case, Liu specifically teaches “The load skewing component 504 facilitates reducing server initiated disconnections when turning off servers.”  Liu further outlines an embodiment where the provisioning of server occurs at a preset interval.  (see paragraphs 58 and 59).

On page 23, Patent Owner further presents that “Liu teaches a person having ordinary skill in the art away from the inventions of the pending claims of the Subject Application because it teaches a policy where a threshold number of “tail” servers are maintained on, even though they process a low quantity of jobs…. teaches away from packing the jobs”.  When in fact, Liu specifically teaches “load skewing component 504 gives priority to a subset of the set of busy server 604 when dispatching new connection requests… The subset given priority includes busy servers that have a number of connections less than but closest to the target threshold.”  It is a clearly stated goal of Liu for “the load skewing component 504 (attempts) to max out a smallest number of servers as possible instead of utilizing a larger number of servers wherein each server carries a smaller connection load.”  (see paragraphs 60-62)

	On pages 24 and 25, Patent Owner presents a lot of what Nandagopal doesn’t teach, namely powering down servers, but Nandagopal was never relied upon for this limitation.  Nandagopal is relied upon for is the packing technique of indexing a job to a least-server-id.   When several servers meet requirements to place a job, as is present in the other references as well, Nandagopal presents packing based upon the server with the least-server-id.  

	On page 25, Patent Owner states “Begun does not necessarily turn off the server that was idle for the predetermined time interval (because it could select any server to turn off)”, when in fact Begun specifically teaches “If a higher power server remains idle (e.g., does not receive or send data requests for a predetermined time) or available processing capacity exceeds a predetermined workload, as determined by ISS 54, dispatcher selects a higher power server to be powered down to a reduced power state” (see paragraph 44), while specifically noting powering off the “unneeded servers” (see paragraph 30).





A.1.  Argument that “The Office Did Not Apply the Graham Framework”

	Patent Owner argues that “The Examiner did not identify what skills or knowledge that a PHOSITA would have developed through the PHOSITA’s education and experience that would be relevant to the Subject Application and that would, allegedly, make the many things the Examiner identified as being obvious to PHOSITA in fact obvious. See Section VII, supra.”
	In response, it is noted that the Examiner considers himself to be a "person(s) of scientific competence in the fields in which they (he) work(s)" capable of making decisions "informed by their scientific knowledge”, given his technical education in the field of Computer Engineering and work as an Examiner at the USPTO.  

The Federal Circuit has stated that examiners and administrative patent judges on the Board are "persons of scientific competence in the fields in which they work" and that their findings are "informed by their scientific knowledge, as to the meaning of prior art references to persons of ordinary skill in the art." In re Berg, 320 F.3d 1310, 1315, 65 USPQ2d 2003, 2007 (Fed. Cir. 2003). In addition, examiners "are assumed to have some expertise in interpreting the references and to be familiar from their work with the level of skill in the art ." PowerOasis, Inc. v. T-Mobile USA, Inc., 522 F.3d 1299, 86 USPQ2d 1385 (Fed. Cir. 2008) (quoting Am. Hoist & Derrick Co. v. Sowa & Sons, 725 F.2d 1350, 1360, 220 USPQ 763, 770 (Fed. Cir. 1984). See MPEP § 2141 for a discussion of the level of ordinary skill. 

Examiner further considers himself able to recognized what his peers (others of similar technical background) would consider obvious.  Examiner further utilized MPEPs guidance in evaluating what a PHOSITA would deem obvious.

MPEP 2141:

The person of ordinary skill in the art is a hypothetical person who is presumed to have known the relevant art at the time of the invention. Factors that may be considered in determining the level of ordinary skill in the art may include: (1) "type of problems encountered in the art;" (2) "prior art solutions to those problems;" (3) "rapidity with which innovations are made;" (4) "sophistication of the technology; and" (5) "educational level of active workers in the field." In re GPAC, 57 F.3d 1573, 1579, 35 USPQ2d 1116, 1121 (Fed. Cir. 1995). "In a given case, every factor may not be present, and one or more factors may predominate." Id. See also Custom Accessories, Inc. v. Jeffrey-Allan Indust., Inc., 807 F.2d 955, 962, 1 USPQ2d 1196, 1201 (Fed. Cir. 1986); Environmental Designs, Ltd. v. Union Oil Co., 713 F.2d 693, 696, 218 USPQ 865, 868 (Fed. Cir. 1983). 
    PNG
    media_image2.png
    18
    19
    media_image2.png
    Greyscale

"A person of ordinary skill in the art is also a person of ordinary creativity, not an automaton.  "KSR, 550 U.S. at 421, 82 USPQ2d at 1397. "[I]n many cases a person of ordinary skill will be able to fit the teachings of multiple patents together like pieces of a puzzle."  Id. at 420, 82 USPQ2d at 1397. Office personnel may also take into account "the inferences and creative steps that a person of ordinary skill in the art would employ."  Id. at 418, 82 USPQ2d at 1396.
In addition to the factors above, Office personnel may rely on their own technical expertise to describe the knowledge and skills of a person of ordinary skill in the art. The Federal Circuit has stated that examiners and administrative patent judges on the Board are "persons of scientific competence in the fields in which they work" and that their findings are "informed by their scientific knowledge, as to the meaning of prior art references to persons of ordinary skill in the art." In re Berg, 320 F.3d 1310, 1315, 65 USPQ2d 2003, 2007 (Fed. Cir. 2003). In addition, examiners "are assumed to have some expertise in interpreting the references and to be familiar from their work with the level of skill in the art ." PowerOasis, Inc. v. T-Mobile USA, Inc., 522 F.3d 1299, 86 USPQ2d 1385 (Fed. Cir. 2008) (quoting Am. Hoist & Derrick Co. v. Sowa & Sons, 725 F.2d 1350, 1360, 220 USPQ 763, 770 (Fed. Cir. 1984). See MPEP § 2141 for a discussion of the level of ordinary skill.



A.2.  Argument that “The Office Improperly Assessed Whether Individual Claim Limitations Would Have Been Obvious, Instead of Assessing Whether the Claims “As a Whole” Would Have Been Obvious”

Patent Owner argues that “In particular, the Examiner did not assess whether it would have been obvious for the data center to both (i) route a new incoming job request to a lowest indexed server currently processing fewer than p requests (where p is a maximum number of requests that the servers can concurrently process while still meeting a specified response time), and (ii) delay transitioning of a server from the first operational state (e.g., lower-setup-cost state) to the second operational state (e.g., higher-setup-cost state) after that server processes the last request assigned to it. Focusing the § 103 inquiry on particular, discrete claim limitations, as the Examiner did here, improperly disregards the “as a whole” statutory mandate in favor of a “part-by-part” analysis that leads inevitably to hindsight reconstruction. See Ruiz v. A.B. Chance Co., 357 F.3d 1270, 1275 (Fed. Cir. 2004) (finding that the “as a whole” requirement “prevents evaluation of the invention part by part.... This form of hindsight reasoning, using the invention as a roadmap to find its prior art components, would discount the value of combining various existing features or principles in a new way to achieve a new result—often the very definition of invention”).”
	In response, the Examiner notes that what is presented in this combination, in a basic sense, is what is covered by the state of the art (Hanson and Liu) plus a known technique for job placement (Nandagopal) and a known technique for moving servers to a reduced power state (Begun).  Job placement by placing jobs in the highest indexed sever with room, clearly fits into load skewing techniques identified in the state of the art references.  While cautiously waiting a period prior to shutting down severs so as to not unnecessarily shut down severs needed in the near future, provides a means of mitigating a concern of the other references.  This use of know techniques combined with the state of the art references would clearly have been obvious to one of ordinary skill in the art at the time of the invention. 
Simply using a known technique to improve similar devices (methods, or products) is an acceptable rational to support a conclusion of obviousness. One of ordinary skill in the art would have been capable of applying this known method of enhancement to a "base" device (method, or product) in the prior art and the results would have been predictable to one of ordinary skill in the art. The Supreme Court in KSR noted that if the actual application of the technique would have been beyond the skill of one of ordinary skill in the art, then using the technique would not have been obvious. KSR, 550 U.S. at 417, 82 USPQ2d at 1396. 

In response to Patent Owner’s argument that the Examiner’s conclusion of obviousness is based upon improper hindsight reasoning, it must be recognized that any judgment on obviousness is in a sense necessarily a reconstruction based upon hindsight reasoning.  But so long as it takes into account only knowledge which was within the level of ordinary skill at the time the claimed invention was made, and does not include knowledge gleaned only from the Applicant's disclosure, such a reconstruction is proper.  See In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971).
A person of ordinary skill in the art would have been motivated to combine the references in the manner presented in the rejection of Claims 10, 21, and 24 for the reasons explicitly stated in that rejection.  The motivation to do so at least being found in each of the applied references.  Applicant’s arguments do not specifically address the strong showing of obviousness pulled directly from the applied references and described in those rejections.  Specifically, each are directed at techniques for balancing power consumption and performance in server management, while substituting a known technique for job placement (Nandagopal) and a known technique for moving servers to a reduced power state (Begun) into references discussing but not laying out specific methods of job placement and server state changing.

With respect to Patent Owner arguments noting teaching away, Examiner respectfully submits that in each the subject patent specification and in the references there is a balance between servers on line and servers placed in a reduced power state, where they point to the advantages and disadvantages of each.  These are not admissions of teaching away but rather appreciation of the qualities of each state. It is well known and it is taught both amongst the subject patent specification and in the references that there is a down side in repeatedly turning servers on/off, yet the patent and references do so in order to save valuable power and in turn cost. Specifically, Hanson teaches a power manager that intelligently decides when to vacate servers and move them to a reduced power state (see paragraphs 18-22). Liu teaches only maintaining the minimum number of servers in an on state so as to satisfy a level of quality of service (see paragraph 29). Begun further teaches moving idle servers to a reduced power state based upon need as decided by a power manager (see paragraph 44).

Patent Owner argues that “Hanson teaches not to pack incoming jobs in as few as servers as possible and then gradually turning off the other servers.” 
In response, the Examiner respectfully submits that Hanson teaches explicitly “consolidating workload onto the minimum number of machines sufficient to server it, and turning off the unused machines.” (see paragraph 30)

Patent Owner argues that “Hanson does not teach (i) packing the jobs in as few as servers as possible by routing new incoming jobs to the lowest indexed server, and then (ii) transitioning to the higher-setup-cost state the idle servers in the lower-setup-costs state (e.g., “on”) after their respective timers expire.”
In response, the Examiner notes again that Hanson teaches explicitly “consolidating workload onto the minimum number of machines sufficient to server it, and turning off the unused machines.” (see paragraph 30)  While the specifics of which server the job is routed to is not specified in Hanson nor is the transition upon timers expiring, these techniques where known in the art as options to accomplish the goals of Hanson, as previously outlined. 

Patent Owner argues that “Hanson also teaches away from the systems/methods of the pending claims of the Subject Application because Hanson admonishes both (i) turning off servers when response time is “not critical,” and (ii) not powering off servers when “response time is critical.”
In response, the Examiner notes again that what Hanson is describing in these quotes provided without context is the tradeoffs involved in system that weigh response time and power savings balancing.   The totality of Hanson describes a system in which enough servers are kept on line to meet Service Level Agreements, and extra servers are placed in a reduced power state. (see paragraphs 28-30)

Patent Owner argues that “Liu teaches a PHOSITA away from the inventions of the pending claims of the Subject Application because it teaches a policy where a threshold number of “tail” servers are maintained on even though they process a low quantity of jobs. Thus, Liu teaches a PHOSITA to keep a threshold number of servers (the “tail” servers) on and, thereby, teaches away from packing the jobs to the lowest indexed servers.”
In response, the Examiner respectfully submits that Liu’s maintaining severs even though they process a low quantity of jobs is in line with and itself motivation to combine with a system that doesn’t immediately shut down severs not currently processing jobs (or process a low level of jobs), but rather sets a timer that must expire before shutting down these under-utilized servers. 

Patent Owner argues that “Nandagopal teaches away from the pending claims because it essentially teaches a version of the “Always On” approach, which consumes too much power. That is, Nandagopal keeps the servers on all the time and, thereby, teaches away from turning off the servers after being idled due to packing the jobs into other servers, such as recited in the pending claims of the Subject Application.”
In response, the Examiner respectfully submits that Nandagopal was not relied upon for any of its server power management features, but rather only for its noted job placement technique.  The other reference where not a specific in known job placement techniques where Nandagopal provides several options (Random, Least-Server-ID, Last-Server-Selected Round Robin, etc.) to place jobs when multiple severs satisfy load conditions. 

Patent Owner argues that “Begun does not necessarily turn of the server that was idle for the predetermined time interval…… does the opposite of claim 10’s job “packing”.”
In response, the Examiner respectfully submits that Begun is relied upon in the rejection for a pause prior to moving servers to a reduced power state.  Begun’s ability to relocate jobs is not cited nor relied upon.  Begun specifically teaches “If a higher power server remains idle (e.g., does not receive or send data requests for a predetermined time) or available processing capacity exceeds a predetermined workload, as determined by ISS 54, dispatcher selects a higher power server to be powered down to a reduced power state”.  (see paragraph 44)


A.3.  Argument that The Examiner Relied Impermissibly on Hindsight Reconstruction

Patent Owner argues that “the Examiner relied on hindsight reconstruction, which is evident by the fact that two of the purported “motivations” for a PHOSITA to realize the claimed inventions are inconsistent with each other.”  Where Liu teaches “even distribution”  of the incoming jobs requests is the opposite of the packing of new jobs to the “lowest indexed server”.
In response, the Examiner respectfully submits Patent Owner here focuses on limited embodiments of Liu when stating that Liu “evenly distribut[ing] new connection request among the predetermined number of available servers…” ignoring the fact that in the same paragraphs 62, Liu notes “load skewing” in “Dispatching to busy servers first produces energy savings since it requires more energy to start up new servers than to maintain a fully loaded server”, and to “max out the smallest number of servers as possible”.  Patent Owner implies here by their argument and citations that Liu has no interest in “packing new jobs”, when it is clear from just reading the Patent Owners supplied paragraph in full that Liu is promoting “load skewing”, “dispatching to busy servers first”, and “attempt[s] to max out the smallest number of servers as possible” (see again paragraph 62).
Now to say what is done in Liu is the opposite of what is done Nandagopal where servers are loaded according to the lowest ID’s server, is just not consistent with the teachings of the references.  Nandagopal just merely provides an ordering of distribution if all else is equal.  
In summary, Liu specifically addresses the use of and benefits in "Load Skewing" (see paragraphs 62 and 33). This concept is further addressed in Hanson (paragraphs 19-21), where jobs are consolidated into a limited number of servers. Begun (paragraphs 43-44) further shows that if available servers remain idle (does not receive requests for a predetermined time) or processing capacity exceed a predetermined workload, jobs are redistributed/consolidated in the remaining servers and the server is moved to a reduced power state.  Even Nandagopal recognizes the benefits of prioritizing job distribution to the least-server-ID and last-server-selected, each of which continually chooses the same server, rather than spreading the workload amongst all servers. 
This argument presented is counter to Patent Owner’s own submitted evidence that utilize a combination of load balancing and load skewing techniques with the claimed invention (see Patent Owner’s argument that claimed invention utilizes “custom” load balancing on page 57 of the Appeal Brief and below remarks responsive to the Mr. Wu post).


B. Arguments directed at The Secondary Considerations of Nonobviousness

The evidence of secondary considerations presented by the Patent Owner have been considered against the underlying evidence supporting the findings of obviousness listed above.  After weighing all of the available evidence, the Examiner finds that the secondary considerations do not overcome the evidence supporting the conclusions of obviousness. 
Specifically, while the secondary considerations and associated affidavits show the success and acclaim AutoScale has had, it is not clear that AutoScale is merely what is broadly claimed in the claims at issue.  A clear one to one nexus has not been established between claim language and the end product.  Rather the Examiner believes the claims are arrived at through an obvious combination of known job distribution methods coupled with known methods for moving underused servers to a reduced power state. The claims at issue are met by the prior art of record where Hanson and Liu describe the state of the art: Hanson [18-21] teaches consolidating jobs so as to free up servers to move to a reduced power state. Liu [62] [33] teaches load skewing; and [29-30] teaches determining when it is worth it to turn servers off. While the Hanson and Liu references are supplemented by two other references that teach server job distribution techniques including ID number based provisioning (as in Nandagopal) and timer based moving of servers to a reduced power state (as in Begun).  Here Nandagopal [29] teaches using Least-Server-ID to place jobs into servers if multiple servers are both less than max capacity and meet needs of the job. Begun [43-44] teaches setting a timer when no jobs are at the server, to move to a reduced power state after expiration of said timer. The Patent Owner presents multiple arguments directed at this combination, most of which point to individual claim elements lacking from individual references. These discrepancies are solved via the obvious combination as addressed above.

The Affidavits themselves break down the invention in terms of these two ‘techniques’ utilized by the Patent to improve efficiency (followed by the third described technique of determining how many servers need be in an active state based on current number of jobs rather than request rate; from new dependent claims 18, 20, and 22).  See element 2.4 of the Wierman declaration (the Harchol-Balter declaration recites the like in Section 3) where: 
A first technique is to pack jobs in a few of the servers, while at the same time not over-packing jobs to those few servers so that they can still meet the desired service levels (e.g., response times) for the data center. This is, in essence, load “skewing” and is the opposite of the then-common technique of load balancing. See ‘018 Patent at col. 7:16-52.
and
The second technique of the Auto Scale invention is called the delayed-off. The idea is that a server is only transitioned then from the lower-setup-cost state to the higher-setup-cost state after the server has remained idle for some contiguous finite period of time, determined by a “timer.” See ‘018 Patent at col. 6:11-67.

With regard to the “first technique”, ‘Load Skewing” was a known technique for dispatching jobs, as evidenced by paragraphs 6 and 33 of Liu:
[0006] In accordance with another aspect, a load skewing component is provided that attempts to dispatch new connection requests to busy servers first. The load skewing component amasses a majority of user connections to a small set of busy servers and maintains the set of busy servers at or close to a target load. If there are no busy servers available to handle additional load, the load skewing component can dispatch connections to the tail servers until capacity becomes available on the busy servers. 

[0033] The server management component 102 includes a load dispatching component 108 that allocates incoming user connection requests to a server in the cluster 104. The load dispatching component 108 assigns user connection requests to servers provisioned or turned on by the provisioning component 106. The load dispatching component 108 can employ a plurality of dispatching algorithms such as, but not limited to, load balancing, load skewing and the like. In load balancing, the load dispatching component 108 attempts to make numbers of connections on servers in the cluster 104 the same. In load skewing, the load dispatching component 108 assigns new user connection requests to busy servers in the cluster 104 first until a threshold is met.


With regard to the “second technique”, Begun specifically teaches that “if a higher power server remains idle (e.g., does not receive or send a data request for a predetermined time)… dispatcher 32 selects a higher power server to be powered down to a reduced power state…”

With regard to the “third technique”, where decisions on changing a server state are based on “current number of jobs” rather than “request rate”, as Patent Owner described was the current practice, Begun specifically, implements its method of power management (making decisions to power down or power up servers) based upon “current workload” of the servers (see paragraph 50).
[0050] The method of power management of the present invention implements a resource manager coupled to a group of servers. The resource manager analyzes the balance of tasks of the group of servers utilizing a set of performance metrics. If the processing capacity of the group of higher power servers exceeds current workload, at least a server in the group is selected to be powered down to a reduced power state. The tasks on the selected server are rebalanced over the remaining higher power servers. However, if the power manager determines that the workload exceeds the processing capacity of the group of servers, at least a server is powered up to a higher power state, and the tasks are rebalanced over the group of servers.


Diving deeper in to the secondary considerations reveals how much of a nexus is truly lacking between the claim language and the secondary considerations.  For example.  Harchol-Balter declaration 8.3 directs the Office to https://engineering.fb.com/2014/08/08/production-engineering/making-facebook-s-software-infrastructure-more-energy-efficient-with-autoscale/ , a blog post by Mr. Wu about how Facebook deployed the AutoScale invention at its data centers.  In this post, Mr. Wu describes Overall architecture of Facebooks servers integrated with Autoscale:  
Overall architecture
In each frontend cluster, Facebook uses custom load balancers to distribute workload to a pool of web servers. Following the implementation of Autoscale, the load balancer now uses an active, or “virtual,” pool of servers, which is essentially a subset of the physical server pool. Autoscale is designed to dynamically adjust the active pool size such that each active server will get at least medium-level CPU utilization regardless of the overall workload level. The servers that aren’t in the active pool don’t receive traffic.

    PNG
    media_image3.png
    197
    291
    media_image3.png
    Greyscale

Figure 1: Overall structure of Autoscale
We formulate this as a feedback loop control problem, as shown in Figure 1. The control loop starts with collecting utilization information (CPU, request queue, etc.) from all active servers. Based on this data, the Autoscale controller makes a decision on the optimal active pool size and passes the decision to our load balancers. The load balancers then distribute the workload evenly among the active servers. It repeats this process for the next control cycle.

This very limited 3 page summary of the integration reveals seemingly important features that make Autoscale work that aren’t revealed in the claims.  Specifically this idea of using load balancing in combination with load skewing to make optimal use of available active severs.  Patent Owner argued against this same feature in the presented references. 

Additionally, it appears that a great deal of the benefit achieved though integration of Autoscale is a product of its Decision Logic:

Decision logic
A key part of the feedback loop is the decision logic. We want to make an optimal decision that will adapt to the varying workload, including workload surges or drops due to unexpected events. On one hand, we want to maximize the energy-saving opportunity. On the other, we don’t want to over-concentrate the traffic in a way that could affect site performance.
For this to work, we employ the classic control theory and PI controller to get the optimal control effect of fast reaction time, small overshoots, etc. To apply the control theory, we need to first model the relationship of key factors such as CPU utilization and request-per-second (RPS). To do this, we conduct experiments to understand how they correlate and then estimate the model based on experimental data. For example, Figure 2 shows the experimental results of the relationship between CPU and RPS for one type of web server at Facebook. In the figure, the blue dots are the raw data points while the red dashed line is the estimated model (piece-wise linear). With the models obtained, the controller is then designed using the classic stability analysis to pick the best control parameters.

    PNG
    media_image4.png
    344
    708
    media_image4.png
    Greyscale

Figure 2: Experimental results of the relationship between CPU and RPS for one type of web server; the red dashed line is the estimated piece-wise linear model

This section notes for Autoscale to “employ the classic control theory and PI controller to get the optimal control effect of fast reaction time, small overshoots, etc. To apply the control theory, we need to first model the relationship of key factors such as CPU utilization and request-per-second (RPS).”  In addition, to the use of two seemingly rather important unclaimed design logic features: “control theory” and “PI controller”, the underlying control is said to be based upon factors such as “request-per-second”, where this sounds much closer to the “request rate” that Patent Owner is currently arguing that the references teach, and less like the “current number of request” that Patent Owner argues the claims are limited to.  

The Examiner notes that "[t]o be of probative value, any secondary evidence must be related to the claimed invention (nexus required)." Id. The MPEP continues on to say "[t]he term 'nexus' designates a factually and legally sufficient connection between the objective evidence of non-obviousness and the claimed invention so that the evidence is of probative value in the determination of non-obviousness." Id., citing Demaco Corp. v. F. Von Langsdorff Licensing Ltd., 851 F.2d 1387 (Fed. Cir. 1988).  In the present case, the Patent Owner has not presented any objective secondary evidence directed to or having a nexus with the specific features of the claims that stand rejected for being obvious.

In summary, Patent Owner has failed to establish the required nexus between the secondary considerations and the claimed subject matter.  Therefore, PO’s submitted evidence of non-obviousness (i.e., commercial success) is deemed insufficient to overcome the strong showing of obviousness in the applied obviousness rejections of record.


ANSWERS TO SPECIFIC ARGUMENTS
In attempting to develop a nexus between the Facebook blog post and the Patent claims, Patent Owner presents that “The Facebook blog post links the success of its AutoScale implementation to these two aspects of AutoScale that are recited in the pending claims”
Now Patent Owner here tries to set up the bounds of what it needs to provide a nexus between, noting “the claims recite two important features -  the packing factor and the delayed turn off”, however the contended features were actually:
(1) “the request distribution system routes the new incoming job request to a lowest indexed request-processing server currently processing fewer than p requests” (hereinafter “the lowest indexed server” limitation)
(2) “wherein each of the plurality of request-processing servers has an associated timer that is of a non-zero, finite delay time period wherein the timer starts after processing a last requested assigned to the request processing server, such that each request processing server is configured to transition from the first operational state to the second operational state after expiration of the timer’s non-zero, finite delay time period”.  (hereinafter “the delay timer” limitation)
What the Patent Owner sets up in these two ‘important features’ is not what if fully being claimed by the above noted features (1) and (2), most notably, reducing the routing of ‘the new incoming job request to a lowest indexed request-processing server’ being interpreted as merely a “packing factor”.  This is inconsistent with the way and weight that this limitation was previously argued to carry.  What the patent owner argues here are known features within the prior art that the Patent Owner tries to set up as the low bar for which to reach to establish a nexus between the evidence and claim language. 

MPEP 716.01(b) notes:
"Where the offered secondary consideration actually results from something other than what is both claimed and novel in the claim, there is no nexus to the merits of the claimed invention." In re Kao, 639 F.3d 1057, 1068 (Fed. Cir. 2011); see also Tokai Corp. v. Easton Enters., Inc., 632 F.3d 1358, 1369 (Fed. Cir. 2011) ("If commercial success is due to an element in the prior art, no nexus exists."); Ormco Corp. v. Align Tech., Inc., 463 F.3d 1299, 1312 (2006) ("[I]f the feature that creates the commercial success was known in the prior art, the success is not pertinent."). In evaluating whether the requisite nexus exists, the identified objective indicia must be directed to what was not known in the prior art, including patents and publications, which may well be the novel combination or arrangement of known individual elements. See KSR Int’l Co. v. Teleflex Inc., 550 U.S. 398, 418–19 (2007); Veritas Techs. LLC v. Veeam Software Corp., 835 F.3d 1406, 1414-15 (Fed. Cir. 2016).  

Patent Owner argues: “First, the Facebook blog posts states that its implementation of AutoScale “will concentrate workload to a server until it has at least a medium-level workload.” App. 47. The Facebook blog post further elaborates that, “[i]f the overall workload is low..., the load balancer will use only a subset of the servers.  Other servers can be left running idle or be used for batch-processing workloads.” Id. Relevant to claim 10, the “subset of the servers” that is used by the load balancer is the “request-processing servers in the first operational state” and the “medium-level workload” threshold, which limits the number of jobs sent to a server, and corresponds to the “predetermined packing factor” of claim 10. App. 64-65, § 8.4; App. 9-10, § 3.11.”
In response, the Examiner respectfully submits that clearly the quoted sections from the blog post are in no way as limiting as the argued claim limitation of “the request distribution system routes the new incoming job request to a lowest indexed request-processing server currently processing fewer than p requests”.  Where is there any reference in the reference document to the ‘lowest indexed request-processing server’ feature?

Patent Owner argues: “Second, the fact that in the Facebook implementation the “other servers can be left running idle or be used for batch-processing workloads” also shows that Facebook transitions its servers from the first operational state to the second operational state after a non-zero, finite time period. App. 64-65, § 8.4; App. 9-10, § 3.11.”
In response, the Examiner respectfully submits that clearly the quoted sections from the blog post are in no way as limiting as the argued claim limitation of “wherein each of the plurality of request-processing servers has an associated timer that is of a non-zero, finite delay time period wherein the timer starts after processing a last requested assigned to the request processing server, such that each request processing server is configured to transition from the first operational state to the second operational state after expiration of the timer’s non-zero, finite delay time period”.  

 	Patent Owner argues the AutoScale invention fulfills a long-felt need, noting about 35% less power than the “Always On” approach.
In response, the Examiner again notes a lack of a Nexus between the current claims and what AutoScale achieved.  Were the original claims 1-9 not AutoScale, as they didn’t require (1) “the request distribution system routes the new incoming job request to a lowest indexed request-processing server currently processing fewer than p requests” (hereinafter “the lowest indexed server” limitation)
and (2) “wherein each of the plurality of request-processing servers has an associated timer that is of a non-zero, finite delay time period wherein the timer starts after processing a last requested assigned to the request processing server, such that each request processing server is configured to transition from the first operational state to the second operational state after expiration of the timer’s non-zero, finite delay time period”  (hereinafter “the delay timer” limitation), as the present claims do?  Furthermore, the prior art of record teaches placing servers in a reduced power state and further powering down servers counter to the “Always On” approach argued.  This technique appears to have been well known in the art at the time as evidence by multiple pieces of prior art noting the like.


	Patent Owner argues ‘Unexpected Results’ and ‘Industry Praise’ using similar reason to that applied and answered above.  Again, there is no evidence that the AutoScale product was what is being described by these added claims.  
B. Arguments that ‘Examiner Improperly Considered the Secondary Indicia of Nonobviousness’

Patent Owner argues “the Examiner disregarded the secondary indicia because the Examiner concluded that the claims were obvious without considering the secondary indicia.”
In response, the Examiner respectfully, submits that during prosecution the Examiner fully considered each presented Secondary Consideration.  Patent Owner’s statement that “the Examiner’s state reason for why secondary considerations were not persuasive was because “the Examiner believes that claims are arrived at through an obvious combination of known job distribution methods coupled with known methods for moving underused servers to reduced power state” ID. at 32”  is a misrepresentation of the record.  The examiner did start with that quoted statement on page 32 of the 4/7/2021 Non-final Office action but then went on to address individual arguments regarding the Secondary Consideration over pages 32-39, going into much greater detail than did the Patent Owner in their description of how secondary consideration references taught “the lowest indexed server” limitation and “the delay timer” limitation.

a.  Arguments that “The Examiner Misconstrued the Evidence About Facebook’s Implementation of Autoscale”

Patent Owner argues that “Both Prof. Wierman and Prof. Harchol-Balter, who are each experts in the field (App. 2-3, §§ 1.1-1.5; App. 13-41 (Prof, Wierman); App. 51-52, §§ 1.2-1.3 (Prof. Harchol-Balter)), testified that Facebook’s implementation “is exactly what is taught—and claimed—in the Subject Application” because Facebook uses a pool of servers, which corresponds to the “lower-indexed servers” in the claims, which grows as load increases because “some of the higher- numbered servers are allocated jobs before their timers expire.” App. 10, § 3.12; App. 65, § 8.4.”
Examiner can’t see how a “pool of servers” constitutes routing “the new incoming job request to a lowest indexed request-processing server”.
Patent Owner elaborates on the testimony noting that Prof. Harchol-Barter (inventor) explained that “Facebook blog post states that “instead of a purely round-robin approach, the load balancer [in the Facebook AutoScale implementation] will concentrate workload to a server until it has at least a medium-level workload. If the overall workload is low (like at around midnight), the load balancer will use only a subset of servers. Other servers can be left running idle or be used for batch-processing workloads.” App. 47. Prof. Harchol-Balter explained that this “description corresponds to the first two AutoScale techniques of the Subject Application” (App. 64-65, §8.4), which are the indexed job-packing and delayed-turn-off aspects of AutoScale.”
Again the Examiner can’t see how to “concentrate workload to a server” is as limiting as routing “the new incoming job request to a lowest indexed request-processing server”, nor can he see how “Other servers left running idle” limits to “an associated timer that is of a non-zero, finite delay time period wherein the timer starts after processing a last requested … configured to transition from the first operational state to the second operational state after expiration of the timer’s non-zero, finite delay time period” .
Patent Owner elaborates “That Facebook’s description corresponds to the independent claims is clear from the language in the blog post that Facebook “will concentrate workload to a server until it has at least a medium-level workload,” i.e., the packing factor uses “only a subset of servers,” i.e., the lower-indexed servers”.
Examiner submits that concentrating workload to an individual server is classic load packing / load skewing which was well known in the art (see paragraph 33 of Liu and paragraph 2 of Hanson).  This in no way implies distribution to the lowest indexed server first.  If this were the case the Examiner wouldn’t need Nandagopal in the combination negating this very argument.  Clearly server management systems can pack certain servers without starting with the lowest-indexed, such as by packing into the one with most jobs, the last distributed to, etc. 

Patent Owner argues “The Examiner misconstrued the Facebook blog post by assuming that it uses standard load balancing. Office Action at 36 (“using load balancing” is what “Patent Owner argued against...”).  However, the Examiner missed that Facebook’s version of load balancing is “custom,” i.e., not standard, with the customization coming from “[f]ollowing the implementation of Autoscale ....” App. 47.”
In response, Examiner respectfully submits that what Patent Owner presents here is this Facebook blog post which supposedly uses/is the claimed invention that teaches a hybrid ‘load balancing’, where Patent Owner previously argued against references in the rejection for arguably using a hybrid load balancing (see arguments against Liu).


Patent Owner tries to lessen the importance of “control theory” and the “PI controller’ in the success of Facebooks implementation, however, It is noted in the evidence that:
 “A key part of the feedback loop is the decision logic. We want to make an optimal decision that will adapt to the varying workload, including workload surges or drops due to unexpected events. On one hand, we want to maximize the energy-saving opportunity. On the other, we don’t want to over-concentrate the traffic in a way that could affect site performance.”  
“For this to work, we employ the classic control theory and PI controller to get the optimal control effect of fast reaction time, small overshoots, etc. To apply the control theory, we need to first model the relationship of key factors such as CPU utilization and request-per-second (RPS). To do this, we conduct experiments to understand how they correlate and then estimate the model based on experimental data. For example, Figure 2 shows the experimental results of the relationship between CPU and RPS for one type of web server at Facebook. In the figure, the blue dots are the raw data points while the red dashed line is the estimated model (piece-wise linear). With the models obtained, the controller is then designed using the classic stability analysis to pick the best control parameters.”
This modeling, experimentation, and analysis to pick the best control parameters to balance the system in not brought out in the claims. 


b. Arguments that “The Examiner Did Not Address the Unexpected Results and the Industry Acclaim and Praise, Which Are Additional Secondary Considerations Demonstrating the Nonobviousness of the Pending Claims”

Each of the Unexpected Results, the Industry Acclaim, and Praise are unsuccessful in demonstrating non-obviousness because a clear connection has not been shown between the resultant AutoScale and the claimed invention as was thoroughly shown. 


C. Argument that “The Office Action Made Errors in Rationale and Underlying Fact Findings that Warrant Reversal of the Rejections”

1. Argument that “The Examiner’s Rationale that a PHOSITA Would Be Motivated to Route Jobs to the Lowest-Indexed Server in View of Nandagopal is Flawed”

Patent Owner argues that “Nandagopal, however, does not teach routing jobs to the lowest-indexed server to fill up the lowest-indexed server so that other servers will finish their jobs and time out. As explained above (see Section VIII.A.3), Nandagopal teaches balancing the load across the servers so that the Quality of Service for each of the services is substantially the same. Nandagopal,  [0003]; see also id. at  [0021] and Fig. 3.”… “none of Hanson, Liu or Nandagopal teaches filling some servers (e.g., the lower-indexed server) so that other servers (e.g., the higher-indexed servers) can finish their jobs, time out, and move to a lower power state.”
Examiner Respectfully submits that Nandagopal recognizes the benefits of prioritizing job distribution to the least-server-ID and last-server-selected, each of which continually chooses the same server, rather than spreading the workload amongst all servers. This is further supported by Liu who specifically addresses the use of and benefits in "Load Skewing" (see paragraphs 62 and 33). This concept is further addressed in Hanson (paragraphs 19-21), where jobs are consolidated into a limited number of servers. Begun (paragraphs 43-44) further shows that if available servers remain idle (does not receive requests for a predetermined time) or processing capacity exceed a predetermined workload, jobs are redistributed/consolidated in the remaining servers and the server is moved to a reduced power state.  
This argument presented is counter to Patent Owner’s own submitted evidence that utilize a combination of load balancing and load skewing techniques with the claimed invention (see arguments presented in paragraph 3 of page 57 of the Appeal Brief and remarks responsive to the Mr. Wu post).


2. Argument that “The Examiner’s Rationale that a PHOSITA Would Be Motivated to Use Begun’s Timer is Based on an Erroneous Determination of the Goal of Hanson, Liu, and Nandagopal”

Patent Owner argues that “Office Action at 10; see also id. at 19, 26. The Office Action is factually wrong that the goal of Hanson, Liu, and Nandagopal is to “recognize and deal” with underused servers and that Begun’s idle-based timer is compatible with Hanson, Liu, and Nandagopal.”
In response, Examiner respectfully submits:
Hanson teaches:  “the power manager (120) can control the power states of servers in the pool based on current or anticipated workload” (see paragraph 28) (further see paragraphs 18-21).
Liu teaches:  “The server management component 102 includes a provisioning component 106 that determines a number of servers within the cluster 104 that should be active. The provisioning component 106 can start up and / or shut down servers in the cluster 104.” (see paragraph 29) (also see paragraphs 30, 33, 59, and 60-62)
Nandagopal teaches a distribution method for distribution of jobs while balancing QoS and utilization. (see paragraphs 3 and 29)





Conclusion
For the above reasons, it is believed that the rejections should be sustained.
Respectfully submitted,
/DENNIS G BONSHOCK/Primary Examiner, Art Unit 3992                                                                                                                                                                                                        
Conferees:
/ROBERT L NASSER/Primary Examiner, Art Unit 3992                  

/ALEXANDER J KOSOWSKI/Supervisory Patent Examiner, Art Unit 3992  

                                                                                                                                                                                                                                                                            /ALEXANDER J KOSOWSKI/	                                                                      Supervisory Patent Examiner, Art Unit 3992                                                                                                                                                                                                                                                  
{ 3 }

Requirement to pay appeal forwarding fee.  In order to avoid dismissal of the instant appeal in any application or ex parte reexamination proceeding, 37 CFR 41.45 requires payment of an appeal forwarding fee within the time permitted by 37 CFR 41.45(a), unless appellant had timely paid the fee for filing a brief required by 37 CFR 41.20(b) in effect on March 18, 2013.