DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-3, 12-14, and 20 are amended in response to the last office action. Claims 1- 20 are presented for examination. Olson et al, Jain et al, Lemaire et al, and Ebsen were cited, previously.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 4, 6-10, 12, and 17-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ebsen [US 2012/0278530 A1] in view of Olson et al [US 2015/0263978 A1].
	As to claims 1, 12, and 20, Ebsen teaches a system, comprising:
a request throttling manager including a plurality of throttling paths [e.g., path between SCHEDULER 420 and HOST READ queue, path between SCHEDULER 420 and HOST WRITE queue in memory unit queue set 440, 441, 442 in fig. 4A], wherein:
each throttling path of the plurality of throttling paths corresponds to a priority class of data requests [e.g., “Priorities are assigned to the memory access requests.  The memory access requests are segregated to memory unit queues of at least one set of memory unit queues, the set of memory unit queues associated with a memory unit” in paragraph 0001; “A priority scheme may also be present between different types of host originating requests and/or different types of memory controller originating requests.  For example, a host write may only be serviced after three host reads have been serviced corresponding to a host write:host read ratio of 1:3” in paragraph 0041] and comprises:
a request queue configured to queue data requests of that priority class [e.g., including a first token bucket [e.g., “In this example, the memory access requests received from the host include host write requests (W) and host read requests (R).  The scheduler 420 also receives memory access requests that originate in the controller from a memory access request module 430.  The host-originated requests and the controller-originated requests are assigned to a memory unit queue set 440, 441, 442” in paragraph 0040; “Turning now to the flow diagram of FIG. 4B, in some configurations that include distributed prioritization, the system may include multiple request control units that compete for priority for the type of requests respectively associated with the control units, e.g., host read, host write, garbage collection” in paragraph 0042]; 
a token bucket allocated tokens for processing data requests of that priority class [e.g., “Each queue may also have a token value associated with it.  Referring to the previous example, in some cases, garbage collection requests are serviced after 7 host-originated requests are serviced.  In this example, each queue receives a number of tokens based on the priority of the queue for a particular request type” in paragraph 0044]; and
a request gate configured to selectively pass, based on a quantity of tokens in the token bucket, data requests to at least one storage manager for processing [e.g., “The scheduler determines the next request to be serviced based on the number of tokens associated with each queue in the queue set.  In this example, the scheduler determines which queue in a queue set has the highest number of tokens and the request at the head of that queue is the first to be serviced.  After the request has been serviced, the number of tokens for that queue is decremented” in paragraph 0044];
the request throttling manager is configured to:
receive a first data request [e.g., “The memory controller receives memory access requests from a host terminal, the memory access requests from the host terminal including one or both of host read requests and host write request” in paragraph 0001; one of a path between SCHEDULER 420 and HOST READ queue and a path between SCHEDULER 420 and HOST WRITE queue in memory unit queue set 440, 441, 442 in fig. 4A];
determine, based on a first priority class of the first data request, a first throttling path from the plurality of throttling paths [e.g., “Priorities are assigned to the memory access requests.  The memory access requests are segregated to memory unit queues of at least one set of memory unit queues, the set of memory unit queues associated with a memory unit.  Each memory access request is sent to the memory unit according to a priority and an assigned memory unit queue of the memory access request” in paragraph 0001];
queue the first data request in a first request queue of the first throttling path [e.g., “Priorities are assigned to the memory access requests.  The memory access requests are segregated to memory unit queues of at least one set of memory unit queues, the set of memory unit queues associated with a memory unit.  Each memory access request is sent to the memory unit according to a priority and an assigned memory unit queue of the memory access request” in paragraph 0001]; and 
pass, responsive to a first token bucket of the first throttling path including a sufficient first quantity of first tokens to process the first data request, the first data request from the first request queue through a first request gate of the first throttling path to the at least one storage manager [e.g., “The scheduler determines the next request to be serviced based on the number of tokens associated with each queue in the queue set.  In this example, the scheduler determines which queue in a queue set has the highest number of tokens and the request at the head of that queue is the first to be serviced.  After the request has been serviced, the number of tokens for that queue is decremented.  In this case, the host-originated request queue started at 7 tokens and after the first request was serviced, the number of tokens for that queue is decremented to 6.  The scheduler continues to service requests based on the number of tokens associated with each queue in the queue set” in paragraph 0044]; and
the at least one storage manager is configured to access one or more storage nodes of a plurality of storage nodes of a distributed storage system in response to the first data request [e.g., MEMORY UNIT 1, 2, N in figs. 1A, 4A; “The process may also check for conflicts with outstanding memory access requests on that channel and/or memory unit before servicing the request.  Once the process determines that the memory unit is available, the memory access request is serviced 340 by sending the request to the memory device via the channel” in paragraph 0039].
Though Ebsen teaches the data request are received from a host for accessing data, Ebsen does not explicitly teach, however Olson et al teach the data requests being file data requests [e.g., “Applications running on the compute instances 140 issue read and/or write requests 122 (also referred to herein as client read/write requests) for storage objects (such as files or file systems) that are implemented using block storage devices 120” in paragraph 0038].  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify to implement Olson et al’s teaching above including the data requests being file data requests in order to increase applicability for the system of Ebsen. 
As to claim 3, the combination of Ebsen and Olson et al teaches the request throttling manager is further configured to: receive a second file data request [e.g., R1, W1, R2, … in host-originated requests 810 in fig. 8 of Ebsen]; determine, based on a second priority class of the second file data request, a second throttling path [e.g., “Priorities are assigned to the memory access requests.  The memory access requests are segregated to memory unit queues of at least one set of memory unit queues, the set of memory unit queues associated with a memory unit.  Each memory access request is sent to the memory unit according to a priority and an assigned memory unit queue of the memory access request” in paragraph 0001, other of a path between SCHEDULER 420 and HOST READ queue and a path between SCHEDULER 420 and HOST WRITE queue in memory unit queue set 440, 441, 442 in fig. 4A of Ebsen]; queue the second file data request in a second request queue of the second throttling path [e.g., “Priorities are assigned to the memory access requests.  The memory access requests are segregated to memory unit queues of at least one set of memory unit queues, the set of memory unit queues associated with a memory unit.  Each memory access request is sent to the memory unit according to a priority and an assigned memory unit queue of the memory access request” in paragraph 0001, fig. 8 of Ebsen]; and pass, responsive to a second token bucket of the second throttling path including a sufficient second quantity of second tokens to process the second file data request, the second file data request from the second request queue through a second request gate of the second throttling path to the storage manager [e.g., “The scheduler determines the next request to be serviced based on the number of tokens associated with each queue in the queue set.  In this example, the scheduler determines which queue in a queue set has the highest number of tokens and the request at the head of that queue is the first to be serviced.  After the request has been serviced, the number of tokens for that queue is decremented.  In this case, the host-originated request queue started at 7 tokens and after the first request was serviced, the number of tokens for that queue is decremented to 6.  The scheduler continues to service requests based on the number of tokens associated with each queue in the queue set” in paragraph 0044 of Ebsen]; and the at least one storage manager is configured to access one or more storage nodes of the storage nodes of the distributed storage system in response to the second file data request [e.g., MEMORY UNIT 1, 2, N in figs. 1A, 4A, “The process may also check for conflicts with outstanding memory access requests on that channel and/or memory unit before servicing the request.  Once the process determines that the memory unit is available, the memory access request is serviced 340 by sending the request to the memory device via the channel” in paragraph 0039 of Ebsen].  
As to claim 4, the combination teaches wherein: the first token bucket and the second token bucket are respectively configured with a first token bucket capacity and a second token bucket capacity; the first token bucket capacity defines a maximum number of tokens that may be stored in the first token bucket; and the second token bucket capacity defines a maximum number of tokens that may be stored in the second token bucket [e.g., “In the event that all tokens in a queue set are decremented to zero, the scheduler may replenish the tokens.  In some cases, the number of tokens that are replenished for each queue in the queue set depends on the last saved priority scheme” in paragraph 0046 of Ebsen; “The initial population may be determined, e.g., based on expectations of the workload, service level agreements, a provisioning budget specified by the client that owns or manages the corresponding data object, or some combination of such factors in various embodiments.  For some types of buckets the initial population may be set to zero in some embodiments.  In some implementations the initial population of a bucket may be set to a maximum population for which the bucket is configured” in paragraph 0044 of Olson et al].
 As to claim 6, the combination teaches wherein the first request queue and the first token bucket operate independently of the second request queue and the second token bucket [e.g., “After the request has been serviced, the number of tokens for that queue is decremented.  In this case, the host-originated request queue started at 7 tokens and after the first request was serviced, the number of tokens for that queue is decremented to 6.  The scheduler continues to service requests based on the number of tokens associated with each queue in the queue set” in paragraph 0044, fig. 8 of Ebsen; Bucket 202, work requests 270C, 270D, 270E in fig. 2, “In some implementations, the admission control for different categories of work requests may be handled independently--e.g., different token buckets may be set up for reads than for writes” in paragraph 0024 of Olson et al].
As to claims 7 and 17, the combination teaches a backend throughput manager, wherein: the first token bucket is configured with a first token bucket capacity; the first token bucket capacity defines a maximum number of tokens that may be stored in the first token bucket; and the backend throughput manager is configured to vary the first token bucket capacity based on a throughput of the distributed storage system [e.g., “In the event that all tokens in a queue set are decremented to zero, the scheduler may replenish the tokens.  In some cases, the number of tokens that are replenished for each queue in the queue set depends on the last saved priority scheme.  In other cases, the scheduler and/or request control units determine a new priority scheme based on the current system load, for example.  In this case, the tokens are then replenished according to the new priority scheme” in paragraph 0046 of Ebsen; “Depending on the kinds of applications for which V1 and V2 are configured, variations in the I/O workloads directed at V1 and V2 may still occur over time, which may lead to higher I/O response times (or higher I/O rejection rates) than desired. … For example, if P1 and P2 are both 1000 IOPS, so that their combined PIOPS is 2000, during a given second the rate of I/O requests for V1 may be 1200 (above its provisioned level) and the rate of I/O requests directed to V2 may be 500 (below its provisioned level). ...  In the above example, in an embodiment in which token buckets are being used for admission control, the client-side components may temporarily increase the refill rate for V1's bucket (e.g., to 1250 tokens per second, so that 1200 IOPS can be handled relatively easily) and decrease the refill rate of V2's bucket (e.g., to 750 tokens per second) if the storage server for V1 is capable of handling 1250 IOPs” in paragraph 0025 of Olson et al].
As to claims 8 and 18, the combination teaches wherein the throughput is defined by a throughput parameter that is independent of sizes of object requests received by the distributed storage system [e.g., “In the above example, in an embodiment in which token buckets are being used for admission control, the client-side components may temporarily increase the refill rate for V1's bucket (e.g., to 1250 tokens per second, so that 1200 IOPS can be handled relatively easily) and decrease the refill rate of V2's bucket (e.g., to 750 tokens per second) if the storage server for V1 is capable of handling 1250 IOPs” in paragraph 0025 of Olson et al].
As to claims 9 and 19, the combination teaches wherein the backend throughput manager is further configured to determine the throughput parameter by: determining a storage node throughput of each storage node of a set of storage nodes in the distributed storage system, the set of storage nodes including two or more storage nodes of the plurality of storage nodes of the distributed storage system [e.g., “In some implementations, the block storage service may determine a maximum IOPS level to be supported, based on the volume size indicated by the client.  According to at least some embodiments, the block storage service may support a provisioned workload model” in paragraph 0022, “For example, consider a client C1 with a compute instance CI1, to which block storage volumes V1 and V2 are to be attached.  If the client requests (e.g., at the time of volume creation) a provisioned IOPS level (PIOPS) of P1 for volume V1, and a PIOPS of P2 for volume V2, the storage service may identify back-end storage servers with physical storage devices (and network devices) capable of supporting the desired I/O rates, as well as CPUs capable of handling the request processing for the desired I/O rates” in paragraph 0024 of Olson et al]; determining a utilization rate of each storage node of the set of storage nodes, wherein the utilization rate is based on an idle time of each storage node of the set of storage nodes in the distributed storage system [e.g., “For example, for V1, a token bucket with a refill rate of P1 tokens per second may be established, from which one token is consumed every time an I/O request is accepted.  Similarly, a token bucket with a refill rate of P2 tokens per second may be established for V2, from which one token is consumed every time an I/O request is accepted, as well as CPUs capable of handling the request processing for the desired I/O rates” in paragraph 0024, “In some embodiments, a token consumption policy may also specify a decay-during-idle parameter indicating whether (and at what rate) tokens are to be deleted from the bucket if the corresponding work target is not targeted for work requests for some time, or a transfer-upon-idle parameter indicating whether tokens should be transferred from one bucket to another (e.g., from a bucket of a lightly-used volume to a bucket of a more heavily-used volume) if they are not used during some time interval.  In one embodiment, a staleness policy may be used to consume tokens that have not been consumed for a specified time interval--e.g., each token may be associated with a validity lifetime after which the token may no longer be useful for admission control purposes” in paragraph 0050 of Olson et al]; for each storage node of the set of storage nodes in the distributed storage system, generating an estimated preferred throughput per storage node based on the utilization rate [e.g., “Admission control mechanisms at the back-end servers may typically enforce the PIOPS limits for the volumes in some implementations” in paragraph 0024, “In at least some embodiments, it may be possible to analyze the read and write request patterns at client-side components of the storage service (e.g., at the instance hosts where the applications run) and predict the variations in I/O request rates with a high degree of accuracy.  In such embodiments, the client-side components may coordinate with the back-end storage servers to modify the admission control parameters that are used to accept work requests for the volumes at least temporarily as described below, so that request rates above the provisioned IOPS levels may be supported for some periods of time for one or more volumes if sufficient resources are available” in paragraph 0025, “A client-side component of the service may generate an estimate of a rate of work requests expected to be directed during some time period to at least a portion of a first block-level storage device implemented at a first storage server.  If the expected rate of work requests to the first device exceeds the provisioned workload of the first device, the client-side component may attempt to identify a second block-level storage device (e.g., at a different storage server or at the same storage server) at which the workload expected during the time period is lower than the provisioned workload.  If such a second device can be found, in at least some embodiments the client-side component may ascertain (e.g., by communicating with the first storage server) whether the first storage server has enough capacity to accept the extra workload of the first device” in paragraph 0026; “In the embodiment shown in FIG. 1, if a given client-side component 150 estimates that, for some block device 120, the anticipated request rates may require an I/O rate higher than the provisioned level during a time interval, the client-side component may attempt to find some other block device from which throughput capacity can be "borrowed" to accommodate the anticipated higher request rates” in paragraph 0042 of Olson et al]; calculating a backend throughput based on the storage node throughput of each storage node of the set of storage nodes [e.g., “In the above example, in an embodiment in which token buckets are being used for admission control, the client-side components may temporarily increase the refill rate for V1's bucket (e.g., to 1250 tokens per second, so that 1200 IOPS can be handled relatively easily) and decrease the refill rate of V2's bucket (e.g., to 750 tokens per second) if the storage server for V1 is capable of handling 1250 IOPs.  Alternatively, instead of adjusting refill rates, some number of tokens may simply be ‘borrowed’ or transferred from V2's bucket and deposited in V1's bucket.  In this way, as long as sufficient resources are available, various types of temporary compensatory admission control parameter adjustments may be made to enhance the overall responsiveness of the storage service” in paragraph 0025 of Olson et al]; adjusting the backend throughput based on one or more of a specific storage method and a specific retrieval method used by the distributed storage system; and adjusting the backend throughput based on a desired load factor of the distributed storage system [e.g., “Control-plane communication channels may be intended primarily for administrative or configuration-related operations, including, for example, recovery-related operations, dynamic reconfigurations in response to changing workloads, and so on” in paragraph 0021, “In at least some embodiments, it may be possible to analyze the read and write request patterns at client-side components of the storage service (e.g., at the instance hosts where the applications run) and predict the variations in I/O request rates with a high degree of accuracy.  In such embodiments, the client-side components may coordinate with the back-end storage servers to modify the admission control parameters that are used to accept work requests for the volumes at least temporarily as described below, so that request rates above the provisioned IOPS levels may be supported for some periods of time for one or more volumes if sufficient resources are available” in paragraph 0025 of Olson et al].
As to claim 10, the combination teaches the first request queue is configured as one of a first-in first-out (FIFO) queue and a priority queue [e.g., “The scheduler 145 may route memory access requests to a set of queues 150, e.g. a set of first in first out (FIFO) queues, associated with a memory unit.  In some embodiments, the controller 120 has a queue for the host-originated access requests 151 and a separate queue for the controller-originated access requests 152” in paragraph 0031 of Ebsen].
Claims 2, 11, and 13-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ebsen and Olson et al as applied to claims 1 and 12 above, and further in view of in view of Jain et al [US 2017/0324813 A1].
	As to claims 2, 11, and 13, though the combination of Ebsen and Olson et al teach wherein the first file data request associated with a first class identifying a first priority of the first file data request [e.g., “In such scenarios, relative weights or priorities may be associated with the requests of different group members, and the weight information may be propagated among the group members so that request scheduling decisions can be made with the relative importance of different components in view” in paragraph 0036, “Based on the role and/or on the request rates of various types of storage operations, a relative weight may be assigned to each component, which may for example be used to prioritize requests from one component over those of another (e.g., by the lower-priority component introducing delays between its back-end requests).  In the depicted example, distinct weights are attached to reads and writes issued by each component” in paragraph 0087 of Olson et al], the combination does not explicitly teach, however Jain et al teach wherein the first file data request includes a first class identifying a first priority of the first file data request [e.g., “In some configurations, the storage performance parameter(s) of the storage SLA 200 can be included in a packet representing the incoming storage request 212.  In other configurations, an identifier of an application associated with the storage request 212 can be included in a packet of the incoming storage request 212, and the identifier can be mapped to the storage SLA 200” in paragraph 0048; “In this manner, high priority storage requests 212 can be placed in high priority inbound queues 410, and low priority storage requests 212 can be placed in low priority queues 410 so that a total SLA penalty can be minimized by processing storage requests 212 pursuant to storage performance parameters specified in their corresponding storage SLAs 200” in paragraph 0050].  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify to implement Jain et al’s teaching above including the first file data request including a first class identifying a first priority of the first file data request in order to increase simplicity and/or feasibility in identifying the priority of the data request of the combination.
As to claim 14, the combination of Ebsen, Olson et al, and Jain et al teaches receiving, by the access node, a second file data request [e.g., R1, W1, R2, … in host-originated requests 810 in fig. 8 of Ebsen; “Applications running on the compute instances 140 issue read and/or write requests 122 (also referred to herein as client read/write requests) for storage objects (such as files or file systems) that are implemented using block storage devices 120” in paragraph 0038 of Olson et al]; determining, based on a second priority class of the second file data request, a second throttling path from the plurality of throttling paths [e.g., “Priorities are assigned to the memory access requests.  The memory access requests are segregated to memory unit queues of at least one set of memory unit queues, the set of memory unit queues associated with a memory unit.  Each memory access request is sent to the memory unit according to a priority and an assigned memory unit queue of the memory access request” in paragraph 0001, other of a path between SCHEDULER 420 and HOST READ queue and a path between SCHEDULER 420 and HOST WRITE queue in memory unit queue set 440, 441, 442 in fig. 4A of Ebsen]; queuing the second file data request in a second request queue of the second throttling path [e.g., “Priorities are assigned to the memory access requests.  The memory access requests are segregated to memory unit queues of at least one set of memory unit queues, the set of memory unit queues associated with a memory unit.  Each memory access request is sent to the memory unit according to a priority and an assigned memory unit queue of the memory access request” in paragraph 0001, fig. 8 of Ebsen]; and passing, responsive to a second token bucket of the second throttling path including a sufficient second quantity of second tokens to process the second file data request as a second object data request, the second object data request from the second request queue through a second request gate of the second throttling path to a storage manager in the distributed storage system [e.g., “The scheduler determines the next request to be serviced based on the number of tokens associated with each queue in the queue set.  In this example, the scheduler determines which queue in a queue set has the highest number of tokens and the request at the head of that queue is the first to be serviced.  After the request has been serviced, the number of tokens for that queue is decremented.  In this case, the host-originated request queue started at 7 tokens and after the first request was serviced, the number of tokens for that queue is decremented to 6.  The scheduler continues to service requests based on the number of tokens associated with each queue in the queue set” in paragraph 0044, MEMORY UNIT 1, 2, N in figs. 1A, 4A, “The process may also check for conflicts with outstanding memory access requests on that channel and/or memory unit before servicing the request.  Once the process determines that the memory unit is available, the memory access request is serviced 340 by sending the request to the memory device via the channel” in paragraph 0039 of Ebsen].  
As to claim 15, the combination teaches wherein: the first token bucket and the second token bucket are respectively configured with a first token bucket capacity and a second token bucket capacity; the first token bucket capacity defines a maximum number of tokens that may be stored in the first token bucket; and the second token bucket capacity defines a maximum number of tokens that may be stored in the second token bucket [e.g., “In the event that all tokens in a queue set are decremented to zero, the scheduler may replenish the tokens.  In some cases, the number of tokens that are replenished for each queue in the queue set depends on the last saved priority scheme” in paragraph 0046 of Ebsen; “The initial population may be determined, e.g., based on expectations of the workload, service level agreements, a provisioning budget specified by the client that owns or manages the corresponding data object, or some combination of such factors in various embodiments.  For some types of buckets the initial population may be set to zero in some embodiments.  In some implementations the initial population of a bucket may be set to a maximum population for which the bucket is configured” in paragraph 0044 of Olson et al].
Claim 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ebsen and Olson et al as applied to claim 4, and further in view of Lemaire et al [US 2011/0199899 A1].
	As to claim 5, the combination of Ebsen and Olson et al teaches a token manager configured to: monitor a token level of the first token bucket; receive a replenishment set of tokens bucket [e.g., “In the event that all tokens in a queue set are decremented to zero, the scheduler may replenish the tokens.  In some cases, the number of tokens that are replenished for each queue in the queue set depends on the last saved priority scheme” in paragraph 0046 of Ebsen]. The combination does not further teach, however Lemaire et al teach a token manager configured to: determine, based on the token level, that at least a portion of the tokens of the replenishment set of tokens exceeds the first token bucket capacity; and distribute the at least the portion of the tokens of the replenishment set of tokens to the second token bucket [e.g., “When a per-processor token bucket 620 or 623 is replenished, any tokens that can not be added to the token bucket due to that processor's per-processor token bucket's depth limit are distributed among the remaining per-processor token buckets (to the extents that the remaining per-processor token buckets are not already full), rather than being discarded.  Multiple passes may per performed to distribute the tokens among the per-processor token buckets 620-623” in paragraph 0102].  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify to implement Lemaire et al’s teaching above including distributing to the second token bucket the portion of the tokens of the replenishment set of tokens exceeding the first token bucket capacity in order to increase simplicity in replenishing token and/or to reduce waste of tokens as taught [e.g., ‘lender’ and ‘borrower’ of tokens in paragraph 0025 of Olson et al; ‘rather than discarded’ in paragraph 0102 of Lemaire et al].
Claim 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ebsen, Olson et al, and Jain et al as applied to claim 15 above, and further in view of Lemaire et al [US 2011/0199899 A1].
	As to claim 16, the combination of Ebsen/Olson et al/Jain et al teaches monitoring a token level of the first token bucket; receiving a replenishment set of tokens bucket [e.g., “In the event that all tokens in a queue set are decremented to zero, the scheduler may replenish the tokens.  In some cases, the number of tokens that are replenished for each queue in the queue set depends on the last saved priority scheme” in paragraph 0046 of Ebsen]. The combination does not further teach, however Lemaire et al teach determining, based on the token level, that at least a portion of the tokens of the replenishment set of tokens exceeds the first token bucket capacity; and distributing the at least the portion of the tokens of the replenishment set of tokens to the second token bucket [e.g., “When a per-processor token bucket 620 or 623 is replenished, any tokens that can not be added to the token bucket due to that processor's per-processor token bucket's depth limit are distributed among the remaining per-processor token buckets (to the extents that the remaining per-processor token buckets are not already full), rather than being discarded.  Multiple passes may per performed to distribute the tokens among the per-processor token buckets 620-623” in paragraph 0102].  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify to implement Lemaire et al’s teaching above including distributing to the second token bucket the portion of the tokens of the replenishment set of tokens exceeding the first token bucket capacity in order to increase simplicity in replenishing token and/or to reduce waste of tokens as taught [e.g., ‘lender’ and ‘borrower’ of tokens in paragraph 0025 of Olson et al; ‘rather than discarded’ in paragraph 0102 of Lemaire et al].
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ILWOO PARK whose telephone number is (571) 272-4155.  The examiner can normally be reached on M-F, 9 AM-5 PM EST. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Dr. Henry Tsai can be reached on (571) 272-4176.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300. lnformation regarding the status of an application may be obtained from the Patent Application lnformation Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/ILWOO PARK/Primary Examiner, Art Unit 2184                                                                                                                                                                                                        4/28/2022