DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
The present application having Application No. 16/930,799 filed on 07/16/2020 presents claims 1-20 for examination.

Examiner Notes
Examiner cites particular columns and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Drawings
The applicant’s drawings submitted are acceptable for examination purposes.

Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in Philippine on December 22, 2019.   Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 07/16/2020, 10/29/2020 and 04/01/2022 have been acknowledged and the cited references have been considered by the examiner.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “an inputter/outputter configured to…” in claim 10.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 18 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 18 recites the limitation "the cooling metric" in line 1.  There is insufficient antecedent basis for this limitation in the claim.  It appears that claim 18 would be a proper dependent of claim 17.

Claim 20 recites the limitation "the scaling operation" in line 1.  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-5, 10-14 and 18 are rejected under pre-AIA  35 U.S.C. 103 as being unpatentable over Sun et al. (US 2019/0197655 A1) (hereinafter Sun) in view of Shah et al. (US 2012/0221810 A1) (hereinafter Shah).

As per claim 1, Sun discloses A method for scaling resources of a graphics processing unit (GPU) in a cloud computing system (e.g. Sun: [0013-0014] disclose GPU service platform that dynamically scales GPU resources to handle the service requests of the client system.  Also see [0018] [0081].), the method comprising: receiving requests for services from a client device (e.g. Sun: [Abstract] [0005-0006] discloses receiving service requests from a client system for GPU processing services provided by the GPU service platform.  [0027] discloses receiving service requests from the client systems.  [0048] discloses incoming service requests for GPU services from the client system.  Also see [0058] [0078] [0081] [Figs. 1, 3 and related description].); queuing the received requests in a message bus based on a preset prioritization scheme (e.g. Sun: [0032] discloses a task queue service module that is configured to enqueue GPU processing tasks in a task queue.  [0048-0049] discloses the server frontend module passes incoming service request for GPU services from the client system to the task queue module, the task queue module enqueues tasks of the service requests on the task queue.  [0028] discloses incoming client service request specifies the service attributes including a priority level for executing the GPU processing tasks. [0055-0056] discloses server frontend passes service requests to the task queue service and the task queue service inserts tasks associated with the service request into a task queue.  Each queue task has associated task priority.  Also see [0037][0058] [0078]; [0068-0069] [0076].); and scaling the resources of the GPU for the requests queued in the message bus according to a preset prioritization loop (e.g. Sun: [0013] discloses the GPU service platform dynamically scales GPU resources that can be allocated to handle the service request of the client system.  [0018] discloses GPU scaling technique that can apply GPU resources in smaller increment as needed.  [0081] discloses taking some scaling action, e.g., increase the amount of GPU resources when some tasks in the task queue are pending too long before starting.).
Sun discloses queued tasks/requests with associated priority but does not expressly disclose queueing the requests based on a preset prioritization scheme according to a preset prioritization loop.
However, queueing the requests based on a preset prioritization scheme according to a preset prioritization loop (e.g. Shah: [0016-0018] discloses a request priority module operable with the request priority queue module, for setting an order of placement of the requests in a queue.  The request priority rule module includes a rule that sets a queue order to enable processing of a request of each different priority type at a predetermined interval in the queue.  For example, customizable rule may be defined such that for a request management system that handles three types of priority requests (priority-1to3), for every 10 requests that are handled, if the last five requests were priority-1 requests, then the sixth request in the queue should be priority-2 or priority-3.  Further, rule may be defined such that, if the last nine request were priority-1 and priority-2 requests, the tenth request in the queue should be a priority-3 request.  In this manner, the customizable rule will enable handling, all priority-1 to prority-3 requests in any desired priority order loop, including a processing order of a priority-1 request, a priority-2 request, a priority-1 request, a priority-3 request, a priority-1 request, and a priority-2 request.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system/method of using a request priority rule module operable with the priority queue module to enable the queueing and ordering of requests in a desired predetermined priority order/loop as taught by Shah into Sun because it would prevent starvation of any single type priority request and improve the overall request processing time and throughput (see Shah: [0017-0018] [0031]).

As per claim 2, the combination of Sun and Shah discloses The method of claim 1 [See rejection to claim 1 above], wherein the cloud computing system is implemented with container-based services using a dedicated GPU per service (e.g. Sun: [0045] discloses each application container comprises a separate applications and executes as an isolated process.  One or more containers can be instantiated to execute one or more applications or functions of the GPU server node.).

As per claim 3, the combination of Sun and Shah discloses The method of claim 1 [See rejection to claim 1 above], Shah further discloses wherein the preset prioritization loop is repeated in an order of: a high priority request, a medium priority request, a high priority request, a low priority request, a high priority request, and a medium priority request (e.g. Shah: [0016-0018] discloses a request priority module operable with the request priority queue module, for setting an order of placement of the requests in a queue.  The request priority rule module includes a rule that sets a queue order to enable processing of a request of each different priority type at a predetermined interval in the queue.  For example, customizable rule may be defined such that for a request management system that handles three types of priority requests (priority-1to3), for every 10 requests that are handled, if the last five requests were priority-1 requests, then the sixth request in the queue should be priority-2 or priority-3.  Further, rule may be defined such that, if the last nine request were priority-1 and priority-2 requests, the tenth request in the queue should be a priority-3 request.  In this manner, the customizable rule will enable handling, all priority-1 [high], priority-2 [medium] and prority-3 [low] requests in any desired priority order loop, including a processing order of a priority-1 request, a priority-2 request, a priority-1 request, a priority-3 request, a priority-1 request, and a priority-2 request.).

As per claim 4, the combination of Sun and Shah discloses The method of claim 1 [See rejection to claim 1 above], Shah further discloses wherein the message bus comprises at least one queue for storing a high priority request, a medium priority request, and a low priority request (e.g. Shah: [Figs. 2 and 3] discloses at least one queue for storing priority-1 to priority-n.  [0015] [0017] discloses requests having priority-1, priority-2 and priority-3 to be placed/stored in a queue.  Also see [0033]).

As per claim 5, the combination of Sun and Shah discloses The method of claim 1 [See rejection to claim 1 above], wherein the queuing of the received requests comprises: identifying each of the received requests as one of a high priority request, a medium priority request, or a low priority request; and queuing each request in the message bus (e.g. Sun: [0028] discloses each service request received from the client system may specify one or more service attributes including a priority level for executing the GPU processing tasks.  [0064] discloses client request include priority level information.  Shah: [Figs. 2 and 3] discloses at least one queue for storing priority-1 to priority-n.  [0015] [0017] discloses requests having priority-1 [high], priority-2 [medium] and priority-3 [low] to be placed/stored in a queue.  [0033] determines the requirements for each request and prioritizes the requests to be placed in the queues based on priorities via the request priority queue module. At block 206, the request priority rule module sets an order of placement of the requests in the queues.).

As per claims 10-14, these are system/apparatus claims having similar limitations as cited in method claims 1-5, respectively.  Thus, claims 10-14 are also rejected under the same rationale as cited in the rejection of rejected claims 1-5, respectively.

As per claim 18, the combination of Sun and Shah discloses The apparatus of claim 10 [See rejection to claim 10 above], wherein a parameter of the cooling metric comprises at least one of: a capacity of the GPU, a speed of the GPU, a system operating cost, or a cost for cooling the GPU (e.g. Sun: [0066] discloses maintaining registration information to determine available GPU resources and available GPU server nodes.).

Claims 6 and 15 are rejected under pre-AIA  35 U.S.C. 103 as being unpatentable over Sun in view of Shah and further in view of Greenfield et al. (US 11,112,120 B1) (hereinafter Greenfield).

As per claim 6, the combination of Sun and Shah discloses The method of claim 1 [See rejection to claim 1 above], wherein the scaling of the resources comprises: checking a request for performing a scaling operation among the requests queued in the message bus according to the preset prioritization loop (e.g. Sun: [0013] discloses when the GPU processing tasks of a given service request cannot be handled using the resources of a single GPU server node, the GPU service platform can dynamically scale the amount of GPU resources that can be allocated to handle the service request of the client system.  [0018] the GPU resources are applied in smaller increments as needed for the queued tasks or service request.  [0081] discloses if some tasks in the task queue are pending too long before starting, some scaling action can be performed to increase the amount of GPU resources.  Shah: [0016-0018] as discussed above, Shah discloses a request priority module operable with the request priority queue module, for setting an order of placement of the requests in a queue.  The request priority rule module includes a rule that sets a queue order to enable processing of a request of each different priority type at a predetermined interval in the queue.  For example, customizable rule may be defined such that for a request management system that handles three types of priority requests (priority-1to3), for every 10 requests that are handled, if the last five requests were priority-1 requests, then the sixth request in the queue should be priority-2 or priority-3.  Further, rule may be defined or customized such that, all priority-1 [high], priority-2 [medium] and prority-3 [low] requests in any desired priority order loop, including a processing order of a priority-1 request, a priority-2 request, a priority-1 request, a priority-3 request, a priority-1 request, and a priority-2 request.).
The combination of Sun and Shah does not expressly disclose scaling out or in a resource for the checked request based on a scaling metric.
However, Greenfield discloses scaling out or in a resource for the checked request based on a scaling metric (e.g. Greenfield: [Col. 2, lines 62-67] discloses demand-based scaling where scaling policies are used to control the scaling process based on one or more monitored metrics associated with an auto scaling group.  [Col. 5, lines 30-40] discloses scaling in typically initiated to either launch or terminate a quantity of compute instances, an autoscaling group increases or decreases a quantity of associated compute instances based on monitored metric.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system/method of using an autoscaling service that increases or decreases computing instances based on monitored metric as taught by Greenfield into the combination of Sun and Shah because it would improve auto scaling processes related to any type of virtual computing resources and enable to more efficiently manage scaling requests according to user’s preference (see Greenfield: [Col. 2, lines 40-65]).

As per claim 15, this is a system/apparatus claims having similar limitations as cited in method claim 6.  Thus, claim 15 is also rejected under the same rationale as cited in the rejection of rejected claim 6.

Claim 20 is rejected under pre-AIA  35 U.S.C. 103 as being unpatentable over Sun in view of Shah and further in view of Friedrich et al. (US 2021/0099517 A1) (hereinafter Friedrich).
As per claim 20, the combination of Sun and Shah discloses The apparatus of claim 10 [See rejection to claim 10 above], but does not expressly disclose wherein the scaling operation is measured based on a threshold being set based on a number of requests of a particular type which are allowed to wait in the message bus.
However, Friedrich discloses wherein the scaling operation is measured based on a threshold being set based on a number of requests of a particular type which are allowed to wait in the message bus (e.g. Friedrich: [0004-0006] discloses a compute scaling application accesses, a compute capacity indicating a number of allocated compute instances and usage metrics indicating pending task requests in a queue.  The scaling process determines an adjustment to a number of compute instances based on the usage metrics that indicates number of pending tasks in the queue.  [0017] discloses adjusting computing resources if a pending number of computing tasks increases above a threshold level.  [0019] discloses a cloud scaling system utilizes usage metrics (e.g., a number of pending processing requests in a queue) to determine the compute scaling adjustment.  [0020] discloses reinforcement learning enables how to make adjustments to number of compute instances based on parameters such as queue size or pending jobs.  [0037] discloses scaling application is executed in response to certain criteria.  For example, when an number of pending jobs in a queue exceeds a certain size.  [0044] discloses scaling adjustment may be performed to reduce the queue size to a desired size.  For example, a desirable queue size might be 50 tasks in the queue, whereas an undesirable queue size may be 200 tasks in the queue.  Also see [0025] [0028] [0030] [0035] [0051].).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system/method of using an autoscaling service that increases or decreases computing instances based on monitored metric as taught by Greenfield into the combination of Sun and Shah because the machine learning approach of Friedrich would efficient use of computing resources by automatically adjusting computing resources based on different types of computing tasks (see Friedrich: [0017-0018]).

Allowable Subject Matter
Claims 7-9, 16, 17 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Hiren Patel whose telephone number is (571) 270-3366.  The examiner can normally be reached on Monday to Friday 9:30 AM to 6:00 PM.		
If attempts to reach the above noted Examiner by telephone are unsuccessful, the Examiner’s supervisor, Emerson Puente, can be reached at the following telephone number: (571) 272-3652. 
The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

November 16, 2022

/HIREN P PATEL/Primary Examiner, Art Unit 2196