Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

DETAILED ACTION

Claims 1-20 are currently pending and have been examined.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claim 1-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
The following claim languages are not clearly understood and indefinite:
As per claim 1, lines 7-8, recites “perform workloads assigned to virtual graphics processing unit (vGPU)-enabled graphics processing units (GPUs) based on a plurality of VGPU placement models”. However, it is uncertain and not clearly understood when the workloads are/were assigned to vGPU-enabled GPUs. Further, it is unclear whether the “workloads assigned” line 7 are related to the assigning of “at least one workload to execute on the GPUs. Further, it is not clearly defined as what 
As per claims 8 and 15, they are rejected for having similar issues as claim 1 above. 
As per claims 2-7, 9-14 and 16-20, they are rejected as being dependent on rejected claims 1, 8 and 15.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 7-10 and 14-17 are rejected under 35 U.S.C. 103 as being unpatentable over Sivaraman et al. “Task Assignment in a Virtualized GPU Enabled Cloud” in view of Steiner et al. (U.S. Pub. No. 20190026624 A1).

As per claim 1, Sivaraman teaches the invention substantially as claimed including a system comprising: 
perform [simulate] workloads assigned to virtual graphics processing unit (vGPU)- enabled graphics processing units (GPUs) based on a plurality of VGPU placement models, wherein the workloads are performed to generate workload data and GPU data for the plurality of VGPU placement models (pg. 897, Right col., lines 5-11, To investigate and identify fast, efficient solutions for VM-placement in a vGPU enabled cloud (vGPU cloud), we built a simple simulator that allows us to compare different techniques and algorithms [models]. We also defined a cost function that we use to compare the different solutions. The simulator is built so we can add as many placement algorithms [models] as needed to it … simulator puts in one of …  three categories of jobs based on our experience running a vGPU cloud … these categories have different characteristics [workload, GPU data]); 
assign at least one workload to execute on the vGPU-enabled GPUs (pg. 895, left col. lines 36-37 assigning a task with GPU requirements to a vGPU enabled cloud; pg. 896 right col., lines 3-10 The problem of VM-placement is to determine at each scheduling instance a set of tasks/jobs that can be placed on the available GPUs, given the constraints imposed by the vGPU solution, so as to optimize the following costs: Utilization of the GPUs. Number of jobs/tasks completed. Time spent by a job 
Sivaraman does not expressly teach: at least one computing device comprising at least one processor and at least one data store; machine readable instructions stored in the at least one data store, wherein the instructions, when executed by the at least one processor, cause the at least one computing device to at least:  train a plurality of VGPU placement neural networks to maximize a composite efficiency metric based on the workload data and the GPU data for the plurality of vGPU placement models; generate a combined neural network selector based on the plurality of vGPU placement neural networks, and  utilize the combined neural network selector to assign at least one workload to execute on the vGPU-enabled GPUs, wherein the combined neural network selector selects the at least one workload based on a plurality of workload parameters. 
However, Steiner teaches:
at least one computing device comprising at least one processor and at least one data store; machine readable instructions stored in the at least one data store, wherein the instructions (par. 0079 Computers suitable for the execution of a computer program can be based on general or special purpose microprocessors or both, or any other kind of central processing unit. Generally, a central processing unit will receive instructions and data from a read only memory or a random access memory or both. The essential elements of a computer are a central processing unit for performing or executing instructions and one or more memory devices for storing instructions and data), when executed by the at least one processor, cause the at least one computing device to at least: 
train a plurality of VGPU placement neural networks to maximize a composite efficiency metric based on the workload data and the GPU data for the plurality of vGPU placement models (par. 0008 Further, the grouper and placer neural network are trained jointly using reinforcement learning to optimize for speed of computation, i.e., by maximizing a reward derived from execution times. Therefore, the placements determined by the two neural networks can result in a decreased processing time for the machine learning model operations, i.e., allow the machine learning model to be trained quicker, machine learning model inferences to be generated quicker, or both, while using the same number of hardware devices to perform the operations); 
 generate a combined neural network selector based on the plurality of vGPU placement neural networks, and  utilize the combined neural network selector to assign at least one workload to execute on the vGPU-enabled GPUs, wherein the combined neural network selector selects the at least one workload based on a plurality of workload parameters (par. 0033 Once the grouper neural network 104 and the placer neural network 106 have been trained, the system 100 can use the trained grouper neural network 104 and the trained placer neural network 106 to determine a placement that assigns each of the operations specified by the input data 110 to a respective device from the plurality of hardware devices). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Sivaraman with Steiner since both contain teachings directed toward placement of workloads on GPUs. One or ordinary skill in the art would have been motivated to incorporate the method of training 

As per claim 2, Sivaraman teaches wherein the workload data and GPU data for a respective vGPU placement model are generated by placing workloads in a variety of datacenter configurations comprising a plurality of different arrival rates and a plurality of different GPU counts (pg. 898, lines 49-53, We ran the simulator using a workload … with … VM placement policies … For each choice of placement policy we varied the following two parameters [varied configurations]: x The number of run queues from four to eight to twelve x The arrival rate from 72 arrivals per hour to 144 arrivals per hour to 216 arrivals per hour to 288 arrivals per hour; pg. 899, left col. lines 1-4 So, in all, we ran the simulator with five policies times three queues counts times four arrival rates making for a total of sixty runs. For all sixty runs, we used the best-effort scheduling policy on the Nvidia GPU).

As per claim 3, Sivaraman teaches wherein the plurality of vVGPU placement models comprise at least one of a first-come-first-serve (FCFS) vGPU placement model, a longest-first VGPU placement model, a longest-wait-first vGPU placement model, a random vGPU placement model, a shortest-first VGPU placement model, and a bin-packing heuristic vGPU placement model (pg. 898, right col. lines 33-48 We ran the simulator using a workload … with the following VM placement policies: FCFS: On each simulator clock tick it tries to place first VM on the wait queue, then the second , and so on. LongestFirst: On each simulator clock tick it tries to place the VM with the longest run time first, then the one with the second longest run time, and so on. LongestWaitFirst: On each simulator clock tick it tries to place the VM with the longest wait time first, then the one with the second longest wait time, and so on. Random: On each simulator clock tick it randomly selects a VM and tries to place it, then it selects another one at random and so on. ShortestFirst: On each simulator clock tick it tries to place the VM with the shortest run time first, then the one with the second shortest run time, and so on).

As per claim 7, Sivaraman and Steiner teaches the limitations of claim 1. Steiner further teaches: wherein at least one of the vGPU placement neural networks comprises at least two sets of layers, a respective set comprising four layers, a respective layer comprising at least twenty-four nodes (par. 0003 Neural networks are machine learning models that employ one or more layers of nonlinear units to predict an output for a received input. Some neural networks include one or more hidden layers in addition to an output layer. The output of each hidden layer is used as input to the next layer in the network, i.e., the next hidden layer or the output layer. Each layer of the network generates an output from a received input in accordance with 

As per claim 8, it is a method having similar limitations as claim 1. Thus, claim 8 is rejected for the same rationale as applied to claim 1.

As per claim 9, it is a method having similar limitations as claim 2. Thus, claim 9 is rejected for the same rationale as applied to claim 2.

As per claim 10, it is a method having similar limitations as claim 3. Thus, claim 10 is rejected for the same rationale as applied to claim 3.

As per claim 14, it is a method having similar limitations as claim 7. Thus, claim 14 is rejected for the same rationale as applied to claim 7.

As per claim 15, it is a non-transitory computer-readable medium having similar limitations as claim 1. Thus, claim 15 is rejected for the same rationale as applied to claim 1.

As per claim 16, it is a non-transitory computer-readable medium having similar limitations as claim 2. Thus, claim 16 is rejected for the same rationale as applied to claim 2.

As per claim 17, it is a non-transitory computer-readable medium having similar .

Claims 4, 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Sivaraman in view of Steiner as applied to claim 1, and further in view of Trippi et al. “Trading equity index futures with a neural network”.

As per claim 4, Sivaraman and Steiner teaches the limitations of claim 1. Sivaraman and Steiner does not expressly teach: wherein the combined neural network selector combines the plurality of VGPU placement neural networks based on: a logical-OR combination operation, or a scaled-add combination operation.
However, Trippi teaches wherein the combined neural network selector combines the plurality of VGPU placement neural networks based on: a logical-OR combination operation, or a scaled-add combination operation (pg. 7, lines 15-16 the outputs of individual networks are combined through the use of logical (Boolean) operators).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Sivaraman and Steiner to incorporate the method of combining neural networks as set forth by Trippi because it would provide for efficiently combining outputs of placement neural networks through use of logical Boolean combination operations, with predictable results.

As per claim 11, it is a method having similar limitations as claim 4. Thus, claim 

As per claim 18, it is a non-transitory computer-readable medium having similar limitations as claim 4. Thus, claim 18 is rejected for the same rationale as applied to claim 4.

Claims 5, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Sivaraman in view of Steiner as applied to claim 1, and further in view of Chatterjee et al. “Optimal MAC State Switching for cdma2000 Networks”.

As per claim 5, Sivaraman and Steiner teaches the limitations of claim 1. Sivaraman and Steiner does not expressly disclose: wherein the composite efficiency metric is based on GPU utilization and time a workload waits prior to selection for placement.
However, Chatterjee teaches wherein the composite efficiency metric is based on GPU utilization and time a workload waits prior to selection for placement (page 400, Abstract, Our method uses a composite performance metric which has the capability of proportionally combining basic parameters …[resource] utilization, waiting time).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Sivaraman and Steiner to incorporate a composite efficiency metric as set forth by Chatterjee because it would provide for optimizing workload placement on vGPUs at least based on a composite performance metric that at least includes parameters of GPU utilization, and workload 

As per claim 12, it is a method having similar limitations as claim 5. Thus, claim 12 is rejected for the same rationale as applied to claim 5.

As per claim 19, it is a non-transitory computer-readable medium having similar limitations as claim 5. Thus, claim 19 is rejected for the same rationale as applied to claim 5.

Claims 6, 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Sivaraman in view of Steiner as applied to claim 1, and further in view of ETSI “Environmental Engineering (EE); Energy Efficiency measurement methodology and metrics for servers”.

As per claim 6, Sivaraman and Steiner teaches the limitations of claim 1. Sivaraman and Steiner does not expressly disclose: wherein one of the workload parameters comprises a measure of the composite efficiency metric calculated for a workload scaled by a geometric mean of the composite efficiency metric calculated for a respective one of a plurality of workloads currently in an arrival queue awaiting selection for placement.
However, ETSI teaches wherein one of the workload parameters comprises a measure of the composite efficiency metric calculated for a workload scaled by a geometric mean of the composite efficiency metric calculated for a respective one of a plurality of workloads currently in an arrival queue awaiting selection for placement (pg. 18, ectio 5.1.2.1,The geometric mean function is used to combine the interval data to produce a worklet efficiency score, the worklet efficiency scores to create workload (CPU, memory, storage) efficiency scores and the workload efficiency scores to create a single efficiency metric. Using the geometric mean prevents any single performance, power, worklet or workload efficiency score from unduly influencing the single metric).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Sivaraman and Steiner to incorporate the method of calculating a composite performance metric using a geometric mean as set forth by ETSI because it would provide for preventing any single workload composite efficiency metric of a plurality of workloads from unduly influencing the composite efficiency metric).

As per claim 13, it is a method having similar limitations as claim 6. Thus, claim 13 is rejected for the same rationale as applied to claim 6.

As per claim 20, it is a non-transitory computer-readable medium having similar limitations as claim 6. Thus, claim 20 is rejected for the same rationale as applied to claim 6.

Conclusion

Any inquiry concerning this communication or earlier communications from the 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Meng-Ai An can be reached on (571) 272-3756.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/WH/
Examiner, Art Unit 2195

/MENG AI T AN/Supervisory Patent Examiner, Art Unit 2195