Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-20 are presented for examination.
Claim Rejections - 35 USC § 112 
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112, second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which applicant regards as the invention.

In claim 1, the term “determine, based on a current vGPU scheduling policy for the virtual machine, a scheduling policy matching order for a migration of the virtual machine” is not clear. It is not clear what the term “matching order” means. 
Paragraph 28 of the specification mentions “Table 5 shows an example of how a source vGPU scheduling policy can be used to identify a vGPU scheduling policy matching order for migration. For equal share, the vGPU scheduling policy matching order can list equal share, best effort, and fixed share in order of decreasing priority”.
One way to clarify the concept (and be consistent with the specification) would be to recite “determine a scheduling policy that best matches the VGPU scheduling policy from an ordered list”
Claims 8 and 15 have the same problem and are rejected for the same reasons.
The remaining claims, not specifically mentioned, are rejected for being dependent upon one of the claims above.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1,2,8,9,15 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (Chen-2017.pdf: " GaaS Workload Characterization under NUMA Architecture for Virtualized GPU" 2017 IEEE International Symposium on Performance Analysis of Systems and Software. April 2017 (hereinafter Chen)) in view of Suresh (US 2019/0019267 A1) in further view of Tunuguntla (US 2018/0060996 A1)

As per claim 1, Chen teaches A system comprising: 
at least one computing device comprising at least one processor and at least one data store; 
machine readable instructions accessible to the at least one computing device, wherein the instructions, when executed by the at least one processor, cause the at least one computing device to at least: 
identify at least one graphics processing unit (GPU) that is compatible with a current virtual GPU (vGPU) profile for a virtual machine, wherein the at least one GPU comprises a corresponding at least one vGPU scheduling policy; (Chen (page 68) Each physical GPU can be configured to support different profiles of virtual GPU. Each vGPU profile has a fixed amount of framebuffer and resolution, as shown in Table II.. and (page 67) During VM initialization, the vGPU manager communicates with the physical GPU to determine its vGPU configuration, including the number of GPU channels and amount of framebuffer required.. The GPU hardware scheduler adopts a round-robin method to schedule the requests sent from each VM’s GPU channels.)
Chen does not teach determine, based on a current vGPU scheduling policy for the virtual machine, a scheduling policy matching order for a migration of the virtual machine.
However, Suresh teaches determine, based on a current vGPU scheduling policy for the virtual machine, a scheduling policy matching order for a migration of the virtual machine; (Suresh Fig 6C and [0060] For example, an end user may request the creation of a virtual machine having a specified amount of memory, processing power, and network capabilities.  The management server 410 may respond to the user's request and may allocate the resources to create the virtual machine without the user knowing whether the virtual machine was created using resources from zone 401 or zone 402. [0090] At step 610, virtual GPU manager 522 may classify each of the one or more one or more enumerated discreet GPUs 514A-514N and/or integrated CPU/GPU(s) 512 based on the processing performance information received at step 609.  In some instances, the classification may concern identifying each of the one or more enumerated discreet GPUs 514A-514N and/or integrated CPU/GPU(s) 512 as being either high-load processing or light-load processing [0091] At step 611, virtual GPU manager 522 may rank each of the one or more enumerated discreet GPUs 514A-514N and/or integrated CPU/GPU(s) 512 based on the respective processing performance information.  In some instances, the rankings assigned by virtual GPU manager 522 may be done in relation to the classifications (e.g., high-load processing and light-load processing) identified at step 610 but, in other instances, the rankings assigned by the virtual GPU manager 522 may be done in across the classifications (e.g., regardless of the classification identified at step 610).  The rankings may be assigned by virtual GPU manager 522 based on a data value associated with one or more of the processing performance variables.  For instance, a processing unit from the one or more enumerated discreet GPUs 514A-514N and/or integrated CPU/GPU(s) 512 with a higher processing capacity may be ranked above a processing unit from the one or more enumerated discreet GPUs 514A-514N and/or integrated CPU/GPU(s) 512 with a lower processing capacity).

It would have been obvious to a person in the ordinary skill in the art before the filing date of the claimed invention to combine Suresh with the system of Chen to match the policy order. One having ordinary skill in the art would have been motivated to use Suresh into the system of Chen for the purpose of optimizing the rendering of graphics in virtual GPU system (Suresh paragraph 01)

Chen and Suresh do not teach select a destination GPU from the at least one GPU based on a vGPU scheduling policy of the destination GPU being identified as a best available one of the at least one vGPU scheduling policy according to the scheduling policy matching order and migrate the virtual machine to the destination GPU.
However, Tunuguntla teaches select a destination GPU from the at least one GPU based on a vGPU scheduling policy of the destination GPU being identified as a best available one of the at least one vGPU scheduling policy according to the scheduling policy matching order; (Tunuguntla Fig 9 Block 902, 904 and 906 and paragraphs 86 and 87)

migrate the virtual machine to the destination GPU. (Tunuguntla Fig 9 Block 908 and [0087] Next, at block 908, the recommended migration is automatically executed by the cluster management server 106, if the dynamic, mode is enabled for the graphics resource management module.)

It would have been obvious to a person in the ordinary skill in the art before the filing date of the claimed invention to combine Tunuguntla with the system of Chen, Suresh to select a migration destination. One having ordinary skill in the art would have been motivated to use Tunuguntla into the system of Chen, Suresh for the purpose of placing virtual computing instances in a distributed computer system that utilizes virtual graphic processing unit (vGPU) requirements of the virtual computing instances to place the virtual computing instances on a plurality of hosts of the distributed computer system (Tunuguntla paragraph 05)

As per claim 2, Suresh teaches wherein the vGPU scheduling policy matching order comprises a ranking of a plurality of vGPU scheduling policies, wherein the ranking is specific to the vGPU scheduling policy. ; (Suresh Fig 6C and [0060] For example, an end user may request the creation of a virtual machine having a specified amount of memory, processing power, and network capabilities.  The management server 410 may respond to the user's request and may allocate the resources to create the virtual machine without the user knowing whether the virtual machine was created using resources from zone 401 or zone 402. [0090] At step 610, virtual GPU manager 522 may classify each of the one or more one or more enumerated discreet GPUs 514A-514N and/or integrated CPU/GPU(s) 512 based on the processing performance information received at step 609.  In some instances, the classification may concern identifying each of the one or more enumerated discreet GPUs 514A-514N and/or integrated CPU/GPU(s) 512 as being either high-load processing or light-load processing [0091] At step 611, virtual GPU manager 522 may rank each of the one or more enumerated discreet GPUs 514A-514N and/or integrated CPU/GPU(s) 512 based on the respective processing performance information.  In some instances, the rankings assigned by virtual GPU manager 522 may be done in relation to the classifications (e.g., high-load processing and light-load processing) identified at step 610 but, in other instances, the rankings assigned by the virtual GPU manager 522 may be done in across the classifications (e.g., regardless of the classification identified at step 610).  The rankings may be assigned by virtual GPU manager 522 based on a data value associated with one or more of the processing performance variables.  For instance, a processing unit from the one or more enumerated discreet GPUs 514A-514N and/or integrated CPU/GPU(s) 512 with a higher processing capacity may be ranked above a processing unit from the one or more enumerated discreet GPUs 514A-514N and/or integrated CPU/GPU(s) 512 with a lower processing capacity).

As to claims 8 and 15, they are rejected based on the same reason as claim 1.
As to claims 9 and 16, they are rejected based on the same reason as claim 2.

Claims 4, 7, 11, 14 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (Chen-2017.pdf: " GaaS Workload Characterization under NUMA Architecture for Virtualized GPU" 2017 IEEE International Symposium on Performance Analysis of Systems and Software. April 2017 (hereinafter Chen)) in view of Suresh (US 2019/0019267 A1) in further view of Tunuguntla (US 2018/0060996 A1) and Baggerman (US 2019/0139185 A1)

As per claim 4, Chen, Suresh and Tunuguntla do teach configure the destination GPU to a particular vGPU profile based on at least one of: the current vGPU profile (already covered in claim 1).
However, they do not teach GPU memory requirement of the virtual machine.
However, Baggerman teaches a GPU memory requirement of the virtual machine. (Baggerman Fig 3B Block 344 (Migrate workload of VM to physical GPU associated with reassigned VGPU profile)  [0010]The vGPU profile mechanism includes a workload classification module and a profile reassignment module. The workload classification module retrieves information describing the workload of a virtual machine, compares the information to that of predefined workload profiles, and classifies the workload based on the comparison. Using the classified workload of the virtual machine and one or more profile reassignment rules, the profile reassignment module of the vGPU profile mechanism may reassign a virtual machine to a different vGPU profile [0037] A vGPU profile mechanism 114 retrieves information describing a workload of each virtual machine 110, 112 supported by one or more nodes 105 included in the system 102.  A workload may describe an amount of one or more computing resources used by a virtual machine 110, 112 and a duration of use of the one or more computing resources by the virtual machine 110, 112 to perform one or more tasks comprising the workload.  For example, a given workload may describe an amount of memory, CPU, or storage required by a virtual machine 110, 112 to run an email application while another workload may describe a different amount of memory, CPU, or storage required by the virtual machine 110, 112 to run a game application.  In some embodiments, the vGPU profile mechanism 114 may store the information describing the workload of each of the virtual machines 110, 112 (e.g., as observed workload data 126 in a database 124) for subsequent retrieval.)

It would have been obvious to a person in the ordinary skill in the art before the filing date of the claimed invention to combine Baggerman with the system of Chen, Suresh and Tunuguntla to meet GPU memory requirement of VM. One having ordinary skill in the art would have been motivated to use Baggerman into the system of Chen, Suresh and Tunuguntla for the purpose of dynamically allocating GPU resources in a networked virtualization system (Baggerman paragraph 02)

As per claim 7, Chen, Suresh and Tunuguntla do not teach determine that a number of virtual machines running on the destination GPU is less than a maximum number of virtual machines for the destination GPU.
However, Baggerman teaches determine that a number of virtual machines running on the destination GPU is less than a maximum number of virtual machines for the destination GPU. (Baggerman [0032] For example, a node may support a maximum number of virtual machines based on the node's available resources (e.g., memory, CPU, GPU, and scheduling limitations specific to the node), as well as the amount of the node's available resources that are required by each virtual machine to process its workload. and [0061] A general trend that is illustrated in both tables 200, 202 that is central to the concept of vGPU profiles 108 and the virtualization of GPU resources on a physical GPU 107 is the tradeoff between the amount of GPU resources that may be allocated per user and the maximum number of users (i.e., the number of virtual machines 110, 112) that may be supported by each graphics board 104.)

It would have been obvious to a person in the ordinary skill in the art before the filing date of the claimed invention to combine Baggerman with the system of Chen, Suresh and Tunuguntla to limit the number of VMs running on GPU. One having ordinary skill in the art would have been motivated to use Baggerman into the system of Chen, Suresh and Tunuguntla for the purpose of dynamically allocating GPU resources in a networked virtualization system (Baggerman paragraph 02)

As to claims 11 and 18, they are rejected based on the same reason as claim 4.
As to claim 14, it is rejected based on the same reason as claim 7.

Claims 5, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (Chen-2017.pdf: " GaaS Workload Characterization under NUMA Architecture for Virtualized GPU" 2017 IEEE International Symposium on Performance Analysis of Systems and Software. April 2017 (hereinafter Chen)) in view of Suresh (US 2019/0019267 A1) in further view of Tunuguntla (US 2018/0060996 A1) and Liu (US 2018/0349204 Al) and Kruglick (US 2013/0290956 Al) and Appu (US 2018/0293701 A1)

As per claim 5, Chen, Suresh and Tunuguntla do not teach the vGPU scheduling policy corresponds to: best-effort scheduling.
However, Liu teaches the vGPU scheduling policy corresponds to: best-effort scheduling (Liu [0058] After it is determined that the physical GPU has a sufficient remaining time slice and a sufficient frame buffer space for meeting a computing resource requirement of a virtual GPU of the type C to be created, the GPU resource manager can return a query result that the creation is feasible to libvirt, which can send the query result to the control system to control the issuing of a request for creating a virtual machine.), 

This is consistent with paragraph 17 of the specification ([0017] In the best effort policy, each virtual machine 118 or workload assigned to vGPUs 151 of a GPUs 115 can use GPU cycles until its time slice is over or until the job queue is empty).

It would have been obvious to a person in the ordinary skill in the art before the filing date of the claimed invention to combine Liu with the system of Chen, Suresh and Tunuguntla to implement best-effort scheduling. One having ordinary skill in the art would have been motivated to use Liu into the system of Chen, Suresh and Tunuguntla for the purpose of dynamically adjusting virtual GPUs (Liu paragraph 04). 

Chen, Suresh and Tunuguntla do not teach equal-share scheduling.
However, Kruglick teaches equal-share scheduling (Kruglick [0027] In fair-sharing type scheduling techniques, different VMs are generally provided with substantially equal processor 101 time).

This is consistent with paragraph 18 of the specification ([0018] For equal share, the amount of cycles given to each vGPU 151 is determined by the current number of virtual machines 118 of vGPUs, regardless of whether these virtual machines 118 are running CUDA or GPU-utilizing applications or not)

It would have been obvious to a person in the ordinary skill in the art before the filing date of the claimed invention to combine Kruglick with the system of Chen, Suresh and Tunuguntla to implement equal-sharing scheduling. One having ordinary skill in the art would have been motivated to use Kruglick into the system of Chen, Suresh and Tunuguntla for the purpose of adapting data center technologies to run as efficiently as possible (Kruglick paragraph 03). 

Chen, Suresh and Tunuguntla do not teach fixed-share scheduling.
However, Appu teaches fixed-share scheduling (Appu [Abstract] For example, one embodiment of an apparatus comprises a graphics processing unit (GPU) comprising a plurality of graphics processing resources; slice configuration hardware logic to logically subdivide the graphics processing resources into a plurality of slices; and slice allocation hardware logic to allocate a designated number of slices to each virtual machine (VM) of a plurality of VMs running in a virtualized execution environment, the slice allocation hardware logic to allocate different numbers of slices to different VMs based on graphics processing requirements and/or priorities of each of the VMs). 

This is consistent with what is disclosed in paragraph 19 of the specification ([0019] For fixed share, the amount of cycles given to each vGPU 151 is 
determined by the total number of supported virtual machines 118 under the 
given scheduling policy, regardless if other virtual machines 118 are powered 
on or not)

It would have been obvious to a person in the ordinary skill in the art before the filing date of the claimed invention to combine Appu with the system of Chen, Suresh and Tunuguntla to implement fixed-share scheduling. One having ordinary skill in the art would have been motivated to use Appu into the system of Chen, Suresh and Tunuguntla for the purpose of dynamic provisioning, quality of service (QoS) and prioritization in a graphics processor (Appu paragraph 01) 

As to claims 12 and 19, they are rejected based on the same reason as claim 5.

Claims 6, 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (Chen-2017.pdf: " GaaS Workload Characterization under NUMA Architecture for Virtualized GPU" 2017 IEEE International Symposium on Performance Analysis of Systems and Software. April 2017 (hereinafter Chen)) in view of Suresh (US 2019/0019267 A1) in further view of Tunuguntla (US 2018/0060996 A1) and Kelly (US 2020/0394063 A1).

As per claim 6, Chen, Suresh and Tunuguntla do not teach identify at least one host that matches threshold migration requirements, wherein the at least one host comprises the at least one GPU.
However, Kelly teaches identify at least one host that matches threshold migration requirements, wherein the at least one host comprises the at least one GPU. (Kelly [0033] migrating a workload associated with the virtual desktop instance 230-1, 230-2, 230-3, 230-N among the computer cluster when a threshold composite score is reached.  In an example, the threshold score may be 20 out of 100.)

It would have been obvious to a person in the ordinary skill in the art before the filing date of the claimed invention to combine Kelly with the system of Chen, Suresh and Tunuguntla to find a host that matches threshold requirements. One having ordinary skill in the art would have been motivated to use Kelly into the system of Chen, Suresh and Tunuguntla for the purpose of migrating of virtual machines (VMs) between cluster nodes.  (Kelly paragraph 01)

As to claims 13 and 20, they are rejected based on the same reason as claim 6.

Allowable Subject Matter
              Claims 3, 10 and 17 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

US 20210303327 A1 – discloses graphics processing unit (GPU)-remoting latency aware migration.  In some aspects, a host executes a GPU-remoting client that includes a GPU workload.  GPU-remoting latencies are identified for hosts of a cluster.  A destination host is identified based on having a lower GPU-remoting latency than the host currently executing the GPU-remoting client.  The GPU-remoting client is migrated from its current host to the destination host.

US 20210026672 A1 – discloses avoiding power-on failures during virtualization of graphics processing units.  A computing environment can be directed to, in response to a virtual machine being powered on, identify a profile for a virtual graphics processing unit (vGPU) designated for the virtual machine, the profile specifying an amount of memory required by the vGPU, identify that the virtual machine is unable to be assigned to any of a plurality of physical graphics processing units (GPUs) based on the amount of memory required by the vGPU, free fat least the amount of memory required by the vGPU by performing a migration of at least one existing virtual machine from a first one of the physical GPUs to a second one of the physical GPUs, and assign the virtual machine to an available one of the physical GPUs and a corresponding host.

US 20140181806 A1 – discloses dynamically allocating graphics processing units among virtual machines are provided.  Example embodiments provide a dynamic GPU allocation system ("DGAS"), which enables the efficient allocation of physical GPU resources to one or more virtual machines.  In one embodiment, the DGAS comprises a GPU allocation list for use in allocating the physical GPU resources comprising one or more virtual machine entries each containing a designation of a virtual machine, an indication of a GPU benefit factor associated with the designated virtual machine, and an indication of processing bandwidth requirements associated with the designated virtual machine.  The entries are ranked based at least upon the GPU benefit factor associated with each designated virtual machine.  Available GPU resources are allocated to some subset of these ranked virtual machines as physical GPU capacity is matched with the requirements of the subset.

US 20110083131 A1 – discloses automatically allocating resources to a virtual machine.  Expected workload profile data and application utilization data corresponding to a software application associated with a virtual machine (VM) is collected by an application profiling agent.  Resource utilization data corresponding to the utilization of resources associated with the execution of the software application is collected by a system resource monitor.  The expected workload profile data, the application utilization data, and the resource utilization data are then processed to determine a virtual machine workload class, which is then used to determine a corresponding VM policy.  Data associated with the VM policy then processed to generate VM resource allocation instructions, which are in turn processed to provision the VM. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MEHRAN KAMRAN whose telephone number is (571)272-3401.  The examiner can normally be reached on 9-5.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emerson Puente can be reached on (571)272-3652.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MEHRAN KAMRAN/           Primary Examiner, Art Unit 2196