Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on March 15, 2022 has been entered.

EXAMINER’S NOTE
Features from Banerjee et al. ((US 2020/0379815) that are relied on for the following rejections are supported by provisional applications (62/855,591, filed on May 31, 2019).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims  1-6, 9-15 and 19-26 are rejected under 35 U.S.C. 103 as being unpatentable over Banerjee et al. (US 2020/0379815, Banerjee hereinafter ) in view of Yash Ukidave et al. (“Runtime Support for Adaptive Spatial Partitioning and Inter-Kernel Communication on GPUs”, Yash hereinafter, 2014).

As to claim 1, Banerjee teaches  an apparatus (e.g., FIG. 1) , comprising: 
a plurality of hardware queues configured to store command buffers (e.g., para 27, 30 and 34 “ the term “command queue” refers to ordered lists of one or more command buffers for submission to a graphics hardware.”, “vertex commands”, the user space driver 102 can translate the API calls into commands encoded within command buffers before being transferred to kernel driver 103”,  “user kernel driver 103 splits commands committed for a graphics processor” and “ process queues 700A-K may be hardware queues” in para 81)  prior to execution in respective pipelines of a plurality of pipelines (e.g., para 89 “graphics processor 900 includes vertex pipe 905”, “to process data (e.g., graphics data) in parallel using multiple execution pipelines or instances”, “ the multiple execution pipelines correspond to a plurality of execution units of a processing circuit hardware resource allocation system”) .  
Banerjee teaches further a hardware scheduler (e.g., “103”, FIG. 1)  configured to  schedule command buffers (e.g., “command buffers”)  stored in at least two of the plurality of hardware queues (e.g., “command buffers” )for execution on respective pipelines in response to an indication that the command buffers are both from a first application (e.g.,  “application 101”, FIG. 1, para 30, “The command buffers are then sent to the kernel driver 103 to prepare the commands for execution on the graphics processor resource 112. As an example, the kernel driver 103 may perform memory allocation and scheduling of the commands to be sent to the graphics processor resource 112” and  “command queues across executing applications before they reach the graphics hardware” in para 48).
However, Banerjee does not teach the hardware scheduler configured to gang schedule command buffers; concurrent execution on respective pipelines in response to an indication that the command buffers are both from a first application and are to be grouped.
Yash teaches a hardware scheduler configured to gang schedule command buffers stored in at least two of the plurality of hardware queues for concurrent execution on respective pipelines (e.g. a scheduler that allows concurrent execution of different kernels using multiple command queues “) in response to an indication that the command buffers are both from a first application(e.g., applications implemented using multiple command queues”) and are to be grouped (e.g., see  FIGs 1-4 “I. INTRODUCTION “, “III. SPATIAL PARTITIONING OF SOUTHERN ISLAND GPUS USING MULTIPLE COMMAND QUEUES ARCHITECTURE”,  “Fig. 1. High-level model of a device capable of executing multiple command queues concurrently. The lists allow for flexible mapping of NDRange workgroups to compute units on the Southern Islands device”, page 168-169, “a scheduler that allows concurrent execution of different kernels using multiple command queues”, “We use multiple OpenCL command queues and sub-device to submit workloads to the same GPU”, “concurrent execution of NDRanges”,  for  “the implementation of pipe-based communication channel between different sub-devices. Pipes allow for communication between different NDRanges executing simultaneously on the same GPU.”, “The pipe object is implemented using a buffer which is modeled as a queue.”, “for applications implemented using multiple command queues” In pages 171 and 173. According to applicant’s specification in  para 11 “the phrase ‘gang scheduling’ refers to concurrently scheduling command buffers generated by a single application or process from multiple queues for concurrent execution on corresponding virtual pipelines.” Thus, a hardware scheduler configured to gang schedule command buffers stored in at least two of the plurality of hardware queues for concurrent execution on respective pipelines).Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash to have a hardware scheduler configured to gang schedule command buffers stored in at least two of the plurality of hardware queues for concurrent execution on respective pipelines in response to an indication that the command buffers are both from a first application and are to be grouped   in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).

As to claim 2, Banerjee teaches a kernel mode driver (e.g., “kernel driver 103”, FIG. 1) to allocate queues  of the plurality of hardware queues to applications (e.g., para 30, “ the kernel driver 103 to prepare the commands for execution on the graphics processor resource 112. As an example, the kernel driver 103 may perform memory allocation and scheduling of the commands to be sent to the graphics processor resource 112”). 

As to claim 3, Banerjee teaches further  wherein: the kernel mode driver is to indicate that the command buffers stored in the at least two queues are from the same application (e.g., see FIG. 1, “application 101”, para 20,“various command queues submitted by a single application” and “ each command queue for an executing application “ in para 51)  . However, Banerjee does not teach a group identifier   received from the first application. Yash teaches the kernel mode driver is to indicate that the command buffers stored in the at least two queues  (e.g., “Multiple Command Queue mapping “) are from the same application via a group identifier  (e.g., “NDRange 0”, FIG. 3) received from the first application (e.g., see FIG. 3,  right column of page 169, “A. Workgroup Scheduling Mechanism for Multiple Command Queue mapping on different Sub-Devices”,  “uses lists to track the workgroups of the NDRanges mapped to that sub-device.”, “1. Available CU List”, “2. Pending Workgroup(WG) List”, “3. Usable CU List”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash  in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).


As to claim 4, Banerjee teaches wherein: the kernel mode driver is to modify allocation of the  plurality of hardware queues in response to a modification request  (e.g., para 82, “ reallocate the graphics hardware resources based on the modified priorities” and “The adjusted priority may be used by hardware resource arbitration circuit 510 in future allocation cycles to reallocate graphics hardware resources 505” in para 65)

As to claim 5, Banerjee teaches wherein: modifying the allocation of the plurality of hardware queues comprises reallocating a first queue from the first application to a second application (e.g., para 65, “The adjusted priority may be used by hardware resource arbitration circuit 510 in future allocation cycles to reallocate graphics hardware resources 505”, “priority may be adjusted such that allocation of graphics hardware resources 505 to executes at cluster 230A trends towards a specified ratio over a period of time (e.g., 1 millisecond or 1 second)” for  “normalization may comprise allocating hardware resources equitably across executing applications, regardless of the number of command queues submitted by a given application” in para 75-76 , see FIG. 6).

As to claim 6, Banerjee teaches wherein: the kernel mode driver is  to reconfigure priorities (e.g., “priority may be adjusted”)  of at least one of the plurality of hardware queues in response to a reconfiguration request (e.g., para 65, “The adjusted priority may be used by hardware resource arbitration circuit 510 in future allocation cycles to reallocate graphics hardware resources 505”, “priority may be adjusted such that allocation of graphics hardware resources 505 to executes at cluster 230A trends towards a specified ratio over a period of time (e.g., 1 millisecond or 1 second)” for  “normalization may comprise allocating hardware resources equitably across executing applications, regardless of the number of command queues submitted by a given application” in para 75-76 , see FIG. 6).  

As to claim 9, Banerjee does not teach wherein: in response to receiving an interrupt that comprises an address indicating a location of a routine to be executed by a second plurality of pipelines, the hardware scheduler is configured to use the address to access a data structure that identifies the routine. However, Yash teaches  wherein: in response to receiving an interrupt (e.g., “e.g., “enqueued kernels”, “to be stopped””) that comprises an address indicating a location  (e.g., the extracted ID as an offset into the buffer) of a routine (e.g., “read and write APIs”)  to be executed by a second plurality of pipelines (e.g., another one of “ Pipes”), the hardware scheduler is configured to use the address to access a data structure  (e.g.,  “pipe object”, “data/tile ID “) that identifies the routine(e.g., see Fig. 4.,  page 171, “D. Continuous Processing Using OpenCL Pipes “, Fig. 4. Simulated implementation of OpenCL pipes. The pipe read and write APIs require an element size in order to index global memory”  , “the enqueued kernels utilize the same memory space, they will still need to be stopped and restarted often to support synchronization and exchange of data”, “Pipes allow for communication between different NDRanges executing simultaneously on the same GPU.”,  “ The data/tile ID of the processed data is written to the pipe by the producer kernel. The consumer kernel extracts the data/tile ID from the pipe and accesses data from the intermediate buffer using the extracted ID as an offset into the buffer. Both of the kernels can execute on separate command queues mapped to different sub-devices on the GPU and maintain a real-time communication channel”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).

As to claim 10, Banerjee does not teach wherein: Page 3 of 15U.S. App. No.: 16/721,434PATENT in response to the hardware scheduler accessing the data structure that identifies the routine, the routine  is to generate a plurality of hardware queue allocation requests for a second group that differs from a first group of a command that generated the interrupt.  However, Yash teaches in response to the hardware scheduler  accessing the data structure (e.g., Pipe object “) that identifies the routine, the routine    is to generate a plurality of hardware queue allocation requests (e.g., “executing multiple command queues concurrently”)  for a second group (e.g., another one of “different NDRanges”)  that differs from a first group (e.g. one of “different NDRanges”)  of a command that generated the interrupt (e.g.,  See page 169, “III. SPATIAL PARTITIONING OF SOUTHERN ISLAND GPUS USING MULTIPLE COMMAND QUEUES ARCHITECTURE “, “Fig. 1. High-level model of a device capable of executing multiple command queues concurrently. The lists allow for flexible mapping of NDRange workgroups to compute units on the Southern Islands device.”  and  “D. Continuous Processing Using OpenCL Pipes “, “the enqueued kernels utilize the same memory space, they will still need to be stopped and restarted often to support synchronization and exchange of data”,  “Pipes allow for communication between different NDRanges executing simultaneously on the same GPU “ , “Pipe object stores data in the form of packets”, “Memory transactions on the pipe object are carried out using OpenCL built-in functions such as read pipe and write pipe”,  “built-in functions of the pipe operations as user functions in the kernel”,  “Fig. 4. Simulated implementation of OpenCL pipes. The pipe read and write APIs require an element size in order to index global memory”  in page 171) . Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).


As to claim 11, see rejection of claim 1 above . Banerjee  teaches further a method, comprising: allocating, at a kernel mode driver, a first  plurality of queues to a first application (see FIGs. 1-4, para 51, “command queues of the application”). 

As to claim 12, Banerjee  teaches receiving, at the kernel mode driver, registration requests (e.g., “commands”)  from the first application to allocate the first plurality of queues to the first   application (e.g., see FIG. 1-4, “para 54, “each concurrently running application and command queue again has the chance to submit its commands to the graphics hardware for execution”) .

As to claim 13, see rejection of claims 4 and 5 above.

As to claim 14, Banerjee does not   teach identifying, at the kernel mode driver , a group identifier from the registration requests from the first application; and adding the first plurality of queues to a group of queues allocated to the first application, wherein the group of queues is associated with the group identifier. However, Yash teaches identifying, at the kernel mode driver (e.g., “kernel dispatch”), a group identifier (e.g., “uses lists to track”, ““1. Available CU List”)  from the registration requests from the first application; and adding the first plurality of queues to a group of queues allocated  (e.g., “Multiple Command Queue mapping “)to the first application, wherein the group of queues is associated with the group identifier (e.g., see FIG. 3,  right column of page 169, “A. Workgroup Scheduling Mechanism for Multiple Command Queue mapping on different Sub-Devices”,  “uses lists to track the workgroups of the NDRanges mapped to that sub-device.”, “1. Available CU List”, “2. Pending Workgroup(WG) List”, “3. Usable CU List”  for “kernel dispatch (occupancy)” in page 170). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).


As to claim 15, Banerjee does not   teach deallocating the first plurality of queues from the first application in response to at least one of detecting expiration of a time quantum and detecting that the first  plurality of queues is empty. However,  Yash teaches deallocating  the first plurality of queues (e.g., “to reassign resources to the active NDRanges”)  from the first application in response to at least one of detecting expiration of a time quantum and detecting that the first  plurality of queues is empty (e.g., see left column of page 170, “If the pending list is empty, the Load Balancer is invoked to reassign resources to the active NDRanges”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).

  
As to claim 19, see rejection of claim 1 above. Banerjee teaches further   allocating, at a kernel mode driver, a first queue to a first application,  wherein the first queue is configured to store command buffers prior to execution in a first  pipeline   allocating, at the kernel mode driver  , a second queue to the first application, wherein the second queue is configured to store command buffers prior to execution in a second pipeline (e.g., para 48-49, “command queues across executing applications “, “two applications submitted many more command queues” and  “hardware resource allocation system 300, a balancing logic 213 module may be used to track the current utilization of graphics hardware resources by each active command queue.” in para 53) . However, Banerjee does not teach a  gang scheduling a first group of command buffers from the first application to the first queue and the second queue for concurrent execution on the first pipeline and the second pipeline. Yash teaches  a  gang scheduling (e.g., the scheduler” for “concurrent execution of NDRanges”)  a first group of command buffers (e.g., see FIG. 1, “pending WG list”) from the first application to the first queue and the second queue for concurrent execution on the first pipeline  and the second pipeline (e.g., see page 169, “Fig. 1. High-level model of a device capable of executing multiple command queues concurrently.”, “the scheduler for fixed and adaptive spatial partitioning of the GPU, which enables concurrent execution of NDRanges”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).


As to claim 21, Banerjee does not  teach wherein gang scheduling the first group of command buffers comprises: in response to a delay indication from the first application, introducing a predetermined delay between execution of a first command buffer scheduled to the first queue and a second command buffer scheduled to the second queue.  However, Yash teaches  in response to a delay indication from the first application, introducing a predetermined delay (e.g., “a delay due “) between execution of a first command buffer scheduled to the first queue and a second command buffer scheduled to the second queue (e.g., . see page 171, “In latency-based
scheduling, the scheduler iterates over the compute units in a round-robin fashion and assigns one pending workgroup to each usable compute unit.” and  “The first NDRange scheduled for execution occupies the entire GPU and the subsequent NDRanges”,  “a delay due to re-assignment of compute units from the first NDRange in the case of full-adaptive partitioning” in page 174). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).


As to claim 22, Banerjee does not teach  hierarchically gang scheduling a second group of command buffers from the first application to a third queue for concurrent execution with the first group of command buffers, wherein the second group of command buffers are to be executed on a third pipeline associated with the third queue . However, Yash teaches hierarchically gang scheduling (e.g., see page 170, “Fig. 2. Flowchart describing the mechanism of handling adaptive partitioning of a GPU in the OpenCL runtime”)   a second group of command buffers (e.g., another one  of “multiple NDRanges) from the first application to a third queue  ( e.g., another one of “command queues”) for concurrent execution with the first group of command buffers (e.g., “NDRanges executing simultaneously”) , wherein the second group of command buffers are to be executed on a third pipeline (see another one of “Pipes  “ , “using pipe objects”) associated with the third queue (e.g., see page 169-170 , “OpenCL runtime-level support for concurrent execution using adaptive partitioning. We expose the options for adaptive and fixed partitioning to”, “OpenCL 1.1 allows for the division of compute resources of the GPU to form a logical collection of compute units known as sub-devices. We use multiple OpenCL command queues and sub-device to submit workloads to the same GPU”) , “B. Adaptive Spatial Partitioning of GPU using multiple Command Queues”, “multiple NDRanges which can execute concurrently. We extend the fixed spatial partitioning mechanism to support adaptive partitioning of resources (i.e., compute units) across different subdevices. The OpenCL sub-device API (clCreateSubdevices)”  for “Pipes allow for communication between different NDRanges executing simultaneously on the same GPU.” , “Execution is pipelined, with each stage providing data to the next for processing. Each stage of compute runs in a separate command queue mapped to a different compute unit. The data communication between stages is achieved using pipe objects” in page 171-172).  Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).


As to claim 23 , Banerjee teaches further   teaches the first queue is allocated to the first application in response to a first registration request from the first application (see rejection of claim 12 above). However, Banerjee does not teach comprising a first group identifier; and the second queue is allocated to the first application in response to a second registration request from the first application comprising the first group identifier.  Yash teaches a first group identifier (e.g., “1. Available CU List”) ; and the second queue is allocated to the first application (e.g., “executing multiple command queues concurrently”) in response to a second registration request from the first application comprising the first group identifier (e.g., see page 169, “lists to track availability of compute units from a sub-device and also to track the workgroups of the NDRanges mapped to that sub-device.”, “1. Available CU List:”, “2. Pending Workgroup(WG) List:”, “3. Usable CU List:”, “Fig. 1. High-level model of a device capable of executing multiple command queues concurrently. The lists allow for flexible mapping of NDRange workgroups to compute units on the Southern Islands device.”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).


As to claim 24, Banerjee does not teach wherein hierarchically gang scheduling the second group of command buffers comprises: generating an interrupt in response to execution of a first command in a first command buffer in the first queue, wherein the interrupt comprises an address; accessing a data structure using the address, wherein the data structure identifies a routine that generates a third registration request that comprises a second group identifier; and allocating the third queue to the first application in response to the third registration request.  However, Yash teaches generating an interrupt (e.g., “enqueued kernels”, “to be stopped”) in response to execution of a first command in a first command buffer in the first queue (e.g., “Pipes allow for communication between different NDRanges executing simultaneously”) , wherein the interrupt comprises an address; accessing a data structure using the address (e.g., “using the extracted ID as an offset into the buffer”), wherein the data structure identifies a routine (e.g., “. The pipe read and write APIs”)  that generates a third registration request that comprises a second group identifier (e.g., “the data/tile ID”) ; and allocating the third queue to the first application in response to the third registration request (e.g., see page 171, “workloads, even though the enqueued kernels utilize the same memory space, they will still need to be stopped and restarted often to support synchronization and exchange of data”, “the implementation of pipe-based communication channel between different sub-devices. Pipes allow for communication between different NDRanges executing simultaneously on the same GPU”, “A pipe is a typed memory object”, “The pipe object is implemented using a buffer which is modeled as a queue”, “Fig. 4. Simulated implementation of OpenCL pipes. The pipe read and write APIs require an element size in order to index global memory”, “kernel extracts the data/tile ID from the pipe and accesses data from the intermediate buffer using the extracted ID as an offset into the buffer. Both of the kernels can execute on separate command queues mapped to different sub-devices on the GPU”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).


As to claim 25, Banerjee does not teach wherein hierarchically gang scheduling the second group of command buffers comprises: Page 6 of 15U.S. App. No.: 16/721,434PATENT gang scheduling a third group of command buffers from the first application to a fourth queue for concurrent execution with the first group of command buffers, wherein the third group of command buffers are to be executed on a fourth pipeline associated with the fourth queue.  However, Yash teaches wherein hierarchically gang scheduling the second group of command buffers comprises: Page 6 of 15U.S. App. No.: 16/721,434PATENT gang scheduling a third group of command buffers from the first application to a fourth queue for concurrent execution with the first group of command buffers (e.g., “. Pipes allow for communication between different NDRanges executing simultaneously on the same GPU”), wherein the third group of command buffers are to be executed on a fourth pipeline (e.g., another one of “pipes”) associated with the fourth queue (e.g., see page 171, “workloads, even though the enqueued kernels utilize the same memory space, they will still need to be stopped and restarted often to support synchronization and exchange of data”, “the implementation of pipe-based communication channel between different sub-devices. Pipes allow for communication between different NDRanges executing simultaneously on the same GPU”, “A pipe is a typed memory object”, “The pipe object is implemented using a buffer which is modeled as a queue”, “Fig. 4. Simulated implementation of OpenCL pipes. The pipe read and write APIs require an element size in order to index global memory”, “kernel extracts the data/tile ID from the pipe and accesses data from the intermediate buffer using the extracted ID as an offset into the buffer. Both of the kernels can execute on separate command queues mapped to different sub-devices on the GPU”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).
As to claim 26, Banerjee does not teach deallocating the first queue from the first application in response to at least one of expiration of a time quantum and the first queue becoming empty, wherein the first queue is allocated to the first application for the time quantum. However, Yash teaches deallocating the first queue from the first application (e.g., “to reassign resources to the active NDRanges”)  in response to at least one of expiration of a time quantum and the first queue becoming empty (e.g., see left column of page 170, “If the pending list is empty, the Load Balancer is invoked to reassign resources to the active NDRanges”), wherein the first queue is allocated to the first application for the time quantum (e.g., see page 174, “Fig. 8. Timeline showing re-assignment of compute units for each NDRange of the applications”) . Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method of , Banerjee  by adopting the teachings of Yash in order  “to support concurrent execution and provide flexible mapping of compute kernels to the GPU “ (Yash, Abstract).

Response to Arguments
Applicant's arguments with respect to claims 1-6, 9-15and 19-26 have been considered but are moot in view of the new ground(s) of rejection. 

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

LV et al. (US 2018/0374188) discloses “a hybrid scheme combining both per-engine scheduling and gang scheduling, through constructing an inter-engine dependency graph, when the command buffers are audited. The GPU scheduler 1426 can then choose per-engine scheduling and gang scheduling policies dynamically, according to the dependency graph”.


Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABDOU K SEYE whose telephone number is (571)270-1062. The examiner can normally be reached M-F 9-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hyung SOUGH can be reached on 5712726799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ABDOU K SEYE/Examiner, Art Unit 2194                                                                                                                                                                                                        
/S. Sough/SPE, AU 2192/2194