Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

DETAILED ACTION

Claims 1-20 are currently pending and have been examined.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/30/2020 has been considered. The submission is in compliance with the provisions of 37 CFR 1.97. Form PTO-1449 is signed and attached hereto.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claim 1-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
The following claim languages are not clearly understood and indefinite:
As per claim 1, lines 2-4 recites “dispatching a parent work group of a program to be executed; executing a spawn work group instruction to enable a child work group of the parent work group to be executed”. However, it is uncertain and not clearly understood as to who’s performing the functions of dispatching and execution, and further unclear as to whom the parent work group is to be dispatched. Lines 5-7, recites “dispatching the child work group for execution when a sufficient amount of resources are determined to be available to execute the child work group; and executing the child work group”. However, it is further uncertain and not clearly understood as to who performs the functions of dispatching and execution, and further unclear as to whom the child work group is being dispatched to.
As per claims 10 and 19, they are rejected for having similar issues as claim 1 above.
 As per claims 2-9, 11-18 and 20, they are rejected as being dependent on rejected claims 1, 10 and 19.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7, 10-16 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Wen et al. (U.S. Pub. No. 20090125907 A1) in view of Lee et al. (U.S. Pub. No. 20200326988 A1).

As per claim 1, Wen teaches the invention as claimed including a processing method comprising:
dispatching a parent work group of a program to be executed (par. 0006 a thread control unit (TCU) executes an individual thread; par. 0018 A thread that includes a nested SPAWN-type command is called a parent thread [parent workgroup]);
executing a spawn work group instruction to enable a child work group of the parent work group to be executed (par. 0046 Upon encountering a SPAWN command, the MTCU 104 retrieves the parallel segment positioned between the SPAWN command and a JOIN command, inclusive. The SCU 142 of the MTCU 104 initiates spawning of the parallel segment into a plurality of child threads);
dispatching the child work group for execution ... [on] resources … available to execute the child work group; and executing the child work group (par. 0061 At step 204, TCU 1 notifies the PSFU 130 that it is available to execute a thread and receives a unique assigned TID in return from the PSFU 130. At step 206, TCU i checks if the TID is less than GR-HIGH 133. If yes, at step 208, TCU i executes the instructions of the virtual thread (e.g., the parallel segment); par. 0015 announcing that the first TCU is available to execute another child thread …assigning a thread ID (TID) to each child thread of the plurality of child threads which is unique with respect to the other TIDs; and allocating a new child thread to the first TCU; par. 0016 The method includes executing a plurality of child threads by a plurality of TCUs including a first TCU of the plurality of TCUs).
Wen does not expressly disclose: dispatching the child work group for execution when a sufficient amount of resources are determined to be available to execute the child work group; and executing the child work group.
However, Lee teaches: dispatching the child work group for execution when a sufficient amount of resources are determined to be available to execute the child work group; and executing the child work group (par. 0047 The system of FIG. 3 includes services and components. For example, the admission control component (222) delays/limits workflows until there are sufficient resources to run the workflow, at which time it admits work as resources become available; page 13, claim 9, monitoring, by an admission controller of the execution platform, resource availability on the execution platform for executing the first workflow definition; and upon determining, by the admission controller, that sufficient resources are available on the execution platform for executing the first workflow definition, launching one or more workflow executors to execute the first workflow definition).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Wen by incorporating the method of waiting for sufficient resources to become available to run the workflow as set forth by Lee because it would provide for waiting to execute spawned child threads until sufficient resources are available to properly execute the child threads, with predictable results.

As per claim 2, Wen and Lee teaches the limitations of claim 1. Lee teaches: determining whether or not the sufficient amount of resources are available to execute the child work group prior to dispatching the child work group for execution on a compute unit; and when the sufficient amount of resources are determined to be unavailable to execute the child work group, waiting until the sufficient amount of resources are available to dispatch the child work group for execution on the compute unit (par. 0047 The system of FIG. 3 includes services and components. For example, the admission control component (222) delays/limits workflows [waits] until there are sufficient resources to run the workflow, at which time it admits work as resources become available; page 13, claim 9, monitoring, by an admission controller of the execution platform, resource availability on the execution platform for executing the first workflow definition; and upon determining, by the admission controller, that sufficient resources are available on the execution platform for executing the first workflow definition, launching one or more workflow executors to execute the first workflow definition).

As pe claim 3, Lee teaches wherein determining whether or not the sufficient amount of resources are available to execute the child work group comprises determining whether or not the compute unit and work group context memory, to be accessed by work-items of the child work group, are available (page 13, claim 9, … upon determining, by the admission controller, that sufficient resources are available on the execution platform for executing the first workflow definition, launching one or more workflow executors to execute the first workflow definition; par. 0039 TCUs may each have their own dedicated resources (e.g., function unit, registers, instruction memory, buffers, etc)).

As per claim 4, Wen teaches wherein the work group context memory comprises at least one of registers and local data store (LDS) memory (par. 0033 Each memory module 120 includes register file 122 having at least one local register, a TCU instruction cache 124, and at least one read buffer 126).

As per claim 5, Wen teaches the method of claim 1, wherein the spawn work group instruction comprises a pointer to a synchronization variable (par. 0032 with each memory module 110 in which initialization data or pointers thereto for child threads is stored), and further comprising executing a join workgroup instruction which comprises the pointer to the synchronization variable in the spawn work group instruction (par. 0027 In the current example the XMT capability is supported by using single program multiple data (SPMD) processing Virtual threads are executed in parallel, including transitioning from parallel to serial processing, and vice versa using SPAWN and JOIN commands).

As per claim 6, Wen further teaches: determining completion of execution of the child work group by using the pointer to the synchronization variable; context switching-in the parent work group when the child work group completes execution; and executing the parent work group (par. 0005 … Each thread terminates with a Join command. Once all parallel threads have terminated, transition from parallel mode to serial mode occurs; par. 0016 … completing execution of the child thread by the first TCU; and announcing that the first TCU is available to execute another child thread; executing by a second TCU a parent child thread of the plurality of child threads).

As per claim 7, Wen further teaches wherein the resources are allocated by a processor which dispatches the parent work group and the child work group for execution (par. 0046 The SCU 142 of the MTCU 104 initiates spawning of the parallel segment into a plurality of child threads; par. 0039 The scheduling of the memory resources may be centralized or decentralized).

As per claim 10, it is a processing apparatus having similar limitations as claim 1. Thus, claim 10 is rejected for the same rationale as applied to claim 1. Wen further teaches: memory; and a processor (par. 0014 at least one processor; par. 0037 a computer-readable medium, such as RAM,).

As per claim 11, it is a processing apparatus having similar limitations as claim 2. Thus, claim 11 is rejected for the same rationale as applied to claim 2.

As per claim 12, it is a processing apparatus having similar limitations as claim 3. Thus, claim 12 is rejected for the same rationale as applied to claim 3.

As per claim 13, it is a processing apparatus having similar limitations as claim 4. Thus, claim 13 is rejected for the same rationale as applied to claim 4.

As per claim 14, it is a processing apparatus having similar limitations as claim 5. Thus, claim 14 is rejected for the same rationale as applied to claim 5.

As per claim 15, it is a processing apparatus having similar limitations as claim 6. Thus, claim 15 is rejected for the same rationale as applied to claim 6.

As per claim 16, it is a processing apparatus having similar limitations as claim 7. Thus, claim 16 is rejected for the same rationale as applied to claim 7.

As per claim 19, it is a non-transitory computer readable medium having similar limitations as claim 1. Thus, claim 19 is rejected for the same rationale as applied to claim 1.

As per claim 20, it is a non-transitory computer readable medium having similar limitations as claim 2. Thus, claim 20 is rejected for the same rationale as applied to claim 2.

Claims 8-9 and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Wen in view of Lee as applied to claims 1, 10 and 19, and further in view of Piira et al. (U.S. Pub. No. 20130061234 A1).

As per claim 8, Wen further teaches wherein the program is a kernel and an amount of resources are allocated to execute a plurality of work groups for the kernel (par. 0081 operating system of the XMT system for allocating space), and the method further comprises:
Wen does not expressly teach: executing the parent work group; when a sufficient amount resources is determined to be available to execute the plurality of work groups, continuing execution of the parent work group; and when the sufficient amount resources is determined not to be available to execute the a plurality of work groups, context switching-out the parent work group.
However, Piira teaches: executing the parent work group; when a sufficient amount resources is determined to be available to execute the plurality of work groups, continuing execution of the parent work group; and when the sufficient amount resources is determined not to be available to execute the a plurality of work groups, context switching-out the parent work group (par. 0038 If the determination indicates that there are insufficient computing resources for running instances MPI(1), . . . , MPI(N) [equiv. to plurality of work groups], the instance manager 270 can take action to release computing resources from the given instance [equiv. to parent workgroup] and other instances, which are running at the time of the determination, so that, for example, the given instance can keep running. Alternatively, when no additional computing resources can be accessed, instance manager 270 can shut down the given instance … of the media player).
It would have been obvious to one of ordinary skill in the art before the effective filing dates of the claimed invention to modify the teaching of Wen and Lee by incorporating the method of shutting down an instance when insufficient resources area available for a particular set of instances as set forth by Piira because providing the ability to exit/shut down a parent work group would have ensured there are sufficient resources available for executing spawned threads.

As per claim 9, Wen, Lee and Piira teaches the limitations of claim 1. Wen further teaches wherein an amount of memory is allocated for a threshold number of work groups for the kernel, and the method further comprises: determining whether a number of work groups currently executing for the kernel is less than the threshold number of work groups; when the number of work groups currently executing for the kernel is less than or equal to the threshold number of work groups, continuing execution of the child work group; and when the number of work groups currently executing for the kernel is greater than the threshold number of work groups, (par. 0028 The number of threads to be generated from the SPAWN command are specified by an attribute of the SPAWN command. Par. 0065 When the number of virtual threads spawned exceeds the number of TCUs, at least some of the TCUs will execute multiple iterations). Piira teaches: allocating additional memory (par.0018 include allocating the requested increase in computer resources consumption in response to determining that the computer resources consumption would be less than the first predetermined level; 0040] Memory manager 260 can allocate memory for all player instances MPI(1), . . . , MPI(N) from a shared pool of memory)

As per claim 17, it is a processing apparatus having similar limitations as claim 8. Thus, claim 17 is rejected for the same rationale as applied to claim 8.

As per claim 18, it is a processing apparatus having similar limitations as claim 9. Thus, claim 18 is rejected for the same rationale as applied to claim 9.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
U.S. Patent No. 10963299 B2 teaches Hardware Accelerated Dynamic Work Creation On A Graphics Processing Unit.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Willy W. Huaracha whose telephone number is (571)270-5510.  The examiner can normally be reached on M-F 8:30-5:00pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Meng-Ai An can be reached on (571) 272-3756.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MENG AI T AN/Supervisory Patent Examiner, Art Unit 2195                                                                                                                                                                                                        
/WH/
Examiner, Art Unit 2195