DETAILED ACTION
This office action is in response to the amendment filed June 28, 2022.
Claims 1-20 are pending

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on June 28, 2022 was filed after the mailing date of the application on July 2, 2021.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.


Response to Arguments
Applicant's arguments filed June 28, 2022 have been fully considered but they are not persuasive. 
Specifically regarding applicant’s remarks regarding the teachings of Gummaraju on pages 10-12 of the remarks, the Examiner disagrees:  In cited ¶34 of Gummaraju teaches not just profiling and scheduling of kernels amongst different processors, but compiling of kernels, including conversion of the kernels (¶34 “Some kernels may execute the same code on different types of processors. For some other kernels, the programmer specified source code may be enhanced during compilation as necessary to execute on the different types of processors.”) as such, this argument is not persuasive and the rejection is maintained. 
Further, Applicant’s argument regarding the claimed “operation table” on page 11 is not persuasive: Applicant argues that the limitation is not taught by Gummaraju – essentially arguing on page 11 against an argument/rejection that the Examiner did not make. At the same time, the argument fails to address the actual cited teachings of Lotte. Insomuch as Applicant is attempting between pages 11 and 12 of the remarks to argue the cited references cannot be combined, applicant’s argument fail to specifically make that argument. That is, while applicant argues on page 12 that the deficiencies “cannot be overcome” of “Gummaraju” (which the examiner conceded in making the 103 rejection regarding the operation table in the first place) it fails to specifically address the rejection of record to that end. Additionally, applicant’s remarks regarding the table on page 13 that Lottes system is “is entirely different from the operation table that functions by enabling an engine in the GPU to instruct a component inside or outside of the GPU to complete a designated operation accordingly” is both not consistent with the claim language as it exists and fails to address the proffered combination itself, ignoring that Lottes data structure, like the operation table claimed, enables the distributed scheduling of the respective systems, and thus Lottes teaches or suggests at least this element as mapped in the rejection. This argument is therefore also unpersuasive and the rejection is maintained. 
Regarding Applicant’s arguments about the “order constraint” on pages 13-14 of the remarks, the Examiner Again respectfully disagrees. Specifically, Lottes system, as discussed previously enforces dependencies in work item execution, which while it may not be exactly as applicants’ conceive, it plainly reads on any reasonable construction of the claim language. These arguments, therefore, are unpersuasive and the rejection is respectfully maintained. 





Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-6,8-13, and 15-19 are is/are rejected under 35 U.S.C. 103 as being unpatentable over “Gummaraju” (US PG Pub 2013/0160016) in view of “Lottes” (US PG Pub 2014/0259016). 

Regarding Clam 1, Gummaraju teaches: 
1. A method for compiling code adapted for secondary offloads in a graphics processing unit (GPU), performed by a processing unit, comprising: reconstructing a plurality of execution codes in a first kernel into a second kernel, (See e.g. Gummaraju, e.g. ¶34 describing a system for offloading compute kernels to units with a GPGPU including compiling the kernel for the difference compute unit types including 

wherein the computation codes comprise a portion of the execution codes, and a plurality of synchronization hooks, (In generating and allocating the kernles, e.g. 201, Fig. 2, ¶38 of Gummaraju describes the kernels including operations to carry out for executing the code, as well as synchronization operations)

Gummaraju does not teach, but Lottes teaches:

wherein the second kernel comprises an operation table and a plurality of computation codes, (See e.g. work completeion and work tracking data structures in scheduling kernel of the GPU, Fig. 2, ¶¶21-22 of Lottes) 
wherein the operation table comprises a plurality of entries, (See e.g. work completeion and work tracking data structures in scheduling kernel of the GPU, Fig. 2, ¶¶21-22 of Lottes)
and each synchronization hook comprises information indicating one entry of the operation table, (Here, while Lottes does not teach the synchronization operations described in Gummaraju cited above, Lottes teaches dependencies within the work-items of the executable code, wherein the dependencies are indicated in the data structures 201-202 Fig. 2, ¶¶21-22, and used to determine ready-to-execute work items w/ dependencies satisfied in 203, Fig. 2).

wherein an order of the portion of the execution codes and the synchronization hooks in the computation codes matches an order of the execution codes in the first kernel, thereby enabling a compute unit (CU) in the GPU to execute the computation codes, (Here, while Lottes does not teach the synchronization operations described in Gummaraju cited above, Lottes teaches dependencies within the work-items of the executable code, wherein the dependencies are indicated in the data structures 201-202 Fig. 2, ¶¶21-22, and used to determine ready-to-execute work items w/ dependencies satisfied in 203, Fig. 2).

and an engine in the GPU to instruct a component inside or outside of the GPU to complete a designated operation in accordance with content of each entry in the operation table.  (420 Fig. 4, Scheduling Kernel within the GPU designates operations e.g. 304, Fig. 3, ¶24). 

In addition, it would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the application to combine the teachings of Gummaraju with those of Lottes as each is directed to execution scheduling on a GPU and Lottes recognized “lack of flexible and powerful work scheduling capabilities prevents many complex algorithms from being run on the powerful computation resources of the GPU.” (¶5). 

Regarding the dependent claims 2-6, Gummaraju and Lottes further teach: 
2. The method of claim 1, wherein each execution code or each synchronization hook in the computation codes is accompanied with a synchronization flag to indicate whether an execution of each execution code or each synchronization hook needs to wait for an execution completion of a previous execution code or a previous synchronization hook.  (See 303-304, Fig. 3, ¶24 of Lottes teaches for each work item to be executing checking the corresponding dependency information to determing if the work item still needs to waite for completion of a previous work item) In addition, it would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the application to combine the teachings of Gummaraju with those of Lottes as each is directed to execution scheduling on a GPU and Lottes recognized “lack of flexible and powerful work scheduling capabilities prevents many complex algorithms from being run on the powerful computation resources of the GPU.” (¶5). 


3. The method of claim 1, comprising: determining whether each execution code in the first kernel is suitable to be executed by the CU in the GPU; (See Gummaraju ¶¶50-52 teaching identifying kernels appropriate for execution the CPU or GPU elements and adding the kernels to the appropriate processor-specific queues based on the comparison to the processor profile) 

 if so, appending a suitable execution code to the computation codes; (See Gummaraju ¶¶50-52 teaching identifying kernels appropriate for execution the CPU or GPU elements and adding the kernels to the appropriate processor-specific queues based on the comparison to the processor profile) and 15Applicant Ref.: BR20-0006I00-US Yapto Ref.: 20P110-US otherwise, inserting an entry corresponding to an unsuitable execution code into the operation table, and appending a synchronization hook indicating the entry to the computation codes.  (See Gummaraju ¶¶50-52 teaching identifying kernels appropriate for execution the CPU or GPU elements and adding the kernels to the appropriate processor-specific queues based on the comparison to the processor profile)

4. The method of claim 1, comprising: storing the second kernel in a storage device for execution by the GPU.  (See e.g. Lottes, 420, Fig. 4, executing the scheduling Kernel on the GPU). 
5. The method of claim 1, wherein each entry comprises information indicating that an operation is performed by a component inside or outside of the GPU, and information indicating how to perform the operation.  (See Gummaraju ¶¶50-52 teaching identifying kernels appropriate for execution the CPU or GPU elements and adding the kernels to the appropriate processor-specific queues based on the comparison to the processor profile)

6. The method of claim 5, wherein each entry comprises an operating command, and an operating parameter.  (See e.g. Gummaraju ¶¶47-48 describing using the processor profile entries to determine assignment of the kernels to cpu or gpu elements wherein those entries also include the commands and parameters related to the processor or group of processors)

Claims 8-13 are rejected on the same basis as claims 1-6 above respectively.
Claims 15-19 are rejected on the same basis as claims 1-3,5, and 6 above respectively.

Claims 7, 14 and 20 are is/are rejected under 35 U.S.C. 103 as being unpatentable over “Gummaraju” (US PG Pub 2013/0160016) in view of “Lottes” (US PG Pub 2014/0259016) as applied above and further in view of Dunajski (US PG Pub 2021/0263766). 

Regarding Claim , Gummaraju et al teach the limitations of claim 1 above, and Gummaraju further teaches: 
7. The method of claim 1, and the component outside of GPU is a central processing unit (CPU). (See Gummaraju ¶¶50-52 teaching identifying kernels appropriate for execution the CPU or GPU elements and adding the kernels to the appropriate processor-specific queues based on the comparison to the processor profile)

Gummaraju does not further teach, but Dunajski teaches:
wherein the component inside of the GPU is a layer 2 cache, or a direct memory access/system direct memory access (DMA/SDMA) controller, (See e.g. Dunajski e.g. ¶83 teaches a DMA controller within the GPGPU system as well as l2 cache in ¶84). 

In addition, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the application to combine the teachings of Gummaraju et al and those of Dunajski as each is directed to scheduling executing within GPGPU and CPU heterogenous sytems and Dunajski recognized “Computing platforms allow a central processing unit (CPU) to offload operations to a graphics processing unit (GPU)” and provides methods for exploiting the GPU components to achieve parallel execution processing (¶2). 
Claim 14 and 20 are rejected on the same basis as claim 7 above. 





Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MATTHEW J BROPHY whose telephone number is (571)270-1642. The examiner can normally be reached Monday-Friday, 9am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wei Zhen can be reached on 571-272-3708. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





MJB
10/21/2022

/MATTHEW J BROPHY/Primary Examiner, Art Unit 2191