DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 7/20/2021 was filed after the mailing date of the application.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 13-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  Claims 13-20 describe machine-readable storage media.  
Further, Applicant's specification, at paragraph [0021], fails to explicitly define the scope of computer-readable storage media to exclude transitory media (i.e. it can be transitory or non-transitory).  Thus, in giving the term its plain meaning (see MPEP 2111.01), the claimed computer-readable storage media is considered to include data signals per se.  Data signals per se are not statutory as they fail to fall into one of the four statutory categories of invention.


Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of US. Patent No. 11,029,870 (“Patent ‘870” hereinafter). Although the claims at issue are not identical, they are not patentably distinct from each other because all the features of current claims 1-20 are already included in claims 1-20 of Patent ‘870. See tables below.
Table I
Current Application 17/321,186
US. Patent No. 11,029,870
1-20
1-20


Table II
Current Application 17/321,186
US. Patent No. 11,029,870
1. A compute device comprising: 

one or more accelerator devices; and a compute engine to: 

determine a configuration of each accelerator device of the compute device, wherein the 








receive, from a requester device remote from the compute device, a job to be accelerated; 

divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices as a function of a job analysis of the job and the configuration of each accelerator device; 





schedule the tasks to the one or more accelerator devices based on the job analysis; 

execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks; and 





combine task outputs from the accelerator devices that executed the tasks to obtain an output of the job.


one or more accelerator devices;  and 
a compute engine to: 

determine a configuration of at least one accelerator device of the compute device, wherein and wherein to determine the configuration includes to determine for the at least one accelerator device whether the at least one accelerator device is capable of accessing a shared data set and whether the at least one accelerator device is capable of accessing a shared memory;  

receive, from a requester device, a job to be accelerated;  

divide the job into multiple tasks for a parallel execution of the multiple tasks among the one or more accelerator devices as a function 
of a job analysis of the job and the configuration of the at least one accelerator device indicative of whether the at least one accelerator device is 
capable of accessing the shared data set and whether the at least one accelerator device is capable of accessing the shared memory;  

schedule the multiple tasks to the one or more accelerator devices based on the job analysis;  

execute the multiple tasks on the one or more accelerator devices for parallel execution of the multiple tasks based on whether the at least one 
accelerator device is capable of accessing the shared data set and whether the at least one accelerator device is capable of accessing the shared memory;  and 

combine task outputs from the one or more accelerator devices that executed the multiple tasks to obtain an output of the job.

.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-3, 6, 9, 13, 14, 17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chapman et al. (US. Patent App. Pub. No. 2013/0179485, “Chapman” hereinafter) in view of Archer et al. (US. Patent App. Pub. No. 2010/0191823, “Archer”).
As per claim 1, as shown in Fig. 3, Chapman teaches a compute device comprising: 
one or more accelerator devices (¶ [31]); and 
a compute engine to: 
determine a configuration of each accelerator device of the compute device (¶ [31], i.e. based on their communication rate and the computation rate in the network to determine the chunk size of data), wherein the configuration is indicative of parallel execution features present in each accelerator device (“The host computers splits the application data into segments to encapsulate the data into the optimal chunk and then dispatches the encapsulated data to the accelerator devices as well as instructions for parallel computation for the encapsulated data”, thus implying indicative of parallel execution features present in each accelerator device);
receive, from a requester device remote from the compute device, a job to be accelerated (¶ [45], application kernels 321 of accelerator 320 provide requested services);
divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices as a function of a job analysis of the job and the configuration of each accelerator device (¶ [31] recited above); 
schedule the tasks to the one or more accelerator devices based on the job analysis (¶ [56-57], i.e. schedule tasks for the accelerator devices execution using timer. See Fig. 6, steps 603 and 607);
execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks (¶ [56], “In block S604, the host writes, i.e., dispatches the operation commands to invoke an empty kernel program or an application kernel program in the accelerator devices so as to execute the computation for a predetermined number of iterations NUM_ITER in block S605”).
Chapman does not explicitly teach combine task outputs from the accelerator devices that executed the tasks to obtain an output of the job.
However, Archer teaches a similar method of requesting, assigning and executing tasks in parallel by plurality of accelerators (¶ [45-46, and 86]), where the task outputs from the accelerator devices are combined into a single output (¶ [56], “A `reduce` or `reduction` operation is an example of a collective operation in which data distributed among a number of accelerators is combined into a single result by executing specific arithmetic or logical functions on the data”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to make use of the method as taught by Archer in combination with the method as taught by Chapman, the advantage of which is to produce a single output from the divided tasks executed by the accelerators.
As per claim 2, the combined teachings of Chapman and Archer impliedly teaches wherein the compute engine includes a micro-orchestrator logic unit (It is noted that the functionality of the claimed micro-orchestrator logic unit is not specified).
As per claim 3, as addressed in claims 1 and 2, the combined Chapman-Archer also teaches wherein to determine the configuration of each accelerator device comprises to determine, by the micro-orchestrator logic unit of the compute device, the configuration of each accelerator device.
As per claim 6, as also addressed in claim 1, the combined Chapman-Archer does teach wherein to schedule the tasks to the one or more accelerator devices comprises to schedule parallel execution of the tasks.
As per claim 9, the combined Chapman-Archer also impliedly teach wherein the compute engine is further to: 
determine whether an authorization is required from a server to execute the tasks on the compute device (see Archer, ¶ [66], determining whether the accelerator is authorized to perform the accelerator's assigned specific function); 
transmit, in response to a determination that the authorization is required, the job analysis to an orchestrator server; receive an authorization from the orchestrator server; and receive the tasks to be accelerated (¶ [36-37] in combination with ¶ [66] above, i.e. using messaging via data communication between the host and the accelerators before data transferring to the assigned accelerators). Thus, claim 9 would have been obvious over the combined references for the reason above.
Claim 13, which is similar in scope to claim 1 as addressed above, is thus rejected under the same rationale.
Claim 14, which is similar in scope to claims 2 and 3 as addressed above, is thus rejected under the same rationale.
Claim 17, which is similar in scope to claim 6 as addressed above, is thus rejected under the same rationale.
Claim 20, which is similar in scope to claim 9 as addressed above, is thus rejected under the same rationale.

Claims 4, 5, 7, 10, 11, 15, 16, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chapman et al. (US. Patent App. Pub. No. 2013/0179485) in view of Archer et al. (US. Patent App. Pub. No. 2010/0191823) further in view of Krishnamurthy et al. (US. Patent App. Pub. No. 20120054770, “Krishnamurthy”).
As per claim 4, the combined Chapman-Archer does teach each of the accelerator devices is not a general purpose processor (see e.g. Background section of Chapman, i.e. GPUs, not CPUs).
The combined Chapman-Archer fails to explicitly teach wherein the compute engine is further to determine an availability of one or more of the accelerator devices.

	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to make use of the method as taught by Krishnamurthy and incorporate into the method as taught by the combined Chapman-Archer, the advantage is to identify which accelerators can be used to assign the workload.
As per claim 5, although not explicitly taught by the combined Chapman-Archer, Krishnamurthy does teach wherein to determine the availability of one or more of the accelerator devices (as addressed in claim 4) comprises to determine one or more available kernels of each accelerator device (¶ [34-35], and ¶ [67-68]). Thus, claim 5 would have been obvious over the combined references for the reason above.
As per claim 7, as addressed in claims 4 and 5 above, the combined Chapman-Archer-Krishnamurthy impliedly teaches wherein to schedule the tasks to the one or more accelerator devices comprises to confirm that the one or more accelerator devices are available to simultaneously execute multiple tasks that share data (Krishnamurthy, Fig. 7, ¶ [68] addressed above in combination with the parallel execution of tasks taught by the combined Chapman-Archer addressed in claim 1). Thus, claim 7 would have been obvious over the combined references for the reason above.
As per claim 10, as addressed above, the combined Chapman-Archer-Krishnamurthy impliedly teaches wherein the compute device is a first compute device, and wherein to execute the tasks comprises to: 
determine one or more accelerator devices of a second compute device that is remote to the first compute device to concurrently execute the tasks that share data; and 
execute one or more of the tasks on the one or more accelerator devices of the first compute device as one or more other tasks of the job are concurrently executed with one or more accelerator devices of the second compute device using a shared virtual memory (see Krishnamurthy, Fig. 6, ¶ [67] and [77]). Thus, claim 10 would have been obvious over the combined references for the reason above.
As per claim 11, as addressed in claim 10, the combined Chapman-Archer-Krishnamurthy impliedly teaches wherein to determine the one or more accelerator devices of the second compute device comprises to receive information regarding one or more accelerator devices of the second compute device from the orchestrator server (Krishnamurthy, Fig. 6). Thus, claim 11 would have been obvious over the combined references for the reason above.
Claim 15, which is similar in scope to claim 4 as addressed above, is thus rejected under the same rationale.
Claim 16, which is similar in scope to claim 5 as addressed above, is thus rejected under the same rationale.
Claim 18, which is similar in scope to claim 7 as addressed above, is thus rejected under the same rationale.

Claims 8, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Chapman et al. (US. Patent App. Pub. No. 2013/0179485) in view of Archer et al. (US. Patent App. Pub. No. 2010/0191823) further in view of Teh et al. (US. Patent App. Pub. No. 2017/0046179, “Teh”).
wherein to execute the tasks comprises to concurrently execute two or more of the tasks on two or more of the accelerator devices of the compute device with a high speed serial interface (HSSI). However, HSSI has been well known in the art, one of which is described in Teh as shown in Fig. 7 and 8, ¶ [38] for use by hardware accelerators.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the HSSI as taught by Teh into the system as taught by the combined Chapman-Archer as addressed above, the advantage is to obtain high-speed data transfer for the parallel processing of the accelerators.
Claim 12, which is similar in scope to claim 8 as addressed above, is thus rejected under the same rationale.
Claim 19, which is similar in scope to claim 8 as addressed above, is thus rejected under the same rationale.


Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Hau H. Nguyen whose telephone number is: 571-272-7787.  The examiner can normally be reached on MON-FRI from 8:30-5:30.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on (571) 272-7794.
The fax number for the organization where this application or proceeding is assigned is 571-273-8300.


/HAU H NGUYEN/Primary Examiner, Art Unit 2611