DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to Request for Continued Examination and Applicant Amendment and Arguments filed on 22 February, 2022.
Claims 1-3, 5-15 and 17-28 are pending for examination. Claims 4 and 16 were cancelled. 


Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 22 February, 2022 has been entered.


Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f): 
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) are: “a compute engine" in claim 1 and “means for scheduling” in claim 25.
hardware, firmware, software, or any combination thereof.” and paragraph [0012] that discloses “the compute engine 210 may be embodied as a single device such as an integrated circuit, an embedded system, a field-programmable gate array (FPGA), a system-on-a-chip (SOC), or other integrated system or device.”

If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.


Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):


Claims 1-3, 5-15 and 17-28 are rejected under 35 U.S.C. 112(b), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
As per claims 1, 13, 25 and 26 (line# refers to claim 1):
In line 4, it recites the phrase “multiple accelerator devices”. However, prior to this phrase at line 3, it recites “multiple accelerator devices”. Thus, it is unclear whether the second recitation of “multiple accelerator devices” is the same or different from the first recitation of “multiple accelerator devices”. If they are the same, the or said should be used.

As per claim 5:
In lines 2-3, it recites the phrase “multiple accelerator devices”. It is uncertain if this term intent to refer to “multiple accelerator devices” as cited in claim 1, line 3. If they are the same, the or said should be used.

As per claims 8-9 and 20-21 (line# refers to claim 8):
In line 2, it recites the phrase “a single accelerator device”. However, prior to this phrase in claim 7, at line 4, it recites “a single accelerator device”. Thus, it is unclear whether the second recitation of “a single accelerator device” is the same or different a single accelerator device”. If they are the same, the or said should be used.

As per claims 2-3, 6-7, 10-12, 14-15, 17-19, 22-24 and 27-28:
They are compute device, one or more non-transitory machine-readable storage media, computing device and method claims that depend on claims 1, 13, 25 and 26 respectively above. Therefore, they have same deficiencies as claims 1, 13, 25 and 26 respectively above. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-6, 10, 13-15, 18, 22 and 25-28 are rejected under 35 U.S.C. 103 as being unpatentable over Hundley (US Pub. 2005/0278502 A1) in view of Fong et al. (US Pub. 2018/0052709 A1) and further in view of LI et al. (US Pub. 2017/0331759 A1).
Hundley and LI were cited in the previous Office Action.

As per claim 1, Hundley teaches the invention substantially as claimed including A compute device comprising (Hundley, Fig. 1; [0004] lines 1-2, Data processing hardware, such as computers and personal computers): 
a compute engine to execute an application (Hundley, Fig. 1, 100; [0032] lines 1-4, the computing device 100 executes software instructions. The software instructions can direct the computing device to operate on a block of data, such as a file or some other predetermined block of data); 
an accelerator pool including multiple accelerator devices (Hundley, Fig. 1, 125 (as accelerator pool), 130 HW accelerators); and 
circuitry to (i) receive, from the application, a request to accelerate a function (Hundley, Fig. 7, 355 Task management unit (as circuitry); [0040] line 2, a task management unit (‘TMU”) 355 in the circuit card assembly; [0050] lines 4-5, the TMU 355 comprises a memory (TMU as circuitry); [0032] lines 5-10, When the computing device 100 determines that a block of data (as function) is to be operated on by one of the hardware accelerators 130, a command (as request) to perform the operation is transmitted to the circuit card assembly 125 via the I/O bus 132; [0048] lines 1-3, the computing device 320 generates a data structure of commands to be performed by accelerators 330 in the hardware domain. Lines 11-14, The table of commands, also referred to herein as a playlist, lists one or more of the accelerators 330A-330N and available acceleration command options for particular accelerators; [0049] lines 1-3, the computing device 320 transmits the playlist to the TMU 355 in the hardware domain. The processors 122 can transmit the playlist to the TMU 355); 
(ii) determine suitability associated with each accelerator device in the accelerator pool (Hundley, [0065] lines 6-14, The TMU 355 may receive an input from the computing device 320, either as part of the playlist 500, a rules based playlist, or a separate instruction, indicating the operations that may be executed out of order. Alternatively, the TMU 355 may intelligently determine, based on the types of accelerators 540 in the playlist 500 and the options 550 associated with the listed accelerators 540; [0069] lines 3-13, The TMU 355 accesses the file stored in memory 340 and determines which operations listed in Rule 1 list 560 should next be executed. Each of the comparisons with Results1 data may determine what type of file is in the Results1 data. For example, the Result_A may represent an image file (e.g. .jpg, .gif, .tif), the Result_B may represent an executable file (e.g. .exe), and the Result_C may represent a compressed file (e.g. .zip, .rar, .hqx). The accelerators A, B, C, and D may perform different types of antivirus or decompression operations that are suitable for specific file types (as suitability to perform the function); also see [0069] lines 16-17, accelerator A, which may be an image decompression and/or optimization accelerator; lines 20-21, accelerator B, which may be an antivirus accelerator; lines 28-29, accelerator C, which may be a decompression accelerator; [0004] lines 11-13, a hardware accelerator can be any hardware that is designated to perform specific algorithmic operations on data; [Examiner noted: each accelerator’s suitability is determined in order to allow each hardware accelerator to perform different types of operations]); 
(iii) schedule, in response to the request and the determined suitability associated with each accelerator device, acceleration of the function on multiple accelerator devices to produce output data (Hundley, Fig. 4, 420, 440 transmit header to selected hardware accelerator, 450 perform operation; Fig. 5A; [0020] lines 1-3, FIG. 5A is a table illustrating an exemplary playlist containing instructions for multiple accelerators to perform multiple operations on a block of data; [0041] lines 22-27, the TMU 355 may generate instructions for any of hardware accelerators 130, transmit the instructions to the hardware accelerators 130 via the interconnect 150, and allow the accelerators 130 to perform the requested operations by accessing the input data directly from memory 140; [0036] lines 3-4, The hardware accelerator 130A uses data 210A as input data and processes the data to produce output data); and 
(iv) provide, to the application and in response to completion of acceleration of the function, the output data to the application (Hundley, Fig. 4, 470 additional commands in Playlist, NO to 480 Transmit requested output data; [0032] lines 30-31, the hardware accelerator 130 may process the data, ultimately returning the results to the computing device 120 in the software domain; [0058] lines 1-6, If, in block 470, the TMU 355 determines that there are no algorithmic operations remaining to be performed on the block of data, the method continues to block 480 where output data is transmitted from the memory 340 to the computing device 320 via the interconnect 350 and I/O bus 332); [0080] lines 16-20, each command is executed by the associated hardware accelerator until…the end of processing and the data is returned to the software domain).

determine a queue depth associated with each accelerator device, and when scheduling, it is in response to the request and the determined queue depth associated with each accelerator device.

 However, Fong teaches determine a queue depth associated with each accelerator device, and when scheduling, it is in response to the request and the determined queue depth associated with each accelerator device (Fong, Fig. 4, 418-(1-n) Queues, 408-(1-n) GPUs (as accelerator device); Abstract, lines 3-4, receiving a task request for associated with a workload; [0001] lines 2-5, a hybrid computing infrastructure may be comprised…one or more accelerators, such as graphical processing units (GPUs); [0037] lines 5-6, a plurality of GPUs 408-1 through 408-n, each having a queue 418-1 through 418-n, respectively; [0038] lines 2-9, determine an appropriate GPU among GPU 408-1 through GPU 408-n for offloading a task workload. In one embodiment, the appropriate GPU is determined based on one or more considerations. One such consideration is queue length (as queue depth). For example, if CPU 406 first selects GPU 408-1 to offload thread 406a, but GPU 408-1 has a long queue 418-1, the CPU can look for a GPU with a shorter queue (as scheduling based on queue depth/length); also see [0042] lines 1-3, The GPU information may include the current queue length of each GPU).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Hundley with Fong because Fong’s teaching of determining queue length/depth and assigning 

Although, Hundley and Fong teaches receive from the application, a request to accelerate a function, Hundley and Fong fail to specifically teach obtain the request to accelerate a function.

However, LI teaches obtain the request to accelerate a function (LI, [0049] lines 9-13, the resource provisioning module 266 retrieves (as obtain) a request from the service level queue 264, and instructs the management component 250 to construct and launch an n new bare metal machine with (numCPU/GPU /accelerator).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Hundley and Fong with LI because LI’s teaching of obtaining the request from a queue for performing the acceleration would have provided Hundley and Fong’s system with the advantage and capability to allow the system to easily manage the processing tasks with the task queue which improving system efficiency.

As per claim 2, Hundley, Fong and LI teach the invention according to claim 1 above. Hundley further teaches determine parameters of the request to accelerate the function and wherein to schedule acceleration of the function further comprises to schedule acceleration of the function based on the determined parameters of the request (Hundley, [0065] lines 6-14, The TMU 355 may receive an input from the computing device 320, either as part of the playlist 500, a rules based playlist, or a separate instruction, indicating the operations that may be executed out of order. Alternatively, the TMU 355 may intelligently determine, based on the types of accelerators 540 in the playlist 500 and the options 550 associated with the listed accelerators 540 (as determine parameters of the request); [0069] lines 3-13, The TMU 355 accesses the file stored in memory 340 and determines which operations listed in Rule 1 list 560 should next be executed. Each of the comparisons with Results1 data may determine what type of file is in the Results1 data. For example, the Result_A may represent an image file (e.g. .jpg, .gif, .tif), the Result_B may represent an executable file (e.g. .exe), and the Result_C may represent a compressed file (e.g. .zip, .rar, .hqx). The accelerators A, B, C, and D may perform different types of antivirus or decompression operations that are suitable for specific file types).

As per claim 3, Hundley, Fong and LI teach the invention according to claim 2 above. Hundley further teaches determine one or more of a type of function to be accelerated, a size of a data set to be operated on, or a time period in which acceleration of the function is to be completed (Hundley, [0069] lines 3-13, The TMU 355 accesses the file stored in memory 340 and determines which operations listed in Rule 1 list 560 should next be executed. Each of the comparisons with Results1 data may determine what type of file is in the Results1 data. For example, the Result_A may represent an image file (e.g. .jpg, .gif, .tif), the Result_B may represent an executable file (e.g. .exe), and the Result_C may represent a compressed file (e.g. .zip, .rar, .hqx). The accelerators A, B, C, and D may perform different types of antivirus or decompression operations that are suitable for specific file types (as determine type of function to be accelerated)).

As per claim 5, Hundley, Fong and LI teach the invention according to claim 1 above. Fong further teaches wherein to schedule acceleration of the function comprises to assign the function to multiple accelerator devices based on number of functions presently assigned (Fong, [0038] lines 2-9, determine an appropriate GPU among GPU 408-1 through GPU 408-n for offloading a task workload. In one embodiment, the appropriate GPU is determined based on one or more considerations. One such consideration is queue length (as queue depth). For example, if CPU 406 first selects GPU 408-1 to offload thread 406a, but GPU 408-1 has a long queue 418-1, the CPU can look for a GPU with a shorter queue (as scheduling based on number of functions presently assigned).

As per claim 6, Hundley, Fong and LI teach the invention according to claim 1 above. Hundley further teaches determine a type of function each accelerator device is presently configured to accelerate (Hundley, Fig. 5A, [0065] lines 6-14, The TMU 355 may receive an input from the computing device 320, either as part of the playlist 500, a rules based playlist, or a separate instruction, indicating the operations that may be executed out of order. Alternatively, the TMU 355 may intelligently determine, based on the types of accelerators 540 in the playlist 500 and the options 550 associated with the listed accelerators 540); and 
wherein to schedule acceleration of the function comprises to schedule acceleration of the function based additionally on the determined type of function each accelerator device is presently configured to accelerate (Hundley, [0057] lines 8-9, determining which accelerator 330 should next operate on the data; [0069] lines 3-13, The TMU 355 accesses the file stored in memory 340 and determines which operations listed in Rule 1 list 560 should next be executed. Each of the comparisons with Results1 data may determine what type of file is in the Results1 data. For example, the Result_A may represent an image file (e.g. .jpg, .gif, .tif), the Result_B may represent an executable file (e.g. .exe), and the Result_C may represent a compressed file (e.g. .zip, .rar, .hqx). The accelerators A, B, C, and D may perform different types of antivirus or decompression operations that are suitable for specific file types; [0041] lines 22-27, the TMU 355 may generate instructions for any of hardware accelerators 130, transmit the instructions to the hardware accelerators 130 via the interconnect 150, and allow the accelerators 130 to perform the requested operations by accessing the input data directly from memory 140).

As per claim 10, Hundley, Fong and LI teach the invention according to claim 1 above. Hundley further teaches wherein each accelerator device in the accelerator pool is a field programmable gate array (FPGA) and the circuitry is further to assign acceleration of the function among multiple FPGAs in the accelerator pool (Hundley, Fig. 3, 330 HW accelerators, Task management unit (as acceleration hardware accelerators 130 may be implemented in a variety of methods, including Field Programmable Gate Arrays (FPGAs); [0040] lines 1-7, an interconnect 350 is coupled to a task management unit ("TMU") 355 in the circuit card assembly 325. The TMU 355 provides intelligent routing of commands and data among components of the circuit card assembly 325. As will be explained in further detail below, the TMU 355 allows the execution of multiple commands by multiple accelerators 330 (as assign acceleration of the function among a plurality of FPGAs).

As per claims 13-15, 18 and 22, they are one or more non-transitory machine-readable storage media claims of claims 1-3, 6 and 10 respectively above. Therefore, they are rejected for the same reason as claims 1-3, 6 and 10 respectively above.

As per claim 25, it is a computing device claim of claim 1 above. Therefore, it is rejected for the same reason as claim 1 above. In addition, Hundley further teaches circuitry for executing an application (Hundley, Fig. 1, 100, 122 CPU, 126 memory (as circuitry); [0032] lines 1-4, the computing device 100 executes software instructions. The software instructions can direct the computing device to operate on a block of data, such as a file or some other predetermined block of data)).

As per claims 26-28, they are method claims of claims 1-3 respectively above. Therefore, they are rejected for the same reason as claims 1-3 respectively above.

Claims 7-8 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hundley, Fong and LI, as applied to claims 1 and 13 respectively above, and further in view of Krishnamurthy et al. (US Pub. 2011/0131430 A1).
Krishnamurthy was cited in the previous Office Action.

As per claim 7, Hundley, Fong and LI teach the invention according to claim 1 above. Hundley further teaches wherein the function is one of multiple functions in a sequence of functions to be accelerated (Hundley, Fig. 5A; [0020] lines 1-3, FIG. 5A is a table illustrating an exemplary playlist containing instructions for multiple accelerators to perform multiple operations (as sequence of functions) on a block of data).

Hundley, Fong and LI fail to specifically teach the circuitry is further to determine whether to accelerate the multiple functions on a single accelerator device in the accelerator pool.

However, Krishnamurthy teaches the circuitry is further to determine whether to accelerate the multiple functions on a single accelerator device in the accelerator pool (Krishnamurthy, [0033] lines 1-10, Assume Virtual Queue 4, which runs tasks with constant known execution times, has six tasks queued and each task has a completion time of 10 ms. The accelerator scheduler examines the completion time of all six tasks (not just the first task) and determines that all the tasks can be run on its corresponding accelerator and still finish on time. Thus, other accelerators are not brought up for these tasks (i.e., not run at all or remain in a hibernate state), and these tasks do not need to be moved for proper servicing (as determine whether to accelerate the multiple functions on a single accelerator)).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Hundley, Fong and LI with Krishnamurthy because Krishnamurthy’s teaching of determining whether the tasks can be performed within one accelerator based on the tasks processing time would have provided Hundley, Fong and LI’s system with the advantage and capability to allow the system to distributing the tasks among the accelerators based on the time of the processing tasks which improving the system performance.

As per claim 8, Hundley, Fong, LI and Krishnamurthy teach the invention according to claim 7 above. Krishnamurthy further teaches determine a time estimate to reconfigure the accelerator device for each function in the sequence (Krishnamurthy, [0033] lines 1-10, Assume Virtual Queue 4, which runs tasks with constant known execution times, has six tasks queued and each task has a completion time of 10 ms. The accelerator scheduler examines the completion time of all six tasks (not just the first task) and determines that all the tasks can be run on its corresponding accelerator and still finish on time. Thus, other accelerators are not brought up for these tasks (i.e., not run at all or remain in a hibernate state), and these tasks do not need to task, such as, for instance, start time and completion time, and/or acceptable energy level, etc., INQUIRY 510. If so, then the task is enqueued on that queue).

As per claims 19-20, they are one or more non-transitory machine-readable storage media claims of claims 7-8 respectively above. Therefore, they are rejected for the same reason as claims 7-8 respectively above.


Claims 9 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Hundley, Fong, LI and Krishnamurthy, as applied to claims 7 and 19 respectively above, and further in view of Bharadwaj et al. (US Pub. 2019/0042110 A1).
Bharadwaj was cited in the previous Office Action.

As per claim 9, Hundley, Fong, LI and Krishnamurthy teach the invention according to claim 7 above. Hundley teaches transfer output data from one accelerator device to another accelerator device in the accelerator pool (Hundley, Fig. 3, step 2, data 310A to HW accelerator 330A, step 3, 312 data from output of HW accelerator 330A is transfer to HW accelerator 330B at step 4).

determine a time estimate to transfer output data.

However, Bharadwaj teaches determine a time estimate to transfer output data (Bharadwaj, [0094] lines 12-14, determining a transfer time required to transfer the data length over the port; upon receiving a next IO request, determining whether a time interval between the IO request and the next IO request is less than the transfer time).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined the teaching of Hundley, Fong, LI and Krishnamurthy with Bharadwaj because Bharadwaj’s teaching of determining the time for transferring the data length would have provided Hundley, Fong, LI and Krishnamurthy’s system with the advantage and capability to allow the system to calculate the total amount time needed for processing the tasks and transferring the data in order to determining the different approaches for processing the tasks which improving the system efficiency.

As per claim 21, it is one or more non-transitory machine-readable storage media claim of claim 9 above. Therefore, it is rejected for the same reason as claim 9 above.


Claims 11-12 and 23-24 are rejected under 35 U.S.C. 103 as being unpatentable over Hundley, Fong and LI, as applied to claims 1 and 13 respectively above, and further in view of Hebert et al. (US Pub. 2017/0346902 A1).
Hebert was cited in the previous Office Action.

As per claim 11, Hundley, Fong and LI teach the invention according to claim 1 above. Hundley teaches wherein an accelerator device in the accelerator pool to which the function is scheduled (Hundley, Fig. 4, 450 perform operation; Fig. 5A; [0020] lines 1-3, FIG. 5A is a table illustrating an exemplary playlist containing instructions for multiple accelerators to perform multiple operations on a block of data; [0041] lines 22-27, the TMU 355 may generate instructions for any of hardware accelerators 130, transmit the instructions to the hardware accelerators 130 via the interconnect 150, and allow the accelerators 130 to perform the requested operations by accessing the input data directly from memory 140).

Hundley, Fong and LI fail to specifically teach when the function is scheduled to perform is to load a bit stream to accelerate the function.

However, Hebert teaches when the function is scheduled to perform is to load a bit stream to accelerate the function (Hebert, claim 3, wherein the reconfigurable logic device is a field-programmable gate array ( FPGA), and the executing the configuration script causes a bit stream to be loaded into the FPGA).



As per claim 12, Hundley, Fong, LI and Hebert teach the invention according to claim 11 above. Hundley further teaches wherein the accelerator device is to send, to the circuitry, a notification indicative of completion of the acceleration of the function (Hundley, Fig. 4, 460 Notify TMU of completion of operation; [0054] lines 1-3, the accelerator 330 that operated on the data transmits a signal to the TMU 355 indicating that the operation has been completed).

As per claims 23-24, they are one or more non-transitory machine-readable storage media claims of claims 11-12 respectively above. Therefore, they are rejected for the same reason as claims 11-12 respectively above.


Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Hundley, Fong and LI, as applied to claim 13 above, and further in view of Bird et al. (US Pub. 2014/0181833 A1).


As per claim 17, Hundley, Fong and LI teach the invention according to claim 13 above. Fong further teaches wherein to schedule acceleration of the function comprises to assign the function to one of the accelerator devices that has shorter queue depth (Fong, [0038] lines 2-9, determine an appropriate GPU among GPU 408-1 through GPU 408-n for offloading a task workload. In one embodiment, the appropriate GPU is determined based on one or more considerations. One such consideration is queue length (as queue depth). For example, if CPU 406 first selects GPU 408-1 to offload thread 406a, but GPU 408-1 has a long queue 418-1, the CPU can look for a GPU with a shorter queue (as scheduling based on number of functions presently assigned). 

Hundley, Fong and LI fail to specifically teach when assigning the function to one of the accelerator devices that has the shortest queue depth.

However, Bird teaches when assigning the function to one of the accelerator devices that has the shortest queue depth (Bird, Abstract, lines 7-8, A single processing queue is created for each processor; [0049] lines 11-14, Simple load balancing such as a round robin dispatching or scheduling of incoming threads /tasks, or adding new tasks to the shortest queue can be used to ensure the run queue lengths remain relatively even).



Response to Arguments
Applicant’s arguments with respect to claims 1-3, 5-15 and 17-28 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZUJIA XU whose telephone number is (571)272-0954. The examiner can normally be reached M-F 9:00-5:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/MENG AI T AN/Supervisory Patent Examiner, Art Unit 2195                                                                                                                                                                                                        

/Z.X./Examiner, Art Unit 2195