Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claims 1-20 are presented for examination.

Allowable Subject Matter
Claim 4-6 and 14-16 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 4-6 and 14-16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
the accelerated code version of the second application” in the claims.
Dependent Claims 5-6 and 15-16 are also rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to cure the deficiencies of their independent claim.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 7-8, 10-12, 17-18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Meswani (“Modeling and predicting performance of high performance computing applications on hardware accelerators”), in view of Baldini (US 9715663 B2), further in view of  Aslot (US 20170132163 A1).

Regarding Claim 1, Meswani (Modeling and predicting performance of high performance computing applications on hardware accelerators) teaches

generate a computational profile for each of the applications based at least in part upon execution metrics of each of the applications for the CPU and the accelerated processing unit (Page 92, Section 4, We developed a benchmark suite that enabled us to profile the rate at which the FPGA, the GPU, and the CPU can perform idiom operations at different data sizes (the range is covered by the induction variable expressed in bytes); Page 94, Section 5.2, In order to model the performance of identified idioms on accelerator hardware, we also need to capture the data footprint for each identified idiom. The data footprint depends on data input at runtime; hence we use PEBIL (Laurenzano et al., 2010), a binary instrumentation tool to capture those parameters); 
and apply a genetic algorithm (GA) prediction model to predict if an execution speedup is achievable on an accelerated processing unit for each of the applications based at least in part upon: the computational profile for each of the applications; the computational profile for each of the benchmarks; and available computational capacities of the CPU and the accelerated processing unit within an execution time window (Page 95, Section 6, For the purpose of modeling and the prediction of accelerators, we extended the PMaC performance-modeling framework (Snavely et al., 2002). This framework is used to provide fast and accurate predictions of large scale HPC application performance. The modeling framework is based on three components: (1) benchmarks that characterize how fast a machine can perform certain operations, called machine profiles, (2) tracing and simulation tools that gather information about application characteristics and requirements, called application signatures, and (3) the convolver, which are methods that predict performance using application signatures and machine profiles); 
and a predicted accelerated processing unit execution speed is faster than a predicted or observed CPU execution speed within the execution time window (Abstract , Hybrid-core systems speedup applications by offloading certain compute operations that can run faster on hardware accelerators).

Meswani did not specifically teach
receive a plurality of applications; 
wherein the accelerated processing unit is configured to process an accelerated code version of a first one of the applications in response to a prediction, by the processor, that: the accelerated processor unit has a sufficient available computational capacity to execute the accelerated code version of the first application within the execution time window.


However, Baldini (US 9715663 B2) teaches
receive a plurality of applications (Claim 1, obtaining a set of existing applications and observed performance of corresponding accelerated codes associated with the existing applications executing on a hardware accelerator processor on a target hardware device)
wherein the accelerated processing unit is configured to process an accelerated code version of a first one of the applications in response to a prediction, by the processor (Claim 7, and the predictive models are employed to determine a best target device from the plurality of target devices for running a new application).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have combined Meswani’s teaching to Baldini’s in order to predicte a hardware device for best program performance may be provided by obtaining a plurality of existing applications and observed performance on a plurality of target hardware devices (Baldini [Summary]).

Meswani and Baldini did not specifically teach
that: the accelerated processor unit has a sufficient available computational capacity to execute the accelerated code version of the first application within the execution time window.

However, Aslot (US 20170132163 A1) teaches
that: the accelerated processor unit has a sufficient available computational capacity to execute the accelerated code version of the first application within the execution time window (Para [0029], the coherent accelerator adapter 132 may perform “round robin” through the contexts, switching on a specific time period (e.g., every 10 ms)).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have combined Meswani and Baldini’s teaching to Aslot’s in order to check for exceptions on a coherent accelerator, involves receiving a system call to attach a hardware 

Regarding Claim 2, Meswani, Baldini and Aslot teach
The computer-implemented system of claim 1, wherein the accelerated processing unit comprises a processing unit selected from the group consisting of a graphic processing unit (GPU) and a field programmable gate array (FPGA) (Meswani [Page 89, Introduction, Application performance may improve when a CPU offloads compute operations to hardware accelerators such as a GPU or a FPGA]).

Regarding Claim 7, Meswani, Baldini and Aslot teach
The computer-implemented system of claim 1, wherein the processor is further configured to apply the GA prediction model to predict a hybrid execution speed of hybrid processing by the CPU and the accelerated processing unit for each of the applications (Meswani [Abstract, Hybrid-core systems speedup applications by offloading certain compute operations that can run faster on hardware accelerators. However, such systems require significant programming and porting effort to gain a performance benefit from the accelerators. Therefore, prior to porting it is prudent to investigate the predicted performance benefit of accelerators for a given workload. To address this problem we present a performance-modeling framework that predicts the application performance rapidly and accurately for hybrid-core systems]).

Regarding Claim 8, Meswani, Baldini and Aslot teach
The computer-implemented system of claim 7, wherein the processor is further configured to generate an optimization path for processing by the CPU, processing by the accelerated processing unit, or hybrid processing by the CPU and the accelerated processing unit (Meswani [Abstract, Hybrid-core systems speedup applications by offloading certain compute operations that can run faster on hardware accelerators. However, such systems require significant programming and porting effort to gain a performance benefit from the accelerators. Therefore, prior to porting it is prudent to investigate the predicted performance benefit of accelerators for a given workload. To address this problem we present a performance-modeling framework that predicts the application performance rapidly and accurately for hybrid-core systems]).

Regarding Claim 10, Meswani, Baldini and Aslot teach
The computer-implemented system of claim 1, further comprising at least one additional CPU or at least one additional accelerated processing unit (Meswani [Page 90, Left Col, 3rd Paragraph, Each compute node on our hypothetical machine has an x86 host CPU, a NVIDIA Fermi GPU, and a Convey FPGA co-processor]).

	Regarding Claim 11, is a computer-implemented method claim corresponding to the computer-implemented system claim above (Claim 1) and, therefore, is rejected for the same reasons set forth in the rejection of Claim 1.

Regarding Claim 12, is a computer-implemented method claim corresponding to the computer-implemented system claim above (Claim 2) and, therefore, is rejected for the same reasons set forth in the rejection of Claim 2.

Regarding Claim 17, is a computer-implemented method claim corresponding to the computer-implemented system claim above (Claim 7) and, therefore, is rejected for the same reasons set forth in the rejection of Claim 7.

	Regarding Claim 18, is a computer-implemented method claim corresponding to the computer-implemented system claim above (Claim 8) and, therefore, is rejected for the same reasons set forth in the rejection of Claim 8.


Regarding Claim 20, is a computer-implemented method claim corresponding to the computer-implemented system claim above (Claim 10) and, therefore, is rejected for the same reasons set forth in the rejection of Claim 10.


Claims 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Meswani (“Modeling and predicting performance of high performance computing applications on Baldini (US 9715663 B2), and Aslot (US 20170132163 A1) further in view of Chen (US 20190121674 A1).

Regarding Claim 3, Meswani, Baldini and Aslot teach
The computer-implemented system of claim 2.

Meswani, Baldini and Aslot did not teach
wherein the accelerated code version of the first application comprises an application selected from the group consisting of a GPU code version of the first application and an FPGA code version of the first application.

However, Chen (US 20190121674 A1) teaches 
wherein the accelerated code version of the first application comprises an application selected from the group consisting of a GPU code version of the first application and an FPGA code version of the first application (Para [0068], When the controller 130 receives clCreateProgramWithSource, it may be determined that what is received by the controller 130 is the OpenCL Kernel source code. The controller 130 may use a compiler to generate OpenCL Kernel binary code for GPU and FPGA respectively).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have combined Meswani, Baldini and Aslot’s teaching to Chen’s in order to allocate processing resource to applications, by determining 

Regarding Claim 13, is a computer-implemented method claim corresponding to the computer-implemented system claim above (Claim 3) and, therefore, is rejected for the same reasons set forth in the rejection of Claim 3.

Claims 9 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Meswani (“Modeling and predicting performance of high performance computing applications on hardware accelerators”), in view of Baldini (US 9715663 B2), and Aslot (US 20170132163 A1) further in view of Chakradhar (US 20100088490 A1).


Regarding Claim 9, Meswani, Baldini and Aslot teach
The computer-implemented system of claim 1.

Meswani, Baldini and Aslot did not teach
wherein the processor is further configured to receive the applications from a scheduler.

However, Chakradhar (US 20100088490 A1) teaches 
automatically scheduling the offload of computation operations from a host processor memory to the parallel accelerator and scheduling data transfers between the host processor memory and the parallel accelerator such that the data transfers are minimized by modifying at least one of computation operations and communications in the template to generate an optimized execution plan).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have combined Meswani, Baldini and Aslot’s teaching to Chakradhar’s  in order to provide application execution processing improving system for hybrid computing platform, by presenting application program interface to application representing template and managing execution of plan on parallel accelerator (Chakradhar [Summary]).

Regarding 19, is a computer-implemented method claim corresponding to the computer-implemented system claim above (Claim 9) and, therefore, is rejected for the same reasons set forth in the rejection of Claim 9.

Notice of References Cited
Geleji (US 20200167137 A1) is related to obtaining a native code having a large number of counters embedded for profiling. Use cases that is serviced by the native code is identified and respective use case profiles representing performance characteristics of a corresponding use case are 

Choudhury (US 20200125926 A1) is related to obtaining, as input for inferencing of one or more deep neural networks, (i) an inferencing model and (ii) one or more resource constraints; computing, based at least in part on the obtained input, a set of statistics pertaining to resource utilization for each of multiple layers in the one or more deep neural networks; determining, based at least in part on (i) the obtained input and (ii) the computed set of statistics, multiple batch sizes to be used for inferencing the multiple layers of the one or more deep neural networks.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMIR SOLTANZADEH whose telephone number is (571)272-3451. The examiner can normally be reached M-F, 9am - 5pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wei Zhen can be reached on (571) 272-3708. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/AMIR SOLTANZADEH/Examiner, Art Unit 2191                                                                                                                                                                                                        /WEI Y ZHEN/Supervisory Patent Examiner, Art Unit 2191