DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to RCE filed on 6/1/2021.
Claims 2-21 are pending.

Response to Amendment
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 2, 4-6, 8, 9, 11-13, 15, 16, 18-21 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Lachner US 2010/0153934 A1 (hereinafter Lachner).

Per Claim 2:
A computing device, comprising: a central processing unit (CPU) (Lachner, [0018], a computing system having one or more CPUs), to:
generate an execution policy to be generated indicating how source code is to be compiled (Lachner, [0017], A compiler generates machine code that includes prefetching and/or scheduling optimizations for code to be executed on a first processing element (such as, e.g., a CPU) and one or more additional processing element(s) (such as, e.g., GPU) of a heterogeneous multi-processor system. [0020], FIG. 1 illustrates at least one embodiment of a compiler 120 to generate compiler-based software pre-fetch optimization instructions for code to be executed on a heterogeneous multi-processor target hardware system 140. For at least one embodiment, the compiler translates a computer program 102 written in a high-level language, such as C++, DirectX, or FORTRAN, into machine language for the appropriate processing elements of the target hardware system 140. The compiler takes the high-level code for the computer program as input and generates a so-called "fat" machine executable binary file 104 that includes machine language instructions for both a first and second processing element of the target hardware of the processing system on which the computer program is to be executed. For at least one embodiment, the resultant "fat" binary file 104 includes machine language instructions for a first processing element (e.g., a CPU) and a second processing element (e.g., a GPU). [0062], apply compiler optimization techniques to code written for a system that includes heterogeneous processor architectures to deliver optimized performance of foreign code. Foreign code portions, which are compiled for a processor architecture that is different from the CPU architecture, are compiled as foreign macro-instruction extensions to the native  Examiner notes, the instant Specification [0007], The compiler is configured to embed the intermediate representation of the logic iterator into the compiled first portion of machine code when the specifier of the particular target and execution policy indicates JIT compilation for sequential execution on the CPU, JIT compilation for parallel execution on the CPU, JIT compilation for execution on a GPU, or runtime selectable compilation and execution. Lachner teaches execution instruction/policy indicated how to compile/perform the source code that read on the instant claimed limitations and
a graphics processing unit (GPU) to compile at least a portion of the source code according to the execution policy (Lachner, [0041], The compiler 120 effectively offloads from the CPU foreign code portions to a GPU by treating them as foreign macro-instructions that can then be subjected to compiler-based optimization 

Per Claim 4:
The rejection of claim 2 is incorporated, further Lachner teaches comprising a runtime library defining a plurality of execution policies including: just-in-time (TIT) parallel code implementations to be performed by the GPU (Lachner, [0062], This compilation results in generation of prefetch and "launch" run-time function calls that are inserted into the intermediate representation for the foreign macro-instructions. Thus, the programmer need not use any special programming language to effect synchronized concurrent programming for heterogeneous architectures), Examiner notes, a runtime library is set of routines/functions commonly used in programs, Lachner’s run-time functions read on runtime library; and
runtime selectable compilation on the GPU (Lachner, [0041], The compiler 120 effectively offloads from the CPU foreign code portions to a GPU by treating them as foreign macro-instructions that can then be subjected to compiler-based optimization techniques. [0064] FIG. 9 illustrates at least one embodiment of a system 900 in which the run-time support function calls executed by the CPU 200 cause the appropriate operations to be performed on the GPU 220. FIG. 9 illustrates that the system 900 

Per Claim 5:
The rejection of claim 4 is incorporated, further Lachner teaches wherein a compiler module is further configured to generate, during runtime, using the compiled source code and the runtime library, object code to be performed by the GPU (Lachner, [0064] FIG. 9 illustrates at least one embodiment of a system 900 in which the run-time support function calls executed by the CPU 200 cause the appropriate operations to be performed on the GPU 220. FIG. 9 illustrates that the system 900 includes a modified compiler 120 (to generate heterogeneous machine code 908 for an application).

Per Claim 6:
The rejection of claim 5 is incorporated, further Lachner teaches wherein the compiler module is further configured to use the runtime library to generate the object code by selecting one policy of the plurality of execution policies (Lachner, [0064] FIG. 9 illustrates at least one embodiment of a system 900 in which the run-time support function calls executed by the CPU 200 cause the appropriate operations to be performed on the GPU 220. FIG. 9 illustrates that the system 900 includes a modified compiler 120 (to generate heterogeneous machine code 908 for an application).

Per Claim 8:
The rejection of claim 2 is incorporated, further Lachner teaches wherein the object code execution policy specifies at least one of a sequential execution policy and a parallel execution policy (Lachner, [0061], the compiler may "schedule" the code segments concurrently by placing the "launch" calls sequentially in the CPU code 800 without any synchronization instructions between them. It is assumed that the GPU runtime scheduler (914 of FIG. 9) will schedule the GPU operations corresponding to the "launch" calls in parallel, if feasible, on the GPU side).

Per Claims 9, 11-13, and 15:
These are method versions of the computer device discussed above (claims 2-6, and 8), wherein all claim limitations also have been addressed and/or covered as set forth above. Thus accordingly, these claims are also anticipated by Lachner.

Per Claims 16, and 18-21:
These are system versions of the computer device discussed above (claims 2-6, and 8), wherein all claim limitations also have been addressed and/or covered as set forth above. Thus accordingly, these claims are also anticipated by Lachner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3, 7, 10, 14, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lachner US 2010/0153934 A1 (hereinafter Lachner), in view of Felch, US 20130086564 A1 (hereinafter Felch).

Per Claim 3:
The rejection device of claim 2 is incorporated, Lachner does not explicitly teach wherein the portion of source code includes a logic iterator. However, Felch teaches wherein the portion of source code includes a logic iterator (Felch, FIG. 7, 730, 735, and 770).
It would have been obvious to one having ordinary skill in the computer art before the effective filing date of the claimed invention to modify the computing device disclosed by Lachner to include the portion of source code includes a logic iterator using the teaching of Felch. The modification would be obvious because one of ordinary skill in the art would be motivated to provide a program that may be run on the parallel computing architecture (Felch [0012])

Per Claim 7:
The rejection of claim 5 is incorporated, Lachner does not explicitly teach wherein the object code is compiled for compute unified device architecture (CUDA).
However, Fetch teaches wherein the object code is compiled for compute unified device architecture (CUDA) (Felch, [0031], The CUDA optimizing compiler receives a program and compiles it for execution on parallel hardware (CUDA-compatible Graphics Processing Units)...The CUDA optimizing compiler can initiate execution of the compiled program on parallel hardware as in CUDA-compatible Graphis Processing Units (GPUs)).
It would have been obvious to one having ordinary skill in the computer art before the effective filing date of the claimed invention to modify the computing device disclosed by Lachner to include the object code is compiled for compute unified device architecture (CUDA) using the teaching of Felch. The modification would be obvious because one of ordinary skill in the art would be motivated to provide an automated method of optimizing execution of a program in a parallel processing environment (Felch [0004]). 
Per Claims 10, and 14:
These are method versions of the computer device discussed above (claims 3, and 7), wherein all claim limitations also have been addressed and/or covered as set forth above. Thus accordingly, these claims are also obvious.

Per Claim 17:
This is a system version of the computer device discussed above (claim 3), wherein all claim limitations also have been addressed and/or covered as set forth above. Thus accordingly, this claim is also obvious.

Response to Arguments
Applicant's arguments filed 6/1/2021 have been fully considered but they are not persuasive. 

Applicant argued:
Applicant respectfully submits that Lachner fails to teach each element of claim 2 and respectfully requests withdrawal of the rejection of claim 2. By way of example, Lachner at least fails to teach “a central processing unit (CPU) to generate an execution policy indicating how source code is to be compiled” and “a graphics processing unit (GPU) to compile at least a portion of the source code according to the execution policy.” Accordingly, Applicant respectfully requests withdrawal of the rejection of claim 2.
As noted in the specification, the execution policy is added at run-time and results in intermediate code that is then fully compiled to final object code in accordance with the execution policy when sent to the GPU. This can include, for example, accessing run-time libraries which may be added to the code. In contrast, the cited portions of Lachner (e.g., paragraphs 62, 85, and 89) appear to be directed toward identification of code sequences. This identification of a sequence cannot reasonably be 

Examiner response:
In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., he execution policy is added at run-time and results in intermediate code that is then fully compiled to final object code in accordance with the execution policy when sent to the GPU. This can include, for example, accessing run-time libraries which may be added to the code.) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
	In fact, Lachner teaches instant claimed limitation generate an execution policy to be generated indicating how source code is to be compiled, see Lachner, [0017], A compiler generates machine code that includes prefetching and/or scheduling optimizations for code to be executed on a first processing element (such as, e.g., a CPU) and one or more additional processing element(s) (such as, e.g., GPU) of a heterogeneous multi-processor system. [0020], FIG. 1 illustrates at least one embodiment of a compiler 120 to generate compiler-based software pre-fetch optimization instructions for code to be executed on a heterogeneous multi-processor target hardware system 140 … The compiler takes the high-level code for the computer  Here, Lachner teaches execution policy indicating how source code is to be compiled that optimize the using processing units as CPU or GPU.
Further, Lachner teaches a graphics processing unit (GPU) to compile at least a portion of the source code according to the execution policy, see Lachner, [0041], The compiler 120 effectively offloads from the CPU foreign code portions to a GPU by treating them as foreign macro-instructions that can then be subjected to compiler-

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 2012/0166772 teaches the graphics compiler is able to pass the fragment program (as a render graph node) to a selected GPU at run-time. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANNA CHEN DENG whose telephone number is (571)272-5989.  The examiner can normally be reached on 9:30 AM – 6:30 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wei Zhen can be reached at 571 –272-3708.  The fax phone number for the organization where this application or proceeding is assigned is 703-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 

/ANNA C DENG/Primary Examiner, Art Unit 2191