Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions.

Claims 1-20 are presented for examination.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as 
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) are:”one or more compilers to compile … ”, “a linking module to link …” in claim 8, “an executable file generation module to generate…” in claim 9, “an identification module to assign …” in claim 10, and “the linking module is further operable to …” in claim 11.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 

Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-24 of U.S. Patent No. 10,261,807. Although the claims at issue are not identical, they are not patentably distinct from each other because the referenced patent and the instant application are claiming common subject matter. For illustration purpose, Claim 1 rejection is provided as follow:
Current Application 16/268,106
U.S. Patent No. 10,261,807
1. A computer- implemented method, comprising: 
compiling at least one portion of device code and at least one portion of host code separately












and linking the compiled at least one portion of device code to the compiled at least one portion of host code 











uniquely identifying a device code portion associated with each host object fileset of a plurality of host object filesets used as input, wherein two or more of said plurality of host 

using a linker to link together said plurality of host object filesets to produce a plurality of unique linked device code portions, wherein said linker uses as input separate device code portions from said two or more of said plurality of device code portions to produce said plurality of unique linked device code portions; 

and generating said executable file, wherein said executable file comprises an executable form of said plurality of unique linked device code portions.


For CPU code in GP code 12, GP compiler 20 compiles the one or more modules with CPU code into one or more object or intermediate representation (IR) files 22; Paragraph 0021, DP device compiler 38 is configured to compile code written in a high level DP device programming language such as HLSL (High Level Shader Language) rather than code written in the GP language of GP code 12)
Therefore, It would have been obvious to a person having ordinary skill in the art that the said executable file is accessed for execution using different processor types as taught by Zhu in order to enhance the ability of programmers to program data parallel devices by allowing programmers to program both CPUs and data parallel devices (e.g., GPUs) using a high level general purpose programming language that has data parallel (DP) extensions (Zhu2 [summary])


Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. 

Claims 1-5, 7-12, and 14-19 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Zhu (US20120317556) in view of Zhu (US20110314458), hereon after Zhu2.
Note: Zhu and Zhu2 were cited in IDS.

Regarding Claim 1, Zhu (US20120317556) teaches 
A computer- implemented method, comprising: 
and linking the compiled at least one portion of device code to the compiled at least one portion of host code (Fig. 1; Paragraph 0050, compiler 101 can compile higher level code 111 into proxy code 122 for execution on a CPU and stub code 123 for execution on a co- processor; Paragraph 0062, stub code 123 includes data receive code 127 and kernel dispatch code 132) Examiner Comments: The data receive code 127 and kernel dispatch code 132 are interpreted to the claimed device code portion. The stub code 123 which contains the dispatch code 127 and kernel dispatch code 132 is interpreted to the claimed linked device code portion. 

Zhu did not teach
compiling at least one portion of device code and at least one portion of host code separately.

However, Zhu (US20110314458) hereon after Zhu2 teaches
For CPU code in GP code 12, GP compiler 20 compiles the one or more modules with CPU code into one or more object or intermediate representation (IR) files 22; Paragraph 0021, DP device compiler 38 is configured to compile code written in a high level DP device programming language such as HLSL (High Level Shader Language) rather than code written in the GP language of GP code 12) Examiner Comments: Compiler 20 and Compiler 38 are separate compilers.

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu’s teaching to Zhu2 in order to enhance the ability of programmers to program data parallel devices by allowing programmers to program both CPUs and data parallel devices (e.g., GPUs) using a high level general purpose programming language that has data parallel (DP) extensions (Zhu2 [summary]).

Regarding Claim 2, Zhu and Zhu2 teaches 
The computer-implemented method of claim 1, further comprising: generating an executable file comprising an executable form of the compiled at least one portion of device code and the compiled at least one portion of host code (Zhu [Paragraph 0066, During compilation of lower level code 121 proxy code 122 was generated for execution on a CPU and stub code was generated for execution on a co-processor (e.g., a GPU or other accelerator)) Examiner Comments: The lower level code 121 is interpreted to the claimed executable file; 
Method 400 includes in response to the execution command, an act of executing the proxy code on one of one or more central processing units to invoke a call stub (act 402); Paragraph 0075, Act 402 includes an act of invoking the stub code on one of the one or more co-processors (act 405)); 
wherein the different processor types comprise a central processor type and a graphics processor type (Zhu [Paragraph 0066, During compilation of lower level code 121 proxy code 122 was generated for execution on a CPU and stub code was generated for execution on a co-processor (e.g., a GPU or other accelerator)]).
 
Regarding Claim 3, Zhu and Zhu2 teach
The computer-implemented method of claim 1.

Zhu did not teach 
further comprising assigning a unique identifier to the compiled at least one portion of device code.

However, Zhu2 teaches
further comprising assigning a unique identifier to the compiled at least one portion of device code (Zhu2 [Paragraph 0026, GP compiler 20 uses a naming convention for kernel functions and types used in the DP intermediate code. The naming convention ensures that a unique name is used for each kernel function and type and that the unique name is used consistently for each instance of a function and a type]).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu’s teaching to Zhu2 in order to enhance the ability of programmers to program data parallel devices by allowing programmers to program both CPUs and data parallel devices (e.g., GPUs) using a high level general purpose programming language that has data parallel (DP) extensions (Zhu2 [summary]).

Regarding Claim 4, Zhu and Zhu2 teach
The computer-implemented method of claim 3.

Zhu did not teach 
wherein the assigning further comprises using the unique identifier to prevent the compiled at least one portion of device code from being linked more than once.  

However, Zhu2 teaches
wherein the assigning further comprises using the unique identifier to prevent the compiled at least one portion of device code from being linked more than once (Zhu2 [Paragraph 0026, GP compiler 20 uses a naming convention for names used for identifying binding descriptors 24. This naming convention allows binding descriptors 24 to be uniformly referenced in import tables 24D based on locally available information. The naming conventions may be based on the names of the kernel functions and types in GP code 12]).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu’s teaching to Zhu2 in order to enhance the ability of programmers to program data parallel devices by allowing programmers to program both CPUs and data parallel devices (e.g., GPUs) using a high level general purpose programming language that has data parallel (DP) extensions (Zhu2 [summary]).

Regarding Claim 5, Zhu and Zhu2 teach
The computer-implemented method of claim 1, wherein the compiled at least one of portion of host code comprises instructions to be executed by a central processing unit (CPU) and the compiled at least one portion of device code comprises instructions to be executed by a graphics processing unit (GPU) (Zhu [Paragraph 0066, During compilation of lower level code 121 proxy code 122 was generated for execution on a CPU and stub code was generated for execution on a co-processor (e.g., a GPU or other accelerator)]). 

Regarding Claim 7, Zhu and Zhu2 teach
The computer-implemented method of claim 1, wherein the compiled at least one portion of device code is provided to a device linker from a host object (Zhu [Paragraph 0042, From the intermediate representation 181, code generator 103 can generate a plurality of different lower level instructions (e.g., DirectX.RTM./High Level Shader Language ("HLSL") bytecode) that correctly implement the statements and expressions of received higher level code).

Regarding Claim 8, Zhu teaches
A system for executing code, the system comprising: 
and a linking module to link the compiled at least one portion of device code to the compiled at least one portion of host code (Fig. 1; Paragraph 0050, compiler 101 can compile higher level code 111 into proxy code 122 for execution on a CPU and stub code 123 for execution on a co- processor; Paragraph 0062, stub code 123 includes data receive code 127 and kernel dispatch code 132) Examiner Comments: The data receive code 127 and kernel dispatch code 132 are interpreted to the claimed device code portion. The stub code 123 which contains the dispatch code 127 and kernel dispatch code 132 is interpreted to the claimed linked device code portion. 

Zhu did not teach
one or more compilers to compile at least one portion of device code and at least one portion of host code separately.

However, Zhu2 teaches
one or more compilers to compile at least one portion of device code and at least one portion of host code separately (Paragraph 0020, For CPU code in GP code 12, GP compiler 20 compiles the one or more modules with CPU code into one or more object or intermediate representation (IR) files 22; Paragraph 0021, DP device compiler 38 is configured to compile code written in a high level DP device programming language such as HLSL (High Level Shader Language) rather than code written in the GP language of GP code 12) Examiner Comments: Compiler 20 and Compiler 38 are separate compilers.

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu’s teaching to Zhu2 in order to enhance the ability of programmers to program data parallel devices by allowing programmers to program both CPUs and data parallel devices (e.g., GPUs) using a high level general purpose programming language that has data parallel (DP) extensions (Zhu2 [summary]).

Regarding Claim 9, Zhu and Zhu2 teach
The system of claim 8, further comprising: an executable file generation module to generate an executable file comprising an executable form of the compiled at least one portion of device code and the compiled at least one portion of host code  (Zhu [Paragraph 0066, During compilation of lower level code 121 proxy code 122 was generated for execution on a CPU and stub code was generated for execution on a co-processor (e.g., a GPU or other accelerator)) Examiner Comments: The lower level code 121 is interpreted to the claimed executable file; 
and at least two different processor types to execute the executable file (Zhu [Paragraph 0067, Method 400 includes in response to the execution command, an act of executing the proxy code on one of one or more central processing units to invoke a call stub (act 402); Act 402 includes an act of invoking the stub code on one of the one or more co-processors (act 405));  
wherein the at least two different processor types comprise a central processor type and a graphics processor type  (Zhu [Paragraph 0066, During compilation of lower level code 121 proxy code 122 was generated for execution on a CPU and stub code was generated for execution on a co-processor (e.g., a GPU or other accelerator)]).
 
Regarding Claim 10, Zhu and Zhu2 teach
The system of claim 8.

Zhu did not teach 
further comprising an identification module to assign a unique identifier to the compiled at least one portion of device code.

However, Zhu2 teaches
further comprising an identification module to assign a unique identifier to the compiled at least one portion of device code (Zhu2 [Paragraph 0026, GP compiler 20 uses a naming convention for kernel functions and types used in the DP intermediate code. The naming convention ensures that a unique name is used for each kernel function and type and that the unique name is used consistently for each instance of a function and a type]).



Regarding Claim 11, Zhu and Zhu2 teach
The system of claim 10.

Zhu did not teach 
wherein the linking module is further operable to use the unique identifier to prevent the compiled at least one portion of device code from being linked more than once.  

However, Zhu2 teaches
wherein the linking module is further operable to use the unique identifier to prevent the compiled at least one portion of device code from being linked more than once (Zhu2 [Paragraph 0026, GP compiler 20 uses a naming convention for names used for identifying binding descriptors 24. This naming convention allows binding descriptors 24 to be uniformly referenced in import tables 24D based on locally available information. The naming conventions may be based on the names of the kernel functions and types in GP code 12]).



Regarding Claim 12, Zhu and Zhu2 teach
The system of claim 8, wherein the compiled at least one portion of host code comprises instructions to be executed by a central processing unit (CPU) and the compiled at least one portion of device code comprises instructions to be executed by a graphics processing unit (GPU) (Zhu [Paragraph 0066, During compilation of lower level code 121 proxy code 122 was generated for execution on a CPU and stub code was generated for execution on a co-processor (e.g., a GPU or other accelerator)]).   

Regarding Claim 14, Zhu and Zhu2 teach
The system of claim 8, wherein the compiled at least one portion of device code is provided to a device linker for the linking from a host object (Zhu [Paragraph 0042, From the intermediate representation 181, code generator 103 can generate a plurality of different lower level instructions (e.g., DirectX.RTM./High Level Shader Language ("HLSL") bytecode) that correctly implement the statements and expressions of received higher level code).

Regarding Claim 15, Zhu teaches
compiler 101 can compile higher level code 111 into proxy code 122 for execution on a CPU and stub code 123 for execution on a co- processor; Paragraph 0062, stub code 123 includes data receive code 127 and kernel dispatch code 132) Examiner Comments: The data receive code 127 and kernel dispatch code 132 are interpreted to the claimed device code portion. The stub code 123 which contains the dispatch code 127 and kernel dispatch code 132 is interpreted to the claimed linked device code portion. 

Zhu did not teach
compile at least one portion of device code and at least one portion of host code separately.

However, Zhu2 teaches
compile at least one portion of device code and at least one portion of host code separately (Paragraph 0020, For CPU code in GP code 12, GP compiler 20 compiles the one or more modules with CPU code into one or more object or intermediate representation (IR) files 22; Paragraph 0021, DP device compiler 38 is configured to compile code written in a high level DP device programming language such as HLSL (High Level Shader Language) rather than code written in the GP language of GP code 12) Examiner Comments: Compiler 20 and Compiler 38 are separate compilers.

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu’s teaching to Zhu2 in order to enhance the ability of programmers to program data parallel devices by allowing programmers to program both CPUs and data parallel devices (e.g., GPUs) using a high level general purpose programming language that has data parallel (DP) extensions (Zhu2 [summary]).
 
Regarding Claim 16, Zhu and Zhu2 teaches 
The non-transitory computer-readable storage medium of claim 15, further comprising instructions when executed by the at least one processor of the computing device to: generate an executable file comprising an executable form of the compiled at least one portion of host code and the compiled at least one portion of device code (Zhu [Paragraph 0066, During compilation of lower level code 121 proxy code 122 was generated for execution on a CPU and stub code was generated for execution on a co-processor (e.g., a GPU or other accelerator)) Examiner Comments: The lower level code 121 is interpreted to the claimed executable file; 
and access the executable file for execution using different processor types (Zhu [Paragraph 0067, Method 400 includes in response to the execution command, an act of executing the proxy code on one of one or more central processing units to invoke a call stub (act 402); Paragraph 0075, Act 402 includes an act of invoking the stub code on one of the one or more co-processors (act 405)); 
wherein the different processor types comprise a central processor type and a graphics processor type (Zhu [Paragraph 0066, During compilation of lower level code 121 proxy code 122 was generated for execution on a CPU and stub code was generated for execution on a co-processor (e.g., a GPU or other accelerator)]).

Regarding Claim 17, Zhu and Zhu2 teach
The non-transitory computer-readable storage medium of claim 15.

Zhu did not teach 
wherein accessing further comprises assigning a unique identifier to the compiled at least one portion of device code.

However, Zhu2 teaches
wherein accessing further comprises assigning a unique identifier to the compiled at least one portion of device code (Zhu2 [Paragraph 0026, GP compiler 20 uses a naming convention for kernel functions and types used in the DP intermediate code. The naming convention ensures that a unique name is used for each kernel function and type and that the unique name is used consistently for each instance of a function and a type]).



Regarding Claim 18, Zhu and Zhu2 teach
The non-transitory computer-readable storage medium of claim 17.

Zhu did not teach 
wherein the assigning further comprises using the unique identifier to prevent the compiled at least one portion of device code from being linked more than once.  

However, Zhu2 teaches
wherein the assigning further comprises using the unique identifier to prevent the compiled at least one portion of device code from being linked more than once (Zhu2 [Paragraph 0026, GP compiler 20 uses a naming convention for names used for identifying binding descriptors 24. This naming convention allows binding descriptors 24 to be uniformly referenced in import tables 24D based on locally available information. The naming conventions may be based on the names of the kernel functions and types in GP code 12]).



Regarding Claim 19, Zhu and Zhu2 teach
The non-transitory computer-readable storage medium of claim 15, wherein the compiled at least one portion of device code comprises instructions to be executed by a graphics processing unit (GPU) (Zhu [Paragraph 0066, During compilation of lower level code 121 proxy code 122 was generated for execution on a CPU and stub code was generated for execution on a co-processor (e.g., a GPU or other accelerator)]). 

Claims 6, 13 and 20 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Zhu (US20120317556) and Zhu2 (US20110314458), in view of Ueng “CUDA-Lite: Reducing GPU Programming Complexity”.
Note: Ueng was cited in IDS.

Regarding Claim 6, Zhu and Zhu2 teach
The computer-implemented method of claim 1.

Zhu and Zhu2 did not teach 


However, Ueng teaches
wherein the compiled at least one portion of device code is written in a version of a Compute Unified Device Architecture programming language (CUDA) (Introduction, Paragraph 1, NVIDIA introduced the Compute Unified Device Architecture (CUDA) [9], an extended ANSI C programming model. Under CUDA, Graphics Processing Units (GPUs) consist of many processor cores, each of which can directly address into a global memory. This allows for a much more flexible programming model than previous GPGPU programming models [11], and allows developers to implement a wider variety of data-parallel kernels. As a result, CUDA has rapidly gained acceptance in application domains where GPUs are used to execute compute intensive, data-parallel application kernels).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu's teaching in to Ueng's because it allows developers to implement a wider variety of data-parallel kernels (Ueng [instroduction]).

Regarding Claim 13, Zhu and Zhu2 teach
The system of claim 8.

Zhu and Zhu2 did not teach 


However, Ueng teaches
wherein the compiled at least one portion of device code is written in a version of a Compute Unified Device Architecture programming language (CUDA) (Introduction, Paragraph 1, NVIDIA introduced the Compute Unified Device Architecture (CUDA) [9], an extended ANSI C programming model. Under CUDA, Graphics Processing Units (GPUs) consist of many processor cores, each of which can directly address into a global memory. This allows for a much more flexible programming model than previous GPGPU programming models [11], and allows developers to implement a wider variety of data-parallel kernels. As a result, CUDA has rapidly gained acceptance in application domains where GPUs are used to execute compute intensive, data-parallel application kernels).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu's teaching in to Ueng's because it allows developers to implement a wider variety of data-parallel kernels (Ueng [instroduction]).

Regarding Claim 20, Zhu and Zhu2 teach
The non-transitory computer-readable storage medium of claim 15.

Zhu and Zhu2 did not teach 


However, Ueng “CUDA-Lite: Reducing GPU Programming Complexity” teaches
wherein the compiled at least one portion of device code is written in a version of a Compute Unified Device Architecture programming language (CUDA) (Introduction, Paragraph 1, NVIDIA introduced the Compute Unified Device Architecture (CUDA) [9], an extended ANSI C programming model. Under CUDA, Graphics Processing Units (GPUs) consist of many processor cores, each of which can directly address into a global memory. This allows for a much more flexible programming model than previous GPGPU programming models [11], and allows developers to implement a wider variety of data-parallel kernels. As a result, CUDA has rapidly gained acceptance in application domains where GPUs are used to execute compute intensive, data-parallel application kernels).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu's teaching in to Ueng's because it allows developers to implement a wider variety of data-parallel kernels (Ueng [instroduction]).

Response to Arguments
Applicant argues that “there is no indication that the stub code 123 is linked to the proxy code 122. By way of example, FIG. 1 of Zhu illustrates linking between the host code (allegedly proxy code 
Examiner respectfully disagrees.  As shown in Fig. 1 of Zhu, the Parser/semantic checker 102 which is part of compiler 101 creates intermediate representation 181 from higher level code 111. The Parser/semantic checker 102 can split kernel related code into stub routine 172 and calling context code into proxy routine 173 in accordance with code (See Zhu [0049]). From the intermediate representation 181, code generator 103 can generate a plurality of different lower level instructions (e.g., DirectX.RTM./High Level Shader Language ("HLSL") bytecode) that correctly implement the statements and expressions of received higher level code (See Zhu [0042]).  The compiler 101 can compile higher level code 111 into proxy code 122 for execution on a CPU and stub code 123 for execution on a co-processor (Zhu [0050]).   The stub routine 172 which is part of the intermediate representation 181 is interpreted to the claimed compiled device code portions and proxy 173 which is part of the intermediate representation 181 is interpreted to the claimed compiled host code portion.  Zhu further discloses “an act of linking the descriptor to the one or more runtime optimization objects stored alongside the stub code to provide the proxy code with access to the derived usage information for making kernel optimization decisions at runtime” (Zhu[0063]).  By providing the proxy code with access to the derived usage information for making kernel optimization decisions at runtime, Zhu teaches the claimed “linking the compiled at least one portion of device code to the compiled at least one portion of host code”.

Applicant argues that “the Office Action is asserting the same stub and proxy codes (which are lower level code) also correspond to features of claim 2.  Such an interpretation is improper, at least because the lower level code cannot reasonably be interpreted as both executable and linked code portions.

	
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMIR SOLTANZADEH whose telephone number is (571)272-3451.  The examiner can normally be reached on M-F, 9am - 5pm ET.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wei Zhen can be reached on (571) 272-3708.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/AMIR SOLTANZADEH/Examiner, Art Unit 2191                                                                                                                                                                                                        /WEI Y ZHEN/Supervisory Patent Examiner, Art Unit 2191