Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions.

Claims 1-20 are presented for examination.

Allowable Subject Matter
Claims 2, 9 and 16 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and overcoming the double patenting rejection.


Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) are:”one or more compilers to compile … ”, “a linking module to link …” in claim 8, “an executable file generation module to generate…” in claim 9, “an identification module to assign …” in claim 10, and “the linking module is further operable to …” in claim 11.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim 

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-24 of U.S. Patent No. 10,261,807. Although the claims at issue are not identical, they are not patentably distinct from each other because the referenced patent and the instant application are claiming common subject matter. For illustration purpose, Claim 1 rejection is provided as follow:
Current Application 16/268,106
U.S. Patent No. 10,261,807
1. A computer- implemented method, comprising: 













and linking the compiled at least one portion of device code to the compiled at least one portion of host code 











uniquely identifying a device code portion associated with each host object fileset of a plurality of host object filesets used as input, wherein two or more of said plurality of host object filesets each comprise a plurality of host code portions and a plurality of device code portions in compiled form, wherein said plurality of host code portions and said plurality of device code portions are configured to execute on different processor types; 

using a linker to link together said plurality of host object filesets to produce a plurality of unique linked device code portions, wherein said linker uses as input separate device code portions from said two or more of said plurality of device code portions to produce said plurality of unique linked device code portions; 




Claim 1 of U.S. Patent No. 10,261,807 does not disclose “compiling at least one portion of device code and at least one portion of host code separately”. However, Zhu (US20110314458) teaches “compiling at least one portion of device code and at least one portion of host code separately” (Paragraph 0020, For CPU code in GP code 12, GP compiler 20 compiles the one or more modules with CPU code into one or more object or intermediate representation (IR) files 22; Paragraph 0021, DP device compiler 38 is configured to compile code written in a high level DP device programming language such as HLSL (High Level Shader Language) rather than code written in the GP language of GP code 12).
Therefore, It would have been obvious to a person having ordinary skill in the art that the said executable file is accessed for execution using different processor types as taught by Zhu in order to enhance the ability of programmers to program data parallel devices by allowing programmers to program both CPUs and data parallel devices (e.g., GPUs) using a high level general purpose programming language that has data parallel (DP) extensions (Zhu [summary]).


Claim Rejections - 35 USC § 103

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

Claims 1, 3-5, 7-8, 10-12, 14-15 and 17-19 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Zhu (US20110314458), in view of Kim (US 9495720 B2).

Regarding Claim 1, Zhu teaches 
A computer- implemented method, comprising: 
compiling at least one portion of device code and at least one portion of host code separately (Paragraph 0020, For CPU code in GP code 12, GP compiler 20 compiles the one or more modules with CPU code into one or more object or intermediate representation (IR) files 22; Paragraph 0021, DP device compiler 38 is configured to compile code written in a high level DP device programming language such as HLSL (High Level Shader Language) rather than code written in the GP language of GP code 12) Examiner Comments: Compiler 20 and Compiler 38 are separate compilers.

Zhu did not teach


However, Kim teaches
and linking the compiled at least one portion of device code to the compiled at least one portion of host code (Claim 1, the application including CPU source code and GPU source code; compiling the CPU source code into CPU machine code, and compiling the GPU source code into a GPU virtual instruction; generating an execution file by linking the CPU machine code and the GPU virtual instruction in response to the request).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu’s teaching to Kim’s in order to enhance the ability of  compiling and executing an application including Central Processing Unit (CPU) source code and Graphic Processing Unit (GPU) source code by providing a compiler that compiles the GPU source code into a GPU virtual instruction (Kim [Abstract]).
 
Regarding Claim 3, Zhu and Kim teach
The computer-implemented method of claim 1, further comprising assigning a unique identifier to the compiled at least one portion of device code (Zhu [Paragraph 0026, GP compiler 20 uses a naming convention for kernel functions and types used in the DP intermediate code. The naming convention ensures that a unique name is used for each kernel function and type and that the unique name is used consistently for each instance of a function and a type]).

Regarding Claim 4, Zhu and Kim teach
The computer-implemented method of claim 3, wherein the assigning further comprises using the unique identifier to prevent the compiled at least one portion of device code from being linked more than once (Zhu [Paragraph 0026, GP compiler 20 uses a naming convention for names used for identifying binding descriptors 24. This naming convention allows binding descriptors 24 to be uniformly referenced in import tables 24D based on locally available information. The naming conventions may be based on the names of the kernel functions and types in GP code 12]).

Regarding Claim 5, Zhu and Kim teach
The computer-implemented method of claim 1, wherein the compiled at least one of portion of host code comprises instructions to be executed by a central processing unit (CPU) and the compiled at least one portion of device code comprises instructions to be executed by a graphics processing unit (GPU) (Zhu [Para 0015, GP executable 32 represents a program intended for execution on one or more processors (e.g., central processing units (CPUs)). GP executable 32 includes low level instructions from an instruction set of one or more central processing units (CPUs). GP executable 32 may also include one or more DP device executables 40. A DP device executable 40 represents a data parallel program (e.g., a shader) intended for execution on one or more data parallel (DP) devices such as DP device 210 shown in FIG. 7 and described in additional detail below. DP devices are typically graphic processing units (GPUs)]).

Regarding Claim 7, Zhu and Kim teach
The computer-implemented method of claim 1, wherein the compiled at least one portion of device code is provided to a device linker from a host object (Zhu [Paragraph 0023, Linker 30 includes binder 34 that binds the DP intermediate code from the set of binding descriptors 24 into a DP device source code unit 36 by traversing the call graph rooted from an invocation 18 and formed by the set of binding descriptors 24, translating DP intermediate code into DP device source code (if necessary), and concatenating the DP device source code from the set of binding descriptors 24).

Regarding Claim 8, Zhu teaches
A system for executing code, the system comprising: 
one or more compilers to compile at least one portion of device code and at least one portion of host code separately (Paragraph 0020, For CPU code in GP code 12, GP compiler 20 compiles the one or more modules with CPU code into one or more object or intermediate representation (IR) files 22; Paragraph 0021, DP device compiler 38 is configured to compile code written in a high level DP device programming language such as HLSL (High Level Shader Language) rather than code written in the GP language of GP code 12) Examiner Comments: Compiler 20 and Compiler 38 are separate compilers.
Zhu did not specifically teach


However, Kim teaches
and a linking module to link the compiled at least one portion of device code to the compiled at least one portion of host code (Claim 1, the application including CPU source code and GPU source code; compiling the CPU source code into CPU machine code, and compiling the GPU source code into a GPU virtual instruction; generating an execution file by linking the CPU machine code and the GPU virtual instruction in response to the request).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu’s teaching to Kim’s in order to enhance the ability of  compiling and executing an application including Central Processing Unit (CPU) source code and Graphic Processing Unit (GPU) source code by providing a compiler that compiles the GPU source code into a GPU virtual instruction (Kim [Abstract]).

Regarding Claim 10, Zhu and Kim teach
The system of claim 8,further comprising an identification module to assign a unique identifier to the compiled at least one portion of device code (Zhu [Paragraph 0026, GP compiler 20 uses a naming convention for kernel functions and types used in the DP intermediate code. The naming convention ensures that a unique name is used for each kernel function and type and that the unique name is used consistently for each instance of a function and a type]).

Regarding Claim 11, Zhu and Kim teach
The system of claim 10, wherein the linking module is further operable to use the unique identifier to prevent the compiled at least one portion of device code from being linked more than once (Zhu [Paragraph 0026, GP compiler 20 uses a naming convention for names used for identifying binding descriptors 24. This naming convention allows binding descriptors 24 to be uniformly referenced in import tables 24D based on locally available information. The naming conventions may be based on the names of the kernel functions and types in GP code 12]).

Regarding Claim 12, Zhu and Kim teach
The system of claim 8, wherein the compiled at least one of portion of host code comprises instructions to be executed by a central processing unit (CPU) and the compiled at least one portion of device code comprises instructions to be executed by a graphics processing unit (GPU) (Zhu [Para 0015, GP executable 32 represents a program intended for execution on one or more processors (e.g., central processing units (CPUs)). GP executable 32 includes low level instructions from an instruction set of one or more central processing units (CPUs). GP executable 32 may also include one or more DP device executables 40. A DP device executable 40 represents a data parallel program (e.g., a shader) intended for execution on one or more data parallel (DP) devices such as DP device 210 shown in FIG. 7 and described in additional detail below. DP devices are typically graphic processing units (GPUs)]).

Regarding Claim 14, Zhu and Kim teach
The system of claim 8, wherein the compiled at least one portion of device code is provided to a device linker for the linking from a host object (Zhu [Paragraph 0023, Linker 30 includes binder 34 that binds the DP intermediate code from the set of binding descriptors 24 into a DP device source code unit 36 by traversing the call graph rooted from an invocation 18 and formed by the set of binding descriptors 24, translating DP intermediate code into DP device source code (if necessary), and concatenating the DP device source code from the set of binding descriptors 24).

Regarding Claim 15, Zhu teaches
A non-transitory computer-readable storage medium including instructions to execute code, the instructions when executed by at least one processor of a computing device causing the computing device to: compile at least one portion of device code and at least one portion of host code separately (Paragraph 0020, For CPU code in GP code 12, GP compiler 20 compiles the one or more modules with CPU code into one or more object or intermediate representation (IR) files 22; Paragraph 0021, DP device compiler 38 is configured to compile code written in a high level DP device programming language such as HLSL (High Level Shader Language) rather than code written in the GP language of GP code 12) Examiner Comments: Compiler 20 and Compiler 38 are separate compilers.

	Zhu did not specifically teach
	and link the compiled at least one portion of device code to the compiled at least one portion of host code.

However, Kim teaches
and link the compiled at least one portion of device code to the compiled at least one portion of host code (Claim 1, the application including CPU source code and GPU source code; compiling the CPU source code into CPU machine code, and compiling the GPU source code into a GPU virtual instruction; generating an execution file by linking the CPU machine code and the GPU virtual instruction in response to the request).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu’s teaching to Kim’s in order to enhance the ability of  compiling and executing an application including Central Processing Unit (CPU) source code and Graphic Processing Unit (GPU) source code by providing a compiler that compiles the GPU source code into a GPU virtual instruction (Kim [Abstract]).
 

Regarding Claim 17, Zhu and Kim teach
The non-transitory computer-readable storage medium of claim 15,
GP compiler 20 uses a naming convention for kernel functions and types used in the DP intermediate code. The naming convention ensures that a unique name is used for each kernel function and type and that the unique name is used consistently for each instance of a function and a type]).

Regarding Claim 18, Zhu and Kim teach
The non-transitory computer-readable storage medium of claim 17, wherein the assigning further comprises using the unique identifier to prevent the compiled at least one portion of device code from being linked more than once (Zhu [Paragraph 0026, GP compiler 20 uses a naming convention for names used for identifying binding descriptors 24. This naming convention allows binding descriptors 24 to be uniformly referenced in import tables 24D based on locally available information. The naming conventions may be based on the names of the kernel functions and types in GP code 12]).

Regarding Claim 19, Zhu and Kim teach
The non-transitory computer-readable storage medium of claim 15,wherein the compiled at least one portion of device code comprises instructions to be executed by a graphics processing unit (GPU) (Zhu [Para 0015, GP executable 32 represents a program intended for execution on one or more processors (e.g., central processing units (CPUs)). GP executable 32 includes low level instructions from an instruction set of one or more central processing units (CPUs). GP executable 32 may also include one or more DP device executables 40. A DP device executable 40 represents a data parallel program (e.g., a shader) intended for execution on one or more data parallel (DP) devices such as DP device 210 shown in FIG. 7 and described in additional detail below. DP devices are typically graphic processing units (GPUs)]).


Claims 6, 13 and 20 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Zhu (US20110314458), and Kim (US 9495720 B2) in view of Ueng “CUDA-Lite: Reducing GPU Programming Complexity”.
Note: Ueng was cited in IDS.

Regarding Claim 6, Zhu and Kim teach
The computer-implemented method of claim 1.

Zhu and Kim did not teach 
wherein the compiled at least one portion of device code is written in a version of a Compute Unified Device Architecture programming language (CUDA).  

However, Ueng teaches
wherein the compiled at least one portion of device code is written in a version of a Compute Unified Device Architecture programming language (CUDA) (Introduction, Paragraph 1, NVIDIA introduced the Compute Unified Device Architecture (CUDA) [9], an extended ANSI C programming model. Under CUDA, Graphics Processing Units (GPUs) consist of many processor cores, each of which can directly address into a global memory. This allows for a much more flexible programming model than previous GPGPU programming models [11], and allows developers to implement a wider variety of data-parallel kernels. As a result, CUDA has rapidly gained acceptance in application domains where GPUs are used to execute compute intensive, data-parallel application kernels).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu and Kim 's teaching in to Ueng's because it allows developers to implement a wider variety of data-parallel kernels (Ueng [instroduction]).

Regarding Claim 13, Zhu and Kim teach
The system of claim 8.

Zhu and Kim did not teach 
wherein the compiled at least one portion of device code is written in a version of a Compute Unified Device Architecture programming language (CUDA).  

However, Ueng teaches
wherein the compiled at least one portion of device code is written in a version of a Compute Unified Device Architecture programming language (CUDA) (Introduction, Paragraph 1, NVIDIA introduced the Compute Unified Device Architecture (CUDA) [9], an extended ANSI C programming model. Under CUDA, Graphics Processing Units (GPUs) consist of many processor cores, each of which can directly address into a global memory. This allows for a much more flexible programming model than previous GPGPU programming models [11], and allows developers to implement a wider variety of data-parallel kernels. As a result, CUDA has rapidly gained acceptance in application domains where GPUs are used to execute compute intensive, data-parallel application kernels).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu and Kim 's teaching in to Ueng's because it allows developers to implement a wider variety of data-parallel kernels (Ueng [instroduction]).

Regarding Claim 20, Zhu and Kim teach
The non-transitory computer-readable storage medium of claim 15.

Zhu and Kim did not teach 
wherein the compiled at least one portion of device code is written in a version of a Compute Unified Device Architecture programming language (CUDA).  

However, Ueng “CUDA-Lite: Reducing GPU Programming Complexity” teaches
wherein the compiled at least one portion of device code is written in a version of a Compute Unified Device Architecture programming language (CUDA) (Introduction, Paragraph 1, NVIDIA introduced the Compute Unified Device Architecture (CUDA) [9], an extended ANSI C programming model. Under CUDA, Graphics Processing Units (GPUs) consist of many processor cores, each of which can directly address into a global memory. This allows for a much more flexible programming model than previous GPGPU programming models [11], and allows developers to implement a wider variety of data-parallel kernels. As a result, CUDA has rapidly gained acceptance in application domains where GPUs are used to execute compute intensive, data-parallel application kernels).

It would have been obvious to a person having ordinary skill in the art at the time of the invention to have combined Zhu and Kim 's teaching in to Ueng's because it allows developers to implement a wider variety of data-parallel kernels (Ueng [instroduction]).

Response to Arguments
Applicant’s arguments with respect to claims 1-20 have been considered but are moot because the arguments do not apply to the previous cited sections of the references used in the previous office action. The current office action is now citing additional references to address the newly added claimed limitations.

		
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMIR SOLTANZADEH whose telephone number is (571)272-3451.  The examiner can normally be reached on M-F, 9am - 5pm ET.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wei Zhen can be reached on (571) 272-3708.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/AMIR SOLTANZADEH/Examiner, Art Unit 2191                                                                                                                                                                                                        /WEI Y ZHEN/Supervisory Patent Examiner, Art Unit 2191