DETAILED ACTION
Claims 1-2, 6-16, and 18-26 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on September 22, 2021, has been entered.

Claim Objections
Claim 6 is objected to because of the following informalities:
At the end of line 10, delete “and”.
Claims 7-9 are objected to because of the following informalities:
In line 1 of each claim, insert --sections of-- after “modifying” for consistency with claim 6, line 2.
Claim 18 is objected to because of the following informalities:
At the end of line 10, delete “and”.
Claim 23 is objected to because of the following informalities:
In line 4, should “first” be replaced with --second-- since claim 18 makes it clear that the second processor is the processor that is executing the program (and thus would use registers to execute the program)?  Please confirm and correct, or explain why the examiner is mistaken.
Claim 25 is objected to because of the following informalities:
For similar reasons as set forth for claim 23, it appears that “first” should be replaced with --second--.
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 2, 6-13, 16, and 18-26 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The claims recite the following limitations for which there is a lack of antecedent basis:
In claim 2, “the second mode”, because there are two instances of “a second mode” in claim 1.  The examiner recommends replacing “a second” with --the second-- in claim 1, last paragraph.
In claim 6, lines 4-5, “the corresponding section”.  The examiner recommends replacing this with --each of the sections--.
In claims 8-9 and 12, each instance of “the first portion”, whose basis was deleted from claim 6.
In claims 8-10 and 12, each instance of “the first wavefront” and “the entire first wavefront”.  First wavefront language was deleted from claim 6.
In claims 8-10 and 12, each instance of “the second wavefront” and “the entire second wavefront”.  Second wavefront language was deleted from claim 6.
In claims 8-9 and 12, each instance of “the second portion”, whose basis was deleted from claim 6.
In claim 10, “the corresponding section”.  The examiner recommends inserting
-corresponding-- after “a” in line 3.
In claim 13, “the specified threshold”, whose basis was deleted from claim 6.
In claim 16, both instances of “the memory”, whose basis was deleted from claim 14.
In claim 18, line 3, “the sections”.
In claims 21-22, each instance of “the first portion”, whose basis was deleted from claim 18.
In claims 21-23, each instance of “the first wavefront” and “the entire first wavefront”.  First wavefront language was deleted from claim 18.
In claims 21-23, each instance of “the second wavefront” and “the entire second wavefront”.  Second wavefront language was deleted from claim 18.
In claims 21-22, each instance of “the second portion”, whose basis was deleted from claim 18.
In claim 23, “the corresponding section”.  The examiner recommends inserting
-corresponding-- after “a” in line 3.
In claim 25, “the specified threshold”, whose basis was deleted from claim 18.
In claim 26, “the plurality of execution units”, whose basis was deleted from claim 18.
Claims 7-13 and 19-26 are rejected due to their dependence on an indefinite claim.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2 and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Skadron et al., U.S. Patent Application Publication No. 2011/0219221 A1 (herein referred to as Skadron) in view of the examiner’s taking of Official Notice.
Referring to claim 1, Skadron has taught a computer-implemented method comprising:
a) identifying, based on at least one indication present in a program, one of a first mode and a second mode (see paragraphs [0013]-[0017], which discuss operation in response to memory divergence.  When there is no memory-divergence, all threads of a warp will hit the cache and proceed as normal with executing instructions.  Thus, a first instruction will be executed in parallel on the entire wavefront before some subsequent second instruction is executed on the entire wavefront (first mode).  However, when there is memory divergence, a first portion of the warp will hit the cache and the second portion of the warp will miss the cache.  Thus, the warp is split, and the first portion becomes a run-ahead warp that proceeds as normal and executes a set of instructions as it runs ahead of the second portion (run-behind warp), which is stalled until the cache miss is resolved.  Thus, first and second instructions (and potentially others) will be executed on the first portion of the wavefront while running ahead, before the first, second, and potentially other instructions are executed on the second portion after the stall finishes (second mode).  Thus, what indicates the mode is whether a cache hit or miss occurs.  And, whether a cache hit or miss occurs is based on there being a memory-access instruction in the program, and based on the address parameter included in the program for any memory access-instruction.  As such, either existence of a memory access instruction or the address to be accessed by a memory access instruction is an indicator in the program on which identification of a first or second mode will be based);
b) in response to identifying the first mode, executing at a first processor an instruction for an entire wavefront before executing a next instruction for the entire wavefront (again, from paragraphs [0013]-[0017], when there is no memory-divergence (first mode is identified), all threads of a warp will hit the cache and proceed as normal with executing instructions.  Thus, a first instruction will be executed in parallel on the entire wavefront before some subsequent second instruction is executed on the entire wavefront); and
c) in response to identifying a second mode, executing at the first processor a set of instructions for a portion of a given wavefront before executing the set of instructions for another portion of the given wavefront (again, from see paragraphs [0013]-[0017], when there is memory divergence (second mode is identified), a first portion of the warp will hit the cache and the second portion of the warp will miss the cache.  Thus, the warp is split, and the first portion becomes a run-ahead warp that proceeds as normal and executes a set of instructions as it runs ahead of the second portion (run-behind warp), which is stalled until the cache miss is resolved.  Thus, first and second instructions (and potentially others) will be executed on the first portion of the wavefront while running ahead, before the first, second, and potentially other instructions are executed on the second portion after the stall finishes);
d) Skadron has not taught that the program is a shader program.  However, shaders are known highly-parallel programs that involve thread warps/wavefronts, that would be naturally compatible with the warp execution of Skadron.  A shader program has many purposes in the field of graphics processing, some of which include loading/storing data for shading/lighting and texture simulation, determining pixel-related parameters, and repositioning of vertices for certain effects.  As such, in order to allow Skadron to be used for adjusting shading levels in graphics processing, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Skadron such that the program is a shader program.
Referring to claim 2, Skadron, as modified, has taught the method of claim 1, wherein the at least one indication includes a command for the first processor to operate in the second mode (again, for the processor to enter the second mode, the memory read/write command must be encountered).
Claims 14-15 are respectively rejected for similar reasons as claims 1-2.

Claims 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Skadron in view of the examiner’s taking of Official Notice and Gierach et al., U.S. Patent Application Publication No. 2019/0206110 A1 (herein referred to as Gierach).  Note that Gierach is valid prior art for these claims because they are unsupported by the parent application; thus, the effective filing date associated with these claims is that of the instant application, which Gierach precedes.
Claim 6 is partially rejected for similar reasons as claim 1.  Skadron, as modified, has not taught modifying, at a compiler of a processing system, sections of a shader program to be executed in one of a first mode and a second mode, the modifying of the sections based on a number of registers expected to be required to execute the corresponding section.  However, Gierach has taught that a compiler that identifies register usage/pressure in each chunk of shader code, and indicating such to a shader splitter of the compiler to split (modify) the chunk into smaller chunks to reduce register pressure during execution.  The smaller the register pressure, the higher the throughput and/or the more threads that can be dispatched concurrently.  See paragraph [00153].  As such, to improve throughput, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Skadron for modifying, at a compiler of a processing system, sections of a shader program to be executed in one of a first mode and a second mode, the modifying of the sections based on a number of registers expected to be required to execute the corresponding section.  Note, then that any resulting chunk will execute in the first or second more based on its load/store instructions hitting/missing cache.
Referring to claim 7, Skadron, as modified, has taught the method of claim 6, wherein modifying the shader program comprises: responsive to identifying a first section of the shader program: inserting a first command into the shader program preceding the identified first section (in response to detecting a post-memory access section (by encountering a command that access memory), a given load/store command of the ISA is inserted prior to that section into the compiled code wherever necessary to access memory), the first command to configure the first processor to operate in the first mode (a given load/store will configure the processor to operate in the first mode when it causes no cache misses); and inserting a second command into the shader program following the identified first section (again, a given load/store command of the ISA is inserted into the compiled code wherever necessary to access memory), the second command to configure the first processor to operate in the second mode (a given load/store will configure the processor to operate in the first mode when it causes no cache misses).

Claims 16 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Skadron in view of the examiner’s taking of Official Notice, Young, U.S. Patent Application Publication No. 2015/0095879 A1 (herein referred to as Young), and Gierach.
Referring to claim 16, Skadron, as modified, has taught the system of claim 14, but has not taught a second processor coupled to the first processor and the memory, wherein the second processor is configured to: analyze a second shader program to identify register usages for different sections of the second shader program; modify the second shader program by selectively associating an indication in the second shader program with each section of the second shader program based on the identified register usage for that section; and provide the modified second shader program for storage in the memory as the first shader program.  However, Young has first taught a processor (FIG.1, 112) that compiles code in memory 114 for another processor (FIG.1, 116).  See paragraph [0014].  One of ordinary skill in the art would have recognized that by providing a second processor to perform compiling, compiling can be perform in parallel with execution on the first processor.  In other words, parallelism and throughput could be increased.  For instance, while the first processor is executing a compiled program, the second processor could be compiling the next program.  As such, it would have first been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Skadron to include a second processor coupled to the first processor and the memory, wherein the second processor is configured to: analyze a second shader program, modify the second shader program, and provide the modified second shader program for storage in the memory as the first shader program.  Further, Gierach has taught that such compiler analysis and modification would include identifying register usage/pressure in each chunk of shader code, and indicating such to a shader splitter of the compiler that splits the chunk into smaller chunks to reduce register pressure during execution.  The smaller the register pressure, the higher the throughput and/or the more threads that can be dispatched concurrently.  See paragraph [00153].  As such, to improve throughput, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Skadron such that the second processor is configured to: analyze a second shader program to identify register usages for different sections of the second shader program; and modify the second shader program by selectively associating an indication in the second shader program with each section of the second shader program based on the identified register usage for that section.
Claims 18 is rejected for similar reasoning set forth in the rejections of claims 6 and 16.
Referring to claim 19, Skadron, as modified, has taught the system of claim 18, wherein the first processor comprises a central processing unit and the second processor comprises a graphics processing unit (paragraph [0005] of Skadron stats that this architecture may be found in graphics processors.  And, in Young, a regular CPU 112 (paragraph [0024]) compiles code for a graphics processor 116 (paragraph [0026])).
Claim 20 is rejected for similar reasons as claim 7.

Claim 26 is rejected under 35 U.S.C. 103 as being unpatentable over Skadron in view of the examiner’s taking of Official Notice, Young, Gierach. and Jiao, U.S. Patent No. 9,804,666.
Referring to claim 26, Skadron, as modified, has taught the system of claim 18, but has not taught wherein in the first mode the plurality of execution units of the second processor are configured to share subsets of registers.  However, Jiao has taught such a concept in column 5, lines 11-15, where all threads in a warp share a register file so as to provide a shared register value to all lanes.  One of ordinary skill in the art would recognize the efficiency of such a register file that allows storage of shared data to a single register as opposed to duplicating the data and storing the duplicates into a separate register for each thread, thereby cutting storage requirements by a factor of N, where N is the number of threads in the warp.  In addition, the value would be sent to all threads with just a single register access instead of N separate register accesses.  Thus, register reads, too, would be decreased by a factor of N.  Decreasing register accesses reduces power consumption and storage requirements, among other advantages.  Consequently, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Skadron such that in the first mode the plurality of execution units of the second processor are configured to share subsets of registers.

Allowable Subject Matter
Claims 8-13 and 21-25 are objected to as being dependent upon a rejected base claim, but would be allowable over the prior art if rewritten in independent form including all of the limitations of the base claim and any intervening claims.  Note that any amendments to these claims to address 112 issues may necessitate a new ground of prior art rejection.

Response to Arguments
On page 12 of applicant’s response, applicant argues that Skadron has not taught how the instructions of different warp splits are executed or whether the threads of the different warp splits share sets of instruction.
The examiner respectfully disagrees.  Skadron is a SIMD architecture with multiple threads, meaning each thread executes the same instructions.  A collection of threads executing the same instruction makes up a warp.  Thus, all threads may attempt to execute a memory operation at the same time.  When all of them hit cache, threads may continue executing the same instructions in lockstep.  However, if any thread misses the cache, then that thread is delayed while the others are allowed to run forward.  As such, all threads in a warp do share sets of instructions (this is known).  When a warp is split, the same instructions are executed, but some in one warp split are executed ahead of those in another split (which miss the cache).

On page 12 of the response, applicant argues the examiner is incorrectly assuming that the run-ahead and run-behind portions execute the same sets of instructions.
The examiner notes that this is not an assumption.  This is how a SIMD multi-threaded system operates.  Each thread executes the same instructions.  The idea in Skadron is to not let one thread’s cache miss delay the other threads that have hit the cache just so they can stay in lockstep.  Instead, the warp is split, letting the threads hitting the cache continue forward on the set of instructions.  The run-behind warp, when the cache miss is eventually resolved, will simply play catch-up on the same set of instructions.  This is the nature of SIMD/SIMT.

On page 13 of the response, applicant argues that a load/store instruction does not indicate a mode of operation.  Instead, applicant points out that a cache hit/miss causes the warp split.
The examiner asserts that applicant is reading the claim language too narrowly.  Claim 1, for instance, merely requires that one of a first mode and second mode be identified based on an indication in the program.  As applicant correctly points out, a cache hit/miss may dictate the mode.  However, a cache hit/miss can only occur in response to a load/store in the program.  Thus, the mode selection is based on an indication of a memory access in the program (i.e., presence of a load/store).

On page 13 of the response, applicant argues that the Official Notice is improper.
The examiner respectfully disagrees.  A shader is a well-known program.  It operates on pixel data, which is something that SIMD units (such as in Skadron) are known to do.  It is obvious for any type of program to be executed on Skadron, including a shader as it would be predictably useful to carry out the same shading instructions on some group of pixels at the same time (hence, usefulness from SIMD).  Skadron’s splitting simply allows the shading to occur more quickly when some of the data might not be in cache.

On page 14 of the response, applicant argues the rejection of claim 16, stating that Gierach has not taught modifying a shader program to associate an indication with each section of the program.
Again, the examiner asserts that applicant is interpreting the claim too narrowly.  The examiner first notes that the indication in claim 16 is not necessarily the same as that in claim 14.  In Gierach, to modify the program, an indication is associated with each chunk to indicate whether the chunk is to be further split to reduce register pressure.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to David J. Huisman whose telephone number is 571-272-4168.  The examiner can normally be reached on Monday-Friday, 9:00 am-5:30 pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta, can be reached at 571-270-3995.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/David J. Huisman/Primary Examiner, Art Unit 2183