DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in response to RCE filed 22 February 2021.
Claims 1-4, 6-8, 10-13, 15-17, 19, and 21-25 are pending. Claims 5, 9, 14, 18, and 20 are cancelled.

Response to Arguments
Applicant’s arguments, see pages 8-12 of the remarks filed 22 February 2021, with respect to the rejections of claims 1-3, 5-8, 10-12, and 14-17, and additionally claims 4, 13, 19 and 20, have been fully considered but are not persuasive.

i.	On page 10, the applicant argues that “As discussed during the interview, the cited references do not teach or suggest predicting a subsequent value of a signal…and then prefetching based on the predicted subsequent value…Accordingly, withdrawal of the rejection is respectfully requested. For at least similar reasons, withdrawal of the rejection of claim 10 is also respectfully requested.”
	The examiner respectfully disagrees, because, while the examiner agreed that the claims as amended similarly to those presented in the interview appeared to overcome the instant application, the examiner did not indicate whether the cited references themselves taught or suggested the limitations of the presented claims. Upon further review of the references cited in the previous office action, the examiner discovered that the previously relied upon references still read on the current claims, albeit with new citations and in a new order. Therefore, while the currently amended claims overcome the references as cited prior rejection, the references were used in a new rejection, and the applicant’s argument is not persuasive.

ii.	On page 10, the applicant argues that “Claims 2, 3, and 5-8 and 11, 12, and 14-17 depend on claims 1 and 10 respectively…The cited references therefore fail to disclose the 
The examiner respectfully disagrees, because claims 1 and 10 are not allowable over the arguments above. Therefore, the applicant’s argument is not persuasive.

iii.	On pages 10-11, the applicant argues that “the cited references, individually and in combination, fail to disclose or render obvious the features of independent claims 1 and 10. The cited references therefore fail to disclose the features of dependent claims 4, and 13, at least by virtue of their respective dependence.”
	The examiner respectfully disagrees, because claims 1 and 10 are not allowable over the arguments above. Therefore, the applicant’s argument is not persuasive.

iv.	On pages 11-12, the applicant argues that “As discussed above, Holt561 and Bradford, individually or in combination, do not teach or suggest at least ‘scheduling a first workgroup…executing using threads in the first workgroup, a first wait instruction…and modifying the signal…’ as in claim 1.”
	The examiner respectfully disagrees, for the same rationale as discussed above with regard to the augments for claims 1 and 10. Since claims 1 and 10 are not allowable over the arguments above. Therefore, the applicant’s argument is not persuasive.

Examiner’s Note Regarding Contingent Limitations in Claim 19
Claim 19 is a method claim that recites contingent limitations. For example, the limitations of “prefetching a first context of a first workgroup into registers of a processor core in response to the predicted value of the signal being equal to a first value associated with the first workgroup;”, “executing, concurrently with prefetching the first context, a second workgroup…”, “preempting the second workgroup in response to executing the wait instruction”, and “scheduling the first workgroup for execution on the processor core in response to preempting the second workgroup and in response to the signal having the predicted value” are limitations that are contingent upon (“in 

Claim Objections
Claims 1, and 10 are objected to because of the following informalities (line numbers correspond to claim 1): In line 5: “a first hint that indicates whether the signal” should read “a first hint that indicates whether a value of the signal”.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 7-12, and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Arimilli et al. Pub. No.: US 2011/0173630 A1 (hereafter “Arimilli”), in view of Bradford et al. Patent No.: US 7,493,621 B2 (hereafter “Bradford”), in view of Ray et al. Pub. No.: US 2018/0293102 A1 (hereafter “Ray”).

Arimilli, Bradford, and Ray were cited in the previous PTO-892 dated 23 March 2020.

Regarding claim 1, Arimilli teaches the invention substantially as claimed, including:
A method comprising: 
scheduling a first [thread] for execution on a processor core, based on a first context, in response to a signal having a first value ([0100], Lines 7-15: The wake-and-go mechanism then performs a compare (block 720) and determines whether the value being written to the target address represents the event for which the thread is waiting (block 722) (i.e., value of the target address represents a “signal having a value”). If the kill corresponds to the event for which the thread is waiting (i.e., the value is a “first value”), then the operating system updates the array (block 724) to remove the target address from the wake-and-go array. Thereafter, operation returns to block 702 in FIG. 7A where the operating system restarts the thread (i.e., starting or restarting the thread includes queuing the thread in a run queue of a processor, which represents scheduling the thread for execution on a processor “core” (see at least [0095])). [0089], Lines 4-7: Operating system 630 or some other hardware or software then wakes thread 610 by retrieving the thread state information from thread state storage 612 and placing the thread in the run queue for the processor (i.e., thread state information associated with a first thread represents “first context” information (see at least [0126]))); 
executing, using [the thread], a first wait instruction ([0096], Lines 2-7: The processor determines whether the next instruction updates the wake-and-go array (block 708). An instruction to update the wake-and-go array may be…a call to a background sleeper thread (i.e., call to a background sleeper thread represents a “first wait instruction”). [0097], Lines 4-5: The update to the wake-and-go array may be made by the thread) including the first value of the signal ([0088], Lines 9-13: Background sleeper thread 640 then stores a target address A2 in array 622. Thread 610 may also store other information in association with background sleeper thread 640, such as a value for which thread 610 is waiting (i.e., “first value of the signal”) to be written to target address A2 (i.e., call to background sleeper thread includes the value for which the thread is waiting))… 
preempting the first [thread] in response to executing the first wait instruction ([0088], Lines 13-15: Then, thread 610 goes to sleep with thread state being stored in thread state storage 612); 
modifying the signal from the first value to a second value; scheduling a second [thread] for execution on the processor core, based on a second context, in response to preempting the first [thread] and in response to the signal having the second value ([0009], Lines 1-3: A processor may be executing a first thread, which goes to sleep. The processor may then begin executing a second thread (i.e., second thread is scheduled for execution on the processor in response to the first thread going to sleep, or in other words, in response to the first thread being preempted). [0010], Lines 6-8: A computing system may be running thousands of threads, many of which are waiting for an event at any given time (i.e., second thread waits for a second value at a second target address). [0100], Lines 7-15: The wake-and-go mechanism then performs a compare (block 720) and determines whether the value being written to the target address represents the event for which the thread is waiting (block 722). If the kill corresponds to the event for which the thread is waiting (i.e., the value is a “second value”), then the operating system updates the array (block 724) to remove the target address from the wake-and-go array. Thereafter, operation returns to block 702 in FIG. 7A where the operating system restarts the thread (i.e., the second thread is scheduled by queuing the second thread in the run queue of the processor in response to the first thread going to sleep and the value being the second value))…

	While Arimilli teaches thread preemption and context fetching, Arimilli does not explicitly disclose:
a first wait instruction including the first value of a signal and a first hint that indicates whether the signal is to be incremented, decremented, or exchanged with another value of the signal;
predicting a subsequent value of the signal based on the second value and whether the first hint indicates that the signal is to be incremented, decremented, or exchanged;

However, Bradford teaches:
a first wait instruction including the first value of a signal and a first hint that indicates whether the signal is to be incremented, decremented, or exchanged with another value of the signal (Column 8, Lines 56-61: For prefetching data processed by instructions, state information such as…base addresses (i.e., signal values) and strides (i.e., hints) used in connection with data prefetching…may be used);
predicting a subsequent value of the signal based on the second value and whether the first hint indicates that the signal is to be incremented, decremented, or exchanged (Column 8, Line 63-Column 9, Line 3: A prefetch engine 80 with a scheduler block 82 that interfaces with an increment/decrement control block 84 that updates entries 88 in a stride table 86. Each entry 88, in particular, includes a base address value and a stride value, with the base address value representing a current address to be fetched, and the stride value representing the amount to add or subtract from the base address to generate a next address to be fetched. Column 9, Lines 4-12: Data prefetcher 38 generally operates by attempting to discern access patterns among memory accesses, and predicting which data will likely be needed based upon those patterns. More specifically, once a base address and stride value are determined, the base address is fetched via a command from scheduler 82 to the cache system, and the base address is summed with the stride value by increment/decrement control block 84, with the new base address value written back into the table (i.e., strides represent hints that the address will be incremented or decremented by increment/decrement control block 84)); and
prefetching a third context into registers of the processor core based on the predicted subsequent value of the signal (Column 6, Lines 40-44: A context switch operation is utilized to initiate a prefetch of data (i.e., “state information” of the working state of a thread (Column 4, Lines 49-50), which is a “context”) likely to be used by a thread, prior to resumption of execution of that thread (i.e., a third context is prefetched prior to resumption of the thread associated with the third context). In this regard, a prefetch of data may result in the retrieval of data into any or all of the cache memories in a cache system (i.e., cache memory represents types of processor “registers” (see at least Column 6, Lines 24-39));

It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have combined Bradford’s teaching of predicting a subsequent address to prefetch context data from before switching between threads based on whether a current address is to be incremented or decremented, with Arimilli’s teaching of switching between threads based on executing a wait instruction, with a reasonable expectation of success, since they are analogous thread execution systems that similarly perform context switching between threads. Such a combination results in a system that preempts a first thread based on executing a wait instruction, as in Arimilli, and prefetches predicted 

While Arimilli, and Bradford discuss execution, preemption and context switching between individual threads, the combination of Arimilli and Bradford does not explicitly disclose:
executing, using threads in the first workgroup, a first wait instruction
preempting the first workgroup in response to executing the first wait instruction
scheduling a second workgroup for execution on the processor core

However, Ray teaches:
executing, using threads in the first workgroup ([0080], Lines 5-8: The instruction unit 254 can dispatch instructions as thread groups (e.g., warps) (i.e., “a first workgroup” of threads), with each thread of the thread group assigned to a different execution unit within GPGPU core 262 (i.e., “processor” core)), a first wait instruction ([0180], Lines 1-12: As illustrated, a number of threads or, in this case, thread groups TG0 821, TG1 823, TG2 825 and TGN 827 may be up for processing. In one embodiment, each time there is a condition or event (i.e., “wait instruction”)…necessitating a thread group, such as TG0 821, TG1 823, to wait, the affected thread groups, such as TG0 821, TG1 823, are taken out of processing rotation, while their context information relating to the wait is stored 831, 833, respectively, as facilitated by partial preemption logic 705 of FIG. 7);
preempting the first workgroup in response to executing the first wait instruction ([0164], Lines 1-12: Workload invokes thread groups to computation…The thread group is preempted as part of an application pre-empt. An application preempt essentially pauses all thread groups relating to an application and swaps the context out);
scheduling a second workgroup for execution on the processor core ([0181], Lines 1-5: Partial preemption logic 705 of FIG. 7 may be used to partially preempt the process to allow for other thread groups, such as TG2 825, to be dispatched for processing 835, which TG0 821 and TG1 823 are suspended)

It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have combined Ray’s teaching of executing a wait instruction by, preempting, and scheduling groups of threads, with the combination of Arimilli, and Bradford’s teaching of executing a wait instruction by, preempting, and scheduling single threads, with a reasonable expectation of success, since they are analogous thread execution systems that similarly execute threads to perform tasks. Such a combination would result in a system that executes wait instructions by, preempts, and schedules groups of threads, as in Ray, based on values and predicted values of signals, as in Bradford. One of ordinary skill would have been motivated to make this combination to realize the improvement of efficiently managing thread groups (Ray [0005]).

Regarding claim 2, Ray teaches:
preempting the first workgroup comprises storing the first context in a memory, and wherein scheduling the second workgroup for execution comprises writing the second context from the memory into the registers of the processor core ([0100], Lines 1-6: A set of registers 445 store context data for threads executed by the graphics processing engines 431-432, N and a context management circuit 448 manages the thread contexts. For example, the context management circuit 448 may perform save and restore operations to save and restore contexts of the various threads during context switches…For example, on a context switch, the context management circuit 448 may store current register values to a designated region in memory…It may then restore the register values when returning to the context (i.e., during a context switch between first and second thread groups, the context of the first thread group is stored in the designated memory region, and the context of the second thread group is “restored” to the set of registers)).  

Regarding claim 3, Bradford teaches:
writing the second context from the memory into the registers of the processor core comprises prefetching the second context from the memory into the registers of the processor core prior to preempting the first [thread] (Column 4, Line 64-Column 5, Line 2: Precisely when during i.e., second context is prefetched while first thread group executes, and thus, is prefetched prior to the preemption)). 

Regarding claim 7, Arimilli teaches:
storing information identifying the first workgroup in a first queue in response to preempting the first [thread] ([0088], Lines 3-15: Thread 610 runs in a processor (i.e., processor “core”) (not shown) and performs some work. Thread 610 makes a call (i.e., executes a “wait,” or sleep instruction) to background sleeper thread 640 to update wake-and-go array 622 (i.e., “queue” data structure)…Thread 610 may also store other information in association with background sleeper thread 640, such as a value for which thread 610 is waiting to be written to target address A2 (i.e., value of the target address represents a “signal having a value”). Then thread 610 goes to sleep (i.e., first thread is “preempted”) with thread state being stored in thread state storage 612. [0197], Lines: Each entry in central repository wake-and-go array 2400 may include thread identification (ID) 2402 (i.e., information identifying the thread to be preempted (Thread ID and value) are updated, or stored in the wake-and-go array queue)), wherein the first [thread] is scheduled for execution on the processor core based on the information identifying the first [thread] in the first queue ([0198], Lines 1-5: The wake-and-go engine 2350 may use the thread ID 2402 to identify the thread and the CPU ID to identify the processor. Wake-and-go engine 2350 may then place the thread in the run queue for the processor identified by CPU ID 2404 (i.e., scheduling the thread for execution on the processor uses the thread ID stored in the Wake-and-go array)). 

Regarding claim 8, Arimilli teaches:
prefetching the first context from a memory…prior to resuming execution of the first [thread]; and moving the information identifying the first [thread] from the first queue to a second queue in response to prefetching the first context from the memory ([0081], Lines 1-14: When a transaction appears on SMP fabric 420 with an address that matches the target address A2, array 422 i.e., second queue) for the processor…Thread 410 updates the array to remove the target address from array 422, and performs more work (i.e., thread state (context information) is retrieved from memory prior to re-executing the thread (prefetching), and the thread (i.e., a thread is identified by its thread ID) is removed from the array and placed in the run queue)), wherein the first [thread] is scheduled for execution on the processor core based on the information identifying the first [thread] in the second queue [0198], Lines 1-7: The wake-and-go engine 2350 may use the thread ID 2402 to identify the thread and the CPU ID to identify the processor. Wake-and-go engine 2350 may then place the thread in the run queue (i.e., second queue) for the processor identified by CPU ID 2404. Wake-and-go engine 2350 may also use thread state pointer 2416 to load thread state information, which is used to wake the thread to the proper state (i.e., execution of the awakened thread is “scheduled” for execution based on the stored thread in the run queue)). 

	Bradford further teaches:
prefetching the first context from a memory into the registers (Column 3, Lines 39-42: Initiating, in connection with a context switch operation, a prefetch of data (i.e., “state information” of the working state of a thread (Column 4, Lines 49-50), which is a “context”) likely to be used by a thread (i.e., third thread context) prior to resuming execution of that thread (i.e., first thread context is prefetched into cache memory (Column 3, Line 48) of processor register files (Column 2, Lines 7-8))).

Regarding claims 10-12, and 16-17, they are apparatus claims comprising similar limitations to those of method claims 1-3, and 7-8 respectively, and are therefore rejected for at least the same rationale.

Claims 4, and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Arimilli, in view of Bradford, in view of Ray, as applied to claims 3, and 12 above, and in further view of Holt et al. Pub. No.: US 2016/0062797 A1 (hereafter “Holt797”).

Holt797 was cited in the previous PTO-892 dated 23 March 2020.

Regarding claim 4, the combination of Arimilli, Bradford, and Ray does not explicitly disclose:
the first context is stored in a first portion of the registers prior to preempting the first workgroup, and wherein 
the second context is prefetched into a second portion of the registers prior to preempting the first workgroup, the first portion being different than the second portion. 

However, Holt797 teaches:
the first context is stored in a first portion of the registers prior to preempting the first [task], and wherein the second context is prefetched into a second portion of the registers prior to preempting the first [task], the first portion being different than the second portion ([0024], Lines 6-22: Core 110 includes two sets of physical registers, register file 165 and alternate register file 166 (i.e., first and second portions of processing core registers)…When core 110 is executing instructions for a current portion of a current task (i.e., first task) having the architectural registers mapped to register file 165 (i.e., context information of first workgroup is stored in first register file 165) and a RSWI occurs, which requires hardware task scheduler 140 to make a scheduling decision, the architectural registers’ mapping can be switched to alternate register file 166 for executing the instructions corresponding to a next portion of a next task. The two sets of physical registers allows hardware task scheduler to preload the context information for the next portion of the next task at alternate register file 166 (i.e., context information for second task is preloaded, or prefetched into second register file) while core 110 utilizes register file 165 for the current portion of the current task (i.e., preloading occurs before the first workgroup is preempted)).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have combined Holt797’s teaching of dividing a register into different portions and allowing a first task to execute out of a first portion while preloading context information into another portion prior to 

Regarding claim 13, it is an apparatus claim comprising similar limitations to those of method claim 4, and is therefore rejected for at least the same rationale.

Claims 6, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Arimilli, in view of Bradford, in view of Ray, as applied to claims 1, and 10 above, and in further view of Holt et al. Pub. No.: US 2014/0282561 A1 (hereafter “Holt561”).

Holt561 was cited in the previous PTO-892 dated 23 March 2020.

Regarding claim 6, while Bradford discloses prefetching contexts of threads, the combination of Arimilli, Bradford, and Ray does not specifically disclose:
wherein prefetching the third context comprises prefetching the third context in response to the subsequent value having a third value associated with a third [task] that executes based on the third context.

However, Holt561 teaches:
wherein prefetching the third context comprises prefetching the third context in response to the subsequent value having a third value associated with a third [task] that executes based on the third context ([0017], Lines 13-17: Process 226 loads the context information for the task selected in i.e., context information for the third task is preloaded prior to a resource transfer instruction (step 208 of Fig. 2) that initiates the context switch (step 212 of Fig. 2))).

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have combined Holt561’s teaching of prefetching a third context corresponding to a third task in response to predicting that the third task should be executed next based on a predicted subsequent value of priority and a hint, with the combination of Arimilli, Bradford, and Ray’s teaching of predicting subsequent values of data to prefetch into registers based on address and stride, with a reasonable expectation of success, since they are analogous task execution systems that similarly perform context switching between tasks. Such a combination would result in a system that prefetches context information into a register of a predicted subsequent task based on a hint and predicted value, as in Holt561, where the hint and predicted value are strides and base addresses respectively, as in Bradford. One of ordinary skill would have been motivated to make this combination so that delay associated with context switching is reduced (Holt561 [0010], Lines 17-19).

Regarding claim 15, it is an apparatus claim comprising similar limitations to those of method claim 6, and is therefore rejected for at least the same rationale.

Claims 19-25 are rejected under 35 U.S.C. 103 as being unpatentable over Bradford, in view of Ray.

Regarding claim 19, Bradford teaches the invention substantially as claimed, including:
A method comprising: 
predicting a value of a signal based on a current value of the signal and whether a hint indicates the signal is to be incremented, decremented, or exchanged (Column 8, Line 63-Column 9, Line 3: A prefetch engine 80 with a scheduler block 82 that interfaces with an increment/decrement control block 84 that updates entries 88 in a stride table 86. Each entry 88, in particular, includes a base address value (i.e., current value of a “signal”) and a stride value (i.e., “hint”), with the base address value i.e., strides represent hints that the address will be incremented or decremented by increment/decrement control block 84)); 
prefetching a first context of a first [thread] into registers of a processor core in response to the predicted value of the signal being equal to a first value associated with the first [thread] (Column 6, Lines 40-44: A context switch operation is utilized to initiate a prefetch of data (i.e., “state information” of the working state of a thread (Column 4, Lines 49-50), which is a “context”) likely to be used by a thread, prior to resumption of execution of that thread. In this regard, a prefetch of data may result in the retrieval of data into any or all of the cache memories in a cache system (i.e., cache memory represents types of processor “registers” (see at least Column 6, Lines 24-39)); 
executing, concurrently with prefetching the first context, a second [thread] on the processor core based on a second context stored in the registers, wherein the second [thread] includes a wait instruction that includes the current value of the signal and the hint (Column 7, Lines 24-41: Now turning to FIG. 3, an exemplary implementation of a context switch routine 50 is illustrated…Routine 50 begins in block 52 by saving the working state of the current thread being executed, including any prefetch control data as needed to indicate what data and/or instructions should be prefetched prior to resumption of execution of the thread (i.e., executing a context switch represents execution of a “wait instruction” that stores, or “includes” as part of the context switch operation, prefetch control data comprising the base address and stride value). Column 4, Line 64-Column 5, Line 2: Precisely when during a context switch a prefetch is initiated can vary in different embodiments, e.g., before restoring a working state, while restoring a working state, after restoring a working state, or even during execution of another thread scheduled for execution prior to resumption of the thread for which the i.e., current thread being executed represents a “second thread” that executes during, or concurrently with prefetching the next thread that represents a “first thread”)); 
preempting the second [thread] in response to executing the wait instruction (Column 7, Lines 332-34: A context switch may also be triggered by a thread…releasing or suspending (i.e., a context switch suspends, or “preempts” a currently executing thread)); and 
scheduling the first [thread] for execution on the processor core in response to preempting the second [thread] and in response to the signal having the predicted value (Column 7, Lines 44-49: Block 58 initiates a data and/or instruction prefetch on behalf of the next thread, using any of the variations discussed herein. Block 60 then restores the working state of the next thread, in a manner generally known in the art. Execution of the next thread is then resumed in block 62 (i.e., next thread, representing a “first thread” is scheduled and executed responsive to the suspension of the currently executing thread and the prefetching of the context information of the next thread based on the predicted address)).  

While Bradford prefetches context information of a first thread concurrently with execution of a second thread, preempts the second thread and schedules the first thread, Bradford does not explicitly disclose:
a first workgroup and a second workgroup;

However, Ray teaches:
a first workgroup and a second workgroup ([0164], Lines 1-12: Workload invokes thread groups to computation…The thread group is preempted as part of an application pre-empt. An application preempt essentially pauses all thread groups relating to an application and swaps the context out. [0181], Lines 1-5: Partial preemption logic 705 of FIG. 7 may be used to partially preempt the process to allow for other thread groups, such as TG2 825, to be dispatched for processing 835, which TG0 821 and TG1 823 are suspended (i.e., a thread group (“second workgroup”) is preempted by another thread group (first workgroup)));



Regarding claim 21, Bradford teaches:
prior to scheduling the first [thread], modifying the signal from the current value to the predicted value (Column 9, Lines 7-12: Once a base address and stride value are determined, the base address is fetched via a command from scheduler 82 to the cache system, and the base address is summed (i.e., “modified”) with the stride value by increment/decrement control block 84, with the new base address value written back into the table).

Regarding claim 22, Bradford teaches:
predicting, concurrently with executing the first [thread], a second predicted value of the signal based on the predicted value of the signal and whether a second hint indicates the signal is to be incremented, decremented, or exchanged (Column 8, Line 63-Column 9, Line 3: A prefetch engine 80 with a scheduler block 82 that interfaces with an increment/decrement control block 84 that updates entries 88 in a stride table 86. Each entry 88, in particular, includes a base address value (i.e., current value of a “signal”) and a stride value (i.e., “hint”), with the base address value representing a current address to be fetched, and the stride value representing the amount to add or subtract from the base address to generate a next address to be fetched. Column 9, Lines 4-12: Data prefetcher 38 generally operates by attempting to discern access patterns among memory accesses, and predicting which data will likely be needed based upon those patterns. More specifically, once a base address and stride value are determined, the base address is fetched via a command from scheduler 82 to the cache i.e., Data prefetcher 38 prefetches each subsequent address based on incrementing, or decrementing the current address, representing at least a first, and second or subsequent prediction)).  

Regarding claim 23, Bradford teaches:
prefetching a third context of a third [thread] into the registers of the processor core in response to the second predicted value of the signal being equal to a second value associated with the third [thread] (Column 6, Lines 40-44: A context switch operation is utilized to initiate a prefetch of data (i.e., “state information” of the working state of a third thread (Column 4, Lines 49-50), which is a “context”) likely to be used by a thread, prior to resumption of execution of that thread. In this regard, a prefetch of data may result in the retrieval of data into any or all of the cache memories in a cache system (i.e., cache memory represents types of processor “registers” (see at least Column 6, Lines 24-39)).

Regarding claim 24, Bradford teaches:
prefetching the second context of the second [thread] into the registers of the processor core in response to the second predicted value of the signal being equal to a second value associated with the second [thread] (Column 6, Lines 40-44: A context switch operation is utilized to initiate a prefetch of data (i.e., “state information” of the working state of the previously preempted second thread (Column 4, Lines 49-50), which is a “context”) likely to be used by a thread, prior to resumption of execution of that thread. In this regard, a prefetch of data may result in the retrieval of data into any or all of the cache memories in a cache system (i.e., cache memory represents types of processor “registers” (see at least Column 6, Lines 24-39)).  

Regarding claim 25, Bradford teaches:
preempting the second [thread] comprises storing the second context in a memory Column 7, Lines 24-41: Now turning to FIG. 3, an exemplary implementation of a context switch routine 50 is illustrated…Routine 50 begins in block 52 by saving the working state of the current thread being i.e., context data is stored in memory)), and wherein scheduling the first [thread] for execution comprises writing the first context from the memory into the registers of the processor core (Column 7, Lines 44-49: Block 58 initiates a data and/or instruction prefetch on behalf of the next thread, using any of the variations discussed herein. Block 60 then restores the working state of the next thread, in a manner generally known in the art. Execution of the next thread is then resumed in block 62 (i.e., next thread, representing a “first thread” is scheduled and executed responsive to the suspension of the currently executing thread and the prefetching of the context information of the next thread based on the predicted address)).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Howes et al. Pub. No.: US 2014/0157287 A1 teaches (in Figs. 3, and 4) executing a first thread (workgroup of at least one thread), saving a state and performing a context switch to a second thread, and then subsequently restoring the first context.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL W AYERS whose telephone number is (571)272-6420.  The examiner can normally be reached on M-F 8:30-5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Meng-Ai An can be reached on 5712723756.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through 






/MICHAEL W AYERS/Examiner, Art Unit 2195