The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Detailed Action
Claims 1-20 have been examined.Claims 1-20 have been rejected.



Claim Rejections - 35 USC § 112(b)

The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

 
Claim(s) 1-20 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claim 11 line 4 recites the limitation "the steps".  This limitation lacks antecedent basis.Claim 20 line 6 recites the limitation "the steps".  This limitation lacks antecedent basis.



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Note that in the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claims 1, 2, 4-12, 14-18 are rejected under 35 U.S.C. 103 as being unpatentable over Fadelu (WO 2022/139795 A1, filed as PCT/US2020/066430).
As per claim 1, Fadelu ('795) discloses a computer-implemented method for checkpointing a context associated with an execution of a software application on a parallel processor (see abstract, a multi-chip system performs a machine learning application), the method comprising: 
	causing a plurality of parallel processing elements included in the parallel processor to stop executing a first plurality of instructions (paragraph 34, processing tiles can be halted and resumed); 
	causing the parallel processor to collect first state data associated with the context (paragraph 46, state data for each ASIC is saved to a shared memory); 
	generating a checkpoint based on the first state data, wherein the checkpoint is stored in a memory associated with the parallel processor (paragraphs 46-47, the saved state is a checkpoint that can be restored and is saved to a shared memory); and 
	causing the plurality of parallel processing elements to resume executing the first plurality of instructions (paragraphs 46-47, the state of the ASICs can be restored from shared memory to the individual memories of the ASICs in the hardware accelerator; the scalar core can then resume execution of the long-running process).
Fadelu ('795) teaches that a long-running process is saved and then resumed (paragraphs 46-47).  Fadelu ('795) does not expressly disclose the method wherein the parallel processor stops executing in accordance with a context before executing a next instruction included in the first plurality of instructions, and resuming executing of the first plurality of instructions at the next instruction in accordance with the context.Prior to the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to modify the execution stopping and resuming disclosed by Fadelu ('795) such that the processor resumes executing at a next instruction following the instruction where it halted.  This modification would have been obvious because, as would be clear to one of ordinary skill in the art, resuming with a next instruction provides for no instructions to be skipped, which one of ordinary skill in the art would appreciate may be important for correct program execution.

As per claim 2, Fadelu ('795) discloses the computer-implemented method of claim 1, wherein causing the plurality of parallel processing elements to stop executing the first plurality of instructions comprises transmitting a control call to the parallel processor to preempt the context (paragraph 2, a preemptive scheduler preempts and later resumes tasks on the system) at an instruction level (paragraph 43, a fence instruction ensures that instructions prior to the fence are carried out before the checkpoint).

As per claim 4, Fadelu ('795) discloses the computer-implemented method of claim 1, further comprising: 	while the plurality of parallel processing elements is executing a second plurality of instructions, determining that the context is to be restarted based on the checkpoint (paragraphs 44-45, after the high priority process completes execution, the long-running process continues); 	causing the plurality of parallel processing elements to stop executing the second plurality of instructions (paragraphs 44-45, after the high priority process completes execution, the long-running process continues); and 
	causing the plurality of parallel processing elements to restart executing the first plurality of instructions at the next instruction in accordance with the checkpoint (paragraph 46, the long-running process resumes after state is restored from shared memory to the individual memories of the ASICs).

As per claim 5, Fadelu ('795) discloses the computer-implemented method of claim 1, wherein causing the plurality of parallel processing elements to resume executing the first plurality of instructions comprises transmitting a control call to the parallel processor to re-enable the plurality of parallel processing elements (paragraph 2, the task is later resumed).

As per claim 6, Fadelu ('795) discloses the computer-implemented method of claim 1, further comprising, prior to causing the plurality of parallel processing elements to stop executing the first plurality of instructions, preventing the parallel processor from scheduling a second plurality of instructions in accordance with a second context on the plurality of parallel processing elements (paragraphs 3 and 10, a higher priority process may cause the system to preempt a long-running process; this action is based on relative priority.  It would be clear to one of ordinary skill in the art that the relative priority scheduling implies that the system will prevent the preemption (and prevent the scheduling) if the other process is of lower priority).

As per claim 7, Fadelu ('795) discloses the computer-implemented method of claim 1, wherein causing the plurality of parallel processing elements to stop executing the first plurality of instructions further causes the parallel processor to collect second state data associated with the context (paragraph 25, the context can include hardware states, instructions, compute operands and results, which are collected for storing).

As per claim 8, Fadelu ('795) discloses the computer-implemented method of claim 7, wherein generating the checkpoint comprises copying the first state data and the second state data to the checkpoint (paragraph 25, the saved state can include multiple state data including hardware states, instructions, compute operands and results).

As per claim 9, Fadelu ('795) discloses the computer-implemented method of claim 1 wherein the first state data is associated with the plurality of parallel processing elements (paragraph 46, a state of local memory of each ASIC is saved to shared memory) and comprises at least one of a per-thread register state or a shared memory state (paragraph 46, a state of local memory of each ASIC is saved to shared memory; the state data therefore comprises the shared memory state).

As per claim 10, Fadelu ('795) discloses the computer-implemented method of claim 1, wherein the parallel processor comprises a parallel processing unit (paragraph 30), a graphics processing unit, a tensor processing unit (paragraphs 48 and 49), or a multi-core central processing unit.

As per claims 11 and 12, these claims recite limitations found in claims 1 and 2, respectively and are respectively rejected on the same grounds as claims 1 and 2.As per claim 14, this claim recites limitations found in claim 4 and is rejected on the same grounds as claim 4.

As per claim 15, Fadelu ('795) discloses the one or more non-transitory computer readable media of claim 14, wherein the second plurality of instructions is associated with the context or a different context (claim 1 of the reference, a second process of a higher priority is determined to be queued and is executed until completion).

As per claim 16, Fadelu ('795) discloses the one or more non-transitory computer readable media of claim 11, further comprising, prior to causing the plurality of parallel processing elements to stop executing the first plurality of instructions, determining that the context is to be checkpointed based on a checkpoint interval (paragraphs 32-35, the system employs a maximum execution time before a preemption checkpoint.  The preemption point is when state data is saved for a context switch, as in paragraphs 46-47).

As per claim 17, Fadelu ('795) discloses the one or more non-transitory computer readable media of claim 11, wherein causing the plurality of parallel processing elements to stop executing the first plurality of instructions further causes the parallel processor to collect second state data associated with the context (paragraph 25, the saved state can include multiple pieces of state data including hardware states, instructions, compute operands and results).

As per claim 18, Fadelu ('795) discloses the one or more non-transitory computer readable media of claim 17, wherein generating the checkpoint comprises copying the first state data and the second state data to the checkpoint (paragraph 25, the saved state includes multiple pieces of state data including hardware states, instructions, compute operands and results).

Claims 3, 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Fadelu ('795) in view of Browning (US Patent 6,654,781).
As per claim 3, Fadelu ('795) discloses the computer-implemented method of claim 1.  Fadelu ('795) does not expressly disclose the method wherein the first plurality of instructions comprises a kernel included in the software application.Browning ('781) teaches a system which a kernel thread is executed (column 3 lines 36-42).Prior to the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to modify the context switching and process scheduling disclosed by Fadelu ('795) such that a process executed is a kernel thread, as taught by Browning ('781).  This modification would have been obvious because kernel threads may perform functions including control flows which enable concurrent processing by multiple processors (Browning ('781) column 1 lines 22-30).

As per claim 13, this claim recites limitations found in claim 3 and is rejected on the same grounds as claim 3.

As per claim 20, Fadelu ('795) discloses a system comprising: 
	a parallel processing memory storing a first process (paragraph 46, state data for each ASIC is saved to a shared memory); 
	a parallel processor coupled to the parallel processing memory (paragraph 30, the ASICs form an array of parallel computing engines); 
	one or more memories storing instructions; and 
	one or more primary processors coupled to the one or more memories that, when executing the instructions, perform the steps of: 
	causing a plurality of parallel processing elements included in the parallel processor to stop executing the first process (paragraph 34, processing tiles can be halted and resumed);
	causing the parallel processor to collect state data associated with the context (paragraph 46, state data for each ASIC is saved to a shared memory);
	generating a checkpoint based on the state data, wherein the checkpoint is stored in the parallel processing memory (paragraphs 46-47, the saved state is a checkpoint that can be restored and is saved to a shared memory); and
	causing the plurality of parallel processing elements to resume executing the first process (paragraphs 46-47, the state of the ASICs can be restored from shared memory to the individual memories of the ASICs in the hardware accelerator; the scalar core can then resume execution of the long-running process).

Fadelu ('795) does not expressly disclose the method wherein the parallel processor stops executing in accordance with a context before executing a next instruction included in the first plurality of instructions, and resuming executing of the first plurality of instructions at the next instruction in accordance with the checkpoint.Prior to the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to modify the execution stopping and resuming disclosed by Fadelu ('795) such that the processor resumes executing at a next instruction following the instruction where it halted.  This modification would have been obvious because, as would be clear to one of ordinary skill in the art, resuming with a next instruction provides for no instructions to be skipped, which one of ordinary skill in the art would appreciate may be important for correct program execution.

Fadelu ('795) does not expressly disclose the system wherein the first process is a kernel process.

Browning ('781) teaches a system which a kernel thread is executed (column 3 lines 36-42).Prior to the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to modify the context switching and process scheduling disclosed by Fadelu ('795) such that a process executed is a kernel thread, as taught by Browning ('781).  This modification would have been obvious because kernel threads may perform functions including control flows which enable concurrent processing by multiple processors (Browning ('781) column 1 lines 22-30).

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Fadelu ('795) in view of Volp (US Patent Application Publication 2005/0081020).

Fadelu ('795) discloses the one or more non-transitory computer readable media of claim 17.  

Fadelu ('795) teaches that the in-process values that are part of a context may include hardware states (paragraph 4) and that the compute tile itself includes various registers (paragraphs 53, 77 and 81).

Fadelu ('795) does not expressly disclose the media wherein the second state data comprises at least one of a privileged register state or a flip-flop state.

Volp ('020) teaches use of a privileged execution context including states of privileged registers (paragraphs 5 and 9).Prior to the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to modify the execution stopping and resuming disclosed by Fadelu ('795) such that saved state information comprises a privileged register state, as taught by Volp ('020).  This modification would have been obvious because use of privileged contexts and nonprivileged context allows strengthening of security by preventing untrusted software from taking control (Volp ('020) paragraph 12).




Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  	Goodman teaches context switching in a SIMD processor.  The system performs instruction-level preemption using special registers to save PC, current thread validity information and predicate information.	Rauchfuss teaches storing a thread context including stopping a thread of a GPU at an instruction level granularity in response to a request to pre-empt a thread.




Contact Information


Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSEPH SCHELL whose telephone number is (571) 272-8186.  The examiner can normally be reached on Monday through Friday 9AM-5:00PM (Pacific Time).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  Please note that all agendas or related documents that Applicant would like reviewed should be sent at least one full business day (i.e. 24 hours not including weekends or holidays) before the interview.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matt Kim can be reached at (571) 272-4182.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.  The fax phone number for the examiner is 571-273-8186.  The examiner may be e-mailed at joseph.schell@uspto.gov though communications via e-mail are not permitted without a written authorization form (see MPEP 502.03).
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





JS/JOSEPH O SCHELL/Primary Examiner, Art Unit 2114