DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant’s submission filed on 6/27/2022 has been entered.

Claims 1, 3-6, 8, 10-13, 15 and 17-20 are presented for examination. Claims 1, 4-6, 8, 12-13, 15 and 17-19 have been amended. 
Applicant’s amendments to the claims have overcome claim objections and 112 rejections previously set forth in the Final Office Action mailed 3/28/2022.

Examiner Notes
Examiner cites particular columns, paragraphs, figures and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in entirely as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1, 3-6, 8, 10-13, 15 and 17-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Regarding to Claim 1, the meaning of limitations “terminate the execution of the at least one transaction in the set of transactions without storing any context information associated with the graphics workload; store an execution state of the thread in a memory queue” at lines 12-15 is not clear. For of all, there is no “thread” being introduced at the claim currently, and thus it is not clear what does the claimed “an execution state of the thread” referred to. According to the original claim language and specification submitted by 3/15/2019 (such as “an execution thread of a graphics workload” at line 1 of the original Claim 1 submitted by 3/15/2019), such claimed “an execution state of the thread” can be interpreted as an execution state of the graphics workload. If so, then it is not clear the difference between “context information associated with the graphics workload” and “an execution state of the thread” (i.e., an execution state of the graphics workload) and it is not clear whether “without storing any context information associated with the graphics workload” at lines 13-14 would conflict with “store an execution state of the thread” at line 15. According to some of the descriptions from the specification, such as “store context data for threads executed by the graphics processing engines  … save and restore contexts of the various threads during contexts switching (e.g., where a first thread is saved and a second thread is stored so that the second thread can be executed by a graphics processing engine)” from [00115], “save and restore the context state. This pointer is optional if no state is required to be saved between jobs or when a job is preempted. The context save/restore area may be pinned system memory” from [00136], “a pipeline select command 2513 is required only once within an execution context” from [00300], the claimed “any context information associated with the graphics workload” (emphasis added) is same or at least included at the claimed “an executions state of the thread” for this particular patent application without further explanation or clarification under BRI (examiner also included some references as evidences to show one with ordinary skill in the art can interpret execution state as context information, see NPL named “Context Switch Definition” attached/listed at the Conclusion section). If so, then the step/action of terminating without storing any context information would be conflict with the step/action of storing the execution state. For the purpose of examination, examiner interprets the limitations mentioned above as “terminate the execution of the at least one transaction in the set of transactions without storing any context information associated with the graphics workload
Note 1: One with ordinary skill in the art may interpret “terminate the execution … without storing any context information associated with the graphics workloads; store an execution state of the thread in a memory queue” as the terminate step/action and storing context state or execution state step/action as two different and separate steps/actions performed in order without overlap (since they are separated and performed in order without overlap, the terminate step/action is performed without storing the context information or execution state). However, such interpretation would conflict with the description from [00211] from the specification that is support for the amended terminate step/action (i.e., “context information need not be stored because the transaction will be executed from the starting point 1610 after the page fault is resolved”).
Note 2: If Applicant disagree with any of the explanation or the interpretation above, then Applicant is suggested to explain that how does the specification provide support or explain both of the claimed terminate step/action and store step/action are performed in a manner required by the plain meanings. 
Claims 3-6 are rejected for failing to cure the deficiency from their respective parent claim by dependency.

Regarding to Claim 8, Claim 8 is rejected under the same reason set forth in the rejection of Claim 1 above.
Claims 10-13 are rejected for failing to cure the deficiency from their respective parent claim by dependency.

Regarding to Claim 15, Claim 8 is rejected under the same reason set forth in the rejection of Claim 1 above.
Claims 17-20 are rejected for failing to cure the deficiency from their respective parent claim by dependency.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 6, 8, 10, 13, 15 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Deming et al. (US PGPUB 20140281263 A1, hereafter Deming) in view of Gschwind et al. (US 20150347134 A1, hereafter Gschwind), Grossman (US PGPUB 20170004647 A1) and Kacevas et al. (US PGPUB 20180047131 A1, hereafter Kacevas).
Deming, Grossman and Kacevas were recited on the previous office action.

Regarding to Claim 1, Deming discloses: An apparatus comprising: 
a parallel processor comprising a plurality of graphics processing resources, wherein at least one of the processing resources of the parallel processor is to (see Figs. 1, 3-4 and [0021]):
detect a page fault in the execution of at least one transaction in a set of transactions for a graphics workload, wherein the set of transaction are to be executed atomically (see Figs. 1, 3-4, [0009] and [0096]-[0098]; “if uTLB 430 is unable to map … then the uTLB 430 generates a memory access fault. The fault detector 450 processes the memory access fault—sending a fault signal”. Also see [0020] and [0093]; “the parallel processing subsystem 112 incorporates circuitry optimized for graphics and video processing” and “threads executing within the SM 310(0) each generate a stream of virtual memory transactions from the SM 310(0)”, emphasis added. Furthermore, see [0064] and [0069]; “various units in the CPU 102 and the PPU 202 may implement atomic operations” and “the type of access that was attempted (e.g., read, write, or atomic), the virtual memory address for which an attempted access caused a page fault”. At certain embodiments, the threads executing in SM310 of the parallel processing subsystem 112 are graphics workloads and the set of transactions of a thread generated are transactions for atomic type of accessing), and in response to detection, to:
terminate the execution of the at least one transaction in the set of transactions (see Figs 1, 3-4, step 508 of Fig. 5 and [0105], “the fault detector 450 included in the replay unit 350(0) … stalls the SM 310(0), and adds the faulting virtual memory transaction to the replay buffer 460”. Stalling the SM that originally executes the faulting transaction and adding the faulting transaction to a buffer imply the corresponding faulting transaction is terminated);

Deming further discloses: and in response to the detection, the processor to:
create a set of instructions to resolve the page fault (see Fig. 2 and [0099]);
launch the set of instructions for execution on the CPU (see Fig. 2 and [0099]).

Note: for the purpose of compact purpose, Deming also discloses feature of in response to the detection of a page fault in the execution of at least one transaction to store an execution state of the thread in a memory queue (see Fig. 2 and [0097]-[0098]; “the fault detector 450 causes a fault buffer entry to be written to the fault buffer 216 of FIG. 2”, emphasis added. The fault buffer entry, i.e., claimed execution state of the thread, is written to fault buffer in the system memory 104 in response to detecting there is a page fault in the execution of the transaction).

Deming does not disclose: 
the terminate is performed without storing any context information associated with the graphics workload;
the creation and launch steps are performed by the at least one of the processing resources of the parallel processor, i.e., the processing resources of the parallel processor that performs the detection of the page fault, instead of the processor performs the creation and launch steps; 
the set of instructions to resolve the page fault are created in a command batch format, launch step is performed in a manner of launching the command batch in a hardware command streamer for execution the parallel processor.

However, Gschwind discloses: in response to detect a memory fault in execution of at least one transaction, terminate the execution of the at least one transaction without storing any context information associated with the transaction (see [0195]; “When a memory conflict is detected, the transaction rolls back (aborts), and all intermediate results are discarded, and the architectural state is restored to that of the start of the transaction (XBEGIN)” and “restarts execution from the fallback instruction address at the beginning of the aborted transaction”, emphasis added. Also see [0059]; “The programmer uses the XBEGIN instruction to specify the start of the transactional code region and the XEND instruction to specify the end of the transactional code region”).
It would have been obvious to one with ordinary skill, in the art before the effective filling date of the claim invention, to modify the terminating failed transaction in response to memory fault on the execution of the failed transaction from Deming by including terminating failed transaction and restarting such transaction at the beginning or entry point of such transaction from Gschwind, since it would provide a mechanism of restarting the failed transaction in a manner of discarding all intermediate result of the failed transaction (see [0195] from Gschwind).

Grossman discloses: an apparatus comprising a parallel processor:
in response to detect a page fault in execution of transaction at the parallel processor, the at least one of the processing resources of the parallel processor to:
create a set of instructions comprising instructions to resolve the page fault; launch the set of instructions for execution on the parallel processor (see Fig. 1 and [0015]; “In response to the GPU 112 experiencing a page fault, the GPU MMU 118 can interrupt the GPU context manager 114, to initiate handling the page fault and to inform the GPU fault handler 116 or the CPU fault handler 106 of the page fault”. Also see [0025]).
It would have been obvious to one with ordinary skill, in the art before the effective filling date of the claim invention, to modify the PPU Fault handler 215 of CPU 102 to handle page fault occurred on GPU/PPU 202 from the combination of Deming and Gschwind by including GPU fault handler 116 of GPU 112 to handle page fault occurred on GPU 112 from Grossman, since it would provide a system architecture of reducing communication latency of page fault report mechanism when the fault handler to handle page fault occurred on GPU is located at GPU instead of CPU (see Fig. 1 of Grossman).

Furthermore, Kacevas discloses: an apparatus comprising a parallel processor, the apparatus creates a set of instructions in command batch format, the apparatus launches the command batch in a hardware command streamer for execution on the parallel processor (see [0065]; “graphics processor receives batches of commands via ring interconnect 502. The incoming commands are interpreted by command streamer 503 in the pipeline front-end 534”. Also see “the commands may be issued as batch of commands in a command sequence, such that the graphics processor will process the sequence of commands in an at least partially concurrent manner” from [0097]).
It would have been obvious to one with ordinary skill, in the art before the effective filling date of the claim invention, to modify the mechanism of GPU/PPU executes a set of instructions from the combination of Deming, Gschwind and Grossman by including a set of instructions are executed in a command batch format to be launched and interpreted by a command streamer of GPU for executing the command batch or set of instructions on GPU from Kacevas, since it is understood that a set of instructions can be formed in a command batch format to be executed.

Thereby, the combination of Deming, Gschwind, Grossman and Kacevas discloses: wherein at least one of the processing resources of the parallel processor is to: in response to detect the detection, to:
terminate the execution of the at least one transaction in the set of transactions without storing any context information associated with the graphic workload (see Figs 1, 3-4, step 508 of Fig. 5 and [0105] from Demining and [0059], [0195] from Gschwind, “the fault detector 450 included in the replay unit 350(0) … stalls the SM 310(0), and adds the faulting virtual memory transaction to the replay buffer 460”, “the transaction rolls back (aborts), and all intermediate results are discarded, and the architectural state is restored to that of the start of the transaction (XBEGIN)” and “restarts execution from the fallback instruction address at the beginning of the aborted transaction”. The failed transaction is terminated or aborted in a manner of without storing the context information or intermediate results and resuming at the beginning point/location of the failed transaction later);
create a command batch comprising instructions to resolve the page faults (see Fig. 2, [0099] from Deming; Fig. 1, [0015], [0025] from Grossman and [0065] from Kacevas. At the combination system, the PPU Fault Handler 215 from Deming is located at graphics processor that detects page fault occurred at the graphics processor, such PPU Fault Handler 215 generates/creates a set of instructions at command batch format to resolve the detected page fault); and
launch the command batch in a hardware command streamer for execution on the processor (see Fig. 2, [0099] from Deming; Fig. 1, [0015], [0025] from Grossman and [0065] from Kacevas. At the combination system, the graphics processor includes a hard command streamer to launch and then interpret received instructions in the command batch format, then such command batch or sets of instructions can be execution on the graphics processor).

Regarding to Claim 3, the rejection of Claim 1 is incorporated and further the combination of Deming, Gschwind, Grossman and Kacevas discloses: the at least one of the processing resources to: generate a page fault signal for the at least one transaction in the set of transactions, the page fault signal comprising a thread identifier, a transaction identifier, a processor identifier, and a virtual function unit (see [0097]-[0098] from Deming; “the fault detector 450 causes a fault buffer entry to be written to the fault buffer 216 of FIG. 2”, emphasis added. Also see [0069] from Deming for details of fault buffer entry, i.e., claimed page fault signal for the transaction. Based on [0069], the fault buffer entry at least comprises: “an indication of a unit or thread that caused a page fault”, i.e., the claimed thread identifier; “the type of access that was attempted (e.g., read, write, or atomtic)”, i.e., the claimed transaction identifier, to identify the type of faulted transaction; “the virtual memory address for which an attempted access caused a page fault”, i.e., the claimed virtual function unit. In addition, see “a fault buffer 216, which includes entries written by the PPU 202 in order to inform the CPU 102 of a page fault generated by the PPU 202” from [0031] of Deming, and thus it is reasonable to consider that a fault buffer entry, i.e., the claimed page fault signal, should also include information of indicating it is PPU 202 instead of CPU (see “CPU-based page fault” at [0076] from Deming) generates the corresponding page fault, i.e., the claimed processor identifier);
discard any work performed on the at least one transaction in the set of transactions (see [0195] from Gschwind; “When a memory conflict is detected, the transaction rolls back (aborts), and all intermediate results are discarded”).

Regarding to Claim 6, the rejection of Claim 1 is incorporated and further the combination of Deming, Gschwind, Grossman and Kacevas discloses: wherein: a hardware element detects the page fault and reports the page fault in a memory queue (see Figs. 1, 3-4, [0020] and [0097]-[0098] from Deming; “many graphics processing units (GPUs) are designed to perform parallel operations and computations and, thus, are considered to be a class of parallel processing unit (PPU)” and “The fault detector 450 processes the memory access fault” and “the fault detector 450 causes a fault buffer entry to be written to the fault buffer 216 of FIG. 2”. The fault detector 450 is resided at replay unit 350 which is part of PPU 202, and thus PPU/GPU, i.e., the claimed hardware element, detects the page fault and reports the page fault to a memory buffer/queue. Also see “The system memory 104 stores a fault buffer 216, which includes entries written by the PPU 202 in order to inform the CPU 102 of a page fault generated by the PPU 202” from [0031] of Deming. Note: based on “any virtual memory transactions from the SM 310(0) that are queued in the in-flight buffer 440” from [0098] of Deming and “re-queues the virtual memory transaction in the replay buffer 460” from [0100] of Deming (emphasis added), it is reasonable to consider a memory buffer of Deming as a memory queue).

Regarding to Claim 8, Claim 8 is a method claim corresponds to system Claim 1 and is rejected for the same reason set forth in the rejection of Claim 1 above.

Regarding to Claim 10, the rejection of Claim 9 is incorporated and further Claim 10 is a method claim corresponds to system Claim 3 and is rejected for the same reason set forth in the rejection of Claim 3 above.

Regarding to Claim 13, the rejection of Claim 8 is incorporated and further Claim 13 is a method claim corresponds to system Claim 6 and is rejected for the same reason set forth in the rejection of Claim 6 above.

Regarding to Claim 15, Claim 15 is a product claim corresponds to system Claim 1 and is rejected for the same reason set forth in the rejection of Claim 1 above.

Regarding to Claim 17, the rejection of Claim 16 is incorporated and further Claim 17 is a product claim corresponds to system Claim 3 and is rejected for the same reason set forth in the rejection of Claim 3 above.

Claims 4-5, 11-12 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Deming et al. (US PGPUB 20140281263 A1, hereafter Deming) in view of Gschwind et al. (US 20150347134 A1, hereafter Gschwind), Grossman (US PGPUB 20170004647 A1), Kacevas et al. (US PGPUB 20180047131 A1, hereafter Kacevas) and further in view of Lee et al. (US PGPUB 20190272217 A1, hereafter Lee).
Deming, Grossman, Kacevas and Lee were cited on the previous office action.

Regarding to Claim 4, the rejection of Claim 3 is incorporated and further the combination of Deming, Gschwind, Grossman and Kacevas discloses: at least one of the processing resources to: initiate execution of a new thread (see [0022], [0029] and [0093] from Deming, there are multiple threads being executed, and thus the system of Deming inherently includes action of initiating execution of a new thread); and re-execute the transaction for execution after the page fault is resolved (see step 512 of Fig. 5 from Deming; “the replay unit 350(0) waits for the CPU 102 to signal that one or more faults have been resolved via the replay signal. Upon receiving the replay signal, the replay unit 350(0) invalidates the uTLB 430 and re-executes the virtual memory transactions that are stored in the replay buffer 460”).

The combination of Deming, Gschwind, Grossman and Kacevas does not disclose: the at least one transaction in the set of transactions for execution is queued after the page fault is resolved.
However, Lee discloses: a fault handling mechanism comprises to queue a program command for execution after the fault is resolved (see [0009]; “a recovery operation to a program command corresponding to the program fail, re-queues a recovered program command in the first queue and resumes providing the queued commands from the first queue”, emphasis added).
It would have been obvious to one with ordinary skill, in the art before the effective filling date of the claim invention, to modify the queuing fault command/transaction having fault before the fault is solved from the combination of Deming, Gschwind, Grossman and Kacevas by including the queuing fault command/transaction having fault after the fault is solved from Lee, and thus the new combination would teach the missing limitation from Deming, Gschwind, Grossman and Kacevas. Both Deming and Lee discuss re-queueing the transaction or command having fault for re-execution; Deming discusses a mechanism of queuing such transaction/command before resolving the issue/fault (see steps 508-510 of Fig. 5 and [0105] from Deming); but Lee discusses a mechanism of queueing such transaction/command after resolving the issue/fault (see [0009] from Lee). Substitute the type of fault handling mechanism from Deming for another from Lee to achieve the predictable result of queuing the transaction/command having issue/fault after the corresponding issue/fault is resolved.

Regarding to Claim 5, the rejection of Claim 4 is incorporated and further the combination of Deming, Gschwind, Grossman, Kacevas and Lee discloses: at least one of the processing resource to: load a subsequent transaction for execution (see steps 506-502 of Fig. 5 and [0104] from Deming; “the method 500 returns to step 502”. The method returns to step 502 from 506 for receiving/loading the next or subsequent memory transaction of the thread for execution. Also see steps 514-516-502 of Fig. 5 and [0107] from Deming; “causes the SM 310(0) to resume issuing virtual memory transactions from the SM 310(0), and the method 500 returns to step 502”. The SM 310(0) begins to load/issue a next/subsequent transaction for execution).

Regarding to Claim 11, the rejection of Claim 10 is incorporated and further Claim 11 is a method claim corresponds to system Claim 4 and is rejected for the same reason set forth in the rejection of Claim 4 above.

Regarding to Claim 12, the rejection of Claim 11 is incorporated and further Claim 12 is a method claim corresponds to system Claim 5 and is rejected for the same reason set forth in the rejection of Claim 5 above.

Regarding to Claim 18, the rejection of Claim 17 is incorporated and further Claim 18 is a product claim corresponds to system Claim 4 and is rejected for the same reason set forth in the rejection of Claim 4 above.

Regarding to Claim 19, the rejection of Claim 18 is incorporated and further Claim 19 is a product claim corresponds to system Claim 5 and is rejected for the same reason set forth in the rejection of Claim 5 above.

Regarding to Claim 20, the rejection of Claim 19 is incorporated and further Claim 20 is a product claim corresponds to system Claim 6 and is rejected for the same reason set forth in the rejection of Claim 6 above.

Response to Arguments
Applicant’s arguments, filled 6/27/2022, with respect to rejections of Claims 1, 3-6, 8, 10-13, 15 and 17-20 under 35 U.S.C. 103 have been full considered. New grounds of rejections were made based on amended limitations.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Context Switch Definition discloses: storing the CPU’s state (i.e., the context) for that process somewhere in memory (see 4th paragraph).
Brannock et al. (US 20170286318 A1) discloses: store the current executions state referred to as the context (see [0003]).
Pu et al. (US 20160174289 A1) discloses: derive at least one context information from application execution state information (see [0042]).
Bratanov (US 9459984 B2) discloses: execution states of programs include instruction addresses and task contexts (see lines 39-41 of col. 1).
Calciu et al. (US 20150277967 A1) discloses: during an abort phase, clearing information recorded during the transaction execution and resting from the address stored during the begin phase (see [0069]).
Wecker (US 20090064141 A1) discloses: operations of a transaction may be rolled back to an initial starting point in the event that the transaction is restarted or aborted (see [0011]).
Jayasena et al. (US 20140149677 A1) discloses: in response to a page fault, terminating execution of a wavefront without saving intermediate state (see Claim 8).
Peters et al. (US 20100251010 A1) discloses: terminating a first instance of a function without storing state information associated with the function (see [0103]).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZHI CHEN whose telephone number is (571)272-0805.  The examiner can normally be reached on Monday-Friday 9:30AM-5PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emerson Puente can be reached on (571)272-3652.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Zhi Chen/
Patent Examiner, AU2196

/EMERSON C PUENTE/Supervisory Patent Examiner, Art Unit 2196