DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 4-12, 17, and 20 have been amended.
Claims 2 and 13 have been cancelled.
Claims 21 and 22 have been added.
Claims 1, 3-12, and 14-22 have been examined.
The specification and drawing objections in the previous Office Action have been addressed and are withdrawn.
The § 112 rejections in the previous Office Action have been addressed and are withdrawn.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-6, 10, 15, 17, 21, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over US Publication No. 2008/0115113 by Codrescu et al. (hereinafter referred to as “Codrescu”) in view of US Publication No. 2017/0091076 by Gao et al. (hereinafter referred to as “Gao”). 
Regarding claim 1, Codrescu discloses:
a method of debugging a processor while the processor executes…a software application…the method comprising: inspecting…[an identifier] when the first one of the…[threads] is allocated for execution by a first one of the threads (Codrescu discloses, at ¶ [0027], selecting a thread to be executed. This discloses inspecting the thread identifier.); 
(Codrescu discloses, at ¶ [0049], determining whether the thread identifier matches a predetermined value(break identifier) in a register (debug hardware).); and 
raising an instruction exception event for the first one of the threads in response to the…identifier matching the…break identifier in the debug hardware (Codrescu discloses, at ¶ [0049], if there is a match, then the process goes into debug mode, which discloses raising an instruction exception event.).  
Codrescu does not explicitly disclose each vertex being assigned to a respective programming thread and that the aforementioned identifiers are vertex identifiers.
However, in the same field of endeavor (e.g., debugging) Gao discloses:
assigning vertices to processing nodes and vertex identifiers (Gao discloses, at ¶ [0045], scheduling vertices to run on processing nodes and, at ¶ [0048], vertex identifiers.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism, which includes per thread breakpoints, to include vertex identifiers, as disclosed by Gao, in order to enable debugging of large execution graphs. See Gao, ¶ [0008].

Regarding claim 3, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu also discloses:
writing the vertex break identifier into the debug hardware (Codrescu discloses, at ¶ [0049], the thread ID (vertex break identifier) in a register, which discloses writing the thread ID thereto.).

Regarding claim 4, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu also discloses:
wherein the processor is configured to execute a plurality of worker threads in each of a plurality of time slots in a repeating sequence of inter-leaved time slots, with a program state of each of the worker threads being stored in a respective context register set associated with each of the worker threads wherein a first one of the context register sets set stores the vertex identifier (Codrescu discloses, at ¶ [0029], executing threads (worker threads) involving interleaving instructions from different threads, which discloses a repeating sequence of time slots. Codrescu also discloses, at ¶ [0061, storing state in registers (context register sets) assigned to the threads.).  

Regarding claim 5, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu also discloses:
wherein a supervisor thread executed on the processor manages allocation of…worker threads (Codrescu discloses, at ¶ [0034], a supervisor managing execution of the threads.).  
Codrescu does not explicitly disclose the managing includes allocating vertices to the threads. 
However, in the same field of endeavor (e.g., debugging) Gao discloses:
assigning vertices (Gao discloses, at ¶ [0045], scheduling vertices to run on processing nodes and, at ¶ [0048], vertex identifiers.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism, which includes per thread breakpoints, to include vertex identifiers, as disclosed by Gao, in order to enable debugging of large execution graphs. See Gao, ¶ [0008].

Regarding claim 6, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu also discloses:
…the…break identifier comprises the…identifier (Codrescu discloses, at ¶ [0049], determining there is a match between the thread ID and a predetermined value.)  
Codrescu does not explicitly disclose wherein the software application is represented by a graph of interconnected vertices, and the identifiers are vertex identifiers.
However, in the same field of endeavor (e.g., debugging) Gao discloses:
wherein the software application is represented by a graph of interconnected vertices, and the identifiers are vertex identifiers (Gao discloses, at ¶ [0045] and Figure 3, an execution structure of interconnected vertices, and, at ¶ [0048], vertex identifiers.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism, which includes per thread 

Regarding claim 10, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu also discloses:
wherein the processor is configured to execute a plurality of worker threads in each of a plurality of time slots in a repeating sequence of inter-leaved time slots (Codrescu discloses, at ¶ [0029], executing threads (worker threads) involving interleaving instructions from different threads, which discloses a repeating sequence of time slots.);
wherein each worker thread executes instructions in a codelet assigned to it…the codelet executing to an exit state unless it is excepted (Codrescu discloses, at ¶ [0027], threads contain sets of instructions (codelets) that execute, which discloses to an exit state unless excepted. That is, generating results, as taught by ¶ [0028], is an exit state reached unless there is an exception.).  
Codrescu does not explicitly disclose codelets representing respective vertices.
However, in the same field of endeavor (e.g., debugging) Gao discloses:
assigning vertices to processing nodes (Gao discloses, at ¶ [0045], scheduling vertices to run on processing nodes.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism, which includes per thread breakpoints, to include vertex identifiers, as disclosed by Gao, in order to enable debugging of large execution graphs. See Gao, ¶ [0008].

Regarding claim 15, Codrescu discloses:
a processor configured to execute …a software application…assigned to a respective programming thread of the processor, the processor comprising: storage circuitry configured to hold for a first one of the programming threads a… identifier… allocated for execution to the first one of the programming threads (Codrescu discloses, at ¶ [0027], a processor executing instructions (a software application) involving assigning the instructions to selected threads to be executed. As disclosed at ¶ [0058], the threads have thread IDs stored in thread identifier registers (storage).); 
circuitry for allocating the first one of the programming threads to execution circuitry for execution (Codrescu discloses, at ¶ [0027], selecting a thread to be executed using issue logic circuitry.); 
debug hardware configured to hold a…break identifier; and a debug component configured to compare the…identifier with the…break identifier, and to raise an instruction exception event for the first one of the programming threads responsive to the…identifier matching the…break identifier in the debug hardware (Codrescu discloses, at ¶ [0049], determining whether the thread identifier matches a predetermined value(break identifier) in a register (debug hardware) and, at ¶ [0049], if there is a match, then the process goes into debug mode, which discloses raising an instruction exception event.).  
Codrescu does not explicitly disclose vertices and vertex identifiers.  
However, in the same field of endeavor (e.g., debugging) Gao discloses:
assigning vertices to processing nodes and vertex identifiers (Gao discloses, at ¶ [0045], scheduling vertices to run on processing nodes and, at ¶ [0048], vertex identifiers.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism, which includes per thread breakpoints, to include vertex identifiers, as disclosed by Gao, in order to enable debugging of large execution graphs. See Gao, ¶ [0008].

Regarding claim 17, Codrescu, as modified, discloses the elements of claim 15, as discussed above. Codrescu also discloses:
wherein the processor is configured to execute a plurality of worker threads in each of a plurality of time slots in a repeating sequence of inter-leaved time slots, with a program state of each of the worker threads being stored in a respective context register set associated with each of the worker threads wherein a first one of the context register sets set stores the vertex identifier (Codrescu discloses, at ¶ [0029], executing threads (worker threads) involving interleaving instructions from different threads, which discloses a repeating sequence of time slots. Codrescu also discloses, at ¶ [0061, storing state in registers (context register sets) assigned to the threads.).  

Regarding claim 21, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu also discloses:
wherein another instruction exception is raised responsive to a match between an executing instruction address and an instruction break address in the debug hardware (Codrescu discloses, at ¶ [0049], entering debug mode (raising an instruction exception) in response to the program counter (instruction address) matching a predetermined value in a register.).

Regarding claim 22, Codrescu, as modified, discloses the elements of claim 21, as discussed above. Codrescu also discloses:
setting an enable bit which in one state enables raising of the instruction exception event in response to the vertex identifier matching the vertex break identifier, and in another state enables raising of another instruction exception event responsive to the match between the executing instruction address and the instruction break address for any vertex identifier assigned to the first one of the threads (Codrescu discloses, at ¶ [0056], an enable bit that enables the hardware breakpoint for any thread, or alternatively, for only specified matching threads. Codrescu also discloses, at ¶ [0049], entering debug mode (raising an instruction exception) in response to the program counter (instruction address) matching a predetermined value in a register.).  

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Codrescu in view of Gao in view of US Publication No. 2012/0144240 by Rentschler et al. (hereinafter referred to as “Rentschler”). 
Regarding claim 7, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu also discloses:
selecting a group of…[identifiers] and writing their identifiers in the debug hardware for each of multiple processing units in the processor (Codrescu discloses, at ¶ [0049], the thread ID in registers (debug hardware), which discloses writing the thread ID thereto.). 
Codrescu does not explicitly disclose vertices, that the identifiers are vertex identifiers, and that the aforementioned group is randomly selected. 

vertices and vertex identifiers(Gao discloses, at ¶ [0045], scheduling vertices to run on processing nodes and, at ¶ [0048], vertex identifiers.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism, which includes per thread breakpoints, to include vertex identifiers, as disclosed by Gao, in order to enable debugging of large execution graphs. See Gao, ¶ [0008].
Also in the same field of endeavor (e.g., debugging) Rentschler discloses:
selecting randomly (Rentschler discloses, at ¶ [0031], debug triggering using a random event.).  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism to include random triggers, as disclosed by Rentschler, in order to flexibly implement debug for complicated systems. See Rentschler, ¶ [0004].

Claims 8, 9, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Codrescu in view of Gao in view of “An Architecture and Compiler for Scalable On-Chip Communication” by Liang et al. (hereinafter referred to as “Liang”) in view of the Examiner’s taking official notice. 
Regarding claim 8, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu does not explicitly disclose the processor comprises an arrangement of tiles and an interconnect for communicating between tiles, wherein each tile comprises an execution unit for executing machine code instructions, and the interconnect is operable to conduct communications between a group of some or all of the tiles according to a bulk synchronous parallel scheme, whereby each of the tiles in the group performs an on-tile compute phase followed by an inter-tile exchange phase with the exchange phase being held back until all the tiles in the group have completed the compute phase.  
However, in the same field of endeavor (e.g., parallel computing) Liang discloses:
the processor comprises an arrangement of tiles and an interconnect for communicating between tiles, wherein each tile comprises an execution unit for executing machine code instructions, and the interconnect is operable to conduct communications between a group of some or all of the tiles…whereby (Liang discloses, at § I, a 2-D mesh of tiles having cores (execution unit) and an on chip interconnect for communicating between the tiles. As disclosed in § IIA, computation and communications are separately scheduled. As disclosed at § V, communication between cores is synchronized at specified communication points, i.e., once computation is completed.).  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism to use the system shown by Liang in order to provide high performance by integrating large amounts of resources and ensure efficient communication between the resources. See Liang, § 1. 
The references do not explicitly disclose that the communication is according to a bulk synchronous parallel scheme. However, the Examiner takes official notice that BSP is a well-known communication scheme for parallel computing. It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to use BSP there must be some scheme for organizing communication between the computing components, there are a finite number of predictable potential solutions, and one of ordinary skill in the art could have pursued the known potential options with a reasonable expectation of success. Therefore, it would have been obvious to try to utilize BSP.

Regarding claim 9, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu also discloses:
wherein the exchange phase is arranged to be performed by the supervisor thread (Codrescu discloses, at ¶ [0034], a supervisor managing execution of the threads.).
Codrescu does not explicitly disclose the processor comprises an arrangement of tiles and an interconnect for communicating between tiles; wherein each tile comprises an execution unit for executing machine code instructions, and the interconnect is operable to conduct communications between a group of some or all of the tiles according to a bulk synchronous parallel scheme, whereby each of the tiles in 
However, in the same field of endeavor (e.g., parallel computing) Liang discloses:
the processor comprises an arrangement of tiles and an interconnect for communicating between tiles; wherein each tile comprises an execution unit for executing machine code instructions, and the interconnect is operable to conduct communications between a group of some or all of the tiles…whereby each of the tiles in the group performs an on-tile compute phase followed by an inter-tile exchange phase with the exchange phase being held back until all the tiles in the group have completed the compute phase…. (Liang discloses, at § I, a 2-D mesh of tiles having cores (execution unit) and an on chip interconnect for communicating between the tiles. As disclosed in § IIA, computation and communications are separately scheduled. As disclosed at § V, communication between cores is synchronized at specified communication points, i.e., once computation is completed.).  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism to use the system shown by Liang in order to provide high performance by integrating large amounts of resources and ensure efficient communication between the resources. See Liang, § 1. 
The references do not explicitly disclose that the communication is according to a bulk synchronous parallel scheme. However, the Examiner takes official notice that BSP is a well-known communication scheme for parallel computing. It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to use BSP there must be some scheme for organizing communication between the computing components, there are a finite number of predictable potential solutions, and one of ordinary skill in the art could have pursued the known potential options with a reasonable expectation of success. Therefore, it would have been obvious to try to utilize BSP.

Regarding claim 16, Codrescu, as modified, discloses the elements of claim 15, as discussed above. Codrescu does not explicitly disclose the processor comprising an arrangement of tiles and an interconnect for communicating between tiles, wherein each tile comprises an execution unit for executing 
However, in the same field of endeavor (e.g., parallel computing) Liang discloses:
the processor comprising an arrangement of tiles and an interconnect for communicating between tiles, wherein each tile comprises an execution unit for executing machine code instructions, and the interconnect is operable to conduct communications between a group of some or all of the tiles…whereby each of the tiles in the group performs an on-tile compute phase followed by an inter-tile exchange phase with the exchange phase being held back until all the tiles in the group have completed the compute phase (Liang discloses, at § I, a 2-D mesh of tiles having cores (execution unit) and an on chip interconnect for communicating between the tiles. As disclosed in § IIA, computation and communications are separately scheduled. As disclosed at § V, communication between cores is synchronized at specified communication points, i.e., once computation is completed.).  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism to use the system shown by Liang in order to provide high performance by integrating large amounts of resources and ensure efficient communication between the resources. See Liang, § 1. 
The references do not explicitly disclose that the communication is according to a bulk synchronous parallel scheme. However, the Examiner takes official notice that BSP is a well-known communication scheme for parallel computing. It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to use BSP there must be some scheme for organizing communication between the computing components, there are a finite number of predictable potential solutions, and one of ordinary skill in the art could have pursued the known potential options with a reasonable expectation of success. Therefore, it would have been obvious to try to utilize BSP.

Claims 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Codrescu in view of Gao in view of Liang in view of US Patent No. 7,100,021 by Marshall et al. (hereinafter referred to as “Marshall”). 
Regarding claim 11, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu also discloses:
wherein the processor is configured to execute a plurality of worker threads in each of a plurality of time slots in a repeating sequence of inter-leaved time slots (Codrescu discloses, at ¶ [0029], executing threads (worker threads) involving interleaving instructions from different threads, which discloses a repeating sequence of time slots.);
wherein each worker thread executes instructions in a codelet assigned to it … the codelet executing to an exit state unless it is excepted, and wherein a supervisor thread executes… (Codrescu discloses, at ¶ [0027], threads contain sets of instructions (codelets) that execute, which discloses to an exit state unless excepted. That is, generating results, as taught by ¶ [0028], is an exit state reached unless there is an exception. Codrescu also discloses, at ¶ [0034], a supervisor managing execution of the threads.).  
Codrescu does not explicitly disclose codelets representing respective vertices, a synchronisation instruction, and wherein a tile is configured to wait for all of the worker threads to reach their respective exit points, and then to execute a request for synchronisation, whereby the tile is paused until a synchronisation acknowledgement signal is received
However, in the same field of endeavor (e.g., debugging) Gao discloses:
assigning vertices to processing nodes (Gao discloses, at ¶ [0045], scheduling vertices to run on processing nodes.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism, which includes per thread breakpoints, to include vertex identifiers, as disclosed by Gao, in order to enable debugging of large execution graphs. See Gao, ¶ [0008]. 
Also in the same field of endeavor (e.g., parallel computing) Liang discloses:
(Liang discloses, at § V, communication between cores is synchronized at specified communication points using blocking points, which discloses synchronization instructions.).  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism to use the system shown by Liang in order to provide high performance by integrating large amounts of resources and ensure efficient communication between the resources. See Liang, § 1. 
Also in the same field of endeavor (e.g., communications) Marshall discloses:
a synchronisation acknowledgement signal (Marshall discloses, at col. 11, lines 55-58, an acknowledge signal.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism to include an acknowledge signal to prevent unnecessary exceptions. See Marshall, col. 11, lines 50-51.

Regarding claim 12, Codrescu, as modified, discloses the elements of claim 11, as discussed above. Codrescu also discloses:
debugging a first worker thread which has raised an exception event, while the other worker threads on the tile continue to execute to their respective exits (Codrescu discloses, at ¶ [0073], transitioning selected threads to debug mode, while allowing other threads to continue.).

Claims 14 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Codrescu in view of Gao in view of US Publication No. 2018/0307985 by Appu et al. (hereinafter referred to as “Appu”).
Regarding claim 14, Codrescu, as modified, discloses the elements of claim 1, as discussed above. Codrescu does not explicitly disclose wherein the software application is a machine learning application.  
However, in the same field of endeavor (e.g., synchronization) Appu discloses:
(Appu discloses, at ¶ [0158], a machine learning application.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to utilize Codrescu’s debugging mechanism in a machine learning context, as disclosed by Appu, in order to provide barriers across processors.

Regarding claim 18, Codrescu discloses:
a method comprising… a plurality of programming threads (Codrescu discloses, at ¶ [0027], a processor executing instructions involving assigning the instructions to selected threads to be executed.);  
…matching a first one of the… [identifiers] to a … break identifier stored in debug hardware on a processor executing the…application; and raising an instruction exception in a first one of the programming threads in response to the matching (Codrescu discloses, at ¶ [0049], determining whether the thread identifier matches a predetermined value(break identifier) in a register (debug hardware) and, at ¶ [0049], if there is a match, then the process goes into debug mode, which discloses raising an instruction exception event.).   
Codrescu does not explicitly disclose vertices and allocating a plurality of vertices of a graph of a machine learning application.
However, in the same field of endeavor (e.g., debugging) Gao discloses:
assigning vertices to processing nodes and vertex identifiers (Gao discloses, at ¶ [0045], scheduling vertices to run on processing nodes and, at ¶ [0048], vertex identifiers.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Codrescu’s debugging mechanism, which includes per thread breakpoints, to include vertex identifiers, as disclosed by Gao, in order to enable debugging of large execution graphs. See Gao, ¶ [0008].
Also in the same field of endeavor (e.g., synchronization) Appu discloses:
wherein the software application is a machine learning application (Appu discloses, at ¶ [0158], a machine learning application.).


Regarding claim 19, Codrescu, as modified, discloses the elements of claim 18, as discussed above. Codrescu also discloses:
setting an enable bit which in one state enables raising of another instruction exception responsive to a match between an executing instruction address and an instruction break address and a match between a vertex identifier of the first one of the vertices matching the vertex break identifier, and in another state enables the raising of the other instruction exception responsive to a match between an executing instruction address and an instruction break address for any vertex identifier assigned to the first one of the threads (Codrescu discloses, at ¶ [0056], an enable bit that enables the hardware breakpoint for any thread, or alternatively, for only specified matching threads. Codrescu also discloses, at ¶ [0049], entering debug mode (raising an instruction exception) in response to the program counter (instruction address) matching a predetermined value in a register.).  

Regarding claim 20, Codrescu, as modified, discloses the elements of claim 18, as discussed above. Codrescu also discloses:
executing a plurality of worker threads in each of a plurality of time slots in a repeating sequence of inter-leaved time slots, with a program state of each of the worker threads being stored in a plurality of respective context register sets associated with each of the worker threads, and wherein a first one of the context register sets stores a vertex identifier of the first one of the vertices (Codrescu discloses, at ¶ [0029], executing threads (worker threads) involving interleaving instructions from different threads, which discloses a repeating sequence of time slots. Codrescu also discloses, at ¶ [0061], storing state in registers (context register sets) assigned to the threads.).  

Response to Arguments
On pages 11-12 of the response filed February 16, 2021 (“response”), the Applicant argues that the combination of Codrescu and Gao does not disclose, “each vertex being assigned to a respective programming thread.” In support of this position, the Applicant argues, “Even if, for the sake of argument, Gao discloses assigning vertices to processing nodes, that does not teach or suggest, "each vertex being assigned to a respective programming thread." It appears that the rejection confuses Gao's processing nodes for threads, but they are not the same. Rather, Gao discloses that the programming nodes may be network nodes, which are certainly not the same thing as threads. E.g., Gao, 0002. A person of ordinary skill in the art would not have taken Gao's assignment of vertices to processing nodes and found any suggestion to apply assignment of vertices to Codrescu's threads. Therefore, claim 1 is not obvious over the combination of Codrescu and Gao.” The Applicant also argues, “The rejection does not explain how a person of ordinary skill in the art with Gao's processing nodes would have come to think that Gao's actions would also apply to Codrescu's threads. In doing so, the rejection fails to explain why a person of ordinary skill in the art would have been motivated to combine Codrescu and Gao in the way that the claimed new invention does.”
Though fully considered, the Examiner respectfully disagrees. Codrescu discloses assigning sets of instructions to threads, which enables simultaneous execution of the sets of instructions. See, e.g.,  Codrescu, ¶ [0027]-[0028]. The only difference between Codrescu’s teachings and the limitation in question is that Codrescu does not explicitly refer to the sets of instructions as vertices. However, Gao explicitly discloses vertices, which are merely segments of code, i.e., sets of instructions. See, e.g., Gao, ¶ [0003]. These sets of instructions are also referred to as codelets by the Applicant. See Specification as filed, p. 3. Therefore, the combination of Codrescu and Gao primarily involves, regarding the limitation in question, the substitution of Gao’s term “vertex” for Codrescu’s “set of instructions.” Based on the current breadth of the claim language, the two terms are synonymous, and the combination merely involves simple substitution of one term for another. 
Gao also discloses that distributing vertices across multiple processing resources, i.e., nodes in Gao’s case, improves simultaneous execution of the vertices. See Gao, ¶ [0003]. However, Gao recognizes that the distribution of vertices and resulting simultaneous processing present challenges with 
Therefore, the fact that Gao discloses assigning vertices to nodes, rather than threads, is not dispositive, as it is Codrescu which is relied upon for disclosing the assignment of instructions to threads. Accordingly, the Applicant’s arguments are deemed unpersuasive.  

On pages 12-13 of the response the Applicant argues that neither Codrescu nor Gao discloses, “comparing a vertex identifier with the vertex break identifier held in debug hardware.” In support of this position, the Applicant argues that in Gao, “vertex identifiers are not compared with any identifier held in debug hardware.” The Applicant also argues, “The rejection states that it would have been obvious to modify Codrescu debugging mechanism, which includes per thread breakpoints, to include vertex identifiers, as disclosed by Gao, in order to enable debugging of large execution graphs. However, this is no more than a statement of what the inventors have invented and can only be made with the application of hindsight. Gao does not use breakpoints of any kind and avoids the need for such a debugging mechanism by identifying processing nodes with failed vertex code and extracting that failed vertex code onto a separate debugger.”
Though fully considered, the Examiner respectfully disagrees. Codrescu discloses comparing a thread identifier with an identifier held in a debug register. See Codrescu, ¶ [0027]. Therefore, whether or not Gao discloses comparing identifiers and using breakpoints is not dispositive as it is Codrescu that is cited for these features, rather than Gao. The only difference between the teachings of Codrescu and the limitation in question is that Codrescu does not explicitly recite that the identifiers are vertex identifiers. 


On page 13 of the response the Applicant argues that the remaining claims are not obvious over the combination of references for the same or similar reasons. 
Though fully considered, the Examiner respectfully disagrees. The reasons set forth in the remarks and rejections presented above, including those regarding the independent claims, are applicable to these claims.

Conclusion
THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee J. Li can be reached on 571-272-4169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SHAWN DOMAN/
Examiner, Art Unit 2183