Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

	Examiner Amendment	
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

Applicant agreed that the examiner’s amendment, authorized by Chris Pattillo (Reg. No. 76,601) 06/02/22

AMENDMENTS TO THE CLAIMS 

(Currently Amended) A failure management method for responding to a failed or failing node in a plurality of nodes performing a first task comprising a plurality of calculations, the plurality of nodes forming a network, each node in the plurality of nodes performing a calculation of the plurality of calculations, each node in the plurality of nodes being connected to a respective storage medium via a link that is redirectable to connect the respective storage medium to a different node in the plurality of nodes, and each node being configured to locally back up its state to its respective storage medium where the local backup of the state of each node is coherent with one another so each local backup corresponds to a same state of the first task, the method comprising: 
redirecting a redirectable link between a failed or failing first node in the plurality of nodes and a first storage medium to connect the first storage medium to an operational second node in the plurality of nodes, wherein the failed or failing first node was performing a first calculation of the plurality of calculations;
retrieving the local backup of the state of the failed or failing first node to the operational second node; 
adding a new third node to the plurality of nodes, the new third node not performing any calculation of the first task before failure of the failed or failing first node, the new third node being added to the plurality of nodes only after failure of the failed or failing first node to then relaunch the first 
transmitting the local backup of the state of the failed or failing first node to the third storage medium of the new third node; and
storing the transmitted local backup of the state of the failed or failing first node on the third storage medium of the new third node,
wherein, for at least one of the plurality of nodes performing said first task, the storage media are flash memories, and
wherein said first task is a distributed application running on said plurality of nodes, said plurality of nodes comprising at least 1000 compute nodes.

(Previously Presented) The failure management method according to claim 1, wherein each of the plurality of nodes is within a first computing blade, and
wherein all local backups of the plurality of nodes within the first computing blade are transmitted to new nodes within a new computing blade, the new nodes of the new computing blade being added to the network upon failure of the failed or failing first node. 

(Currently Amended) The failure management method according to claim 1, further comprising relaunching said first calculation from the local backups during the first task being performed during which the failed or failing first node failed. 

(Previously Presented) The failure management method according to claim 3, further comprising during the relaunching step: 
relaunching nodes that are not failing and do not belong to a first computing blade that includes the failed or failing first computing node, the relaunching being carried out from the local backup of the state of the nodes that are not failing and do not belong to said first computing blade. 

(Previously Presented) The failure management method according to claim 3, wherein after the relaunching step, the nodes are synchronized with one another to relaunch the nodes in the same state of said first task. 

(Cancelled) 

(Previously Presented) The failure management method according to claim 1, wherein a switch connects the plurality of nodes to the storage media of each of the nodes, and the link between each storage medium comprises the switch, and the redirection of the link between each node and the  storage medium of each node so as to connect said storage medium of each node to said another one of the nodes, is performed by a switch changing of the switch connecting the nodes to the storage media of each of the nodes. 

(Previously Presented) The failure management method according to claim 1, wherein the retrieving step changes the attachment of the storage medium of the local backup of the state of the failed or failing first node via a switch to which is attached to the failed or failing first node and the storage medium of the failed or failing first node, the switch changing the attachment without passing through the failed or failing first node itself. 

(Previously Presented) The failure management method according to claim 8, wherein the change of attachment is achieved by sending a command to the switch, the command passing through one of plurality of nodes attached to the switch by a management port. 

(Previously Presented) The failure management method according to claim 7, wherein the switch is a PCIe switch. 

(Previously Presented) The failure management method according to claim 7, wherein 3 to 10 of the plurality of nodes are attached to the switch. 

(Previously Presented) The failure management method according to claim 1, further comprising, for all the nodes of the network performing said first task, including when none of the nodes performing said first task are failing, a global backup step for all the nodes, the global backup step being performed less often than any local backup steps for the nodes of the network performing said first task. 

(Cancelled)

(Currently Amended) The failure management method according to claim 1 

(Previously Presented) The failure management method according to claim 2, wherein the failed or failing first node was performing a first calculation, and wherein the method further comprises: 
after storing the transmitted local backup, a relaunching step to relaunch said first calculation from the local backups during the first task being performed during which the failed or failing first node failed. 

16-20. (cancelled) 

21.	(Currently Amended) A failure management method for responding to a failed or failing node in a plurality of nodes performing a first task, the plurality of nodes forming a network, the first task comprising a plurality of calculations, comprising:
storing a local backup of the state of each node of the plurality of nodes to a different respective storage medium linked, via different respective links, to each node, each link between a storage medium and a node capable redirectable to another node; 
retrieving the local backup of the state of a first failed or failing node by redirecting said link between the first failed node and the storage medium of the first failed or failing node to connect the storage medium of the first failed or failing node to an operational node of the plurality of nodes, wherein the first failed or failing node was performing a first calculation of the plurality of calculations; 
transmitting the local backup of the state of the first failed or failing node to the storage medium of said operational node via the redirected link so that the storage medium of the operational node stores the local backup of the state of the first failed or failing node, the operational node already having performed a calculation of said plurality of calculations, 
wherein, the plurality of nodes have performed the first task together so that the local backup of the state of each node used in the retrieving step are coherent with one another so each local backup corresponds to a same state of said first task; and
after said first failed or failing node fails, adding a new node to the plurality of nodes, the new node not performing any calculation of the first task before failure of the first failed or failing 
wherein, for at least one of the plurality of nodes performing said first task, the storage media are flash memories, and
wherein said first task is a distributed application running on said plurality of nodes, said plurality of nodes comprising at least 1000 compute nodes. 

22.	(Previously Presented) The failure management method according to claim 1, wherein all nodes of said plurality of nodes being synchronized together so that the loss of a single node of said plurality of nodes because its local backup can no longer be recovered is followed by the loss of all calculation steps of all nodes of said plurality of nodes.

23.	(Cancelled) 

24.	(Previously Presented) A failure management method for responding to a failed or failing node in a plurality of computing nodes performing a first task, the plurality of computing nodes forming a network, the first task is a distributed application comprising a plurality of calculations, each computing node in the plurality of computing nodes performing a calculation of the plurality of calculations, each computing node in the plurality of nodes being allocated by a first resource manager and being operably connected to a respective storage medium via a link that is redirectable to connect the respective storage medium to a different computing node in the plurality of nodes, and each computing node being configured to locally back up its state to its respective storage medium where the local backup of the state of each computing node is coherent with one another so each local backup corresponds to a same state of the first task, the method comprising: 
determining, by the first resource manager, that a first computing node has failed or is failing;
redirecting a redirectable link between the failed or failing first computing node in the plurality of nodes and a first storage medium to connect the first storage medium to an operational second computing node in the plurality of nodes, wherein the failed or failing first computing node was performing a first calculation of the plurality of calculations;
retrieving the local backup of the state of the failed or failing first node to the operational second computing node 
adding a new third computing node to the plurality of computing nodes, the new third computing node not performing any calculation of the first task before failure of the failed or failing first node, the new third computing node being added to the plurality of computing nodes only after failure of the failed or failing first node to then relaunch not yet achieved calculation of the failed or failing first computing node within said first task, the new third computing node connected to a third storage medium via a redirectable link;
transmitting the local backup of the state of the failed or failing first computing node to the third storage medium; and
storing the transmitted local backup of the state of the failed or failing first computing node on the third storage medium,
wherein, for at least one of the plurality of nodes performing said first task, the storage media are flash memories, and
wherein said first task is a distributed application running on said plurality of nodes, said plurality of nodes comprising at least 1000 compute nodes.

25. 	(Cancelled)


Claim 1-5, 7-12, 14-15, 21-22, 24 are allowed.


Conclusion
 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sulaiman Nooristany whose telephone number is (571) 270-1929.  The examiner can normally be reached on M-F from 9 to 5.  If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Jeffrey Rutkowski, can be reached on (571) 270-1215.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/SULAIMAN NOORISTANY/Primary Examiner, Art Unit 2415