Patent Number: 7,765,385

Title: Fault recovery on a parallel computer system with a torus network

Abstract: An apparatus and method for overcoming a torus network failure in a parallel computer system. A mesh routing mechanism in the service node of the computer system configures the nodes from a torus to a mesh network when a failure occurs in the torus network. The mesh routing mechanism takes advantage of cutoff registers in each node to route node to node data transfers around the faulty node or network connection.

Inventors: Darrington; David L. (Rochester, MN), McCarthy; Patrick Joseph (Rochester, MN), Peters; Amanda (Rochester, MN), Sidelnik; Albert (St. Paul, MN)

Assignee: International Business Machines Corporation

International Classification: G06F 9/00 (20060101)

Expiration Date: 7/27/12018