Patent Number: 7,134,052

Title: Autonomic recovery from hardware errors in an input/output fabric

Abstract: An apparatus, program product and method propagate errors detected in an IO fabric element from an IO fabric that is used to couple a plurality of endpoint IO resources to processing elements in a computer. In particular, such errors are propagated to the endpoint IO resources affected by the IO fabric element in connection with recovering from the errors in the IO fabric element. By doing so, a device driver or other program code used to access each affected IO resources may be permitted to asynchronously recover from the propagated error in its associated IO resource, and often without requiring the recovery from the error in the IO fabric element to wait for recovery to be completed for each of the affected IO resources. In addition, an IO fabric may be dynamically configured to support both recoverable and non-recoverable endpoint IO resources. In particular, IO fabric elements within an IO fabric may be dynamically configured to enable machine check signaling in such IO fabric elements in response to detection that an endpoint IO resource is non-recoverable in nature. The IO fabric elements that are dynamically configured as such are disposed within a hardware path that is defined between the non-recoverable resource and a processor that accesses the non-recoverable resource.

Inventors: Bailey; David Alan (Kasson, MN), Nguyen; Trung Ngoc (Rochester, MN), Nordstrom; Gregory Michael (Pine Island, MN), Patel; Kanisha (Cedar Park, TX), Thurber; Steven Mark (Austin, TX)

Assignee: International Business Machines Corporation

International Classification: G06F 11/00 (20060101)

Expiration Date: 2019-11-07 0:00:00