Toggle Main Menu Toggle Search

Open Access padlockePrints

Consistent State Restoration in Distributed Systems

Lookup NU author(s): Professor Brian RandellORCiD



This paper concerns an important aspect of the problem of designing fault-tolerated distributed computing systems. The concepts involved in ""backward error recovery"", ie restoring a system, or some part of a system, to a previous state which it is hoped or believed preceded the occurence of any existing errors, are formalised and generalised so as to apply to concurrent, eg distributed, systems. The formalisation is based on the use of what we term ""Occurence Graphs"" to represent the cause-effect relationships that exist between the events that occur when a system is operational, and to indicate existing possibilities for the state restoration. A protocol is presented which could be used in each of the nodes in a distributed computing system in order to provide system recoverability in the face even of multiple faults. this presentation includes a proof of the protocol's correctness.

Publication metadata

Author(s): Merlin PM, Randell B

Publication type: Report

Publication status: Published

Series Title: Computing Laboratory Technical Report Series

Year: 1977

Pages: 46

Report Number: 113

Institution: Computing Laboratory, University of Newcastle upon Tyne

Place Published: Newcastle upon Tyne