Browse by author
Lookup NU author(s): Dr Paul Ezhilchelvan,
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
Atomic Broadcast (where all processes deliver broadcast messages in the same order) is a very useful group communication primitive for building fault-tolerant distributed systems. This paper presents an atomic broadcast protocol that can be claimed to be optimal in terms of failure detection, resilience, and latency. The protocol requires only the weakest of the useful failure detectors for liveness, and permits upto (n-1)/2 processes to crash in a system of n processes; at most two communication steps and n broadcasts are needed in a run during which process crashes and failure-suspicions do not occur. We also introduce the notion of Notifying Broadcast which can reduce the message overhead further in ’nice’ runs in which all processes are operational and communication delays do not exceed the bound assumed. If nice runs persist, the average message overhead is just one broadcast. That is, the protocol extracts no message overhead for providing crash-tolerance if process failures and unanticipated fluctuations in communication delays do not occur. We are currently implementing our protocol as a CORBA component. All known ORBs use IIOP as the standard protocol for inter-process communication, which in turn uses TCP/IP as the common transport protocol. It turns out that the Notifying Broadcast is straightforward to implement on top of TCP transport layer.
Author(s): Ezhilchelvan P, Palmer D, Raynal M
Publication type: Conference Proceedings (inc. Abstract)
Publication status: Published
Conference Name: The Eighth IEEE International Workshop on Object-Oriented Real-Time Dependable Systems
Year of Conference: 2003
Publisher: IEEE Computer Society
Notes: WORDS 2003
Library holdings: Search Newcastle University Library for this item