Toggle Main Menu Toggle Search

Open Access padlockePrints

A Rollback-Recovery Protocol for Wide Area Pipelined Data Flow Computations

Lookup NU author(s): Dr James Smith, Professor Paul WatsonORCiD



It is argued that there is a significant class of pipelined large grain data flow computations whose wide area distribution and long running nature suggest a need for fault-tolerance, but for which existing approaches appear either costly or incomplete. An example, which motivated this paper, is the execution of queries over distributed databases. This paper presents an approach which exploits some limited input from the application layer in order to implement a low overhead recovery protocol for such data flow computations. Over a large range of possible data flow graphs, the protocol is shown to support tolerance of a single machine failure, per execution of the data flow computation, and in many cases to provide a greater degree of fault-tolerance.

Publication metadata

Author(s): Smith J, Watson P

Publication type: Report

Publication status: Published

Series Title: School of Computing Science Technical Report Series

Year: 2004

Pages: 16

Print publication date: 01/04/2004

Source Publication Date: April 2004

Report Number: 836

Institution: School of Computing Science, University of Newcastle upon Tyne

Place Published: Newcastle upon Tyne