Toggle Main Menu Toggle Search

Open Access padlockePrints

Failure recovery alternatives in grid-based distributed query processing: A case study

Lookup NU author(s): Professor Paul WatsonORCiD


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Fault-tolerance has long been a feature of database systems, with transactions supporting the structuring of applications so as to ensure continuation of updating applications in spite of machine failures. For read-only queries the perceived wisdom has been that support for fault-tolerance is too expensive to be worthwhile. Distributed query processing (DQP) is coming to be seen as a promising way of implementing applications that combine structured data and analysis operations in dynamic distributed settings such as computational grids. Accordingly, a number of protocols have been described that support tolerance to failure of intermediate machines, so as to permit continuation from surviving intermediate state. However, a distributed query can have a non-trivial mapping onto hardware resources. Because of this it is often possible to choose between a number of possible recovery strategies in the event of a failure. The work described here makes an initial investigation in this area in the context of an example query expressed over distributed resources in a Grid and shows that it can be worthwhile to make this choice between recovery alternatives dynamically, at the point a failure is detected rather than statically beforehand.

Publication metadata

Author(s): Smith J, Watson P

Editor(s): Talia, D; Bilas, A; Dikaiakos, MD

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: Knowledge and Data Management in GRIDs: 1st Workshop

Year of Conference: 2007

Pages: 51-63

Publisher: New York: Springer


DOI: 10.1007/978-0-387-37831-2_4

Library holdings: Search Newcastle University Library for this item

ISBN: 9780387378305