Browse by author
Lookup NU author(s): Rawaa Qasha, Dr Zhenyu Wen, Dr Jacek CalaORCiD, Professor Paul WatsonORCiD
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND).
Scientific workflows play a vital role in modern science as they enable scientists to specify, share and reuse computational experiments. To maximise the benefits, workflows need to support the reproducibility of the experimental methods they capture. Reproducibility enables effective sharing as scientists can re-execute experiments developed by others and quickly derive new or improved results. However, achieving reproducibility in practice is problematic - previous analyses highlight issues due to uncontrolled changes in the input data, configuration parameters, workflow description and the software used to implement the workflow tasks. The resulting problems have become known as workflow decay.In this paper we present a novel framework that addresses workflow decay through the integration of system description, version control, container management and automated deployment techniques. It then introduces a set ofperformance optimization techniques that significantly reduce the runtime overheads caused by making workflows re-producible. The resulting system significantly improves the performance, repeatability and also the ability to shareand re-use workflows by combining a method to uniquely identify task and workow images with an automated image capture facility and a multi-level cache.The system is evaluated through an extensive set of experiments that validate the approach and highlight the keybenefits of the proposed optimisations. This includes methods for reducing the runtime of workflows by up to an orderof magnitude in cases where they are enacted concurrently on the same host VM and in different Clouds, and wherethey share tasks.
Author(s): Qasha R, Wen Z, Cala J, Watson P
Publication type: Article
Publication status: Published
Journal: Future Generation Computer Systems
Year: 2019
Volume: 29
Pages: 487-502
Print publication date: 01/09/2019
Online publication date: 03/04/2019
Acceptance date: 24/03/2019
Date deposited: 06/04/2019
ISSN (print): 0167-739X
ISSN (electronic): 1872-7115
Publisher: Elsevier BV
URL: https://doi.org/10.1016/j.future.2019.03.045
DOI: 10.1016/j.future.2019.03.045
Altmetrics provided by Altmetric