Toggle Main Menu Toggle Search

Open Access padlockePrints

Incremental workflow improvement through analysis of its data provenance

Lookup NU author(s): Professor Paolo MissierORCiD


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Repeated executions of resource-intensive workflows over a large number of runs are commonly observed in e-science practice. We explore the hypothesis that, in some cases, provenance traces recorded for past runs of a workflow can be used to make future runs more efficient. This investigation is an initial step into the systematic study of the role that provenance analysis can play in the broader context of self-managing software systems. We have tested our hypothesis on a concrete case study involving a Chemical Engineering workflow deployed on a cloud infrastructure, where we can measure the cost of its repeated execution. Our approach involves augmenting the workflow with a feedback loop in which incremental analysis of the provenance of past runs is used to control some of the workflow steps in subsequent executions. We present initial experimental results and hint at future improvements as part of ongoing work.

Publication metadata

Author(s): Missier P

Editor(s): Buneman, P., Freire, J.

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: 3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP)

Year of Conference: 2011

Publisher: CEUR