Toggle Main Menu Toggle Search

Open Access padlockePrints

Provenance graph abstraction by node grouping

Lookup NU author(s): Professor Paolo MissierORCiD, Dr Jeremy Bryans, Dr Carl Gamble



Data provenance is a form of metadata that records the activities involved in data production. It can be used to help data consumers to form judgments regarding data reliability. The PROV data model, released by the W3C in 2013, defines a relational model and constraints which provides a structural and semantic foundation for provenance. This enables the exchange of provenance between data producers and consumers. When the provenance content is sensitive and subject to disclosure restrictions, however, a complementary model is needed to enable producers to partially obfuscate provenance in a principled way. In this paper we propose such a formal model. It is embodied by a grouping operator, whereby a set of nodes in a PROV-compliant provenance graph is replaced by a new abstract node, leading to a new valid PROV graph. We define graph editing rules which allow existing dependencies to be removed, but guarantee that no spurious dependencies are introduced in the abstracted graph. As grouping is closed with respect to composition, it can be used as a building block to achieve complex abstraction. The operator is implemented as part of a user tool that lets owners of sensitive provenance information specify custom abstraction policies.

Publication metadata

Author(s): Missier P, Bryans J, Gamble C, Curcin V, Danger R

Publication type: Report

Publication status: Published

Series Title: School of Computing Science Technical Report Series

Year: 2013

Pages: 13

Print publication date: 01/08/2013

Source Publication Date: August 2013

Report Number: 1393

Institution: School of Computing Science, University of Newcastle upon Tyne

Place Published: Newcastle upon Tyne