Toggle Main Menu Toggle Search

Open Access padlockePrints

Abstracting PROV provenance graphs: A validity-preserving approach

Lookup NU author(s): Professor Paolo MissierORCiD, Dr Jeremy Bryans, Dr Carl Gamble


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


© 2020Data provenance is a structured form of metadata designed to record the activities and datasets involved in data production, as well as their dependency relationships. The PROV data model, released by the W3C in 2013, defines a schema and constraints that together provide a structural and semantic foundation for provenance. This enables the interoperable exchange of provenance between data producers and consumers. When the provenance content is sensitive and subject to disclosure restrictions, however, a way of hiding parts of the provenance in a principled way before communicating it to certain parties is required. In this paper we present a provenance abstraction operator that achieves this goal. It maps a graphical representation of a PROV document PG1 to a new abstract version PG2, ensuring that (i) PG2 is a valid PROV graph, and (ii) the dependencies that appear in PG2 are justified by those that appear in PG1. These two properties ensure that further abstraction of abstract PROV graphs is possible. A guiding principle of the work is that of minimum damage: the resultant graph is altered as little as possible, while ensuring that the two properties are maintained. The operator developed is implemented as part of a user tool, described in a separate paper, that lets owners of sensitive provenance information control the abstraction by specifying an abstraction policy.

Publication metadata

Author(s): Missier P, Bryans J, Gamble C, Curcin V

Publication type: Article

Publication status: Published

Journal: Future Generation Computer Systems

Year: 2020

Volume: 111

Pages: 352-367

Print publication date: 01/10/2020

Online publication date: 15/05/2020

Acceptance date: 12/05/2020

ISSN (print): 0167-739X

Publisher: Elsevier


DOI: 10.1016/j.future.2020.05.015


Altmetrics provided by Altmetric