Browse by author
Lookup NU author(s): Professor Paolo Missier
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
© Springer Nature Switzerland AG 2018. The PROV data model assumes that entities are immutable and all changes to an entity e are represented by the creation of a new entity e’. This is reasonable for many provenance applications but may produce verbose results once we move towards fine-grained provenance due to the possibility of multiple binds (i.e., variables, elements of data structures) referring to the same mutable data objects (e.g., lists or dictionaries in Python). Changing a data object that is referenced by multiple immutable entities requires duplicating those immutable entities to keep consistency. This imposes an overhead on the provenance storage and makes it hard to represent data-changing operations and their effect on the provenance graph. In this paper, we propose a PROV extension to represent mutable data structures. We do this by adding reference derivations and checkpoints. We evaluate our approach by comparing it to plain PROV and PROV-Dictionary. Results indicate a reduction in the storage overhead for assignments and changes in data structures from O(N) and Ώ(R × N), respectively, to O(1) in both cases when compared to plain PROV (N is the number of members in the data structure and R is the number of references to the data structure).
Author(s): Pimentel JFN, Missier P, Murta L, Braganholo V
Publication type: Conference Proceedings (inc. Abstract)
Publication status: Published
Conference Name: 7th International Provenance and Annotation Workshop, IPAW 2018
Year of Conference: 2018
Online publication date: 06/09/2018
Acceptance date: 09/07/2018
Publisher: Springer Verlag
Library holdings: Search Newcastle University Library for this item
Series Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)