Browse by author
Lookup NU author(s): Dr Giacomo BergamiORCiD, Ollie Fox, Professor Graham MorganORCiD
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
Graph query languages such as Cypher are widely adopted to match and retrieve datain a graph representation, due to their ability to retrieve and transform information. Even thoughthe most natural way to match and transform information is through rewriting rules, those arescarcely or partially adopted in graph query languages. Their inability to do so has a major impacton the subsequent way the information is structured, as it might then appear more natural toprovide major constraints over the data representation to fix the way the information should berepresented. On the other hand, recent works are starting to move towards the opposite direction,as the provision of a truly general semistructured model (GSM) allows to both represent all theavailable data formats (Network-Based, Relational, and Semistructured) as well as support a holisticquery language expressing all major queries in such languages. In this paper, we show that theusage of GSM enables the definition of a general rewriting mechanism which can be expressedin current graph query languages only at the cost of adhering the query to the specificity of theunderlying data representation. We formalise the proposed query language in terms declarativegraph rewriting mechanisms described as a set of production rules L -> R while both providingrestriction to the characterisation of L, and extending it to support structural graph nesting operations,useful to aggregate similar information around an entry-point of interest. We further achieve ourdeclarative requirements by determining the order in which the data should be rewritten and multiplerules should be applied while ensuring the application of such updates on the GSM database ispersisted in subsequent rewriting calls. We discuss how GSM, by fully supporting index-based datarepresentation, allows for a better physical model implementation leveraging the benefits of columnardatabase storage. Preliminary benchmarks show the scalability of this proposed implementation incomparison with state-of-the-art implementations.
Author(s): Bergami G, Fox OR, Morgan G
Publication type: Article
Publication status: Published
Journal: Mathematics
Year: 2024
Volume: 12
Issue: 17
Online publication date: 28/08/2024
Acceptance date: 25/08/2024
Date deposited: 28/08/2024
ISSN (electronic): 2227-7390
Publisher: MDPI AG
URL: https://doi.org/10.3390/math12172677
DOI: 10.3390/math12172677
Data Access Statement: The datasets are available at the following repositories: https://osf.io/btjqw/?view_only=f31eda86e7b04ac886734a26cd2ce43d and https://osf.io/rpu37/. The codebase associated with the implementation of the data model and query language interpretation are available on GitHub: https://github.com/datagram-db/datagram-db/releases/tag/v2.0. All the URLs were accessed on the 21 April 2024.
Altmetrics provided by Altmetric