Mining and Ranking of Generalized Multi-Dimensional Frequent Subgraphs

Petermann, A; Micale, G; Bergami, G; Pulvirenti, A; Rahm, E

doi:10.1109/ICDIM.2017.8244685

Mining and Ranking of Generalized Multi-Dimensional Frequent Subgraphs

Lookup NU author(s): Dr Giacomo Bergami ORCiD

Downloads

Full text for this publication is not currently held within this repository. Alternative links are provided below where available.

Abstract

Frequent pattern mining is an important research field and can be applied to different labeled data structures ranging from itemsets to graphs. There are scenarios where a label can be assigned to a taxonomy and generalized patterns can be mined by replacing labels by their ancestors. In this work, we propose a novel approach to generalized frequent subgraph mining. In contrast to existing work, our approach considers new requirements from use cases beyond molecular databases. In particular, we support directed multigraphs as well as multiple taxonomies to deal with the different semantic meaning of vertices. Since results of generalized frequent subgraph mining can be very large, we use a fast analytical method of p-value estimation to rank results by significance. We propose two extensions of the popular gSpan algorithm that mine frequent subgraphs across all taxonomy levels. We compare both algorithms in an experimental evaluation based on a database of business process executions represented by graphs.

Publication metadata

Author(s): Petermann A, Micale G, Bergami G, Pulvirenti A, Rahm E

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: 12th International Conference on Digital Information Management (ICDIM)

Year of Conference: 2017

Online publication date: 04/01/2018

Acceptance date: 08/08/2017

Publisher: IEEE

URL: https://doi.org/0.1109/ICDIM.2017.8244685

DOI: 10.1109/ICDIM.2017.8244685

Library holdings: Search Newcastle University Library for this item

ISBN: 9781538606643

ePrints

Mining and Ranking of Generalized Multi-Dimensional Frequent Subgraphs

Downloads

Abstract

Publication metadata

Share