Browse by author
Lookup NU author(s): Professor Raj Ranjan
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
© 2025 The Author(s). Software: Practice and Experience published by John Wiley & Sons Ltd.Background: Mining logic rules from structured knowledge bases is the basis of knowledge engineering. Due to the NP-hardness of the rule mining problem, logic rules cannot be efficiently induced from knowledge bases, especially large-scale ones. Idea: In this article, we propose a compact and efficient index structure for the maintenance of the intermediate data during top-down rule mining, such that the memory consumption can be reduced and mining efficiency can be improved. Developing Points: The index is based on a mapping from constant symbols to integers and the sorting of the mapped integers. Index update has been dissembled into four basic operations. Moreover, the index itself acts as the cache during top-down mining. Value: Most contributions in existing works employ algorithmic and architectural optimizations to improve efficiency. Data-oriented optimizations have also been explored to some extent, but the data efficiency is relatively low, and the memory consumption is thus becoming a new challenge for state-of-the-art systems. We tackle this challenge in this article, and our technique has been proven more efficient than state-of-the-art systems. We evaluate our method on six datasets which contain up to 160 K records and are frequently used as benchmarks in tasks related to knowledge engineering. The experimental results show that the proposed technique speeds up the rule mining procedure by (Formula presented.) on average and reduces memory consumption by up to 70%. The space overhead of the data structure is about twice that of the indexed records, which is more than 80% lower than that of the state-of-the-art technique.
Author(s): Wang R, Wong R, Sun D, Ranjan R
Publication type: Article
Publication status: Published
Journal: Software - Practice and Experience
Year: 2025
Pages: epub ahead of print
Online publication date: 07/02/2025
Acceptance date: 21/01/2025
Date deposited: 17/02/2025
ISSN (print): 0038-0644
ISSN (electronic): 1097-024X
Publisher: John Wiley and Sons Ltd
URL: https://doi.org/10.1002/spe.3415
DOI: 10.1002/spe.3415
Data Access Statement: The data that support the findings of this study are openly available in the public GitHub repository of SIB at https://github.com/TramsWang/SIB/ tree/main
Altmetrics provided by Altmetric