Browse by author
Lookup NU author(s): Professor Raj Ranjan
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
© 2022 John Wiley & Sons Ltd.Knowledge data has been widely applied to artificial intelligence applications for interpretable and complex reasoning. Modern knowledge bases are constructed via automatic knowledge extraction from open-accessible sources. Thus the sizes of KBs are continuously growing, heavily burdening the maintenance and application of the knowledge data. Besides the grammatical redundancies, semantically repeated information also frequently appears in knowledge bases but is still under-explored. Existing semantic compressors fail to efficiently discover expressive patterns and thus perform unsatisfyingly on knowledge data. This article proposes SInC, a semantic inductive compressor, to efficiently induce first-order Horn rules and semantically compress knowledge bases. SInC improves the scalability of top-down rule mining by batching correlated records in the cache and further optimizes the pruning of duplication and specialization via an identifier structure of Horn rules. SInC was evaluated on real-world and synthetic datasets and compared against the state-of-the-art. The results show that the batched caching speed up the rule mining procedure by more than two orders while consuming fewer than three times memory space. The identifier technique speeds up the duplication and specialization pruning by orders of magnitude with less than 5‰ and 15% error rates, respectively. SInC outperforms the state-of-the-art from the perspective of overall compression on both scalability and compression effect.
Author(s): Wang R, Sun D, Wong R, Ranjan R
Publication type: Article
Publication status: Published
Journal: Software - Practice and Experience
Print publication date: 01/03/2023
Online publication date: 18/11/2022
Acceptance date: 16/10/2022
ISSN (print): 0038-0644
ISSN (electronic): 1097-024X
Publisher: John Wiley and Sons Ltd
Altmetrics provided by Altmetric