Browse by author
Lookup NU author(s): Nicolay Rusnachenko, Dr Huizhi LiangORCiD, Dr Lei ShiORCiD
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
The escalating volume of textual data necessitates adept and scalable Information Extraction (IE) systems in the field of Natural Language Processing (NLP) to analyse massive text collections in a detailed manner. While most deep learning systems are designed to handle textual information as it is, the gap in the existence of the interface between a document and the annotation of its parts is still poorly covered. Concurrently, one of the major limitations of most deep-learning models is a constrained input size caused by architectural and computational specifics. To address this, we introduce ARElight, a system designed to efficiently manage and extract information from sequences of large documents by dividing them into segments with mentioned object pairs. Through a pipeline comprising modules for text sampling, inference, optional graph operations, and visualisation, the proposed system transforms large volumes of text in a structured manner. Practical applications of ARElight are demonstrated across diverse use cases, including literature processing and social network analysis.
Author(s): Rusnachenko N, Liang H, Kalameyets M, Shi L
Publication type: Conference Proceedings (inc. Abstract)
Publication status: Published
Conference Name: The 46th European Conference on Information Retrieval (ECIR 2024)
Year of Conference: 2024
Pages: 229–235
Print publication date: 28/04/2024
Online publication date: 23/04/2024
Acceptance date: 15/12/2023
Date deposited: 15/12/2023
ISSN: 0302-9743
Publisher: Springer
URL: https://doi.org/10.1007/978-3-031-56069-9_23
DOI: 10.1007/978-3-031-56069-9_23
ePrints DOI: 10.57711/9kn1-b308
Library holdings: Search Newcastle University Library for this item
Series Title: Lecture Notes in Computer Science
ISBN: 9783031560682