ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction

Rusnachenko, N; Liang, H; Kalameyets, M; Shi, L

doi:10.1007/978-3-031-56069-9_23

ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction

Lookup NU author(s): Nicolay Rusnachenko, Dr Huizhi Liang ORCiD, Dr Lei Shi ORCiD

Downloads

Accepted version [.pdf]

Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).

Abstract

The escalating volume of textual data necessitates adept and scalable Information Extraction (IE) systems in the field of Natural Language Processing (NLP) to analyse massive text collections in a detailed manner. While most deep learning systems are designed to handle textual information as it is, the gap in the existence of the interface between a document and the annotation of its parts is still poorly covered. Concurrently, one of the major limitations of most deep-learning models is a constrained input size caused by architectural and computational specifics. To address this, we introduce ARElight, a system designed to efficiently manage and extract information from sequences of large documents by dividing them into segments with mentioned object pairs. Through a pipeline comprising modules for text sampling, inference, optional graph operations, and visualisation, the proposed system transforms large volumes of text in a structured manner. Practical applications of ARElight are demonstrated across diverse use cases, including literature processing and social network analysis.