Toggle Main Menu Toggle Search

Open Access padlockePrints

Experience: A Comparative Analysis of Multivariate Time-Series Generative Models: A Case Study on Human Activity Data

Lookup NU author(s): Naif Alzahrani, Dr Jacek CalaORCiD, Professor Paolo Missier

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

© 2024 Copyright held by the owner/author(s). Human activity recognition (HAR) is an active research field that has seen great success in recent years due to advances in sensory data collection methods and activity recognition systems. Deep artificial intelligence (AI) models have contributed to the success of HAR systems lately, although still suffering from limitations such as data scarcity, the high costs of labelling data instances, and datasets' imbalance and bias. The temporal nature of human activity data, represented as time series data, impose an additional challenge to using AI models in HAR, because most state-of-the-art models do not account for the time component of the data instances. These limitations have inspired the time-series research community to design generative models for sequential data, but very little work has been done to evaluate the quality of such models. In this work, we conduct a comparative quality analysis of three generative models for time-series data, using a case study in which we aim to generate sensory human activity data from a seed public dataset. Additionally, we adapt and clearly explain four evaluation methods of synthetic time-series data from the literature and apply them to assess the quality of the synthetic activity data we generate. We show experimentally that high-quality human activity data can be generated using deep generative models, and the synthetic data can thus be used in HAR systems to augment real activity data. We also demonstrate that the chosen evaluation methods effectively ensure that the generated data meets the essential quality benchmarks of realism, diversity, coherence, and utility. Our findings suggest that using deep generative models to produce synthetic human activity data can potentially address challenges related to data scarcity, biases, and expensive labeling. This holds promise for enhancing the efficiency and reliability of HAR systems.


Publication metadata

Author(s): Alzahrani N, Cala J, Missier P

Publication type: Article

Publication status: Published

Journal: ACM Journal of Data and Information Quality

Year: 2024

Volume: 16

Issue: 3

Online publication date: 04/10/2024

Acceptance date: 28/06/2024

Date deposited: 04/11/2024

ISSN (print): 1936-1955

ISSN (electronic): 1936-1963

Publisher: ACM

URL: https://doi.org/10.1145/3688393

DOI: 10.1145/3688393


Altmetrics

Altmetrics provided by Altmetric


Share