Toggle Main Menu Toggle Search

Open Access padlockePrints

Tasks and Visualizations used for Data Profiling: A Survey and Interview Study

Lookup NU author(s): Dr Sara Fernstad

Downloads


Licence

This is the authors' accepted manuscript of an article that has been published in its final definitive form by IEEE, 2023.

For re-use rights please refer to the publisher's terms and conditions.


Abstract

The use of good-quality data to inform decision making is entirely dependent on robust processes to ensure it is fit for purpose. Such processes vary between organisations, and between those tasked with designing and following them. In this paper we report on a survey of 53 data analysts from many industry sectors, 24 of whom also participated in in-depth interviews, about computational and visual methods for characterizing data and investigating data quality. The paper makes contributions in two key areas. The first is to data science fundamentals, because our lists of data profiling tasks and visualization techniques are more comprehensive than those published elsewhere. The second concerns the application question “what does good profiling look like to those who routinely perform it?,” which we answer by highlighting the diversity of profiling tasks, unusual practice and exemplars of visualization, and recommendations about formalizing processes and creating rulebooks.


Publication metadata

Author(s): Ruddle RA, Cheshire J, Johansson Fernstad S

Publication type: Article

Publication status: Published

Journal: IEEE Transactions on Visualization and Computer Graphics

Year: 2023

Pages: epub ahead of print

Online publication date: 06/01/2023

Acceptance date: 20/12/2022

Date deposited: 28/02/2023

ISSN (electronic): 1941-0506

Publisher: IEEE

URL: https://doi.org/10.1109/TVCG.2023.3234337

DOI: 10.1109/TVCG.2023.3234337

ePrints DOI: 10.57711/fqms-wx43


Altmetrics

Altmetrics provided by Altmetric


Share