Toggle Main Menu Toggle Search

Open Access padlockePrints

Synthetic ALSPAC longitudinal datasets for the Big Data VR project

Lookup NU author(s): Dr Demetris AvraamORCiD, Dr Becca WilsonORCiD, Emeritus Professor Paul BurtonORCiD



This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Three synthetic datasets - of observation size 15,000, 155,000 and 1,555,000 participants, respectively - were created by simulating eleven cardiac and anthropometric variables from nine collection ages of the ALSAPC birth cohort study. The synthetic datasets retain similar data properties to the ALSPAC study data they are simulated from (co-variance matrices, as well as the mean and variance values of the variables) without including the original data itself or disclosing participant information. In this instance, the three synthetic datasets have been utilised in an academia-industry collaboration to build a prototype virtual reality data analysis software, but they could have a broader use in method and software development projects where sensitive data cannot be freely shared.

Publication metadata

Author(s): Avraam D, Wilson RC, Burton P

Publication type: Article

Publication status: Published

Journal: Wellcome Open Research

Year: 2017

Volume: 2

Online publication date: 30/08/2017

Acceptance date: 23/08/2017

Date deposited: 07/12/2017

ISSN (electronic): 2398-502X

Publisher: Wellcome Trust


DOI: 10.12688/wellcomeopenres.12441.1

PubMed id: 28989981


Altmetrics provided by Altmetric


Funder referenceFunder name
Wellcome Trust