Toggle Main Menu Toggle Search

Open Access padlockePrints

Descriptive inference using large, unrepresentative nonprobability samples: An introduction for ecologists

Lookup NU author(s): Dr Gavin StewartORCiD

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

© 2023 The Authors. Ecology published by Wiley Periodicals LLC on behalf of The Ecological Society of America. Biodiversity monitoring usually involves drawing inferences about some variable of interest across a defined landscape from observations made at a sample of locations within that landscape. If the variable of interest differs between sampled and nonsampled locations, and no mitigating action is taken, then the sample is unrepresentative and inferences drawn from it will be biased. It is possible to adjust unrepresentative samples so that they more closely resemble the wider landscape in terms of “auxiliary variables.” A good auxiliary variable is a common cause of sample inclusion and the variable of interest, and if it explains an appreciable portion of the variance in both, then inferences drawn from the adjusted sample will be closer to the truth. We applied six types of survey sample adjustment—subsampling, quasirandomization, poststratification, superpopulation modeling, a “doubly robust” procedure, and multilevel regression and poststratification—to a simple two-part biodiversity monitoring problem. The first part was to estimate the mean occupancy of the plant Calluna vulgaris in Great Britain in two time periods (1987–1999 and 2010–2019); the second was to estimate the difference between the two (i.e., the trend). We estimated the means and trend using large, but (originally) unrepresentative, samples from a citizen science dataset. Compared with the unadjusted estimates, the means and trends estimated using most adjustment methods were more accurate, although standard uncertainty intervals generally did not cover the true values. Completely unbiased inference is not possible from an unrepresentative sample without knowing and having data on all relevant auxiliary variables. Adjustments can reduce the bias if auxiliary variables are available and selected carefully, but the potential for residual bias should be acknowledged and reported.


Publication metadata

Author(s): Boyd RJ, Stewart GB, Pescott OL

Publication type: Article

Publication status: Published

Journal: Ecology

Year: 2024

Volume: 105

Issue: 2

Print publication date: 01/02/2024

Online publication date: 13/12/2023

Acceptance date: 20/10/2023

Date deposited: 22/01/2024

ISSN (print): 0012-9658

ISSN (electronic): 1939-9170

Publisher: Ecological Society of America

URL: https://doi.org/10.1002/ecy.4214

DOI: 10.1002/ecy.4214

Data Access Statement: The data and an R Markdown document containing all code to reproduce our analysis are available on Zenodo in Boyd (2023) at https://doi.org/10.5281/zenodo.10029669.


Altmetrics

Altmetrics provided by Altmetric


Funding

Funder referenceFunder name
National Environment Research Council (NERC)
NE/X010384/1

Share