Toggle Main Menu Toggle Search

Open Access padlockePrints

Exploration of sample size and diatom-based indicator performance in three North American phosphorus training sets

Lookup NU author(s): Emeritus Professor Steve Juggins


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Three large training sets were investigated to determine optimal sample sizes for diatom-based inference models. The sample sets represented (1) assemblages from Great Lakes coastlines, (2) phytoplankton from the pelagic Great Lakes and (3) surface sediment assemblages from Minnesota lakes. Diatom-based weighted average models to infer nutrient concentrations were developed for each training set. Training set sample sizes ranging from 10 to the maximum number of samples were created through random sample selection, and performance of each model was evaluated. For each model iteration, diatom-inferred (DI) nutrient data were related to stressor data (e.g., adjacent agricultural or urban development) to characterize the ability of each model to track human activities. The relationships between model performance parameters (DI-stressor correlations and model r (2), error and bias) and sample size were used to determine the minimum sample size needed to optimize models for each region. Depending on the training set, at least 40-70 samples were needed to capture the variation in diatom assemblages and environmental conditions to such a degree that non-analog situations should be rare and so should provide an unambiguous result if the model was applied to any sample assemblage from the region. It is recommended that one exercises caution when dealing with smaller training sets unless there is certainty that the selected samples reflect the regional variability in diatom assemblages and environmental conditions.

Publication metadata

Author(s): Reavie ED, Juggins S

Publication type: Article

Publication status: Published

Journal: Aquatic Ecology

Year: 2011

Volume: 45

Issue: 4

Pages: 529-538

Print publication date: 23/09/2011

ISSN (print): 1386-2588

ISSN (electronic): 1573-5125

Publisher: Springer


DOI: 10.1007/s10452-011-9373-9


Altmetrics provided by Altmetric


Funder referenceFunder name
EPA/R-8286750US Environmental Protection Agency
GL-00E23101US Environmental Protection Agency