Toggle Main Menu Toggle Search

Open Access padlockePrints

Large-scale inference of correlation among mixed-type biological traits with phylogenetic multivariate probit models

Lookup NU author(s): Dr Rebecca PayneORCiD

Downloads

Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Abstract

© Institute of Mathematical Statistics, 2021.Inferring concerted changes among biological traits along an evolutionary history remains an important yet challenging problem. Besides adjusting for spurious correlation induced from the shared history, the task also requires sufficient flexibility and computational efficiency to incorporate multiple continuous and discrete traits as data size increases. To accomplish this, we jointly model mixed-type traits by assuming latent parameters for binary outcome dimensions at the tips of an unknown tree informed by molecular sequences. This gives rise to a phylogenetic multivariate probit model. With large sample sizes, posterior computation under this model is problematic, as it requires repeated sampling from a high-dimensional truncated normal distribution. Current best practices employ multiple-try rejection sampling that suffers from slow-mixing and a computational cost that scales quadrati-cally in sample size. We develop a new inference approach that exploits: (1) the bouncy particle sampler (BPS) based on piecewise deterministic Markov processes to simultaneously sample all truncated normal dimensions, and (2) novel dynamic programming that reduces the cost of likelihood and gradient evaluations for BPS to linear in sample size. In an application with 535 HIV viruses and 24 traits that necessitates sampling from a 12,840-dimensional truncated normal, our method makes it possible to estimate the across-trait correlation and detect factors that affect the pathogen’s capacity to cause dis-ease. This inference framework is also applicable to a broader class of covari-ance structures beyond comparative biology.


Publication metadata

Author(s): Zhang Z, Nishimura A, Bastide P, Ji X, Payne RP, Goulder P, Lemey P, Suchard MA

Publication type: Article

Publication status: Published

Journal: Annals of Applied Statistics

Year: 2021

Volume: 15

Issue: 1

Pages: 230-251

Print publication date: 01/03/2021

Online publication date: 18/03/2021

Acceptance date: 02/04/2018

ISSN (print): 1932-6157

ISSN (electronic): 1941-7330

Publisher: Institute of Mathematical Statistics

URL: https://doi.org/10.1214/20-AOAS1394

DOI: 10.1214/20-AOAS1394


Altmetrics

Altmetrics provided by Altmetric


Share