Model misspecification misleads inference of the spatial dynamics of disease outbreaks

Model misspecification misleads inference of the spatial dynamics of disease outbreaks

Edited by David Hillis, The University of Texas, Austin, TX; received August 24, 2022; accepted January 17, 2023

March 10, 2023

120 (11) e2213913120

Significance

Bayesian phylodynamic models have revolutionized epidemiology by enabling researchers to infer key aspects of the geographic history of disease outbreaks. These models contain many parameters that must be estimated from minimal information (the area from which each pathogen was sampled), rendering inferences under this approach inherently sensitive to the choice of priors on the model parameters. Here, we demonstrate that: 1) the priors assumed in ≈93% of surveyed phylodynamic studies make strong and biologically unrealistic assumptions, and; 2) these priors distort the conclusions of epidemiological studies. We offer strategies and tools to specify more reasonable priors that will enhance our ability to understand pathogen biology and, thereby, to mitigate disease.

Abstract

Epidemiology has been transformed by the advent of Bayesian phylodynamic models that allow researchers to infer the geographic history of pathogen dispersal over a set of discrete geographic areas [1, 2]. These models provide powerful tools for understanding the spatial dynamics of disease outbreaks, but contain many parameters that are inferred from minimal geographic information (i.e., the single area in which each pathogen was sampled). Consequently, inferences under these models are inherently sensitive to our prior assumptions about the model parameters. Here, we demonstrate that the default priors used in empirical phylodynamic studies make strong and biologically unrealistic assumptions about the underlying geographic process. We provide empirical evidence that these unrealistic priors strongly (and adversely) impact commonly reported aspects of epidemiological studies, including: 1) the relative rates of dispersal between areas; 2) the importance of dispersal routes for the spread of pathogens among areas; 3) the number of dispersal events between areas, and; 4) the ancestral area in which a given outbreak originated. We offer strategies to avoid these problems, and develop tools to help researchers specify more biologically reasonable prior models that will realize the full potential of phylodynamic methods to elucidate pathogen biology and, ultimately, inform surveillance and monitoring policies to mitigate the impacts of disease outbreaks.

Continue Reading

Data, Materials, and Software Availability

Acknowledgments

We thank Jeff Thorne, an anonymous reviewer, and the editor for providing thoughtful comments that greatly improved the manuscript. This research was supported by the NSF grants DEB-0842181, DEB-0919529, DBI-1356737, and DEB-1457835 awarded to B.R.M., and the NIH grant RO1GM123306-S awarded to B.R.

Author contributions

J.G., M.R.M., B.R., and B.R.M. designed research; performed research; contributed new reagents/analytic tools; analyzed data; and wrote the paper.

Competing interests

The authors declare no competing interest.

Supporting Information

References

1

P. Lemey, A. Rambaut, A. J. Drummond, M. A. Suchard, Bayesian phylogeography finds its roots. PLoS Comput. Biol. 5, e1000520 (2009).

2

C. J. Edwards et al., Ancient hybridization and an Irish origin for the modern polar bear matriline. Curr. Biol. 21, 1251–1258 (2011).

3

A. J. Drummond, M. A. Suchard, D. Xie, A. Rambaut, Bayesian phylogenetics with BEAUti and the BEAST 17. Mol. Biol. Evol. 29, 1969–1973 (2012).

4

M. A. Suchard et al., Bayesian phylogenetic and phylodynamic data integration using BEAST 110. Virus Evol. 4, 016 (2018).

5

M. Worobey et al., The emergence of SARS-CoV-2 in Europe and North America. Science 370, 564–570 (2020).

6

D. S. Candido et al., Evolution and epidemic spread of SARS-CoV-2 in Brazil. Science 369, 1255–1260 (2020).

7

P. Lemey et al., Untangling introductions and persistence in COVID-19 resurgence in Europe. Nature 595, 713–717 (2021).

8

M. U. G. Kraemer et al., Spatiotemporal invasion dynamics of SARS-CoV-2 lineage B117 emergence. Science 373, 889–895 (2021).

9

T. Alpert et al., Early introductions and transmission of SARS-CoV-2 variant B117 in the United States. Cell 184, 2595–2604 (2021).

10

E. Wilkinson et al., A year of genomic surveillance reveals how the SARS-CoV-2 pandemic unfolded in Africa. Science 374, 423–431 (2021).

11

L. du Plessis et al., Establishment and lineage dynamics of the SARS-CoV-2 epidemic in the UK. Science 371, 708–712 (2021).

12

Z. Yang, Molecular Evolution: A Statistical Approach (Oxford University Press, 2014).

13

T. Bayes, LII. An essay towards solving a problem in the doctrine of chances By the late Rev Mr Bayes, FRS communicated by Mr Price, in a letter to John Canton, AMFR S. Philos. Trans. R. Soc. Lond. 1, 370–418 (1763).

14

E. Jakeman, P. Pusey, Significance of K distributions in scattering experiments. Phys. Rev. Lett. 40, 546 (1978).

15

R. E. Kass, A. E. Raftery, Bayes factors. J. Am. Statis. Assoc. 90, 773–795 (1995).

16

A. Gelman, X. L. Meng, H. Stern, Posterior predictive assessment of model fitness via realized discrepancies. Statis. Sin. 6, 733–760 (1996).

17

J. P. Bollback, Bayesian model adequacy and choice in phylogenetics. Mol. Biol. Evol. 19, 1171–1180 (2002).

18

P. K. Dash et al., Complete genome sequencing and evolutionary phylogeography analysis of Indian isolates of Dengue virus type 1. Virus Res. 195, 124–134 (2015).

19

L. Wilfert et al., Deformed wing virus is a recent global epidemic in honeybees driven by Varroa mites. Science 351, 594–597 (2016).

20

N. R. Faria et al., The early spread and epidemic ignition of HIV-1 in human populations. Science 346, 56–61 (2014).

21

T. Bedford et al., Global circulation patterns of seasonal influenza viruses vary with antigenic drift. Nature 523, 217 (2015).

22

H. W. Yao et al., The spatiotemporal expansion of human Rabies and its probable explanation in mainland China, 2004–2013. PLoS Negl. Trop. Dis. 9, e0003502 (2015).

23

J. Gao, M. R. May, B. Rannala, B. R. Moore, New phylogenetic models incorporating interval-specific dispersal dynamics improve inference of disease spread. Mol. Biol. Evol. 39, 159 (2022).

24

T. Bedford et al., Cryptic transmission of SARS-CoV-2 in Washington state. Science 370, 571–575 (2020).

25

J. O. Berger, An overview of robust Bayesian analysis. Test 3, 5–124 (1994).

26

C. P. Robert, Prior feedback: A Bayesian approach to maximum likelihood estimation. Comput. Statis. 8, 279–294 (1993).

27

S. R. Lele, B. Dennis, F. Lutscher, Data cloning: Easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. Ecol. Lett. 10, 551–563 (2007).

28

J. M. Ponciano, M. L. Taper, B. Dennis, S. R. Lele, Hierarchical models in ecology: Confidence intervals, hypothesis testing, and model selection using data cloning. Ecology 90, 356–362 (2009).

29

J. M. Ponciano, J. G. Burleigh, E. L. Braun, M. L. Taper, Assessing parameter identifiability in phylogenetic models using data cloning. Syst. Biol. 61, 955–972 (2012).

30

J. Gao, M. R. May, B. Rannala, B. R. Moore, PrioriTree: A utility for improving phylodynamic analyses in BEAST. Bioinformatics 39, btac849 (2023).

Information & Authors

Information

Published in

Proceedings of the National Academy of Sciences

Vol. 120 | No. 11
March 14, 2023

Classifications

Copyright

Data, Materials, and Software Availability

Submission history

Received: August 24, 2022

Accepted: January 17, 2023

Published online: March 10, 2023

Published in issue: March 14, 2023

Keywords

  1. phylodynamics
  2. prior sensitivity
  3. biogeography
  4. viral evolution
  5. epidemiology

Acknowledgments

We thank Jeff Thorne, an anonymous reviewer, and the editor for providing thoughtful comments that greatly improved the manuscript. This research was supported by the NSF grants DEB-0842181, DEB-0919529, DBI-1356737, and DEB-1457835 awarded to B.R.M., and the NIH grant RO1GM123306-S awarded to B.R.

Author Contributions

J.G., M.R.M., B.R., and B.R.M. designed research; performed research; contributed new reagents/analytic tools; analyzed data; and wrote the paper.

Competing Interests

The authors declare no competing interest.

Notes

This article is a PNAS Direct Submission.

*

The real constraint on the geographic model is that it must be irreducible. A model with fewer than k − 1 dispersal routes cannot be irreducible; however, a model with at least k − 1 dispersal routes is not guaranteed to be irreducible. See SI Appendix, section S2, for details.

Note that the gamma prior on the average dispersal rate is referred to as the CTMC-rate reference prior in the BEAUti program used to generate input files for BEAST analyses.

Authors

Affiliations

Department of Evolution and Ecology, University of California, Davis, CA 95616

Michael R. May

Department of Evolution and Ecology, University of California, Davis, CA 95616

Department of Evolution and Ecology, University of California, Davis, CA 95616

Brian R. Moore

Department of Evolution and Ecology, University of California, Davis, CA 95616

Notes

Metrics & Citations

Metrics

Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.

Citation statements

Altmetrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

View options

PDF format

Download this article as a PDF file

DOWNLOAD PDF

Get Access

Media

Figures

Tables

Other

Read More

Author:

Leave a Reply