Rare penetrant mutations confer severe risk of common diseases | Science

Rare penetrant mutations confer severe risk of common diseases | Science

Petko P. Fiziev https://orcid.org/0000-0002-1572-4621, Jeremy McRae https://orcid.org/0000-0003-3411-9248, Jacob C. Ulirsch https://orcid.org/0000-0002-7947-0827, Jacqueline S. Dron https://orcid.org/0000-0002-3045-6530, Tobias Hamp, Yanshen Yang, Pierrick Wainschtein https://orcid.org/0000-0002-5203-6481, Zijian Ni https://orcid.org/0000-0003-1181-8337, Joshua G. Schraiber, Hong Gao https://orcid.org/0000-0001-6274-4513, Dylan Cable, Yair Field https://orcid.org/0000-0002-5327-1678, Francois Aguet https://orcid.org/0000-0001-9414-300X, Marc Fasnacht, Ahmed Metwally https://orcid.org/0000-0002-0155-7412, Jeffrey Rogers https://orcid.org/0000-0002-7374-6490, Tomas Marques-Bonet https://orcid.org/0000-0002-5597-3075, Heidi L. Rehm https://orcid.org/0000-0002-6025-0015, Anne O’Donnell-Luria https://orcid.org/0000-0001-6418-9592, Amit V. Khera https://orcid.org/0000-0001-6535-5839, and Kyle Kai-How Farh https://orcid.org/0000-0001-6947-8537 [email protected]Authors Info & Affiliations


2 Jun 2023

Vol 380, Issue 6648

Structured Abstract


Genome-wide association studies (GWASs) have identified thousands of common genetic variants that are predictive of common disease susceptibility, but these variants individually have mild effects on disease owing to the effects of natural selection. By contrast, rare genetic variants can have large effects on common disease risk, but their use in genetic risk prediction has been limited to date owing to the difficulty of distinguishing pathogenic from benign variants and estimating the magnitude of their effects.


PrimateAI-3D is a three-dimensional convolutional neural network for missense variant–effect prediction, which was trained with common genetic variants from the population sequencing of 233 primate species. By applying this method to estimate the pathogenicity of rare coding variants in 454,712 UK Biobank individuals, we aimed to improve rare-variant association tests and genetic risk prediction for common diseases and complex traits.


We performed rare-variant burden tests for 90 well-powered, clinically relevant phenotypes in the UK Biobank exome dataset. Stratifying missense variants with PrimateAI-3D greatly improved gene discovery, revealing 73% more significant gene-phenotype associations (false discovery rate <0.05) compared with not using PrimateAI-3D. When benchmarked against prior studies, gene-phenotype pairs identified with our method were better supported by orthogonal genetic evidence from GWAS and genes from related Mendelian disorders. In addition, PrimateAI-3D scores showed the strongest correlation among existing variant interpretation algorithms for predicting the quantitative effects of rare variants on continuous clinical phenotypes.

Having validated our method for finding gene-phenotype relationships, we next constructed a rare-variant polygenic risk score (PRS) model by combining the rare-variant genes for each phenotype, weighting variants by their PrimateAI-3D prediction score and the direction and effect size of each associated gene. For comparison, we constructed common-variant PRS models and evaluated the performance of the two models for genetic risk prediction in a withheld-test subset of the cohort. Although common variants better explained overall population variance, rare-variant PRSs had more power at the ends of the distribution to identify individuals at the greatest risk for disease, and thus may be more relevant for population genetic screening and risk management. By contrast to common-variant PRS models derived from European populations that show poor generalization to non-Europeans, rare-variant PRSs were substantially more portable to different cohorts and ancestry groups that were not seen during model training. Moreover, because they incorporate orthogonal information from nonoverlapping sets of variants, we combined rare- and common-variant PRS models into a unified model and observed further improvement in genetic risk prediction for common diseases.

To understand the extent by which rare-variant PRSs can be expected to improve with increases in discovery cohort size, we repeated our analyses in down-sampled subsets of the UK Biobank cohort. We found that the number of genes contributing to the rare-variant PRS increased linearly, with no signs of plateauing at a half-million exomes. Newly discovered rare-variant genes were strongly enriched at GWAS loci, forming allelic series with effect sizes that were ~10-fold larger on average than the respective common GWAS variant. Among well-powered GWAS loci that could be unambiguously assigned to a single gene, the majority showed subthreshold signal on the rare-variant burden test, indicating that rare penetrant variants exist at a large fraction of GWAS loci and can be incorporated into the rare-variant PRS with further advances in cohort size and variant effect prediction.


Understanding the impact of rare variants in common diseases is of prime interest for both precision medicine and the discovery of drug targets. By leveraging advances in variant effect prediction, we have demonstrated major improvements in rare-variant burden testing and genetic risk prediction. Notably, we observed that nearly all individuals carried at least one rare penetrant variant for the phenotypes we examined, demonstrating the utility of personal genome sequencing for otherwise healthy individuals in the general population.

Polygenic contribution of rare genetic variants to complex human traits, shown for serum cholesterol as a representative example.

(Left) Rare-variant burden tests capture the direction and effect sizes of genes in known lipid biosynthesis pathways. (Top right) When used in a rare-variant polygenic risk score, individuals at opposite ends of the PRS separate into high- and low-cholesterol groups. (Bottom right) Rare variants in these genes have larger effects compared with common variants identified by GWAS and are strongly predictive of individuals who are phenotypic outliers.


We examined 454,712 exomes for genes associated with a wide spectrum of complex traits and common diseases and observed that rare, penetrant mutations in genes implicated by genome-wide association studies confer ~10-fold larger effects than common variants in the same genes. Consequently, an individual at the phenotypic extreme and at the greatest risk for severe, early-onset disease is better identified by a few rare penetrant variants than by the collective action of many common variants with weak effects. By combining rare variants across phenotype-associated genes into a unified genetic risk model, we demonstrate superior portability across diverse global populations compared with common-variant polygenic risk scores, greatly improving the clinical utility of genetic-based risk prediction.

Get full access to this article

View all available purchase options and get full access to this article.

Supplementary Materials

This PDF file includes:

Materials and Methods

Figs. S1 to S16

References (8189)

Other Supplementary Material for this manuscript includes the following:

References and Notes


A. Buniello, J. A. L. MacArthur, M. Cerezo, L. W. Harris, J. Hayhurst, C. Malangone, A. McMahon, J. Morales, E. Mountjoy, E. Sollis, D. Suveges, O. Vrousgou, P. L. Whetzel, R. Amode, J. A. Guillen, H. S. Riat, S. J. Trevanion, P. Hall, H. Junkins, P. Flicek, T. Burdett, L. A. Hindorff, F. Cunningham, H. Parkinson, The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res.47, D1005–D1012 (2019).


T. A. Manolio, F. S. Collins, N. J. Cox, D. B. Goldstein, L. A. Hindorff, D. J. Hunter, M. I. McCarthy, E. M. Ramos, L. R. Cardon, A. Chakravarti, J. H. Cho, A. E. Guttmacher, A. Kong, L. Kruglyak, E. Mardis, C. N. Rotimi, M. Slatkin, D. Valle, A. S. Whittemore, M. Boehnke, A. G. Clark, E. E. Eichler, G. Gibson, J. L. Haines, T. F. C. Mackay, S. A. McCarroll, P. M. Visscher, Finding the missing heritability of complex diseases. Nature461, 747–753 (2009).


Y. Ding, K. Hou, K. S. Burch, S. Lapinska, F. Privé, B. Vilhjálmsson, S. Sankararaman, B. Pasaniuc, Large uncertainty in individual polygenic risk score estimation impacts PRS-based risk stratification. Nat. Genet.54, 30–39 (2022).


A. R. Martin, M. Kanai, Y. Kamatani, Y. Okada, B. M. Neale, M. J. Daly, Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet.51, 584–591 (2019).


R. Henderson, M. O’Kane, V. McGilligan, S. Watterson, The genetics and screening of familial hypercholesterolaemia. J. Biomed. Sci.23, 39 (2016).


K. B. Kuchenbaecker, J. L. Hopper, D. R. Barnes, K. A. Phillips, T. M. Mooij, M. J. Roos-Blom, S. Jervis, F. E. van Leeuwen, R. L. Milne, N. Andrieu, D. E. Goldgar, M. B. Terry, M. A. Rookus, D. F. Easton, A. C. Antoniou, L. McGuffog, D. G. Evans, D. Barrowdale, D. Frost, J. Adlard, K. R. Ong, L. Izatt, M. Tischkowitz, R. Eeles, R. Davidson, S. Hodgson, S. Ellis, C. Nogues, C. Lasset, D. Stoppa-Lyonnet, J. P. Fricker, L. Faivre, P. Berthet, M. J. Hooning, L. E. van der Kolk, C. M. Kets, M. A. Adank, E. M. John, W. K. Chung, I. L. Andrulis, M. Southey, M. B. Daly, S. S. Buys, A. Osorio, C. Engel, K. Kast, R. K. Schmutzler, T. Caldes, A. Jakubowska, J. Simard, M. L. Friedlander, S. A. McLachlan, E. Machackova, L. Foretova, Y. Y. Tan, C. F. Singer, E. Olah, A. M. Gerdes, B. Arver, H. Olsson; BRCA1 and BRCA2 Cohort Consortium, Risks of Breast, Ovarian, and Contralateral Breast Cancer for BRCA1 and BRCA2 Mutation Carriers. JAMA317, 2402–2416 (2017).


S. A. Cohen, C. C. Pritchard, G. P. Jarvik, Lynch Syndrome: From Screening to Diagnosis to Treatment in the Era of Modern Molecular Oncology. Annu. Rev. Genomics Hum. Genet.20, 293–307 (2019).


A. R. Kim, J. C. Ulirsch, S. Wilmes, E. Unal, I. Moraga, M. Karakukcu, D. Yuan, S. Kazerounian, N. J. Abdulhay, D. S. King, N. Gupta, S. B. Gabriel, E. S. Lander, T. Patiroglu, A. Ozcan, M. A. Ozdemir, K. C. Garcia, J. Piehler, H. T. Gazda, D. E. Klein, V. G. Sankaran, Functional Selectivity in Cytokine Signaling Revealed Through a Pathogenic EPO Mutation. Cell168, 1053–1064.e15 (2017).


K. L. Smith, C. Isaacs, BRCA mutation testing in determining breast cancer therapy. Cancer J.17, 492–499 (2011).


M. Delvecchio, C. Pastore, P. Giordano, Treatment Options for MODY Patients: A Systematic Review of Literature. Diabetes Ther.11, 1667–1685 (2020).


K. J. Karczewski, L. C. Francioli, G. Tiao, B. B. Cummings, J. Alföldi, Q. Wang, R. L. Collins, K. M. Laricchia, A. Ganna, D. P. Birnbaum, L. D. Gauthier, H. Brand, M. Solomonson, N. A. Watts, D. Rhodes, M. Singer-Berk, E. M. England, E. G. Seaby, J. A. Kosmicki, R. K. Walters, K. Tashman, Y. Farjoun, E. Banks, T. Poterba, A. Wang, C. Seed, N. Whiffin, J. X. Chong, K. E. Samocha, E. Pierce-Hoffman, Z. Zappala, A. H. O’Donnell-Luria, E. V. Minikel, B. Weisburd, M. Lek, J. S. Ware, C. Vittal, I. M. Armean, L. Bergelson, K. Cibulskis, K. M. Connolly, M. Covarrubias, S. Donnelly, S. Ferriera, S. Gabriel, J. Gentry, N. Gupta, T. Jeandet, D. Kaplan, C. Llanwarne, R. Munshi, S. Novod, N. Petrillo, D. Roazen, V. Ruano-Rubio, A. Saltzman, M. Schleicher, J. Soto, K. Tibbetts, C. Tolonen, G. Wade, M. E. Talkowski, B. M. Neale, M. J. Daly, D. G. MacArthur, D. Ardissino, G. Atzmon, J. Barnard, L. Beaugerie, E. J. Benjamin, M. Boehnke, L. L. Bonnycastle, E. P. Bottinger, D. W. Bowden, M. J. Bown, J. C. Chambers, J. C. Chan, D. Chasman, J. Cho, M. K. Chung, B. Cohen, A. Correa, D. Dabelea, M. J. Daly, D. Darbar, R. Duggirala, J. Dupuis, P. T. Ellinor, R. Elosua, J. Erdmann, T. Esko, M. Färkkilä, J. Florez, A. Franke, G. Getz, B. Glaser, S. J. Glatt, D. Goldstein, C. Gonzalez, L. Groop, C. Haiman, C. Hanis, M. Harms, M. Hiltunen, M. M. Holi, C. M. Hultman, M. Kallela, J. Kaprio, S. Kathiresan, B.-J. Kim, Y. J. Kim, G. Kirov, J. Kooner, S. Koskinen, H. M. Krumholz, S. Kugathasan, S. H. Kwak, M. Laakso, T. Lehtimäki, R. J. F. Loos, S. A. Lubitz, R. C. W. Ma, D. G. MacArthur, J. Marrugat, K. M. Mattila, S. McCarroll, M. I. McCarthy, D. McGovern, R. McPherson, J. B. Meigs, O. Melander, A. Metspalu, B. M. Neale, P. M. Nilsson, M. C. O’Donovan, D. Ongur, L. Orozco, M. J. Owen, C. N. A. Palmer, A. Palotie, K. S. Park, C. Pato, A. E. Pulver, N. Rahman, A. M. Remes, J. D. Rioux, S. Ripatti, D. M. Roden, D. Saleheen, V. Salomaa, N. J. Samani, J. Scharf, H. Schunkert, M. B. Shoemaker, P. Sklar, H. Soininen, H. Sokol, T. Spector, P. F. Sullivan, J. Suvisaari, E. S. Tai, Y. Y. Teo, T. Tiinamaija, M. Tsuang, D. Turner, T. Tusie-Luna, E. Vartiainen, M. P. Vawter, J. S. Ware, H. Watkins, R. K. Weersma, M. Wessman, J. G. Wilson, R. J. Xavier, B. M. Neale, M. J. Daly, D. G. MacArthur; Genome Aggregation Database Consortium, The mutational constraint spectrum quantified from variation in 141,456 humans. Nature581, 434–443 (2020).


H. Gao, T. Hamp, J. Ede, J. G. Schraiber, J. McRae, M. Singer-Berk, Y. Yang, A. Dietrich, P. Fiziev, L. Kuderna, L. Sundaram, Y. Wu, A. Adhikari, Y. Field, S. Chen, S. Batzoglou, F. Aguet, G. Lemire, R. Reimers, D. Balick, M. C. Janiak, M. Kuhlwilm, J. D. Orkin, S. Manu, A. Valenzuela, J. Bergman, M. Rouselle, F. E. Silva, L. Agueda, J. Blanc, M. Gut, D. de Vries, I. Goodhead, R. A. Harris, M. Raveendran, A. Jensen, I. S. Chuma, J. Horvath, C. Hvilsom, D. Juan, P. Frandsen, F. R. de Melo, F. Bertuol, H. Byrne, I. Sampaio, I. Farias, J. V. do Amaral, M. Messias, M. N. F. da Silva, M. Trivedi, R. Rossi, T. Hrbek, N. Andriaholinirina, C. J. Rabarivola, A. Zaramody, C. J. Jolly, J. Phillips-Conroy, G. Wilkerson, C. Abee, J. H. Simmons, E. Fernandez-Duque, S. Kanthaswamy, F. Shiferaw, D. Wu, L. Zhou, Y. Shao, G. Zhang, J. D. Keyyu, S. Knauf, M. D. Le, E. Lizano, S. Merker, A. Navarro, T. Batallion, T. Nadler, C. C. Khor, J. Lee, P. Tan, W. K. Lim, A. C. Kitchener, D. Zinner, I. Gut, A. Melin, K. Guschanski, M. H. Schierup, R. M. D. Beck, G. Umapathy, C. Roos, J. P. Boubli, M. Lek, S. Sunyaev, A. O’Donnell, H. Rehm, J. Xu, J. Rogers, T. Marques-Bonet, K. K.-H. Farh, The landscape of tolerated genetic variation in humans and primates. Science380, eabn8197 (2023).


C. Bycroft, C. Freeman, D. Petkova, G. Band, L. T. Elliott, K. Sharp, A. Motyer, D. Vukcevic, O. Delaneau, J. O’Connell, A. Cortes, S. Welsh, A. Young, M. Effingham, G. McVean, S. Leslie, N. Allen, P. Donnelly, J. Marchini, The UK Biobank resource with deep phenotyping and genomic data. Nature562, 203–209 (2018).


D. Taliun, D. N. Harris, M. D. Kessler, J. Carlson, Z. A. Szpiech, R. Torres, S. A. G. Taliun, A. Corvelo, S. M. Gogarten, H. M. Kang, A. N. Pitsillides, J. LeFaive, S.-B. Lee, X. Tian, B. L. Browning, S. Das, A.-K. Emde, W. E. Clarke, D. P. Loesch, A. C. Shetty, T. W. Blackwell, A. V. Smith, Q. Wong, X. Liu, M. P. Conomos, D. M. Bobo, F. Aguet, C. Albert, A. Alonso, K. G. Ardlie, D. E. Arking, S. Aslibekyan, P. L. Auer, J. Barnard, R. G. Barr, L. Barwick, L. C. Becker, R. L. Beer, E. J. Benjamin, L. F. Bielak, J. Blangero, M. Boehnke, D. W. Bowden, J. A. Brody, E. G. Burchard, B. E. Cade, J. F. Casella, B. Chalazan, D. I. Chasman, Y. I. Chen, M. H. Cho, S. H. Choi, M. K. Chung, C. B. Clish, A. Correa, J. E. Curran, B. Custer, D. Darbar, M. Daya, M. de Andrade, D. L. DeMeo, S. K. Dutcher, P. T. Ellinor, L. S. Emery, C. Eng, D. Fatkin, T. Fingerlin, L. Forer, M. Fornage, N. Franceschini, C. Fuchsberger, S. M. Fullerton, S. Germer, M. T. Gladwin, D. J. Gottlieb, X. Guo, M. E. Hall, J. He, N. L. Heard-Costa, S. R. Heckbert, M. R. Irvin, J. M. Johnsen, A. D. Johnson, R. Kaplan, S. L. R. Kardia, T. Kelly, S. Kelly, E. E. Kenny, D. P. Kiel, R. Klemmer, B. A. Konkle, C. Kooperberg, A. Köttgen, L. A. Lange, J. Lasky-Su, D. Levy, X. Lin, K.-H. Lin, C. Liu, R. J. F. Loos, L. Garman, R. Gerszten, S. A. Lubitz, K. L. Lunetta, A. C. Y. Mak, A. Manichaikul, A. K. Manning, R. A. Mathias, D. D. McManus, S. T. McGarvey, J. B. Meigs, D. A. Meyers, J. L. Mikulla, M. A. Minear, B. D. Mitchell, S. Mohanty, M. E. Montasser, C. Montgomery, A. C. Morrison, J. M. Murabito, A. Natale, P. Natarajan, S. C. Nelson, K. E. North, J. R. O’Connell, N. D. Palmer, N. Pankratz, G. M. Peloso, P. A. Peyser, J. Pleiness, W. S. Post, B. M. Psaty, D. C. Rao, S. Redline, A. P. Reiner, D. Roden, J. I. Rotter, I. Ruczinski, C. Sarnowski, S. Schoenherr, D. A. Schwartz, J.-S. Seo, S. Seshadri, V. A. Sheehan, W. H. Sheu, M. B. Shoemaker, N. L. Smith, J. A. Smith, N. Sotoodehnia, A. M. Stilp, W. Tang, K. D. Taylor, M. Telen, T. A. Thornton, R. P. Tracy, D. J. Van Den Berg, R. S. Vasan, K. A. Viaud-Martinez, S. Vrieze, D. E. Weeks, B. S. Weir, S. T. Weiss, L.-C. Weng, C. J. Willer, Y. Zhang, X. Zhao, D. K. Arnett, A. E. Ashley-Koch, K. C. Barnes, E. Boerwinkle, S. Gabriel, R. Gibbs, K. M. Rice, S. S. Rich, E. K. Silverman, P. Qasba, W. Gan, G. J. Papanicolaou, D. A. Nickerson, S. R. Browning, M. C. Zody, S. Zöllner, J. G. Wilson, L. A. Cupples, C. C. Laurie, C. E. Jaquish, R. D. Hernandez, T. D. O’Connor, G. R. Abecasis, A. Beitelshees, T. Benos, M. Bezerra, J. Bis, R. Bowler, U. Broeckel, J. Broome, K. Bunting, C. Bustamante, E. Buth, J. Cardwell, V. Carey, C. Carty, R. Casaburi, P. Castaldi, M. Chaffin, C. Chang, Y.-C. Chang, S. Chavan, B.-J. Chen, W.-M. Chen, L.-M. Chuang, R.-H. Chung, S. Comhair, E. Cornell, C. Crandall, J. Crapo, J. Curtis, C. Damcott, S. David, C. Davis, L. Fuentes, M. DeBaun, R. Deka, S. Devine, Q. Duan, R. Duggirala, J. P. Durda, C. Eaton, L. Ekunwe, A. El Boueiz, S. Erzurum, C. Farber, M. Flickinger, M. Fornage, C. Frazar, M. Fu, L. Fulton, S. Gao, Y. Gao, M. Gass, B. Gelb, X. P. Geng, M. Geraci, A. Ghosh, C. Gignoux, D. Glahn, D.-W. Gong, H. Goring, S. Graw, D. Grine, C. C. Gu, Y. Guan, N. Gupta, J. Haessler, N. L. Hawley, B. Heavner, D. Herrington, C. Hersh, B. Hidalgo, J. Hixson, B. Hobbs, J. Hokanson, E. Hong, K. Hoth, C. A. Hsiung, Y.-J. Hung, H. Huston, C. M. Hwu, R. Jackson, D. Jain, M. A. Jhun, C. Johnson, R. Johnston, K. Jones, S. Kathiresan, A. Khan, W. Kim, G. Kinney, H. Kramer, C. Lange, E. Lange, L. Lange, C. Laurie, M. LeBoff, J. Lee, S. S. Lee, W.-J. Lee, D. Levine, J. Lewis, X. Li, Y. Li, H. Lin, H. Lin, K. H. Lin, S. Liu, Y. Liu, Y. Liu, J. Luo, M. Mahaney, B. Make, J. A. Manson, L. Margolin, L. Martin, S. Mathai, S. May, P. McArdle, M.-L. McDonald, S. McFarland, D. McGoldrick, C. McHugh, H. Mei, L. Mestroni, N. Min, R. L. Minster, M. Moll, A. Moscati, S. Musani, S. Mwasongwe, J. C. Mychaleckyj, G. Nadkarni, R. Naik, T. Naseri, S. Nekhai, B. Neltner, H. Ochs-Balcom, D. Paik, J. Pankow, A. Parsa, J. M. Peralta, M. Perez, J. Perry, U. Peters, L. S. Phillips, T. Pollin, J. P. Becker, M. P. Boorgula, M. Preuss, D. Qiao, Z. Qin, N. Rafaels, L. Raffield, L. Rasmussen-Torvik, A. Ratan, R. Reed, E. Regan, M. S. Reupena, C. Roselli, P. Russell, S. Ruuska, K. Ryan, E. C. Sabino, D. Saleheen, S. Salimi, S. Salzberg, K. Sandow, V. G. Sankaran, C. Scheller, E. Schmidt, K. Schwander, F. Sciurba, C. Seidman, J. Seidman, S. L. Sherman, A. Shetty, W. H.-H. Sheu, B. Silver, J. Smith, T. Smith, S. Smoller, B. Snively, M. Snyder, T. Sofer, G. Storm, E. Streeten, Y. J. Sung, J. Sylvia, A. Szpiro, C. Sztalryd, H. Tang, M. Taub, M. Taylor, S. Taylor, M. Threlkeld, L. Tinker, D. Tirschwell, S. Tishkoff, H. Tiwari, C. Tong, M. Tsai, D. Vaidya, P. VandeHaar, T. Walker, R. Wallace, A. Walts, F. F. Wang, H. Wang, K. Watson, J. Wessel, K. Williams, L. K. Williams, C. Wilson, J. Wu, H. Xu, L. Yanek, I. Yang, R. Yang, N. Zaghloul, M. Zekavat, S. X. Zhao, W. Zhao, D. Zhi, X. Zhou, X. Zhu, G. J. Papanicolaou, D. A. Nickerson, S. R. Browning, M. C. Zody, S. Zöllner, J. G. Wilson, L. A. Cupples, C. C. Laurie, C. E. Jaquish, R. D. Hernandez, T. D. O’Connor, G. R. Abecasis; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature590, 290–299 (2021).


J. C. Denny, J. L. Rutter, D. B. Goldstein, A. Philippakis, J. W. Smoller, G. Jenkins, E. Dishman; All of Us Research Program Investigators, The “All of Us” Research Program. N. Engl. J. Med.381, 668–676 (2019).


See supplementary materials.


J. D. Backman, A. H. Li, A. Marcketta, D. Sun, J. Mbatchou, M. D. Kessler, C. Benner, D. Liu, A. E. Locke, S. Balasubramanian, A. Yadav, N. Banerjee, C. E. Gillies, A. Damask, S. Liu, X. Bai, A. Hawes, E. Maxwell, L. Gurski, K. Watanabe, J. A. Kosmicki, V. Rajagopal, J. Mighty, M. Jones, L. Mitnaul, E. Stahl, G. Coppola, E. Jorgenson, L. Habegger, W. J. Salerno, A. R. Shuldiner, L. A. Lotta, J. D. Overton, M. N. Cantor, J. G. Reid, G. Yancopoulos, H. M. Kang, J. Marchini, A. Baras, G. R. Abecasis, M. A. R. Ferreira; Regeneron Genetics Center; DiscovEHR, Exome sequencing and analysis of 454,787 UK Biobank participants. Nature599, 628–634 (2021).


J. Mbatchou, L. Barnard, J. Backman, A. Marcketta, J. A. Kosmicki, A. Ziyatdinov, C. Benner, C. O’Dushlaine, M. Barber, B. Boutkov, L. Habegger, M. Ferreira, A. Baras, J. Reid, G. Abecasis, E. Maxwell, J. Marchini, Computationally efficient whole-genome regression for quantitative and binary traits. Nat. Genet.53, 1097–1103 (2021).


J. L. Goldstein, M. S. Brown, Binding and degradation of low density lipoproteins by cultured human fibroblasts. Comparison of cells from a normal subject and from a patient with homozygous familial hypercholesterolemia. J. Biol. Chem.249, 5153–5162 (1974).


M. S. Brown, J. L. Goldstein, Expression of the familial hypercholesterolemia gene in heterozygotes: Mechanism for a dominant disorder in man. Science185, 61–63 (1974).


M. S. Sabatine, PCSK9 inhibitors: Clinical evidence and implementation. Nat. Rev. Cardiol.16, 155–165 (2019).


S. M. Grundy, N. J. Stone, A. L. Bailey, C. Beam, K. K. Birtcher, R. S. Blumenthal, L. T. Braun, S. de Ferranti, J. Faiella-Tommasino, D. E. Forman, R. Goldberg, P. A. Heidenreich, M. A. Hlatky, D. W. Jones, D. Lloyd-Jones, N. Lopez-Pajares, C. E. Ndumele, C. E. Orringer, C. A. Peralta, J. J. Saseen, S. C. Smith Jr., L. Sperling, S. S. Virani, J. Yeboah, 2018 AHA/ACC/AACVPR/AAPA/ABC/ACPM/ADA/AGS/APhA/ASPC/NLA/PCNA Guideline on the Management of Blood Cholesterol: A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Circulation139, e1082–e1143 (2019).


American Diabetes Association, 2. Classification and Diagnosis of Diabetes: Standards of Medical Care in Diabetes—2019. Diabetes Care42, S13–S28 (2019).


C. C. Cowie, S. S. Casagrande, L. S. Geiss, “Prevalence and Incidence of Type 2 Diabetes and Prediabetes” in Diabetes in America, C. C. Cowie, S. S. Casagrande, A. Menke, M. A. Cissell, M. S. Eberhardt, J. B. Meig, E. W. Gregg, W. C. Knowler, E. Barrett-Connor, D. J. Becker, F. L. Brancati, E. J. Boyko, W. H. Herman, B. V. Howard, K. M. V. Narayan, M. Rewers, J. E. Fradkin, Eds. (National Institutes of Health, ed. 3, 2018).


W. Fu, T. D. O’Connor, G. Jun, H. M. Kang, G. Abecasis, S. M. Leal, S. Gabriel, M. J. Rieder, D. Altshuler, J. Shendure, D. A. Nickerson, M. J. Bamshad, J. M. Akey; NHLBI Exome Sequencing Project, Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature493, 216–220 (2013).


G. V. Kryukov, L. A. Pennacchio, S. R. Sunyaev, Most rare missense alleles are deleterious in humans: Implications for complex disease and association studies. Am. J. Hum. Genet.80, 727–739 (2007).


J. Zeng, R. de Vlaming, Y. Wu, M. R. Robinson, L. R. Lloyd-Jones, L. Yengo, C. X. Yap, A. Xue, J. Sidorenko, A. F. McRae, J. E. Powell, G. W. Montgomery, A. Metspalu, T. Esko, G. Gibson, N. R. Wray, P. M. Visscher, J. Yang, Signatures of negative selection in the genetic architecture of human complex traits. Nat. Genet.50, 746–753 (2018).


A. P. Schoech, D. M. Jordan, P.-R. Loh, S. Gazal, L. J. O’Connor, D. J. Balick, P. F. Palamara, H. K. Finucane, S. R. Sunyaev, A. L. Price, Quantification of frequency-dependent genetic architectures in 25 UK Biobank traits reveals action of negative selection. Nat. Commun.10, 790 (2019).


K. Jaganathan, S. Kyriazopoulou Panagiotopoulou, J. F. McRae, S. F. Darbandi, D. Knowles, Y. I. Li, J. A. Kosmicki, J. Arbelaez, W. Cui, G. B. Schwartz, E. D. Chow, E. Kanterakis, H. Gao, A. Kia, S. Batzoglou, S. J. Sanders, K. K.-H. Farh, Predicting Splicing from Primary Sequence with Deep Learning. Cell176, 535–548.e24 (2019).


A. V. Khera, M. Chaffin, K. G. Aragam, M. E. Haas, C. Roselli, S. H. Choi, P. Natarajan, E. S. Lander, S. A. Lubitz, P. T. Ellinor, S. Kathiresan, Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet.50, 1219–1224 (2018).


S. M. Purcell, N. R. Wray, J. L. Stone, P. M. Visscher, M. C. O’Donovan, P. F. Sullivan, P. Sklar; International Schizophrenia Consortium, Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature460, 748–752 (2009).


J. Luo, H. Yang, B.-L. Song, Mechanisms and regulation of cholesterol homeostasis. Nat. Rev. Mol. Cell Biol.21, 225–245 (2020).


K. E. Berge, H. Tian, G. A. Graf, L. Yu, N. V. Grishin, J. Schultz, P. Kwiterovich, B. Shan, R. Barnes, H. H. Hobbs, Accumulation of dietary cholesterol in sitosterolemia caused by mutations in adjacent ABC transporters. Science290, 1771–1775 (2000).


J. D. Horton, J. C. Cohen, H. H. Hobbs, Molecular biology of PCSK9: Its role in LDL metabolism. Trends Biochem. Sci.32, 71–77 (2007).


J. Behbodikhah, S. Ahmed, A. Elyasi, L. J. Kasselman, J. De Leon, A. D. Glass, A. B. Reiss, Apolipoprotein B and Cardiovascular Disease: Biomarker and Potential Therapeutic Target. Metabolites11, 690 (2021).


J. E. Nahon, M. Hoekstra, S. van Hulst, C. Manta, S. Goerdt, J. J. Geerling, C. Géraud, M. Van Eck, Hematopoietic Stabilin-1 deficiency does not influence atherosclerosis susceptibility in LDL receptor knockout mice. Atherosclerosis281, 47–55 (2019).


J. J. P. Kastelein, H. N. Ginsberg, G. Langslet, G. K. Hovingh, R. Ceska, R. Dufour, D. Blom, F. Civeira, M. Krempf, C. Lorenzato, J. Zhao, R. Pordy, M. T. Baccara-Dinet, D. A. Gipe, M. J. Geiger, M. Farnier, ODYSSEY FH I and FH II: 78 week results with alirocumab treatment in 735 patients with heterozygous familial hypercholesterolaemia. Eur. Heart J.36, 2996–3003 (2015).


M. Van Heek, C. F. France, D. S. Compton, R. L. McLeod, N. P. Yumibe, K. B. Alton, E. J. Sybertz, H. R. Davis Jr., In vivo metabolism-based discovery of a potent cholesterol absorption inhibitor, SCH58235, in the rat and rhesus monkey through the identification of the active metabolites of SCH48461. J. Pharmacol. Exp. Ther.283, 157–163 (1997).


D. J. Weiner, A. Nadig, K. A. Jagadeesh, K. K. Dey, B. M. Neale, E. B. Robinson, K. J. Karczewski, L. J. O’Connor, Polygenic architecture of rare coding variation across 400,000 exomes. medRxiv 2022.07.06.22277335 [Preprint] (2022) [cited 2023].


C. Marras, J. C. Beck, J. H. Bower, E. Roberts, B. Ritz, G. W. Ross, R. D. Abbott, R. Savica, S. K. Van Den Eeden, A. W. Willis, C. M. Tanner; Parkinson’s Foundation P4 Group, Prevalence of Parkinson’s disease across North America. NPJ Parkinsons Dis.4, 21 (2018).


M. T. Wallin, W. J. Culpepper, J. D. Campbell, L. M. Nelson, A. Langer-Gould, R. A. Marrie, G. R. Cutter, W. E. Kaye, L. Wagner, H. Tremlett, S. L. Buka, P. Dilokthornsakul, B. Topol, L. H. Chen, N. G. LaRocca; US Multiple Sclerosis Prevalence Workgroup, The prevalence of MS in the United States: A population-based estimate using health claims data. Neurology92, e1029–e1040 (2019).


A. Gupta, Y. Wang, J. A. Spertus, M. Geda, N. Lorenze, C. Nkonde-Price, G. D’Onofrio, J. H. Lichtman, H. M. Krumholz, Trends in acute myocardial infarction in young patients and differences by sex and race, 2001 to 2010. J. Am. Coll. Cardiol.64, 337–345 (2014).


J. M. Lawrence, J. Divers, S. Isom, S. Saydah, G. Imperatore, C. Pihoker, S. M. Marcovina, E. J. Mayer-Davis, R. F. Hamman, L. Dolan, D. Dabelea, D. J. Pettitt, A. D. Liese; SEARCH for Diabetes in Youth Study Group, Trends in Prevalence of Type 1 and Type 2 Diabetes in Children and Adolescents in the US, 2001-2017. JAMA326, 717–727 (2021).


A. V. Khera, M. Chaffin, K. H. Wade, S. Zahid, J. Brancale, R. Xia, M. Distefano, O. Senol-Cosar, M. E. Haas, A. Bick, K. G. Aragam, E. S. Lander, G. D. Smith, H. Mason-Suares, M. Fornage, M. Lebo, N. J. Timpson, L. M. Kaplan, S. Kathiresan, Polygenic Prediction of Weight and Obesity Trajectories from Birth to Adulthood. Cell177, 587–596.e9 (2019).


B. G. Nordestgaard, M. J. Chapman, S. E. Humphries, H. N. Ginsberg, L. Masana, O. S. Descamps, O. Wiklund, R. A. Hegele, F. J. Raal, J. C. Defesche, A. Wiegman, R. D. Santos, G. F. Watts, K. G. Parhofer, G. K. Hovingh, P. T. Kovanen, C. Boileau, M. Averna, J. Borén, E. Bruckert, A. L. Catapano, J. A. Kuivenhoven, P. Pajukanta, K. Ray, A. F. H. Stalenhoef, E. Stroes, M.-R. Taskinen, A. Tybjærg-Hansen; European Atherosclerosis Society Consensus Panel, Familial hypercholesterolaemia is underdiagnosed and undertreated in the general population: guidance for clinicians to prevent coronary heart disease: consensus statement of the European Atherosclerosis Society. Eur. Heart J.34, 3478–3490 (2013).


G. Thanabalasingham, K. R. Owen, Diagnosis and management of maturity onset diabetes of the young (MODY). BMJ343 (oct19 3), d6044 (2011).


A. Markham, Evinacumab: First Approval. Drugs81, 1101–1105 (2021).


D. Curtis, Polygenic risk score for schizophrenia is more strongly associated with ancestry than with schizophrenia. Psychiatr. Genet.28, 85–89 (2018).


C. Márquez-Luna, S. Gazal, P.-R. Loh, S. S. Kim, N. Furlotte, A. Auton, A. L. Price, 23andMe Research Team, LDpred-funct: incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets. bioRxiv 375337 [Preprint] (2018) [cited 2023].


E. W. Karlson, N. T. Boutin, A. G. Hoffnagle, N. L. Allen, Building the Partners HealthCare Biobank at Partners Personalized Medicine: Informed Consent, Return of Research Results, Recruitment Lessons and Operational Considerations. J. Pers. Med.6, 2 (2016).


E. M. Scott, A. Halees, Y. Itan, E. G. Spencer, Y. He, M. A. Azab, S. B. Gabriel, A. Belkadi, B. Boisson, L. Abel, A. G. Clark, F. S. Alkuraya, J.-L. Casanova, J. G. Gleeson; Greater Middle East Variome Consortium, Characterization of Greater Middle Eastern genetic variation for enhanced disease gene discovery. Nat. Genet.48, 1071–1076 (2016).


N. Shah, Y.-C. C. Hou, H.-C. Yu, R. Sainger, C. T. Caskey, J. C. Venter, A. Telenti, Identification of Misclassified ClinVar Variants via Disease Population Prevalence. Am. J. Hum. Genet.102, 609–619 (2018).


M. Lek, K. J. Karczewski, E. V. Minikel, K. E. Samocha, E. Banks, T. Fennell, A. H. O’Donnell-Luria, J. S. Ware, A. J. Hill, B. B. Cummings, T. Tukiainen, D. P. Birnbaum, J. A. Kosmicki, L. E. Duncan, K. Estrada, F. Zhao, J. Zou, E. Pierce-Hoffman, J. Berghout, D. N. Cooper, N. Deflaux, M. DePristo, R. Do, J. Flannick, M. Fromer, L. Gauthier, J. Goldstein, N. Gupta, D. Howrigan, A. Kiezun, M. I. Kurki, A. L. Moonshine, P. Natarajan, L. Orozco, G. M. Peloso, R. Poplin, M. A. Rivas, V. Ruano-Rubio, S. A. Rose, D. M. Ruderfer, K. Shakir, P. D. Stenson, C. Stevens, B. P. Thomas, G. Tiao, M. T. Tusie-Luna, B. Weisburd, H.-H. Won, D. Yu, D. M. Altshuler, D. Ardissino, M. Boehnke, J. Danesh, S. Donnelly, R. Elosua, J. C. Florez, S. B. Gabriel, G. Getz, S. J. Glatt, C. M. Hultman, S. Kathiresan, M. Laakso, S. McCarroll, M. I. McCarthy, D. McGovern, R. McPherson, B. M. Neale, A. Palotie, S. M. Purcell, D. Saleheen, J. M. Scharf, P. Sklar, P. F. Sullivan, J. Tuomilehto, M. T. Tsuang, H. C. Watkins, J. G. Wilson, M. J. Daly, D. G. MacArthur; Exome Aggregation Consortium, Analysis of protein-coding genetic variation in 60,706 humans. Nature536, 285–291 (2016).


B. Kaufman, R. Shapira-Frommer, R. K. Schmutzler, M. William Audeh, M. Friedlander, J. Balmaña, G. Mitchell, G. Fried, K. Bowen, A. Fielding, S. M. Domchek, Olaparib monotherapy in patients with advanced cancer and a germ-line BRCA1/2 mutation: An open-label phase II study. J. Clin. Oncol.31 (15_suppl), 11024–11024 (2013).


C. P. Cannon, M. A. Blazing, R. P. Giugliano, A. McCagg, J. A. White, P. Theroux, H. Darius, B. S. Lewis, T. O. Ophuis, J. W. Jukema, G. M. De Ferrari, W. Ruzyllo, P. De Lucca, K. Im, E. A. Bohula, C. Reist, S. D. Wiviott, A. M. Tershakovec, T. A. Musliner, E. Braunwald, R. M. Califf; IMPROVE-IT Investigators, Ezetimibe Added to Statin Therapy after Acute Coronary Syndromes. N. Engl. J. Med.372, 2387–2397 (2015).


J. P. Evans, B. C. Powell, J. S. Berg, Finding the Rare Pathogenic Variants in a Human Genome. JAMA317, 1904–1905 (2017).


N. J. Schork, S. S. Murray, K. A. Frazer, E. J. Topol, Common vs. rare allele hypotheses for complex diseases. Curr. Opin. Genet. Dev.19, 212–219 (2009).


H. Shi, S. Gazal, M. Kanai, E. M. Koch, A. P. Schoech, K. M. Siewert, S. S. Kim, Y. Luo, T. Amariuta, H. Huang, Y. Okada, S. Raychaudhuri, S. R. Sunyaev, A. L. Price, Population-specific causal disease effect sizes in functionally important regions impacted by selection. Nat. Commun.12, 1098 (2021).


S. L. Spain, J. C. Barrett, Strategies for fine-mapping complex traits. Hum. Mol. Genet.24 (R1), R111–R119 (2015).


P. M. Visscher, N. R. Wray, Q. Zhang, P. Sklar, M. I. McCarthy, M. A. Brown, J. Yang, 10 Years of GWAS Discovery: Biology, Function, and Translation. Am. J. Hum. Genet.101, 5–22 (2017).


J. Frazer, P. Notin, M. Dias, A. Gomez, J. K. Min, K. Brock, Y. Gal, D. S. Marks, Disease variant prediction with deep generative models of evolutionary data. Nature599, 91–95 (2021).


G. M. Findlay, R. M. Daza, B. Martin, M. D. Zhang, A. P. Leith, M. Gasperini, J. D. Janizek, X. Huang, L. M. Starita, J. Shendure, Accurate classification of BRCA1 variants with saturation genome editing. Nature562, 217–222 (2018).


L. Sundaram, H. Gao, S. R. Padigepati, J. F. McRae, Y. Li, J. A. Kosmicki, N. Fritzilas, J. Hakenberg, A. Dutta, J. Shon, J. Xu, S. Batzoglou, X. Li, K. K.-H. Farh, Predicting the clinical impact of human mutation with deep neural networks. Nat. Genet.50, 1161–1170 (2018).


H3Africa Consortium, C. Rotimi, A. Abayomi, A. Abimiku, V. M. Adabayeri, C. Adebamowo, E. Adebiyi, A. D. Ademola, A. Adeyemo, D. Adu, D. Affolabi, G. Agongo, S. Ajayi, S. Akarolo-Anthony, R. Akinyemi, A. Akpalu, M. Alberts, O. Alonso Betancourt, A. M. Alzohairy, G. Ameni, O. Amodu, G. Anabwani, K. Andersen, F. Arogundade, O. Arulogun, D. Asogun, R. Bakare, M. L. Baniecki, C. Beiswanger, A. Benkahla, L. Bethke, M. Boehnke, V. Boima, J. Brandful, A. I. Brooks, F. C. Brosius, C. Brown, B. Bucheton, D. T. Burke, B. G. Burnett, S. Carrington-Lawrence, N. Carstens, J. Chisi, A. Christoffels, R. Cooper, H. Cordell, N. Crowther, T. Croxton, J. de Vries, L. Derr, P. Donkor, S. Doumbia, A. Duncanson, I. Ekem, A. El Sayed, M. E. Engel, J. C. K. Enyaru, D. Everett, F. M. Fadlelmola, E. Fakunle, K. H. Fischbeck, A. Fischer, O. Folarin, J. Gamieldien, R. F. Garry, S. Gaseitsiwe, R. Gbadegesin, A. Ghansah, M. Giovanni, P. Goesbeck, F. X. Gomez-Olive, D. S. Grant, R. Grewal, M. Guyer, N. A. Hanchard, C. T. Happi, S. Hazelhurst, B. J. Hennig, C. Hertz-Fowler, W. Hide, F. Hilderbrandt, C. Hugo-Hamman, M. E. Ibrahim, R. James, Y. Jaufeerally-Fakim, C. Jenkins, U. Jentsch, P.-P. Jiang, M. Joloba, V. Jongeneel, F. Joubert, M. Kader, K. Kahn, P. Kaleebu, S. H. Kapiga, S. K. Kassim, I. Kasvosve, J. Kayondo, B. Keavney, A. Kekitiinwa, S. H. Khan, P. Kimmel, M.-C. King, R. Kleta, M. Koffi, J. Kopp, M. Kretzler, J. Kumuthini, S. Kyobe, C. Kyobutungi, D. T. Lackland, K. A. Lacourciere, G. Landouré, R. Lawlor, T. Lehner, M. Lesosky, N. Levitt, K. Littler, Z. Lombard, J. F. Loring, S. Lyantagaye, A. Macleod, E. B. Madden, C. R. Mahomva, J. Makani, M. Mamven, M. Marape, G. Mardon, P. Marshall, D. P. Martin, D. Masiga, R. Mason, M. Mate-Kole, E.Matovu, M.Mayige, B. M.Mayosi, J. C.Mbanya, S. A.McCurdy, M. I.McCarthy, H.McIlleron, S. O.Mc’Ligeyo, C.Merle, A. O.Mocumbi, C.Mondo, J. V.Moran, A.Motala, M.Moxey-Mims, W. S.Mpoloka, C. L.Msefula, T.Mthiyane, N.Mulder, G.her Mulugeta, D.Mumba, J.Musuku, M.Nagdee, O.Nash, D.Ndiaye, A. Q.Nguyen, M.Nicol, O.Nkomazana, S.Norris, B.Nsangi, A.Nyarko, M.Nyirenda, E.Obe, R.Obiakor, A.Oduro, S. F.Ofori-Acquah, O.Ogah, S.Ogendo, K.Ohene-Frempong, A.Ojo, T.Olanrewaju, J.Oli, C.Osafo, O.Ouwe Missi Oukem-Boyer, B.Ovbiagele, A.Owen, M. O.Owolabi, L.Owolabi, E.Owusu-Dabo, G.Pare, R.Parekh, H. G.Patterton, M. B.Penno, J.Peterson, R.Pieper, J.Plange-Rhule, M.Pollak, J.Puzak, R. S.Ramesar, M.Ramsay, R.Rasooly, S.Reddy, P. C.Sabeti, K.Sagoe, T. Salako, O.Samassékou, M. S.Sandhu, O.Sankoh, F. S.Sarfo, M.Sarr, G.Shaboodien, I.Sidibe, G.Simo, M.Simuunza, L.Smeeth, E.Sobngwi, H.Soodyall, H.Sorgho, O.Sow Bah, S.Srinivasan, D. J.Stein, E. S.Susser, C.Swanepoel, G.Tangwa, A.Tareila, O.Tastan Bishop, B.Tayo, N.Tiffin, H.Tinto, E.Tobin, S. M.Tollman, M.Traoré, M. J.Treadwell, J.Troyer, M.Tsimako-Johnstone, V.Tukei, I.Ulasi, N.Ulenga, B.van Rooyen, A. P.Wachinou, S. P.Waddy, A.Wade, M.Wayengera, J.Whitworth, L.Wideroff, C. A.Winkler, S.Winnicki, A.Wonkam, M.Yewondwos, T.Sen, N.Yozwiak, H.Zar, Enabling the genomic revolution in Africa. Science344, 1346–1348 (2014).


J. D. Wall, E. W. Stawiski, A. Ratan, H. L. Kim, C. Kim, R. Gupta, K. Suryamohan, E. S. Gusareva, R. W. Purbojati, T. Bhangale, V. Stepanov, V. Kharkov, M. S. Schröder, V. Ramprasad, J. Tom, S. Durinck, Q. Bei, J. Li, J. Guillory, S. Phalke, A. Basu, J. Stinson, S. Nair, S. Malaichamy, N. K. Biswas, J. C. Chambers, K. C. Cheng, J. T. George, S. S. Khor, J.-I. Kim, B. Cho, R. Menon, T. Sattibabu, A. Bassi, M. Deshmukh, A. Verma, V. Gopalan, J.-Y. Shin, M. Pratapneni, S. Santhosh, K. Tokunaga, B. M. Md-Zain, K. G. Chan, M. Parani, P. Natarajan, M. Hauser, R. R. Allingham, C. Santiago-Turla, A. Ghosh, S. G. K. Gadde, C. Fuchsberger, L. Forer, S. Schoenherr, H. Sudoyo, J. S. Lansing, J. Friedlaender, G. Koki, M. P. Cox, M. Hammer, T. Karafet, K. C. Ang, S. Q. Mehdi, V. Radha, V. Mohan, P. P. Majumder, S. Seshagiri, J.-S. Seo, S. C. Schuster, A. S. Peterson; GenomeAsia100K Consortium, The GenomeAsia 100K Project enables genetic discoveries across Asia. Nature576, 106–111 (2019).


M. C. Mills, C. Rahal, The GWAS Diversity Monitor tracks diversity by disease in real time. Nat. Genet.52, 242–243 (2020).


B.-J. Feng, PERCH: A Unified Framework for Disease Gene Prioritization. Hum. Mutat.38, 243–251 (2017).


P. Rentzsch, M. Schubach, J. Shendure, M. Kircher, CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med.13, 31 (2021).


N. Alirezaie, K. D. Kernohan, T. Hartley, J. Majewski, T. D. Hocking, ClinPred: Prediction Tool to Identify Disease-Relevant Nonsynonymous Single-Nucleotide Variants. Am. J. Hum. Genet.103, 474–483 (2018).


M. F. Rogers, H. A. Shihab, M. Mort, D. N. Cooper, T. R. Gaunt, C. Campbell, FATHMM-XF: Accurate prediction of pathogenic point mutations via extended features. Bioinformatics34, 511–513 (2018).


K. A. Jagadeesh, A. M. Wenger, M. J. Berger, H. Guturu, P. D. Stenson, D. N. Cooper, J. A. Bernstein, G. Bejerano, M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity. Nat. Genet.48, 1581–1586 (2016).


C. Dong, P. Wei, X. Jian, R. Gibbs, E. Boerwinkle, K. Wang, X. Liu, Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies. Hum. Mol. Genet.24, 2125–2137 (2015).


B. Reva, Y. Antipin, C. Sander, Predicting the functional impact of protein mutations: Application to cancer genomics. Nucleic Acids Res.39, e118–e118 (2011).


I. A. Adzhubei, S. Schmidt, L. Peshkin, V. E. Ramensky, A. Gerasimova, P. Bork, A. S. Kondrashov, S. R. Sunyaev, A method and server for predicting damaging missense mutations. Nat. Methods7, 248–249 (2010).


Y. Choi, G. E. Sims, S. Murphy, J. R. Miller, A. P. Chan, Predicting the functional effect of amino acid substitutions and indels. PLOS ONE7, e46688 (2012).


N. M. Ioannidis, J. H. Rothstein, V. Pejaver, S. Middha, S. K. McDonnell, S. Baheti, A. Musolf, Q. Li, E. Holzinger, D. Karyadi, L. A. Cannon-Albright, C. C. Teerlink, J. L. Stanford, W. B. Isaacs, J. Xu, K. A. Cooney, E. M. Lange, J. Schleutker, J. D. Carpten, I. J. Powell, O. Cussenot, G. Cancel-Tassin, G. G. Giles, R. J. MacInnis, C. Maier, C.-L. Hsieh, F. Wiklund, W. J. Catalona, W. D. Foulkes, D. Mandal, R. A. Eeles, Z. Kote-Jarai, C. D. Bustamante, D. J. Schaid, T. Hastie, E. A. Ostrander, J. E. Bailey-Wilson, P. Radivojac, S. N. Thibodeau, A. S. Whittemore, W. Sieh, REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. Am. J. Hum. Genet.99, 877–885 (2016).


N.-L. Sim, P. Kumar, J. Hu, S. Henikoff, G. Schneider, P. C. Ng, SIFT web server: Predicting effects of amino acid substitutions on proteins. Nucleic Acids Res. 40 (W1), W452-7 (2012).


H. Carter, C. Douville, P. D. Stenson, D. N. Cooper, R. Karchin, Identifying Mendelian disease genes with the variant effect scoring tool. BMC Genomics14 (Suppl 3), S3 (2013).


Y. Wu, E. M. Byrne, Z. Zheng, K. E. Kemper, L. Yengo, A. J. Mallett, J. Yang, P. M. Visscher, N. R. Wray, Genome-wide association study of medication-use and associated disease in the UK Biobank. Nat. Commun.10, 1891 (2019).


A. G. Barnett, J. C. van der Pols, A. J. Dobson, Regression to the mean: What it is and how to deal with it. Int. J. Epidemiol.34, 215–220 (2005).


C. C. Chang, C. C. Chow, L. C. Tellier, S. Vattikuti, S. M. Purcell, J. J. Lee, Second-generation PLINK: Rising to the challenge of larger and richer datasets. Gigascience4, 7 (2015).


K. J. Galinsky, G. Bhatia, P.-R. Loh, S. Georgiev, S. Mukherjee, N. J. Patterson, A. L. Price, Fast Principal-Component Analysis Reveals Convergent Evolution of ADH1B in Europe and East Asia. Am. J. Hum. Genet.98, 456–472 (2016).


F. Hormozdiari, E. Kostem, E. Y. Kang, B. Pasaniuc, E. Eskin, Identifying causal variants at loci with multiple signals of association. Genetics198, 497–508 (2014).


A. Battle, C. D. Brown, B. E. Engelhardt, S. B. Montgomery; GTEx Consortium; Laboratory, Data Analysis &Coordinating Center (LDACC)—Analysis Working Group; Statistical Methods groups—Analysis Working Group; Enhancing GTEx (eGTEx) groups; NIH Common Fund; NIH/NCI; NIH/NHGRI; NIH/NIMH; NIH/NIDA; Biospecimen Collection Source Site—NDRI; Biospecimen Collection Source Site—RPCI; Biospecimen Core Resource—VARI; Brain Bank Repository—University of Miami Brain Endowment Bank; Leidos Biomedical—Project Management; ELSI Study; Genome Browser Data Integration &Visualization—EBI; Genome Browser Data Integration &Visualization—UCSC Genomics Institute, University of California Santa Cruz; Lead analysts; Laboratory, Data Analysis &Coordinating Center (LDACC); NIH program management; Biospecimen collection; Pathology; eQTL manuscript working group, Genetic effects on gene expression across human tissues. Nature550, 204–213 (2017).


X. Liu, X. Jian, E. Boerwinkle, dbNSFP: A lightweight database of human nonsynonymous SNPs and their functional predictions. Hum. Mutat.32, 894–899 (2011).


X. Liu, C. Li, C. Mou, Y. Dong, Y. Tu, dbNSFP v4: A comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Med.12, 103 (2020).

Information & Authors


Published In


Volume 380 | Issue 6648
2 June 2023


Copyright © 2023 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

Article versions

Submission history

Received: 14 January 2022

Accepted: 16 March 2023

Published in print: 2 June 2023


Request permissions for this article.


We thank D. MacArthur, J. Pritchard, M. Rivas, N. Ersaro, and I. Mitra for helpful discussions, and the participants and investigators in the UK Biobank (Resource Application Number 33751) and MGB studies (protocol 2018P001236) who made this work possible.

Funding: T.M.B. is supported by funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 864203), PID2021-126004NB-100 (MICIIN/FEDER, UE) and Secretaria d’Universitats i Recerca, and CERCA Programme del Departament d’Economia i Coneixement de la Generalitat de Catalunya (GRC 2021 SGR 00177).

Author contributions: P.P.F., J.M., J.C.U., J.S.D., T.H., Y.Y., P.W., Z.N., J.G.S., H.G., A.M., D.C., F.A., M.F., Y.F, and K.K.-H.F. performed the analysis and wrote the manuscript. J.R., T.M.B., H.L.R., A.O.L., A.V.K., and K.F. supervised the work.

Competing interests: H.L.R. receives funding from Illumina, Inc. and Microsoft Corporation to support rare disease gene discovery and diagnosis. A.V.K. is an employee of Verve Therapeutics, Inc., has served as a scientific advisor to Amgen Inc., Novartis AG, Silence Therapeutics PLC, Korro Bio, Inc., Veritas International SL, Color Health, Inc., Third Rock Ventures, Illumina Inc., Ambry Genetics Corporation, and Foresite Labs. A.V.K. holds equity in Verve Therapeutics, Inc., Color Health, Inc., and Foresite Labs. Employees of Illumina, Inc. are indicated in the list of author affiliations. Patents related to this work are (i) “Covariate correction including drug use from temporal data”; filing no. 63/351317; P. Fiziev, J. McRae, and K.-H. Farh; (ii) “Optimized burden test based on nested t tests that maximize separation between carriers and non-carriers”; filing no. 63/351283; P. Fiziev, J. McRae, and K.-H. Farh; (iii) “Rare variant polygenic risk scores”; filing no. 63/351299; P. Fiziev, J. McRae, and K.-H. Farh; and (iv) “Transformer language model for variant pathogenicity”; filing no. US 17/975,536 and US 17/975,547; J. Ede, T. Hamp, A. Dietrich, Y. Wu, and K.-H. Farh.

Data and materials availability: PrimateAI-3D prediction scores are available with a non-commercial license upon request and are displayed at https://primad.basespace.illumina.com. Source code is available at https://github.com/Illumina/PrimateAI-3D, with archived versions of the rare variant burden test and polygenic score at (79) and (80).



Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Conceptualization, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, and Writing – original draft.

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, and Writing – review & editing.

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Conceptualization, Formal analysis, Methodology, Visualization, Writing – original draft, and Writing – review & editing.

Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA.

Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.

Roles: Validation and Writing – review & editing.

Tobias Hamp

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Supervision, Validation, Visualization, Writing – original draft, and Writing – review & editing.

Yanshen Yang

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Software and Visualization.

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia.

Roles: Formal analysis and Writing – original draft.

Department of Statistics, University of Wisconsin–Madison, Madison, WI 53706, USA.

Roles: Formal analysis, Methodology, Software, Validation, and Writing – original draft.

Joshua G. Schraiber

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Writing – original draft and Writing – review & editing.

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Formal analysis, Investigation, and Writing – review & editing.

Dylan Cable

Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA 02142, USA.

Roles: Conceptualization, Formal analysis, Methodology, Software, and Writing – original draft.

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Formal analysis, Investigation, Methodology, Validation, and Writing – review & editing.

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Writing – original draft and Writing – review & editing.

Marc Fasnacht

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Role: Formal analysis.

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, and Writing – review & editing.

Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

Wisconsin National Primate Research Center, University of Wisconsin–Madison, Madison, WI 53715, USA.

Roles: Data curation, Investigation, Project administration, Resources, Visualization, and Writing – review & editing.

Institute of Evolutionary Biology (UPF-CSIC), 08003 Barcelona, Spain.

Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain.

CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain.

Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, 08193 Barcelona, Spain.

Roles: Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, and Writing – review & editing.

Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA.

Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.

Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114, USA.

Roles: Conceptualization and Validation.

Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.

Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114, USA.

Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115, USA.

Roles: Conceptualization and Supervision.

Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA.

Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.

Verve Therapeutics, Cambridge, MA 02215, USA.

Roles: Supervision and Writing – review & editing.

Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA.

Roles: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing – original draft, and Writing – review & editing.

Funding Information

NIH: 1R01HG010898-01A1


These authors contributed equally to this work.

Metrics & Citations


Article Usage



Cite as

Rare penetrant mutations confer severe risk of common diseases.Science380,eabo1131(2023).DOI:10.1126/science.abo1131

Export citation

Select the format you want to export the citation of this publication.

View Options

Check Access





Read More

Author: Samatha Mote

Leave a Reply