Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations
Alicia R. Martin,
Elizabeth G. Atkinson,
Sinéad B. Chapman,
Rocky E. Stroud,
Fred K. Ashaba,
Lori B Chibnik,
Wilfred E. Injera,
Symon M. Kariuki,
Karestan C. Koenen,
Rehema M. Mwema,
Benjamin M Neale,
Carter P. Newman,
Charles R. J. C. Newton,
Joseph K. Pickrell,
Dan J. Stein,
Celia van der Merwe,
Posted 28 Apr 2020
bioRxiv DOI: 10.1101/2020.04.27.064832
Posted 28 Apr 2020
Background: Genetic studies of biomedical phenotypes in underrepresented populations identify disproportionate numbers of novel associations. However, current genomics infrastructure--including most genotyping arrays and sequenced reference panels--best serves populations of European descent. A critical step for facilitating genetic studies in underrepresented populations is to ensure that genetic technologies accurately capture variation in all populations. Here, we quantify the accuracy of low-coverage sequencing in diverse African populations. Results: We sequenced the whole genomes of 91 individuals to high-coverage (>20X) from the Neuropsychiatric Genetics of African Population-Psychosis (NeuroGAP-Psychosis) study, in which participants were recruited from Ethiopia, Kenya, South Africa, and Uganda. We empirically tested two data generation strategies, GWAS arrays versus low-coverage sequencing, by calculating the concordance of imputed variants from these technologies with those from deep whole genome sequencing data. We show that low-coverage sequencing at a depth of ≥4X captures variants of all frequencies more accurately than all commonly used GWAS arrays investigated and at a comparable cost. Lower depths of sequencing (0.5-1X) performed comparable to commonly used low-density GWAS arrays. Low-coverage sequencing is also sensitive to novel variation, with 4X sequencing detecting 45% of singletons and 95% of common variants identified in high-coverage African whole genomes. Conclusion: These results indicate that low-coverage sequencing approaches surmount the problems induced by the ascertainment of common genotyping arrays, including those that capture variation most common in Europeans and Africans. Low-coverage sequencing effectively identifies novel variation (particularly in underrepresented populations), and presents opportunities to enhance variant discovery at a similar cost to traditional approaches. ### Competing Interest Statement A.R.M. serves as a consultant for 23andMe and is a member of the Precise.ly Scientific Advisory Board. B.M.N. is a member of the Deep Genomics Scientific Advisory Board. He also serves as a consultant for the Camp4 Therapeutics Corporation, Takeda Pharmaceutical and Biogen. M.J.D. is a founder of Maze Therapeutics. J.K.P. is an employee of Gencove, Inc. The remaining authors declare no competing interests. D.J.S. has received research grants and/or consultancy honoraria from Lundbeck and Sun.
- Downloaded 1,216 times
- Download rankings, all-time:
- Site-wide: 12,707 out of 117,931
- In genomics: 1,430 out of 6,427
- Year to date:
- Site-wide: 4,738 out of 117,931
- Since beginning of last month:
- Site-wide: 5,861 out of 117,931
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!