In-solution Y-chromosome capture-enrichment on ancient DNA libraries
Diana I Cruz-Dávalos,
María A Nieves-Colón,
G. David Poznik,
Anne C. Stone,
Carlos D. Bustamante,
María C. Ávila-Arcos
Posted 22 Nov 2017
bioRxiv DOI: 10.1101/223214 (published DOI: 10.1186/s12864-018-4945-x)
Posted 22 Nov 2017
Background: As most ancient biological samples have low levels of endogenous DNA, it is advantageous to enrich for specific genomic regions prior to sequencing. One approach - in-solution capture-enrichment - retrieves sequences of interest and reduces the fraction of microbial DNA. In this work, we implement a capture-enrichment approach targeting informative regions of the Y chromosome in six human archaeological remains excavated in the Caribbean and dated between 200 and 3,000 years BP. We compare the recovery rate of Y-chromosome capture (YCC) alone, whole-genome capture followed by YCC (WGC+Y) versus non-enriched (pre-capture) libraries. Results: We recovered 17-4,152 times more targeted unique Y-chromosome sequences after capture, where 0.01-6.2% (WGC+Y) and 0.01-23.5% (YCC) of the sequence reads were on-target, compared to 0.0002-0.004% pre-capture. In samples with endogenous DNA content greater than 0.1%, we found that WGC followed by YCC (WGC+Y) yields lower enrichment due to the loss of complexity in consecutive capture experiments, whereas in samples with lower endogenous content, WGC+Y yielded greater enrichment than YCC alone. Finally, increasing recovery of informative sites enabled us to assign Y-chromosome haplogroups to some of the archeological remains and gain insights about their paternal lineages and origins. Conclusions: We present to our knowledge the first in-solution capture-enrichment method targeting the human Y-chromosome in aDNA sequencing libraries. YCC and WGC+Y enrichments lead to an increase in the amount of Y-DNA sequences, as compared to libraries not enriched for the Y-chromosome. Our probe design effectively recovers regions of the Y-chromosome bearing phylogenetically informative sites, allowing us to identify paternal lineages with less sequencing than needed for pre-capture libraries. Finally, we recommend considering the endogenous content in the experimental design and avoiding consecutive rounds of capture for low-complexity libraries, as clonality increases considerably with each round.
- Downloaded 572 times
- Download rankings, all-time:
- Site-wide: 24,863 out of 84,647
- In genomics: 2,708 out of 5,464
- Year to date:
- Site-wide: 48,874 out of 84,647
- Since beginning of last month:
- Site-wide: 55,565 out of 84,647
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!