Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 67,655 bioRxiv papers from 298,484 authors.

Inferring the ancestry of everyone

By Jerome Kelleher, Yan Wong, Patrick K. Albers, Anthony W. Wohns, Gilean McVean

Posted 01 Nov 2018
bioRxiv DOI: 10.1101/458067 (published DOI: 10.1038/s41588-019-0483-y)

A central problem in evolutionary biology is to infer the full genealogical history of a set of DNA sequences. This history contains rich information about the forces that have influenced a sexually reproducing species. However, existing methods are limited: the most accurate is unable to cope with more than a few dozen samples. With modern genetic data sets rapidly approaching millions of genomes, there is an urgent need for efficient inference methods to exploit such rich resources. We introduce an algorithm to infer whole-genome history which has comparable accuracy to the state-of-the-art but can process around four orders of magnitude more sequences. Additionally, our method results in an "evolutionary encoding" of the original sequence data, enabling efficient access to genealogies and calculation of genetic statistics over the data. We apply this technique to human data from the 1000 Genomes Project, Simons Genome Diversity Project and UK Biobank, showing that the genealogies we estimate are both rich in biological signal and efficient to process.

Download data

  • Downloaded 2,610 times
  • Download rankings, all-time:
    • Site-wide: 1,520 out of 67,655
    • In evolutionary biology: 64 out of 4,487
  • Year to date:
    • Site-wide: 948 out of 67,655
  • Since beginning of last month:
    • Site-wide: 4,234 out of 67,655

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News