Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 62,709 bioRxiv papers from 278,266 authors.

Quantitative analysis of population-scale family trees using millions of relatives

By Joanna Kaplanis, Assaf Gordon, Mary Wahl, Michael Gershovits, Barak Markus, Mona Sheikh, Melissa Gymrek, Gaurav Bhatia, Daniel G. MacArthur, Alkes L. Price, Yaniv Erlich

Posted 07 Feb 2017
bioRxiv DOI: 10.1101/106427 (published DOI: 10.1126/science.aam9309)

Family trees have vast applications in multiple fields from genetics to anthropology and economics. However, the collection of extended family trees is tedious and usually relies on resources with limited geographical scope and complex data usage restrictions. Here, we collected 86 million profiles from publicly-available online data from genealogy enthusiasts. After extensive cleaning and validation, we obtained population-scale family trees, including a single pedigree of 13 million individuals. We leveraged the data to partition the genetic architecture of longevity by inspecting millions of relative pairs and to provide insights to population genetics theories on the dispersion of families. We also report a simple digital procedure to overlay other datasets with our resource in order to empower studies with population-scale genealogical data.

Download data

  • Downloaded 18,422 times
  • Download rankings, all-time:
    • Site-wide: 35 out of 62,709
    • In genomics: 9 out of 4,312
  • Year to date:
    • Site-wide: 2,057 out of 62,709
  • Since beginning of last month:
    • Site-wide: 5,525 out of 62,709

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News