Rxivist logo

Integrative haplotype estimation with sub-linear complexity

By Olivier Delaneau, Jean-Fran├žois Zagury, Matthew Robinson, Jonathan Marchini, Emmanouil Dermitzakis

Posted 13 Dec 2018
bioRxiv DOI: 10.1101/493403 (published DOI: 10.1038/s41467-019-13225-y)

The number of human genomes being genotyped or sequenced increases exponentially and efficient haplotype estimation methods able to handle this amount of data are now required. Here, we present a new method, SHAPEIT4, which substantially improves upon other methods to process large genotype and high coverage sequencing datasets. It notably exhibits sub-linear scaling with sample size, provides highly accurate haplotypes and allows integrating external phasing information such as large reference panels of haplotypes, collections of pre-phased variants and long sequencing reads. We provide SHAPET4 in an open source format on https://odelaneau.github.io/shapeit4/ and demonstrate its performance in terms of accuracy and running times on two gold standard datasets: the UK Biobank data and the Genome In A Bottle.

Download data

  • Downloaded 826 times
  • Download rankings, all-time:
    • Site-wide: 25,594
    • In bioinformatics: 3,037
  • Year to date:
    • Site-wide: 84,961
  • Since beginning of last month:
    • Site-wide: 84,961

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)