Rxivist logo

Hap10: reconstructing accurate and long polyploid haplotypes using linked reads

By Sina Majidian, Mohammad Hossein Kahaei, Dick de Ridder

Posted 09 Jan 2020
bioRxiv DOI: 10.1101/2020.01.08.899013 (published DOI: 10.1186/s12859-020-03584-5)

Background: Haplotype information is essential for many genetic and genomic analyses, including genotype-phenotype associations in human, animals and plants. Haplotype assembly is a method for reconstructing haplotypes from DNA sequencing reads. By the advent of new sequencing technologies, new algorithms are needed to ensure long and accurate haplotypes. While a few linked-read haplotype assembly algorithms are available for diploid genomes, there are no algorithms yet for polyploids. Results: The first haplotyping algorithm designed for 10X linked reads generated from a polyploid genome is presented, built on a typical short-read haplotyping method, SDhaP. Using the input aligned reads and called variants, the haplotype-relevant information is extracted. Next, reads with the same barcodes are combined to produce molecule-specific fragments. Then, these fragments are clustered into strongly connected components which are then used as input of a haplotype assembly core in order to estimate accurate and long haplotypes. Conclusions: Hap10 is a novel algorithm for haplotype assembly of polyploid genomes using linked reads. The performance of the algorithms is evaluated in a number of simulation scenarios and its applicability is demonstrated on a real dataset of sweet potato.

Download data

  • Downloaded 299 times
  • Download rankings, all-time:
    • Site-wide: 71,216 out of 118,129
    • In bioinformatics: 6,785 out of 9,572
  • Year to date:
    • Site-wide: 26,403 out of 118,129
  • Since beginning of last month:
    • Site-wide: 65,216 out of 118,129

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)