Rxivist logo

Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index

By Ali Ghaffaari, Tobias Marschall

Posted 25 Mar 2019
bioRxiv DOI: 10.1101/587717 (published DOI: 10.1093/bioinformatics/btz341)

Motivation: Sequence graphs are versatile data structures that are, for instance, able to represent the genetic variation found in a population and to facilitate genome assembly. Read mapping to sequence graphs constitutes an important step for many applications and is usually done by first finding exact seed matches, which are then extended by alignment. Existing methods for finding seed hits prune the graph in complex regions, leading to a loss of information especially in highly polymorphic regions of the genome. While such complex graph structures can indeed lead to a combinatorial explosion of possible alleles, the query set of reads from a diploid individual realizes only two alleles per locus - a property that is not exploited by extant methods. Results: We present the Pan-genome Seed Index (PSI), a fully-sensitive hybrid method for seed finding, which takes full advantage of this property by combining an index over selected paths in the graph with an index over the query reads. This enables PSI to find all seeds while eliminating the need to prune the graph. We demonstrate its performance with different parameter settings on both simulated data and on a whole human genome graph constructed from variants in the 1000 Genome Project data set. On this graph, PSI outperforms GCSA2 in terms of index size, query time, and sensitivity. Availability: The C++ implementation is publicly available at: https://github.com/cartoonist/psi

Download data

  • Downloaded 320 times
  • Download rankings, all-time:
    • Site-wide: 59,782 out of 103,809
    • In bioinformatics: 6,513 out of 9,474
  • Year to date:
    • Site-wide: 92,743 out of 103,809
  • Since beginning of last month:
    • Site-wide: 70,377 out of 103,809

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


PanLingua

Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News