Rxivist logo

Raptor: A fast and space-efficient pre-filter for querying very large collections of nucleotide sequences

By Enrico Seiler, Svenja Mehringer, Mitra Darvish, Etienne Turc, Knut Reinert

Posted 08 Oct 2020
bioRxiv DOI: 10.1101/2020.10.08.330985

We present Raptor, a tool for approximately searching many queries in large collections of nucleotide sequences. In comparison with similar tools like Mantis and COBS, Raptor is 12 - 144 times faster and uses up to 30 times less memory. Raptor uses winnowing minimizers to define a set of representative k-mers, an extension of the Interleaved Bloom Filters (IBF) as a set membership data structure, and probabilistic thresholding for minimizers. Our approach allows compression and a partitioning of the IBF to enable the effective use of secondary memory. ### Competing Interest Statement The authors have declared no competing interest.

Download data

  • Downloaded 128 times
  • Download rankings, all-time:
    • Site-wide: 92,035 out of 103,764
    • In bioinformatics: 8,742 out of 9,474
  • Year to date:
    • Site-wide: 54,252 out of 103,764
  • Since beginning of last month:
    • Site-wide: 3,590 out of 103,764

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)