Rxivist logo

Enhancing sensitivity and controlling false discovery rate in somatic indel discovery

By Johannes Köester, Louis J. Dijkstra, Tobias Marschall, Alexander Schönhuth

Posted 21 Aug 2019
bioRxiv DOI: 10.1101/741256

As witnessed by various population-scale cancer genome sequencing projects, accurate discovery of somatic variants has become of central importance in modern cancer research. However, count statistics on somatic insertions and deletions (indels) discovered so far point out that large amounts of discoveries must have been missed. The reason is that the combination of uncertainties relating to, for example, gap and alignment ambiguities, twilight zone indels, cancer heterogeneity, sample purity, sampling and strand bias are hard to accurately quantify. Here, a unifying statistical model is provided whose dependency structures enable to accurately quantify all inherent uncertainties in short time. As major consequence, false discovery rate (FDR) in somatic indel discovery can now be controlled at utmost accuracy. As demonstrated on simulated and real data, this enables to dramatically increase the amount of true discoveries while safely suppressing the FDR. Specifically supported by workflow design, our approach can be integrated as a post-processing step in large-scale projects.

Download data

  • Downloaded 573 times
  • Download rankings, all-time:
    • Site-wide: 35,649 out of 116,126
    • In bioinformatics: 4,129 out of 9,552
  • Year to date:
    • Site-wide: 50,684 out of 116,126
  • Since beginning of last month:
    • Site-wide: 91,439 out of 116,126

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)