Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 67,655 bioRxiv papers from 298,484 authors.

Comparison of three variant callers for human whole genome sequencing

By Anna Supernat, Oskar Valdimar Vidarsson, Vidar M. Steen, Tomasz Stokowy

Posted 05 Nov 2018
bioRxiv DOI: 10.1101/461798 (published DOI: 10.1038/s41598-018-36177-7)

Testing of patients with genetics-related disorders is in progress of shifting from single gene assays to gene panel sequencing, whole-exome sequencing (WES) and whole-genome sequencing (WGS). Since WGS is unquestionably becoming a new foundation for molecular analyses, we decided to compare three currently used tools for variant calling of human whole genome sequencing data. We tested DeepVariant, a new TensorFlow machine learning-based variant caller, and compared this tool to GATK 4.0 and SpeedSeq, using 30x, 15x and 10x WGS data of the well-known NA12878 DNA reference sample. According to our comparison, the performance on SNV calling was almost similar in 30x data, with all three variant callers reaching F-Scores (i.e. harmonic mean of recall and precision) equal to 0.98. In contrast, DeepVariant was more precise in indel calling than GATK and SpeedSeq, as demonstrated by F-Scores of 0.94, 0.90 and 0.84, respectively. We conclude that the DeepVariant tool has great potential and usefulness for analysis of WGS data in medical genetics.

Download data

  • Downloaded 570 times
  • Download rankings, all-time:
    • Site-wide: 18,078 out of 67,655
    • In genomics: 2,190 out of 4,583
  • Year to date:
    • Site-wide: 20,571 out of 67,655
  • Since beginning of last month:
    • Site-wide: None out of 67,655

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide

Sign up for the Rxivist weekly newsletter! (Click here for more details.)