Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 70,186 bioRxiv papers from 306,470 authors.

GBStools: A Unified Approach for Reduced Representation Sequencing and Genotyping

By Thomas F Cooke, Muh-Ching Yee, Marina Muzzio, Alexandra Sockell, Ryan Bell, Omar E. Cornejo, Joanna L Kelley, Graciela Bailliet, Claudio M. Bravi, Carlos D. Bustamante, Eimear E Kenny

Posted 03 Nov 2015
bioRxiv DOI: 10.1101/030494 (published DOI: 10.1371/journal.pgen.1005631)

Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. Variant calling error rates, however, are higher in GBS than in standard sequencing, in particular due to restriction site polymorphisms, and few computational tools exist that specifically model and correct these errors. We developed a statistical method to remove errors caused by restriction site polymorphisms, implemented in the software package GBStools. We evaluated it in several simulated data sets, varying in number of samples, mean coverage and population mutation rate, and in two empirical human data sets (N = 8 and N = 63 samples). In our simulations, GBStools improved genotype accuracy more than commonly used filters such as Hardy-Weinberg equilibrium p-values. GBStools is most effective at removing genotype errors in data sets over 100 samples when coverage is 40X or higher, and the improvement is most pronounced in species with high genomic diversity. We also demonstrate the utility of GBS and GBStools for human population genetic inference in Argentine populations and reveal widely varying individual ancestry proportions and an excess of singletons, consistent with recent population growth.

Download data

  • Downloaded 509 times
  • Download rankings, all-time:
    • Site-wide: 21,900 out of 70,230
    • In genomics: 2,482 out of 4,700
  • Year to date:
    • Site-wide: 52,187 out of 70,230
  • Since beginning of last month:
    • Site-wide: 32,784 out of 70,230

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)