Rxivist logo

A massively parallel strategy for STR marker development, capture, and genotyping

By Logan Kistler, Stephen M. Johnson, Mitchell T. Irwin, Edward E. Louis, Aakrosh Ratan, George H Perry

Posted 13 Jul 2016
bioRxiv DOI: 10.1101/063727 (published DOI: 10.1093/nar/gkx574)

Short tandem repeat (STR, or microsatellite) variants are highly polymorphic markers that facilitate powerful, high-precision population genetic analyses. STRs are especially valuable in conservation and ecological genetic research, yielding detailed information on population structure and short-term demographic fluctuations. However, STR marker development and analysis by conventional PCR-based methods imposes a workflow bottleneck and is suboptimal for non-invasive sampling strategies such as fecal DNA recovery. While massively parallel sequencing has not previously been leveraged for scalable, efficient STR recovery, here we present a pipeline for developing STR markers directly from high-throughput shotgun sequencing data without requiring a reference genome assembly, and a methodological approach for highly parallel recovery of enriched STR loci. We first employed our approach to design and capture a panel of 5,000 STR loci from a test group of diademed sifakas (Propithecus diadema, n=3), endangered Malagasy rainforest lemurs, and we report extremely efficient recovery of targeted loci--97.3-99.6% of STRs characterized with ≥10x non-redundant coverage. Second, we tested our STR capture strategy on a P. diadema fecal DNA preparation, and report robust initial results and methodological suggestions for future implementations. In addition to STR targets, this approach also generates large, genome-wide single nucleotide polymorphism (SNP) panels from regions flanking the STR loci. Our method provides a cost-effective and highly scalable solution for rapid recovery of large STR and SNP datasets in any species without need for a reference genome, and can be used even with suboptimal DNA, which is more easily acquired in conservation and ecological genetic studies.

Download data

  • Downloaded 1,152 times
  • Download rankings, all-time:
    • Site-wide: 23,002
    • In genomics: 2,130
  • Year to date:
    • Site-wide: 144,641
  • Since beginning of last month:
    • Site-wide: 143,088

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide