Genomic epidemiology is an established tool for investigation of outbreaks of infectious diseases and wider public health applications. It traces transmission of pathogens based on whole-genome sequencing of colony picks from culture plates enriching the target organism(s). In this article, we introduce the mGEMS pipeline for performing genomic epidemiology directly with plate sweeps representing mixed samples of the target pathogen in a culture plate, skipping the colony pick step entirely. By requiring only a single culturing and library preparation step per analyzed sample, we address several key issues in the current approach relating to its cost, practical application and sensitivity. Our pipeline significantly improves upon the state-of-the-art in analysing mixed short-read sequencing data from bacteria, reaching accuracy levels in downstream analyses closely resembling colony pick sequencing data that allow reliable SNP calling and subsequent phylogenetic analyses. The fundamental novel parts enabling these analyses are the mGEMS read binner for probabilistic assignments of sequencing reads and the high-throughput exact pseudoaligner Themisto. In conjunction with recent advances in probabilistic modelling of mixed bacterial samples and genome assembly techniques, these tools form the mGEMS pipeline. We demonstrate the effectiveness of our approach using closely related samples in a nosocomial setting for the three major pathogens Enterococcus faecalis, Escherichia coli and Staphylococcus aureus. Our results lend firm support to more widespread consideration of genomic epidemiology with mixed infection samples.
- Downloaded 435 times
- Download rankings, all-time:
- Site-wide: 44,349 out of 103,705
- In bioinformatics: 5,281 out of 9,474
- Year to date:
- Site-wide: 10,490 out of 103,705
- Since beginning of last month:
- Site-wide: 14,008 out of 103,705
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!