Linking high GC content to the repair of double strand breaks in prokaryotic genomes
By
Jake L Weissman,
William F. Fagan,
Philip L.F. Johnson
Posted 08 Feb 2019
bioRxiv DOI: 10.1101/544924
(published DOI: 10.1371/journal.pgen.1008493)
Genomic GC content varies widely among microbes for reasons unknown. While mutation bias partially explains this variation, prokaryotes near-universally have a higher GC content than predicted solely by this bias. Debate surrounds the relative importance of the remaining explanations of selection versus biased gene conversion favoring GC alleles. Some environments (e.g. soils) are associated with a high genomic GC content of their inhabitants, which implies that either high GC content is a selective adaptation to particular habitats, or that certain habitats favor increased rates of gene conversion. Here, we report a novel association between the presence of the non-homologous end joining DNA double-strand break repair pathway and GC content; this observation suggests that DNA damage may be a fundamental driver of GC content, leading in part to the many environmental patterns observed to-date. We discuss potential mechanisms accounting for the observed association, and provide preliminary evidence that sites experiencing higher rates of double-strand breaks are under selection for increased GC content relative to the genomic background. Author Summary The overall nucleotide composition of an organism’s genome varies greatly between species. Previous work has identified certain environmental factors (e.g., oxygen availability) associated with the relative number of GC bases as opposed to AT bases in the genomes of species. Many of these environments that are associated with high GC content are also associated with relatively high rates of DNA damage. We show that organisms possessing the non-homologous end-joining DNA repair pathway, which is one mechanism to repair DNA double-strand breaks, have an elevated GC content relative to expectation. We also show that certain sites on the genome that are particularly susceptible to double strand breaks have an elevated GC content. This leads us to suggest that an important underlying driver of variability in nucleotide composition across environments is the rate of DNA damage (specifically double-strand breaks) to which an organism living in each environment is exposed.
Download data
- Downloaded 746 times
- Download rankings, all-time:
- Site-wide: 50,472
- In microbiology: 2,934
- Year to date:
- Site-wide: 111,032
- Since beginning of last month:
- Site-wide: 132,856
Altmetric data
Downloads over time
Distribution of downloads per paper, site-wide
PanLingua
News
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!