Comprehensive, high-resolution binding energy landscapes reveal context dependencies of transcription factor binding
Transcription factors (TFs) are primary regulators of gene expression in cells, where they bind specific genomic target sites to control transcription. Quantitative measurements of TF-DNA binding energies can improve the accuracy of predictions of TF occupancy and downstream gene expression in vivo and further shed light on how transcriptional networks are rewired throughout evolution. Here, we present a novel sequencing-based TF binding assay and analysis pipeline capable of providing quantitative estimates of binding energies for more than one million DNA sequences in parallel at high energetic resolution. Using this platform, we measured the binding energies associated with all possible combinations of 10 nucleotides flanking the known consensus DNA target for two model yeast TFs, Pho4 and Cbf1. A large fraction of these flanking mutations change overall binding energies by an amount equal to or greater than consensus site mutations, suggesting that current definitions of TF binding sites may be too restrictive. By systematically comparing estimates of binding energies output by deep neural networks (NN) and biophysical models trained on these data, we establish that dinucleotide specificities are sufficient to explain essentially all variance in observed binding behavior, with Cbf1 binding exhibiting significantly more epistasis than Pho4. NN-derived binding energies agree with orthogonal biochemical measurements and reveal that dynamically occupied sites in vivo are both energetically and mutationally distant from the highest-affinity sites.
- Downloaded 1,086 times
- Download rankings, all-time:
- Site-wide: 10,489 out of 92,758
- In biophysics: 326 out of 4,028
- Year to date:
- Site-wide: 74,303 out of 92,758
- Since beginning of last month:
- Site-wide: 81,052 out of 92,758
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!