Burden analysis of missense variants in 1,330 disease-associated genes on 3D provides insights into the mutation effects
Jakob B Jespersen,
Henrike O. Heyne,
Shehab S Ahmed,
Zaara T Rifat,
M. Sohel Rahman,
Jeffrey R Cottrell,
Florence F Wagner,
Arthur J Campbell,
Posted 04 Jul 2019
bioRxiv DOI: 10.1101/693259
Posted 04 Jul 2019
Interpretation of the colossal number of genetic variants identified from sequencing applications is one of the major bottlenecks in clinical genetics, with the inference of the effect of amino acid-substituting missense variants on protein structure and function being especially challenging. Here we evaluated the burden of amino acids affected in pathogenic variants (n=32,923) compared to the variants (n=164,915) from the general population in 1,330 disease-associated genes on forty protein features using over 14,000 experimentally-solved 3D structures. By analyzing the whole gene/variant set jointly, we identified 18 features associated with 3D mutational hotspots that are generally important for protein fitness and stability. Individual analyses performed for twenty-four protein functional classes further revealed 240 characteristics of mutational hotspots in total, including new associations recapitulating the sheer diversity across proteins essential structural regions. We demonstrated that the function-specific features of variants correspond to the readouts of mutagenesis experiments and positively correlate with clinically-interpreted pathogenic and benign missense variants. Finally, we made our results available through a web server to foster accessibility and downstream research. Our findings represent a crucial step towards translational genetics, from highlighting the impact of mutations on protein structure to rationalizing the pathogenicity of variants in terms of the perturbed molecular mechanisms.
- Downloaded 1,208 times
- Download rankings, all-time:
- Site-wide: 12,862 out of 118,184
- In genetics: 686 out of 5,132
- Year to date:
- Site-wide: 12,663 out of 118,184
- Since beginning of last month:
- Site-wide: 29,488 out of 118,184
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!