tappAS: a comprehensive computational framework for the analysis of the functional impact of differential splicing
Lorena de la Fuente,
Héctor del Risco,
Posted 03 Jul 2019
bioRxiv DOI: 10.1101/690743
Posted 03 Jul 2019
Traditionally, the functional analysis of gene expression data has used pathway and network enrichment algorithms. These methods are usually gene rather than transcript centric and hence fall short to unravel functional roles associated to posttranscriptional regulatory mechanisms such as Alternative Splicing (AS) and Alternative PolyAdenylation (APA), jointly referred here as Alternative Transcript Processing (AltTP). Moreover, short-read RNA-seq has serious limitations to resolve full-length transcripts, further complicating the study of isoform expression. Recent advances in long-read sequencing open exciting opportunities for studying isoform biology and function. However, there are no established bioinformatics methods for the functional analysis of isoform-resolved transcriptomics data to fully leverage these technological advances. Here we present a novel framework for Functional Iso-Transcriptomics analysis (FIT). This framework uses a rich isoform-level annotation database of functional domains, motifs and sites –both coding and non-coding- and introduces novel analysis methods to interrogate different aspects of the functional relevance of isoform complexity. The Functional Diversity Analysis (FDA) evaluates the variability at the inclusion/exclusion of functional domains across annotated transcripts of the same gene. Parameters can be set to evaluate if AltTP partially or fully disrupts functional elements. FDA is a measure of the potential of a multiple isoform transcriptome to have a functional impact. By combining these functional labels with expression data, the Differential Analysis Module evaluates the relative contribution of transcriptional (i.e. gene level) and post-transcriptional (i.e. transcript/protein levels) regulation on the biology of the system. Measures of isoform relevance such as Minor Isoform Filtering, Isoform Switching Events and Total Isoform Usage Change contribute to restricting analysis to biologically meaningful changes. Finally, novel methods for Differential Feature Inclusion, Co-Feature Inclusion, and the combination of UTR-lengthening with Alternative Polyadenylation analyses carefully dissects the contextual regulation of functional elements resulting from differential isoforms usage. These methods are implemented in the software tappAS, a user-friendly Java application that brings FIT to the hands of non-expert bioinformaticians supporting several model and non-model species. tappAS complements statistical analyses with powerful browsing tools and highly informative gene/transcript/CDS graphs. We applied tappAS to the analysis of two mouse Neural Precursor Cells (NPCs) and Oligodendrocyte Precursor Cells (OPCs) whose transcriptome was defined by PacBio and quantified by Illumina. Using FDA we confirmed the high potential of AltTP regulation in our system, in which 90% of multi-isoform genes presented variation in functional features at the transcript or protein level. The Differential Analysis module revealed a high interplay between transcriptional and AltTP regulation in neural development, mainly controlled by differential expression, but where AltTP acts the main driver of important neural development biological mechanisms such as vesicle trafficking, signal transduction and RNA processing. The DFI analysis revealed that, globally, AltTP increased the availability of functional features in differentiated neural cells. DFI also showed that AltTP is a mechanism for altering gene function by changing cellular localization and binding properties of proteins, via the differential inclusion of NLS, transmembrane domains or DNA binding motifs, for example. Some of these findings were experimentally validated by others and us. In summary, we propose a novel framework for the functional analysis of transcriptomes at isoform resolution. We anticipate the tappAS tool will be an important resource for the adoption of the Functional Iso-Transcriptomics analysis by functional genomics community.
- Downloaded 876 times
- Download rankings, all-time:
- Site-wide: 22,094
- In bioinformatics: 2,702
- Year to date:
- Site-wide: 41,040
- Since beginning of last month:
- Site-wide: 48,331
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!