Rxivist logo

BAGSE: a Bayesian hierarchical model approach for gene set enrichment analysis

By Abhay Hukku, Corbin Quick, Francesca Luca, Roger Pique-Regi, Xiaoquan Wen

Posted 06 Jun 2019
bioRxiv DOI: 10.1101/662171 (published DOI: 10.1093/bioinformatics/btz831)

Gene set enrichment analysis has been shown to be effective in identifying relevant biological pathways underlying complex diseases. Existing approaches lack the ability to quantify the enrichment levels accurately, hence preventing the enrichment information to be further utilized in both upstream and downstream analyses. A modernized and rigorous approach for gene set enrichment analysis that emphasizes both hypothesis testing and enrichment estimation is much needed. We propose a novel computational method, Bayesian Analysis of Gene Set Enrichment (BAGSE), for gene set enrichment analysis. BAGSE is built on a Bayesian hierarchical model and fully accounts for the uncertainty embedded in the association evidence of individual genes. We adopt an empirical Bayes inference framework to fit the proposed hierarchical model by implementing an efficient EM algorithm. Through simulation studies, we illustrate that BAGSE yields accurate enrichment quantification while achieving similar power as the state-of-the-art methods. Further simulation studies show that BAGSE can effectively utilize the enrichment information to improve the power in gene discovery. Finally, we demonstrate the application of BAGSE in analyzing real data from a differential expression experiment and a Transcriptome-wide Association Study (TWAS). Our results indicate that the proposed statistical framework is effective in aiding the discovery of potentially causal pathways and gene networks. BAGSE is implemented using the C++ programming language and is freely available from <https://github.com/xqwen/bagse/>. Simulated and real data used in this paper are also available at the Github repository for reproducibility purposes.

Download data

  • Downloaded 389 times
  • Download rankings, all-time:
    • Site-wide: 61,443
    • In bioinformatics: 6,119
  • Year to date:
    • Site-wide: 104,447
  • Since beginning of last month:
    • Site-wide: 85,079

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


PanLingua

Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News