Rxivist logo

Empowering Multi-Cohort Gene Expression Analysis to Increase Reproducibility

By Winston A. Haynes, Francesco Vallania, Charles Liu, Erika Bongen, Aurelie Tomczak, Marta Andres-Terrè, Shane Lofgren, Andrew Tam, Cole A. Deisseroth, Matthew D Li, Timothy E Sweeney, Purvesh Khatri

Posted 25 Aug 2016
bioRxiv DOI: 10.1101/071514

A major contributor to the scientific reproducibility crisis has been that the results from homogeneous, single-center studies do not generalize to heterogeneous, real world populations. Multi-cohort gene expression analysis has helped to increase reproducibility by aggregating data from diverse populations into a single analysis. To make the multi-cohort analysis process more feasible, we have assembled an analysis pipeline which implements rigorously studied meta-analysis best practices. We have compiled and made publicly available the results of our own multi-cohort gene expression analysis of 103 diseases, spanning 615 studies and 36,915 samples, through a novel and interactive web application. As a result, we have made both the process of and the results from multi-cohort gene expression analysis more approachable for non-technical users.

Download data

  • Downloaded 1,564 times
  • Download rankings, all-time:
    • Site-wide: 15,110
    • In bioinformatics: 1,666
  • Year to date:
    • Site-wide: 49,017
  • Since beginning of last month:
    • Site-wide: 33,974

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide