Rxivist logo

When to use Quantile Normalization?

By Stephanie C. Hicks, Rafael Irizarry

Posted 04 Dec 2014
bioRxiv DOI: 10.1101/012203

Normalization and preprocessing are essential steps for the analysis of high-throughput data including next-generation sequencing and microarrays. Multi-sample global normalization methods, such as quantile normalization, have been successfully used to remove technical variation from noisy data. These methods rely on the assumption that observed global changes across samples are due to unwanted technical variability. Transforming the data to remove these differences has the potential to remove interesting biologically driven global variation and therefore may not be appropriate depending on the type and source of variation. Currently, it is up to the subject matter experts, for example biologists, to determine if the stated assumptions are appropriate or not. Here, we propose a data-driven method to test for the assumptions of global normalization methods. We demonstrate the utility of our method (quantro), by applying it to multiple gene expression and DNA methylation and show examples of when global normalization methods are not appropriate. We also perform a Monte Carlo simulation study to illustrate how our method generally outperforms the current approach. An R-package implementing our method is available on Bioconductor (http://www.bioconductor.org/packages/release/bioc/html/quantro.html).

Download data

  • Downloaded 16,155 times
  • Download rankings, all-time:
    • Site-wide: 95 out of 83,642
    • In genomics: 15 out of 5,408
  • Year to date:
    • Site-wide: 351 out of 83,642
  • Since beginning of last month:
    • Site-wide: 458 out of 83,642

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)