Rxivist logo

Tools and best practices for allelic expression analysis

By Stephane E. Castel, Ami Levy Moonshine, Pejman Mohammadi, Eric Banks, Tuuli Lappalainen

Posted 06 Mar 2015
bioRxiv DOI: 10.1101/016097 (published DOI: 10.1186/s13059-015-0762-6)

Allelic expression (AE) analysis has become an important tool for integrating genome and transcriptome data to characterize various biological phenomena such as cis-regulatory variation and nonsense-mediated decay. In this paper, we systematically analyze the properties of AE read count data and technical sources of error, such as low-quality or double-counted RNA-seq reads, genotyping errors, allelic mapping bias, and technical covariates due to sample preparation and sequencing, and variation in total read depth. We provide guidelines for correcting and filtering for such errors, and show that the resulting AE data has extremely low technical noise. Finally, we introduce novel software for high-throughput production of AE data from RNA-sequencing data, implemented in the GATK framework. These improved tools and best practices for AE analysis yield higher quality AE data by reducing technical bias. This provides a practical framework for wider adoption of AE analysis by the genomics community.

Download data

  • Downloaded 8,237 times
  • Download rankings, all-time:
    • Site-wide: 756 out of 116,126
    • In genomics: 75 out of 6,424
  • Year to date:
    • Site-wide: 9,224 out of 116,126
  • Since beginning of last month:
    • Site-wide: 9,942 out of 116,126

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)