Rxivist logo

Strategies for quantitative RNA-seq analyses among closely related species

By Swati Parekh, Beate Vieth, Christoph Ziegenhain, Wolfgang Enard, Ines Hellmann

Posted 09 Apr 2018
bioRxiv DOI: 10.1101/297408

With the growing appreciation for the role of regulatory differences in evolution, researchers need to reliably quantify expression levels within and among species. However, for non-model organisms genome assemblies and annotations are often not available or have inferior quality, biasing the inference of expression changes to an unknown extent. Here, we explore the possibility to map RNA-seq reads from diverged species to one high quality reference genome. As test case, we used a small primate phylogeny ranging from Human to Marmoset spanning 12% nucleotide divergence. To distinguish the effect of sequence divergence and genome quality, we used in silico evolved genomes and existing genomes to simulate RNA-seq reads. These were then mapped to the genome of origin (self-mapping) as well as to one common reference (cross-mapping) to infer the quantification biases. We find that the bias due to cross-mapping is small for the closely related great apes (≤ 4% divergence), and preferable to self-mapping given current genome qualities. For closely related species, cross-mapping provides easy access, high power and a well controlled false discovery rate for both; the analysis of intra-species expression differences as well as the detection of relative differences between species. If divergence increases, so that a substantial fraction of reads exceeds the limits of the mapper used, we find that gene-specific corrections and effect-size cutoffs can limit the bias before self-mapping becomes unavoidable. In summary, for the first time we systematically quantify biases in cross-species RNA-seq studies, providing guidance to best practices for these important evolutionary studies.

Download data

  • Downloaded 904 times
  • Download rankings, all-time:
    • Site-wide: 10,964 out of 77,039
    • In genomics: 1,581 out of 5,061
  • Year to date:
    • Site-wide: 18,474 out of 77,039
  • Since beginning of last month:
    • Site-wide: 15,786 out of 77,039

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)