Rxivist logo

Leveraging multiple layers of data to predict Drosophila complex traits

By Fabio Morgante, Wen Huang, Peter Sørensen, Christian Maltecca, Trudy F. C. Mackay

Posted 30 Oct 2019
bioRxiv DOI: 10.1101/824896 (published DOI: 10.1534/g3.120.401847)

An important challenge in genetics is to be able to predict complex traits accurately. Despite recent advances, prediction accuracy for most complex traits remains low. Here, we used the Drosophila Genetic Reference Panel (DGRP), a collection of 200 lines with whole-genome sequences and deep RNA sequencing data, to evaluate the usefulness of using high-quality gene expression levels compared to relying on genotypes for predicting three complex traits. We found that expression levels provided higher accuracy than genotypes for starvation resistance, similar accuracy for chill coma recovery, and lower accuracy for startle response. Models including both genotype and expressions levels did not outperform the best single component model. However, accuracy increased considerably for all the three traits when we included another layer of information, i.e., gene ontology (GO). We found that a limited number of GO terms, some of which had a clear biological interpretation, were strongly predictive of the traits. In summary, this study shows that integrating different sources of information can improve prediction accuracy, especially when large samples are not available.

Download data

  • Downloaded 334 times
  • Download rankings, all-time:
    • Site-wide: 74,119
    • In genetics: 3,456
  • Year to date:
    • Site-wide: 111,572
  • Since beginning of last month:
    • Site-wide: None

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)