Rxivist logo

learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trial data

By Cathy C. Westhues, Henner Simianer, Timothy Mathes Beissinger

Posted 15 Dec 2021
bioRxiv DOI: 10.1101/2021.12.13.472185

We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial (MET) breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or can retrieve global meteorological datasets from a NASA database. Daily weather data can be aggregated in daily windows based on naive (for instance, daily windows with a fixed number of days) or phenological approaches. Different machine learning methods for genomic prediction are implemented, including gradient boosted trees, random forests, stacked ensemble models, and multi-layer perceptrons. These prediction models can be evaluated via a collection of cross-validation schemes that mimic typical scenarios encountered by plant breeders working with MET experimental data in a user-friendly way. The package is fully open source and accessible on GitHub.

Download data

  • Downloaded 341 times
  • Download rankings, all-time:
    • Site-wide: 120,334
    • In genetics: 5,241
  • Year to date:
    • Site-wide: 16,280
  • Since beginning of last month:
    • Site-wide: 18,202

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide