Rxivist logo

GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies

By Runmin Wei, Jingye Wang, Erik Jia, Tianlu Chen, Yan Ni, Wei Jia

Posted 26 Aug 2017
bioRxiv DOI: 10.1101/177410 (published DOI: 10.1371/journal.pcbi.1005973)

Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Improper data processing procedures for missing values will cause adverse impacts on subsequent statistical analyses. However, few imputation methods have been developed and applied to the situation of MNAR in the field of metabolomics. Thus, a practical left-censored missing value imputation method is urgently needed. We have developed an iterative Gibbs sampler based left-censored missing value imputation approach (GSimp). We compared GSimp with other three imputation methods on two real-world targeted metabolomics datasets and one simulation dataset using our imputation evaluation pipeline. The results show that GSimp outperforms other imputation methods in terms of imputation accuracy, observation distribution, univariate and multivariate analyses, and statistical sensitivity. The R code for GSimp, evaluation pipeline, vignette, real-world and simulated targeted metabolomics datasets are available at: https://github.com/WandeRum/GSimp.

Download data

  • Downloaded 568 times
  • Download rankings, all-time:
    • Site-wide: 28,450 out of 94,912
    • In bioinformatics: 3,792 out of 8,837
  • Year to date:
    • Site-wide: 75,170 out of 94,912
  • Since beginning of last month:
    • Site-wide: 69,148 out of 94,912

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)