Rxivist logo

Varying-Censoring Aware Matrix Factorization for Single Cell RNA-Sequencing

By F. William Townes, Stephanie C. Hicks, Martin J Aryee, Rafael A. Irizarry

Posted 21 Jul 2017
bioRxiv DOI: 10.1101/166736

Single cell RNA-Seq (scRNA-Seq) has become the most widely used high-throughput technology for gene expression profiling of individual cells. The potential of being able to measure cell-to-cell variability at a high-dimensional genomic scale opens numerous new lines of investigation in basic and clinical research. For example, by identifying groups of cells with expression profiles unlike those observed in cells with known phenotypes, new cell types may be discovered. Dimension reduction followed by unsupervised clustering are the quantitative approaches typically used to facilitate such discoveries. However, a challenge for this approach is that most scRNA-Seq datasets are sparse, with the percentages of measurements reported as zero ranging from 35% to 99% across cells, and these zeros are partially explained by experimental inefficiencies that lead to censored data. Furthermore, the observed across-cell differences in the percentages of zeros are partly due to technical artifacts rather than biological differences. Unfortunately, standard dimension reduction approaches treat these censored values as true zeros, which leads to the identification of distorted low-dimensional factors. When these factors are used for clustering, the distortion leads to incorrect identification of biological groups. Here, we propose an approach that accounts for cell-specific censoring with a varying-censoring aware matrix factor- ization (VAMF) model that permits the identification of factors in the presence of the above described systematic bias. We demonstrate the ad- vantages of our approach on published scRNA-Seq data and confirm these on simulated data.

Download data

  • Downloaded 1,573 times
  • Download rankings, all-time:
    • Site-wide: 4,999 out of 83,433
    • In genomics: 858 out of 5,384
  • Year to date:
    • Site-wide: 53,278 out of 83,433
  • Since beginning of last month:
    • Site-wide: 53,993 out of 83,433

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)