Rxivist logo

RISC: robust integration of single-cell RNA-seq datasets with different extents of cell cluster overlap

By Yang Liu, Tao Wang, Deyou Zheng

Posted 29 Nov 2018
bioRxiv DOI: 10.1101/483297

Single cell RNA-seq (scRNA-seq) has remarkably advanced our understanding of cellular heterogeneity and dynamics in tissue development, diseases, and cancers. Integrated data analysis can often uncover molecular and cellular links among individual datasets and thus provide new biological insights, such as developmental relationship. Due to differences in experimental platforms and biological sample batches, the integration of multiple scRNA-seq datasets is challenging. To address this, we developed a novel computational method for robust integration of scRNA-seq (RISC) datasets using principal component regression (PCR). Because of the natural compatibility of eigenvectors between PCR model and dimension reduction, RISC can accurately integrate scRNA-seq datasets and avoid over-integration. Compared to existing software, RISC shows particular improvement in integrating datasets that contain cells of the same types (more accurately clusters) but at distinct functional states. To demonstrate the value of RISC in finding small groups of cells common between otherwise heterogenous datasets, we applied it to scRNA-seq datasets of normal and malignant cells and successfully identified small clusters of cells in healthy kidney tissues that may be related to the origin of renal tumors.

Download data

  • Downloaded 1,023 times
  • Download rankings, all-time:
    • Site-wide: 17,913
    • In bioinformatics: 2,214
  • Year to date:
    • Site-wide: 27,804
  • Since beginning of last month:
    • Site-wide: 58,492

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)