Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 62,292 bioRxiv papers from 276,522 authors.

Pan-cancer classifications of tumor histological images using deep learning

By Javad Noorbakhsh, Saman Farahmand, Mohammad Soltanieh-ha, Sandeep Namburi, Kourosh Zarringhalam, Jeff Chuang

Posted 26 Jul 2019
bioRxiv DOI: 10.1101/715656

Histopathological images are essential for the diagnosis of cancer type and selection of optimal treatment. However, the current clinical process of manual inspection of images is time consuming and prone to intra- and inter-observer variability. Here we show that key aspects of cancer image analysis can be performed by deep convolutional neural networks (CNNs) across a wide spectrum of cancer types. In particular, we implement CNN architectures based on Google Inception v3 transfer learning to analyze 27815 H&E slides from 23 cohorts in The Cancer Genome Atlas in studies of tumor/normal status, cancer subtype, and mutation status. For 19 solid cancer types we are able to classify tumor/normal status of whole slide images with extremely high AUCs (0.995±0.008). We are also able to classify cancer subtypes within 10 tissue types with AUC values well above random expectations (micro-average 0.87±0.1). We then perform a cross-classification analysis of tumor/normal status across tumor types. We find that classifiers trained on one type are often effective in distinguishing tumor from normal in other cancer types, with the relationships among classifiers matching known cancer tissue relationships. For the more challenging problem of mutational status, we are able to classify TP53 mutations in three cancer types with AUCs from 0.65-0.80 using a fully-trained CNN, and with similar cross-classification accuracy across tissues. These studies demonstrate the power of CNNs for not only classifying histopathological images in diverse cancer types, but also for revealing shared biology between tumors. We have made software available at: https://github.com/javadnoorb/HistCNN

Download data

  • Downloaded 1,944 times
  • Download rankings, all-time:
    • Site-wide: 2,313 out of 62,292
    • In bioinformatics: 480 out of 6,225
  • Year to date:
    • Site-wide: 399 out of 62,292
  • Since beginning of last month:
    • Site-wide: 15 out of 62,292

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide

Sign up for the Rxivist weekly newsletter! (Click here for more details.)