Deep learning-based cross-classifications reveal conserved spatial behaviors within tumor histological images
Ali Foroughi pour,
Jeffrey H. Chuang
Posted 26 Jul 2019
bioRxiv DOI: 10.1101/715656
Posted 26 Jul 2019
Histopathological images are a rich but incompletely explored data type for studying cancer. Manual inspection is time consuming, making it challenging to use for image data mining. Here we show that convolutional neural networks (CNNs) can be systematically applied across cancer types, enabling comparisons to reveal shared spatial behaviors. We develop CNN architectures to analyze 27,815 hematoxylin and eosin slides from The Cancer Genome Atlas for tumor/normal, cancer subtype, and mutation classification. Our CNNs are able to classify tumor/normal status of whole slide images (WSIs) in 19 cancer types with consistently high AUCs (0.995±0.008), as well as subtypes with lower but significant accuracy (AUC 0.87±0.1). Remarkably, tumor/normal CNNs trained on one tissue are effective in others (AUC 0.88±0.11), with classifier relationships also recapitulating known adenocarcinoma, carcinoma, and developmental biology. Moreover, classifier comparisons reveal intra-slide spatial similarities, with average tile-level correlation of 0.45±0.16 between classifier pairs. Breast cancers, bladder cancers, and uterine cancers have spatial patterns that are particularly easy to detect, suggesting these cancers can be canonical types for image analysis. Patterns for TP53 mutations can also be detected, with WSI self- and cross-tissue AUCs ranging from 0.65-0.80. Finally, we comparatively evaluate CNNs on 170 breast and colon cancer images with pathologist-annotated nuclei, finding that both cellular and intercellular regions contribute to CNN accuracy. These results demonstrate the power of CNNs not only for histopathological classification, but also for cross-comparisons to reveal conserved spatial biology.
- Downloaded 3,003 times
- Download rankings, all-time:
- Site-wide: 1,678 out of 83,578
- In bioinformatics: 313 out of 8,009
- Year to date:
- Site-wide: 1,800 out of 83,578
- Since beginning of last month:
- Site-wide: 2,985 out of 83,578
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!