Classification of human white blood cells using machine learning for stain-free imaging flow cytometry
Louise J Michaelis,
Posted 24 Jun 2019
bioRxiv DOI: 10.1101/680975 (published DOI: 10.1002/cyto.a.23920)
Posted 24 Jun 2019
Imaging flow cytometry (IFC) produces up to 12 different information-rich images of single cells at a throughput of 5000 cells per second. Yet often, cell populations are still studied using manual gating, a technique that has several drawbacks. Firstly, it is hard to reproduce. Secondly, it is subjective and biased. And thirdly, it is time-consuming for large experiments. Therefore, it would be advantageous to replace manual gating with an automated process, which could be based on stain-free measurements originating from the brightfield and darkfield image channels. To realise this potential, advanced data analysis methods are required, in particular, machine learning. Previous works have successfully tested this approach on cell cycle phase classification with both a classical machine learning approach based on manually engineered features, and a deep learning approach. In this work, we compare both approaches extensively on the complex problem of white blood cell classification. Four human whole blood samples were assayed on an ImageStream-X MK II imaging flow cytometer. Two samples were stained for the identification of 8 white blood cell types, while two other sample sets were stained for the identification of resting and active eosinophils. For both datasets, four machine learning classifiers were evaluated on stain-free imagery using stratified 5-fold cross-validation. On the white blood cell dataset the best obtained results were 0.776 and 0.697 balanced accuracy for classical machine learning and deep learning, respectively. On the eosinophil dataset this was 0.866 and 0.867 balanced accuracy. From the experiments we conclude that classifying distinct cell types based on only stain-free images is possible with these techniques. However, both approaches did not always succeed in making reliable cell subtype classifications. Also, depending on the cell type, we find that even though the deep learning approach requires less expert input, it performs on par with a classical approach.
- Downloaded 642 times
- Download rankings, all-time:
- Site-wide: 20,840 out of 83,819
- In bioinformatics: 3,040 out of 8,035
- Year to date:
- Site-wide: 8,430 out of 83,819
- Since beginning of last month:
- Site-wide: 6,232 out of 83,819
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!