Biological datasets are large and complex. Machine learning models are therefore essential to capture relationships in the data. Unfortunately, the inferred complex models are often difficult to understand and interpretation is limited to a list of features ranked on their importance in the model. We propose a computational approach, called Foresight, which enables interpretation of the patterns uncovered by Random Forest models trained on biological datasets. Foresight exploits the correlation structure in the data to uncover relevant groups of features and the interactions between them. This facilitates interpretation of the computational model and can provide more detailed insight in the underlying biological relationships than simply ranking features. We demonstrate Foresight on both an artificial dataset and a large gene expression dataset of breast cancer patients. Using the latter dataset we show that our approach retrieves biologically relevant features and provides a rich description of the interactions and correlation structure between these features.
- Downloaded 680 times
- Download rankings, all-time:
- Site-wide: 25,171 out of 105,737
- In bioinformatics: 3,454 out of 9,474
- Year to date:
- Site-wide: 43,819 out of 105,737
- Since beginning of last month:
- Site-wide: 47,448 out of 105,737
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!