Big data approaches to decomposing heterogeneity across the autism spectrum
By
Michael V. Lombardo,
Meng-Chuan Lai,
Simon Baron-Cohen
Posted 09 Mar 2018
bioRxiv DOI: 10.1101/278788
(published DOI: 10.1038/s41380-018-0321-0)
Autism is a diagnostic label based on behavior. While the diagnostic criteria attempts to maximize clinical consensus, it also masks a wide degree of heterogeneity between and within individuals at multiple levels of analysis. Understanding this multi-level heterogeneity is of high clinical and translational importance. Here we present organizing principles to frame the work examining multi-level heterogeneity in autism. Theoretical concepts such as 'spectrum' or 'autisms' reflect non-mutually exclusive explanations regarding continuous/dimensional or categorical/qualitative variation between and within individuals. However, common practices of small sample size studies and case-control models are suboptimal for tackling heterogeneity. Big data is an important ingredient for furthering our understanding heterogeneity in autism. In addition to being 'feature-rich', big data should be both 'broad' (i.e. large sample size) and 'deep' (i.e. multiple levels of data collected on the same individuals). These characteristics help ensure the results from a population are more generalizable and facilitate evaluation of the utility of different models of heterogeneity. A model's utility can be shown by its ability to explain clinically or mechanistically important phenomena, but also by explaining how variability manifests across different levels of analysis. The directionality for explaining variability across levels can be bottom-up or top-down, and should include the importance of development for characterizing change within individuals. While progress can be made with 'supervised' models built upon a priori or theoretically predicted distinctions or dimensions of importance, it will become increasingly important to complement such work with unsupervised data-driven discoveries that leverage unknown and multivariate distinctions within big data. Without a better understanding of how to model heterogeneity between autistic people, progress towards the goal of precision medicine may be limited.
Download data
- Downloaded 1,236 times
- Download rankings, all-time:
- Site-wide: 13,234
- In neuroscience: 1,589
- Year to date:
- Site-wide: None
- Since beginning of last month:
- Site-wide: 98,055
Altmetric data
Downloads over time
Distribution of downloads per paper, site-wide
PanLingua
News
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!