Rxivist logo

Meta-matching: a simple framework to translate phenotypic predictive models from big to small data

By Tong He, Lijun An, Jiashi Feng, Danilo Bzdok, Avram Holmes, Simon B. Eickhoff, B.T. Thomas Yeo

Posted 11 Aug 2020
bioRxiv DOI: 10.1101/2020.08.10.245373

There is significant interest in using brain imaging data to predict non-brain-imaging phenotypes in individual participants. However, most prediction studies are underpowered, relying on less than a few hundred participants, leading to low reliability and inflated prediction performance. Yet, small sample sizes are unavoidable when studying clinical populations or addressing focused neuroscience questions. Here, we propose a simple framework - "meta-matching" - to translate predictive models from large-scale datasets to new unseen non-brain-imaging phenotypes in boutique studies. The key observation is that many large-scale datasets collect a wide range inter-correlated phenotypic measures. Therefore, a unique phenotype from a boutique study likely correlates with (but is not the same as) some phenotypes in some large-scale datasets. Meta-matching exploits these correlations to boost prediction in the boutique study. We applied meta-matching to the problem of predicting non-brain-imaging phenotypes using resting-state functional connectivity (RSFC). Using the UK Biobank (N = 36,848), we demonstrated that meta-matching can boost the prediction of new phenotypes in small independent datasets by 100% to 400% in many scenarios. When considering relative prediction performance, meta-matching significantly improved phenotypic prediction even in samples with 10 participants. When considering absolute prediction performance, meta-matching significantly improved phenotypic prediction when there were least 50 participants. With a growing number of large-scale population-level datasets collecting an increasing number of phenotypic measures, our results represent a lower bound on the potential of meta-matching to elevate small-scale boutique studies. ### Competing Interest Statement The authors have declared no competing interest.

Download data

  • Downloaded 1,923 times
  • Download rankings, all-time:
    • Site-wide: 12,562
    • In neuroscience: 3,228
  • Year to date:
    • Site-wide: 2,132
  • Since beginning of last month:
    • Site-wide: 1,061

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide