Rxivist logo

Bayesian analysis of genetic association across tree-structured routine healthcare data in the UK Biobank

By Adrian Cortes, Calliope A. Dendrou, Allan Motyer, Luke Jostins, Damjan Vukcevic, Alexander Dilthey, Peter Donnelly, Stephen Leslie, Lars Fugger, Gil McVean

Posted 01 Feb 2017
bioRxiv DOI: 10.1101/105122 (published DOI: 10.1038/ng.3926)

Genetic discovery from the multitude of phenotypes extractable from routine healthcare data has the ability to radically transform our understanding of the human phenome, thereby accelerating progress towards precision medicine. However, a critical question when analysing high-dimensional and heterogeneous data is how to interrogate increasingly specific subphenotypes whilst retaining statistical power to detect genetic associations. Here we develop and employ a novel Bayesian analysis framework that exploits the hierarchical structure of diagnosis classifications to jointly analyse genetic variants against UK Biobank healthcare phenotypes. Our method displays a more than 20% increase in power to detect genetic effects over other approaches, such that we uncover the broader burden of genetic variation: we identify associations with over 2,000 diagnostic terms. We find novel associations with common immune-mediated diseases (IMD), we reveal the extent of genetic sharing between specific IMDs, and we expose differences in disease perception or diagnosis with potential clinical implications.

Download data

  • Downloaded 1,613 times
  • Download rankings, all-time:
    • Site-wide: 4,792 out of 83,434
    • In genetics: 363 out of 4,385
  • Year to date:
    • Site-wide: 42,335 out of 83,434
  • Since beginning of last month:
    • Site-wide: 34,693 out of 83,434

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)