Mortality Selection in a Genetic Sample and Implications for Association Studies
Mortality selection is a general concern in the social and health sciences. Recently, existing health and social science cohorts have begun to collect genomic data. Causes of selection into a genomic dataset can influence results from genomic analyses. Selective non-participation, which is specific to a particular study and its participants, has received attention in the literature. But mortality selection---the very general phenomenon that genomic data collected at a particular age represents selective participation by only the subset of birth cohort members who have survived to the time of data collection---has been largely ignored. Here we test the hypothesis that such mortality selection may significantly alter estimates in polygenic association studies of both health and non-health traits. We demonstrate mortality selection into genome-wide SNP data collection at older ages using the U.S.-based Health and Retirement Study (HRS). We then model the selection process. Finally, we test whether mortality selection alters estimates from genetic association studies. We find evidence for mortality selection. Healthier and more socioeconomically advantaged individuals are more likely to survive to be eligible to participate in the genetic sample of the HRS. Mortality selection leads to modest drift in estimating time-varying genetic effects, a drift that is enhanced when estimates are produced from data that has additional mortality selection. There is no general solution for correcting for mortality selection in a birth cohort prior to entry into a longitudinal study. We illustrate how genetic association studies using HRS data can adjust for mortality selection from study entry to time of genetic data collection by including probability weights that account for mortality selection. Mortality selection should be investigated more broadly in genetically-informed samples from other cohort studies.
- Downloaded 633 times
- Download rankings, all-time:
- Site-wide: 39,782
- In genetics: 1,892
- Year to date:
- Site-wide: 102,970
- Since beginning of last month:
- Site-wide: 118,825
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!