Accurate ethnicity prediction from placental DNA methylation data
E Magda Price,
Giulia F Del Gobbo,
Alexandra M Binder,
Karin B Michels,
Carmen J Marsit,
Wendy P. Robinson
Posted 30 Apr 2019
bioRxiv DOI: 10.1101/618470 (published DOI: 10.1186/s13072-019-0296-3)
Posted 30 Apr 2019
Background The influence of genetics on variation in DNA methylation (DNAme) is well documented. Yet confounding from population stratification is often unaccounted for in DNAme association studies. Existing approaches to address confounding by population stratification using DNAme data may not generalize to populations or tissues outside those in which they were developed. To aid future placental DNAme studies in assessing population stratification, we developed an ethnicity classifier, PlaNET (Placental DNAme Elastic Net Ethnicity Tool), using five cohorts with Infinium Human Methylation 450k BeadChip array (HM450k) data from placental samples that is also compatible with the newer EPIC platform. Results Data from 509 placental samples was used to develop PlaNET and show that it accurately predicts (accuracy = 0.938, kappa = 0.823) major classes of self-reported ethnicity/race (African: n = 58, Asian: n = 53, Caucasian: n = 389), and produces ethnicity probabilities that are highly correlated with genetic ancestry inferred from genome-wide SNP arrays (>2.5 million SNP) and ancestry informative markers (n = 50 SNPs). PlaNET’s ethnicity classification relies on 1860 HM450K microarray sites, and over half of these were linked to nearby genetic polymorphisms (n = 955). Our placental-optimized method outperforms existing approaches in assessing population stratification in placental samples from individuals of Asian, African, and Caucasian ethnicities. Conclusion PlaNET provides an improved approach to address population stratification in placental DNAme association studies. The method can be applied to predict ethnicity as a discrete or continuous variable and will be especially useful when self-reported ethnicity information is missing and genotyping markers are unavailable. PlaNET is available as an R package at (<https://github.com/wvictor14/planet>). * PlaNET : Placental DNAme Elastic Net Ethnicity Tool DNAme : DNA methylation CpG : Cytosine-phosphate-guanine SNP : Single-nucleotide polymorphism AIMs : Ancestry informative genotyping markers mQTL : methylation quantitative trait loci PCA : Principal component analysis PC : Principal component HM450K : Infinium HumanMethylation450 BeadChip EPIC : Infinium MethylationEPIC BeadChip LODOCV : Leave-one-dataset-out cross validation GLMNET : Generalized logistic regression with an elastic net penalty SVM : Support vector machines KNN : K-nearest neighbours NSC : Nearest shrunken centroids PlaNET : Placental elastic net ethnicity classifier USA : United States of America AFR : African ASI : Asian CAU : Caucasian BMIQ : Beta-mixture interquantile normalization NOOB : Normal exponential out-of-band normalization.
- Downloaded 356 times
- Download rankings, all-time:
- Site-wide: 61,565 out of 117,931
- In genomics: 4,481 out of 6,427
- Year to date:
- Site-wide: 88,597 out of 117,931
- Since beginning of last month:
- Site-wide: 65,935 out of 117,931
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!