Rxivist logo

Whole-genome reference panel of 1,781 Northeast Asians improves imputation accuracy of rare and low-frequency variants

By Seong-Keun Yoo, Chang-Uk Kim, Hie Lim Kim, Sungjae Kim, Jong-Yeon Shin, Namcheol Kim, Joshua SungWoo Yang, Kwok-Wai Lo, Belong Cho, Fumihiko Matsuda, Stephan C. Schuster, Changhoon Kim, Jong-Il Kim, Jeong-Sun Seo

Posted 17 Apr 2019
bioRxiv DOI: 10.1101/600353 (published DOI: 10.1186/s13073-019-0677-z)

Genotype imputation using the reference panel is a cost-effective strategy to fill millions of missing genotypes for the purpose of various genetic analyses. Here, we present the Northeast Asian Reference Database (NARD), including whole-genome sequencing data of 1,781 individuals from Korea, Mongolia, Japan, China, and Hong Kong. NARD provides the genetic diversities of Korean (n=850) and Mongolian (n=386) ancestries that were not present in the 1000 Genomes Project Phase 3 (1KGP3). We combined and re-phased the genotypes from NARD and 1KGP3 to construct a union set of haplotypes. This approach established a robust imputation reference panel for the Northeast Asian populations, which yields the greatest imputation accuracy of rare and low-frequency variants compared with the existing panels. Also, we illustrate that NARD can potentially improve disease variant discovery by reducing pathogenic candidates. Overall, this study provides a decent reference panel for the genetic studies in Northeast Asia.

Download data

  • Downloaded 973 times
  • Download rankings, all-time:
    • Site-wide: 21,727
    • In genomics: 2,145
  • Year to date:
    • Site-wide: 44,690
  • Since beginning of last month:
    • Site-wide: 43,003

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)