Most downloaded biology preprints, all time
in category genetics
6,203 results found. For more information, click each entry to expand.
49,239 downloads bioRxiv genetics
Wolfgang Haak, Iosif Lazaridis, Nick Patterson, Nadin Rohland, Swapan Mallick, Bastien Llamas, Guido Brandt, Susanne Nordenfelt, Eadaoin Harney, Kristin Stewardson, Qiaomei Fu, Alissa Mittnik, Eszter Bánffy, Christos Economou, Michael Francken, Susanne Friederich, Rafael Garrido Pena, Fredrik Hallgren, Valery Khartanovich, Aleksandr Khokhlov, Michael Kunst, Pavel Kuznetsov, Harald Meller, Oleg Mochalov, Vayacheslav Moiseyev, Nicole Nicklisch, Sandra L. Pichler, Roberto Risch, Manuel A. Rojo Guerra, Christina Roth, Anna Szécsényi-Nagy, Joachim Wahl, Matthias Meyer, Johannes Krause, Dorcas Brown, David Anthony, Alan Cooper, Kurt Werner Alt, David Reich
We generated genome-wide data from 69 Europeans who lived between 8,000-3,000 years ago by enriching ancient DNA libraries for a target set of almost four hundred thousand polymorphisms. Enrichment of these positions decreases the sequencing required for genome-wide ancient DNA analysis by a median of around 250-fold, allowing us to study an order of magnitude more individuals than previous studies and to obtain new insights about the past. We show that the populations of western and far eastern Europe followed opposite trajectories between 8,000-5,000 years ago. At the beginning of the Neolithic period in Europe, ~8,000-7,000 years ago, closely related groups of early farmers appeared in Germany, Hungary, and Spain, different from indigenous hunter-gatherers, whereas Russia was inhabited by a distinctive population of hunter-gatherers with high affinity to a ~24,000 year old Siberian6. By ~6,000-5,000 years ago, a resurgence of hunter-gatherer ancestry had occurred throughout much of Europe, but in Russia, the Yamnaya steppe herders of this time were descended not only from the preceding eastern European hunter-gatherers, but from a population of Near Eastern ancestry. Western and Eastern Europe came into contact ~4,500 years ago, as the Late Neolithic Corded Ware people from Germany traced ~3/4 of their ancestry to the Yamnaya, documenting a massive migration into the heartland of Europe from its eastern periphery. This steppe ancestry persisted in all sampled central Europeans until at least ~3,000 years ago, and is ubiquitous in present-day Europeans. These results provide support for the theory of a steppe origin of at least some of the Indo-European languages of Europe.
39,437 downloads bioRxiv genetics
Iain Mathieson, Iosif Lazaridis, Nadin Rohland, Swapan Mallick, Nick Patterson, Songül Alpaslan Roodenberg, Eadaoin Harney, Kristin Stewardson, Daniel Fernandes, Mario Novak, Kendra Sirak, Cristina Gamba, Eppie R. Jones, Bastien Llamas, Stanislav Dryomov, Joseph Pickrell, Juan Luís Arsuaga, José María Bermúdez de Castro, Eudald Carbonell, Fokke Gerritsen, Aleksandr Khokhlov, Pavel Kuznetsov, Marina Lozano, Harald Meller, Oleg Mochalov, Vayacheslav Moiseyev, Manuel A. Rojo Guerra, Jacob Roodenberg, Josep Maria Vergès, Johannes Krause, Alan Cooper, Kurt W. Alt, Dorcas Brown, David Anthony, Carles Lalueza-Fox, Wolfgang Haak, Ron Pinhasi, David Reich
The arrival of farming in Europe around 8,500 years ago necessitated adaptation to new environments, pathogens, diets, and social organizations. While indirect evidence of adaptation can be detected in patterns of genetic variation in present-day people, ancient DNA makes it possible to witness selection directly by analyzing samples from populations before, during and after adaptation events. Here we report the first genome-wide scan for selection using ancient DNA, capitalizing on the largest genome-wide dataset yet assembled: 230 West Eurasians dating to between 6500 and 1000 BCE, including 163 with newly reported data. The new samples include the first genome-wide data from the Anatolian Neolithic culture, who we show were members of the population that was the source of Europe's first farmers, and whose genetic material we extracted by focusing on the DNA-rich petrous bone. We identify genome-wide significant signatures of selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height.
27,234 downloads bioRxiv genetics
Accurate understanding of the global spread of emerging viruses is critically important for public health response and for anticipating and preventing future outbreaks. Here, we elucidate when, where and how the earliest sustained SARS-CoV-2 transmission networks became established in Europe and the United States (US). Our results refute prior findings erroneously linking cases in January 2020 with outbreaks that occurred weeks later. Instead, rapid interventions successfully prevented onward transmission of those early cases in Germany and Washington State. Other, later introductions of the virus from China to both Italy and Washington State founded the earliest sustained European and US transmission networks. Our analyses reveal an extended period of missed opportunity when intensive testing and contact tracing could have prevented SARS-CoV-2 from becoming established in the US and Europe. ### Competing Interest Statement JOW has received funding from Gilead Sciences, LLC (completed) and the CDC (ongoing) via grants and contracts to his institution unrelated to this research. MAS receives funding from Janssen Research & Development, IQVIA and Private Health Management via contracts unrelated to this research.
25,969 downloads bioRxiv genetics
Iosif Lazaridis, Dani Nadel, Gary Rollefson, Deborah C Merrett, Nadin Rohland, Swapan Mallick, Daniel Fernandes, Mario Novak, Beatriz Gamarra, Kendra Sirak, Sarah Connell, Kristin Stewardson, Eadaoin Harney, Qiaomei Fu, Gloria Gonzalez-Fortes, Songül Alpaslan Roodenberg, György Lengyel, Fanny Bocquentin, Boris Gasparian, Janet M. Monge, Michael Gregg, Vered Eshed, Ahuva-Sivan Mizrahi, Christopher Meiklejohn, Fokke Gerritsen, Luminita Bejenaru, Matthias Blueher, Archie Campbell, Gianpero Cavalleri, David Comas, Philippe Froguel, Edmund Gilbert, Shona M. Kerr, Peter Kovacs, Johannes Krause, Darren McGettigan, Michael Merrigan, D. Andrew Merriwether, Seamus O’Reilly, Martin B. Richards, Ornella Semino, Michel Shamoon-Pour, Gheorghe Stefanescu, Michael Stumvoll, Anke Tönjes, Antonio Torroni, James F Wilson, Loic Yengo, Nelli A. Hovhannisyan, Nick Patterson, Ron Pinhasi, David Reich
We report genome-wide ancient DNA from 44 ancient Near Easterners ranging in time between ~12,000-1,400 BCE, from Natufian hunter-gatherers to Bronze Age farmers. We show that the earliest populations of the Near East derived around half their ancestry from a 'Basal Eurasian' lineage that had little if any Neanderthal admixture and that separated from other non-African lineages prior to their separation from each other. The first farmers of the southern Levant (Israel and Jordan) and Zagros Mountains (Iran) were strongly genetically differentiated, and each descended from local hunter-gatherers. By the time of the Bronze Age, these two populations and Anatolian-related farmers had mixed with each other and with the hunter-gatherers of Europe to drastically reduce genetic differentiation. The impact of the Near Eastern farmers extended beyond the Near East: farmers related to those of Anatolia spread westward into Europe; farmers related to those of the Levant spread southward into East Africa; farmers related to those from Iran spread northward into the Eurasian steppe; and people related to both the early farmers of Iran and to the pastoralists of the Eurasian steppe spread eastward into South Asia.
20,650 downloads bioRxiv genetics
Clare Bycroft, Colin Freeman, Desislava Petkova, Gavin Band, Lloyd T Elliott, Kevin Sharp, Allan Motyer, Damjan Vukcevic, Olivier Delaneau, Jared O’Connell, Adrian Cortes, Samantha Welsh, Gil McVean, Stephen Leslie, Peter Donnelly, Jonathan Marchini
The UK Biobank project is a large prospective cohort study of ~500,000 individuals from across the United Kingdom, aged between 40-69 at recruitment. A rich variety of phenotypic and health-related information is available on each participant, making the resource unprecedented in its size and scope. Here we describe the genome-wide genotype data (~805,000 markers) collected on all individuals in the cohort and its quality control procedures. Genotype data on this scale offers novel opportunities for assessing quality issues, although the wide range of ancestries of the individuals in the cohort also creates particular challenges. We also conducted a set of analyses that reveal properties of the genetic data (such as population structure and relatedness) that can be important for downstream analyses. In addition, we phased and imputed genotypes into the dataset, using computationally efficient methods combined with the Haplotype Reference Consortium (HRC) and UK10K haplotype resource. This increases the number of testable variants by over 100-fold to ~96 million variants. We also imputed classical allelic variation at 11 human leukocyte antigen (HLA) genes, and as a quality control check of this imputation, we replicate signals of known associations between HLA alleles and many common diseases. We describe tools that allow efficient genome-wide association studies (GWAS) of multiple traits and fast phenome-wide association studies (PheWAS), which work together with a new compressed file format that has been used to distribute the dataset. As a further check of the genotyped and imputed datasets, we performed a test-case genome-wide association scan on a well-studied human trait, standing height.
20,553 downloads bioRxiv genetics
Iain Mathieson, Songül Alpaslan Roodenberg, Cosimo Posth, Kurt W. Alt, Nadin Rohland, Swapan Mallick, Iñigo Olalde, Nasreen Broomandkhoshbacht, Francesca Candilio, Olivia Cheronet, Daniel M Fernandes, Matthew Ferry, Beatriz Gamarra, Gloria González Fortes, Wolfgang Haak, Eadaoin Harney, Eppie Jones, Denise Keating, Ben Krause-Kyora, Isil Kucukkalipci, Megan Michel, Alissa Mittnik, Kathrin Nägele, Mario Novak, Jonas Oppenheimer, Nick Patterson, Saskia Pfrengle, Kendra Sirak, Kristin Stewardson, Stefania Vai, Stefan Alexandrov, Kurt W. Alt, Radian Andreescu, Dragana Antonović, Abigail Ash, Nadezhda Atanassova, Krum Bacvarov, Mende Balázs Gusztáv, Hervé Bocherens, Michael Bolus, Adina Boroneanţ, Yavor Boyadzhiev, Alicja Budnik, Josip Burmaz, Stefan Chohadzhiev, Nicholas J. Conard, Richard Cottiaux, Maja Čuka, Christophe Cupillard, Dorothée G. Drucker, Nedko Elenski, Michael Francken, Borislava Galabova, Georgi Ganetovski, Bernard Gély, Tamás Hajdu, Veneta Handzhyiska, Katerina Harvati, Thomas Higham, Stanislav Iliev, Ivor Janković, Ivor Karavanić, Douglas J. Kennett, Darko Komšo, Alexandra Kozak, Damian Labuda, Martina Lari, Catalin Lazar, Maleen Leppek, Krassimir Leshtakov, Domenico Lo Vetro, Dženi Los, Ivaylo Lozanov, Maria Malina, Fabio Martini, Kath McSweeney, Harald Meller, Marko Menđušić, Pavel Mirea, Vyacheslav Moiseyev, Vanya Petrova, T. Douglas Price, Angela Simalcsik, Luca Sineo, Mario Šlaus, Vladimir Slavchev, Petar Stanev, Andrej Starović, Tamás Szeniczey, Sahra Talamo, Maria Teschler-Nicola, Corinne Thevenet, Ivan Valchev, Frédérique Valentin, Sergey Vasilyev, Fanica Veljanovska, Svetlana Venelinova, Elizaveta Veselovskaya, Bence Viola, Cristian Virag, Joško Zaninović, Steve Zäuner, Philipp W. Stockhammer, Giulio Catalano, Raiko Krauß, David Caramelli, Gunita Zariņa, Bisserka Gaydarska, Malcolm Lillie, Alexey G. Nikitin, Inna Potekhina, Anastasia Papathanasiou, Dušan Borić, Clive Bonsall, Johannes Krause, Ron Pinhasi, David Reich
Farming was first introduced to southeastern Europe in the mid-7th millennium BCE - brought by migrants from Anatolia who settled in the region before spreading throughout Europe. To clarify the dynamics of the interaction between the first farmers and indigenous hunter-gatherers where they first met, we analyze genome-wide ancient DNA data from 223 individuals who lived in southeastern Europe and surrounding regions between 12,000 and 500 BCE. We document previously uncharacterized genetic structure, showing a West-East cline of ancestry in hunter-gatherers, and show that some Aegean farmers had ancestry from a different lineage than the northwestern Anatolian lineage that formed the overwhelming ancestry of other European farmers. We show that the first farmers of northern and western Europe passed through southeastern Europe with limited admixture with local hunter-gatherers, but that some groups mixed extensively, with relatively sex-balanced admixture compared to the male-biased hunter-gatherer admixture that prevailed later in the North and West. Southeastern Europe continued to be a nexus between East and West after farming arrived, with intermittent genetic contact from the Steppe up to 2000 years before the migration that replaced much of northern Europe's population.
20,538 downloads bioRxiv genetics
The Armenians are a culturally isolated population who historically inhabited a region in the Near East bounded by the Mediterranean and Black seas and the Caucasus, but remain underrepresented in genetic studies and have a complex history including a major geographic displacement during World War One. Here, we analyse genome-wide variation in 173 Armenians and compare them to 78 other worldwide populations. We find that Armenians form a distinctive cluster linking the Near East, Europe, and the Caucasus. We show that Armenian diversity can be explained by several mixtures of Eurasian populations that occurred between ~3,000 and ~2,000 BCE, a period characterized by major population migrations after the domestication of the horse, appearance of chariots, and the rise of advanced civilizations in the Near East. However, genetic signals of population mixture cease after ~1,200 BCE when Bronze Age civilizations in the Eastern Mediterranean world suddenly and violently collapsed. Armenians have since remained isolated and genetic structure within the population developed ~500 years ago when Armenia was divided between the Ottomans and the Safavid Empire in Iran. Finally, we show that Armenians have higher genetic affinity to Neolithic Europeans than other present-day Near Easterners, and that 29% of the Armenian ancestry may originate from an ancestral population best represented by Neolithic Europeans.
20,002 downloads bioRxiv genetics
Iosif Lazaridis, Nick Patterson, Alissa Mittnik, Gabriel Renaud, Swapan Mallick, Karola Kirsanow, Peter H Sudmant, Joshua G. Schraiber, Sergi Castellano, Mark Lipson, Bonnie Berger, Christos Economou, Ruth Bollongino, Qiaomei Fu, Kirsten I. Bos, Susanne Nordenfelt, Heng Li, Cesare de Filippo, Kay Prüfer, Susanna Sawyer, Cosimo Posth, Wolfgang Haak, Fredrik Hallgren, Elin Fornander, Nadin Rohland, Dominique Delsate, Michael Francken, Jean-Michel Guinet, Joachim Wahl, George Ayodo, Hamza A. Babiker, Graciela Bailliet, Elena Balanovska, Oleg Balanovsky, Ramiro Barrantes, Gabriel Bedoya, Haim Ben-Ami, Judit Bene, Fouad Berrada, Claudio M. Bravi, Francesca Brisighelli, George Busby, Francesco Cali, Mikhail Churnosov, David E. C. Cole, Daniel Corach, Larissa Damba, George van Driem, Stanislav Dryomov, Jean-Michel Dugoujon, Sardana A. Fedorova, Irene Gallego Romero, Marina Gubina, Michael Hammer, Brenna Henn, Tor Hervig, Ugur Hodoglugil, Aashish R. Jha, Sena Karachanak-Yankova, Rita Khusainova, Elza Khusnutdinova, Rick Kittles, Toomas Kivisild, William Klitz, Vaidutis Kučinskas, Alena Kushniarevich, Leila Laredj, Sergey Litvinov, Theologos Loukidis, Robert W. Mahley, Béla Melegh, Ene Metspalu, Julio Molina, Joanna Mountain, Klemetti Näkkäläjärvi, Desislava Nesheva, Thomas Nyambo, Ludmila Osipova, Jüri Parik, Fedor Platonov, Olga Posukh, Valentino Romano, Francisco Rothhammer, Igor Rudan, Ruslan Ruizbakiev, Hovhannes Sahakyan, Antti Sajantila, Antonio Salas, Elena B. Starikovskaya, Ayele Tarekegn, Draga Toncheva, Shahlo Turdikulova, Ingrida Uktveryte, Olga Utevska, René Vasquez, Mercedes Villena, Mikhail Voevoda, Cheryl Winkler, Levon Yepiskoposyan, Pierre Zalloua, Tatijana Zemunik, Alan Cooper, Cristian Capelli, Mark G. Thomas, Andres Ruiz-Linares, Sarah A. Tishkoff, Lalji Singh, Kumarasamy Thangaraj, Richard Villems, David Comas, Rem Sukernik, Mait Metspalu, Matthias Meyer, Evan E. Eichler, Joachim Burger, Montgomery Slatkin, Svante Pääbo, Janet Kelso, David Reich, Johannes Krause
We sequenced genomes from a ~7,000 year old early farmer from Stuttgart in Germany, an ~8,000 year old hunter-gatherer from Luxembourg, and seven ~8,000 year old hunter-gatherers from southern Sweden. We analyzed these data together with other ancient genomes and 2,345 contemporary humans to show that the great majority of present-day Europeans derive from at least three highly differentiated populations: West European Hunter-Gatherers (WHG), who contributed ancestry to all Europeans but not to Near Easterners; Ancient North Eurasians (ANE), who were most closely related to Upper Paleolithic Siberians and contributed to both Europeans and Near Easterners; and Early European Farmers (EEF), who were mainly of Near Eastern origin but also harbored WHG-related ancestry. We model these populations' deep relationships and show that EEF had ~44% ancestry from a "Basal Eurasian" lineage that split prior to the diversification of all other non-African lineages.
19,816 downloads bioRxiv genetics
Ashot Margaryan, Daniel Lawson, Martin Sikora, Fernando Racimo, Simon Rasmussen, Ida Moltke, Lara Cassidy, Emil Jørsboe, Andrés Ingason, Mikkel Pedersen, Thorfinn Korneliussen, Helene Wilhelmson, Magdalena Buś, Peter de Barros Damgaard, Rui Martiniano, Gabriel Renaud, Claude Bhérer, J. Víctor Moreno-Mayar, Anna Fotakis, Marie Allen, Martyna Molak, E. Cappellini, Gabriele Scorrano, Alexandra Buzhilova, Allison Fox, Anders Albrechtsen, Berit Schütz, Birgitte Skar, Caroline Arcini, Ceri Falys, Charlotte Hedenstierna Jonson, Dariusz Błaszczyk, Denis Pezhemsky, Gordon Turner-Walker, Hildur Gestsdóttir, Inge Lundstrøm, Ingrid Gustin, Ingrid Mainland, Inna Potekhina, Italo Muntoni, Jade Cheng, Jesper Stenderup, Jilong Ma, Julie Gibson, Jüri Peets, Jörgen Gustafsson, Katrine Iversen, Linzi Simpson, Lisa Strand, Louise Loe, Maeve Sikora, Marek Florek, Maria Vretemark, Mark Redknap, Monika Bajka, Tamara Pushkina, Morten Søvsø, Natalia Grigoreva, Tom Christensen, Ole Kastholm, Otto Uldum, Pasquale Favia, Per Holck, Raili Allmäe, Sabine Sten, Símun Arge, Sturla Ellingvåg, Vayacheslav Moiseyev, Wiesław Bogdanowicz, Yvonne Magnusson, Ludovic Orlando, Daniel Bradley, Marie Louise Jørkov, Jette Arneborg, Niels Lynnerup, Neil Price, M. Thomas Gilbert, Morten Allentoft, Jan Bill, Søren Sindbæk, Lotte Hedeager, Kristian Kristiansen, Rasmus Nielsen, Thomas Werge, Eske Willerslev
The Viking maritime expansion from Scandinavia (Denmark, Norway, and Sweden) marks one of the swiftest and most far-flung cultural transformations in global history. During this time (c. 750 to 1050 CE), the Vikings reached most of western Eurasia, Greenland, and North America, and left a cultural legacy that persists till today. To understand the genetic structure and influence of the Viking expansion, we sequenced the genomes of 442 ancient humans from across Europe and Greenland ranging from the Bronze Age (c. 2400 BC) to the early Modern period (c. 1600 CE), with particular emphasis on the Viking Age. We find that the period preceding the Viking Age was accompanied by foreign gene flow into Scandinavia from the south and east: spreading from Denmark and eastern Sweden to the rest of Scandinavia. Despite the close linguistic similarities of modern Scandinavian languages, we observe genetic structure within Scandinavia, suggesting that regional population differences were already present 1,000 years ago. We find evidence for a majority of Danish Viking presence in England, Swedish Viking presence in the Baltic, and Norwegian Viking presence in Ireland, Iceland, and Greenland. Additionally, we see substantial foreign European ancestry entering Scandinavia during the Viking Age. We also find that several of the members of the only archaeologically well-attested Viking expedition were close family members. By comparing Viking Scandinavian genomes with present-day Scandinavian genomes, we find that pigmentation-associated loci have undergone strong population differentiation during the last millennia. Finally, we are able to trace the allele frequency dynamics of positively selected loci with unprecedented detail, including the lactase persistence allele and various alleles associated with the immune response. We conclude that the Viking diaspora was characterized by substantial foreign engagement: distinct Viking populations influenced the genomic makeup of different regions of Europe, while Scandinavia also experienced increased contact with the rest of the continent.
17,169 downloads bioRxiv genetics
Alissa Mittnik, Chuan-Chao Wang, Saskia Pfrengle, Mantas Daubaras, Gunita Zarina, Fredrik Hallgren, Raili Allmäe, Valery Khartanovich, Vyacheslav Moiseyev, Anja Furtwängler, Aida Andrades Valtueña, Michal Feldman, Christos Economou, Markku Oinonen, Andrejs Vasks, Mari Tõrv, Oleg Balanovsky, David Reich, Rimantas Jankauskas, Wolfgang Haak, Stephan Schiffels, Johannes Krause
Recent ancient DNA studies have revealed that the genetic history of modern Europeans was shaped by a series of migration and admixture events between deeply diverged groups. While these events are well described in Central and Southern Europe, genetic evidence from Northern Europe surrounding the Baltic Sea is still sparse. Here we report genome-wide DNA data from 24 ancient North Europeans ranging from ~7,500 to 200 calBCE spanning the transition from a hunter-gatherer to an agricultural lifestyle, as well as the adoption of bronze metallurgy. We show that Scandinavia was settled after the retreat of the glacial ice sheets from a southern and a northern route, and that the first Scandinavian Neolithic farmers derive their ancestry from Anatolia 1000 years earlier than previously demonstrated. The range of Western European Mesolithic hunter-gatherers extended to the east of the Baltic Sea, where these populations persisted without gene-flow from Central European farmers until around 2,900 calBCE when the arrival of steppe pastoralists introduced a major shift in economy and established wide-reaching networks of contact within the Corded Ware Complex.
16,834 downloads bioRxiv genetics
Detection of recent natural selection is a challenging problem in population genetics, as standard methods generally integrate over long timescales. Here we introduce the Singleton Density Score (SDS), a powerful measure to infer very recent changes in allele frequencies from contemporary genome sequences. When applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past 2,000 years. We see strong signals of selection at lactase and HLA, and in favor of blond hair and blue eyes. Turning to signals of polygenic adaptation we find, remarkably, that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we report suggestive new evidence for polygenic shifts affecting many other complex traits. Our results suggest that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.
14,786 downloads bioRxiv genetics
Iosif Lazaridis, Anna Belfer-Cohen, Swapan Mallick, Nick Patterson, Olivia Cheronet, Nadin Rohland, Guy Bar-Oz, Ofer Bar-Yosef, Nino Jakeli, Eliso Kvavadze, David Lordkipanidze, Zinovi Matzkevich, Tengiz Meshveliani, Brendan J Culleton, Douglas J. Kennett, Ron Pinhasi, David Reich
The earliest ancient DNA data of modern humans from Europe dates to ~40 thousand years ago, but that from the Caucasus and the Near East to only ~14 thousand years ago, from populations who lived long after the Last Glacial Maximum (LGM) ~26.5-19 thousand years ago. To address this imbalance and to better understand the relationship of Europeans and Near Easterners, we report genome-wide data from two ~26 thousand year old individuals from Dzudzuana Cave in Georgia in the Caucasus from around the beginning of the LGM. Surprisingly, the Dzudzuana population was more closely related to early agriculturalists from western Anatolia ~8 thousand years ago than to the hunter-gatherers of the Caucasus from the same region of western Georgia of ~13-10 thousand years ago. Most of the Dzudzuana population's ancestry was deeply related to the post-glacial western European hunter-gatherers of the 'Villabruna cluster', but it also had ancestry from a lineage that had separated from the great majority of non-African populations before they separated from each other, proving that such 'Basal Eurasians' were present in West Eurasia twice as early as previously recorded. We document major population turnover in the Near East after the time of Dzudzuana, showing that the highly differentiated Holocene populations of the region were formed by 'Ancient North Eurasian' admixture into the Caucasus and Iran and North African admixture into the Natufians of the Levant. We finally show that the Dzudzuana population contributed the majority of the ancestry of post-Ice Age people in the Near East, North Africa, and even parts of Europe, thereby becoming the largest single contributor of ancestry of all present-day West Eurasians.
14,758 downloads bioRxiv genetics
Eric W Stawiski, Devan Diwanji, Kushal Suryamohan, Ravi Gupta, Frederic A Fellouse, J. Fah Sathirapongsasuti, Jiang Liu, Ying-Ping Jiang, Aakrosh Ratan, Monika Mis, Devi Santhosh, Sneha Somasekar, Sangeetha Mohan, Sameer Phalke, Boney Kuriakose, Aju Antony, Jagath R Junutula, Stephan C. Schuster, Natalia Jura, Somasekar Seshagiri
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of coronavirus disease (COVID-19) that has resulted in a global pandemic. It is a highly contagious positive strand RNA virus and its clinical presentation includes severe to critical respiratory disease that appears to be fatal in ~3-5% of the cases. The viral spike (S) coat protein engages the human angiotensin-converting enzyme2 (ACE2) cell surface protein to invade the host cell. The SARS-CoV-2 S-protein has acquired mutations that increase its affinity to human ACE2 by ~10-15-fold compared to SARS-CoV S-protein, making it highly infectious. In this study, we assessed if ACE2 polymorphisms might alter host susceptibility to SARS-CoV-2 by affecting the ACE2 S-protein interaction. Our comprehensive analysis of several large genomic datasets that included over 290,000 samples representing >400 population groups identified multiple ACE2 protein-altering variants, some of which mapped to the S-protein-interacting ACE2 surface. Using recently reported structural data and a recent S-protein-interacting synthetic mutant map of ACE2, we have identified natural ACE2 variants that are predicted to alter the virus-host interaction and thereby potentially alter host susceptibility. In particular, human ACE2 variants S19P, I21V, E23K, K26R, T27A, N64K, T92I, Q102P and H378R are predicted to increase susceptibility. The T92I variant, part of a consensus NxS/T N-glycosylation motif, confirmed the role of N90 glycosylation in immunity from non-human CoVs. Other ACE2 variants K31R, N33I, H34R, E35K, E37K, D38V, Y50F, N51S, M62V, K68E, F72V, Y83H, G326E, G352V, D355N, Q388L and D509Y are putative protective variants predicted to show decreased binding to SARS-CoV-2 S-protein. Overall, ACE2 variants are rare, consistent with the lack of selection pressure given the recent history of SARS-CoV epidemics, however, are likely to play an important role in altering susceptibility to CoVs. ### Competing Interest Statement
13,560 downloads bioRxiv genetics
Pierrick Wainschtein, Deepti P Jain, Zhili Zheng, TOPMed Anthropometry Working Group, Trans-Omics for Precision Medicine Consortium, L. Adrienne Cupples, Aladdin H Shadyab, Barbara McKnight, Benjamin M Shoemaker, Braxton D Mitchell, Bruce M Psaty, Charles Kooperberg, Ching-Ti Liu, Christine M Albert, Dan Roden, Daniel I. Chasman, Dawood Darbar, Donald M Lloyd-Jones, Donna K Arnett, Elizabeth A Regan, Eric Boerwinkle, Jerome Rotter, Jeffrey R O'Connell, Lisa R Yanek, Mariza de Andrade, Matthew A Allison, Merry-Lynn N McDonald, Mina K Chung, Myriam Fornage, Nathalie Chami, Nicholas L Smith, Patrick T Ellinor, Ramachandran S. Vasan, Rasika A. Mathias, Ruth JF Loos, Stephen Rich, Steven A. Lubitz, Susan R Heckbert, Susan Redline, Xiuqing Guo, Y-D Ida Chen, Cecelia A. Laurie, Ryan D Hernandez, Stephen T. McGarvey, Michael E Goddard, Cathy C Laurie, Kari E North, Leslie A Lange, Bruce S. Weir, Loic Yengo, Jian Yang, Peter M Visscher
Heritability, the proportion of phenotypic variance explained by genetic factors, can be estimated from pedigree data, but such estimates are uninformative with respect to the underlying genetic architecture. Analyses of data from genome-wide association studies (GWAS) on unrelated individuals have shown that for human traits and disease, approximately one-third to two-thirds of heritability is captured by common SNPs. It is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular if the causal variants are rare, or other reasons such as over-estimation of heritability from pedigree data. Here we show that pedigree heritability for height and body mass index (BMI) appears to be largely recovered from whole-genome sequence (WGS) data on 25,465 unrelated individuals of European ancestry. We assigned 33.7 million genetic variants to groups based upon their minor allele frequencies (MAF) and linkage disequilibrium (LD) with variants nearby, and estimated and partitioned genetic variance accordingly. The estimated heritability was 0.68 (SE 0.10) for height and 0.30 (SE 0.10) for BMI, with a range of ~0.60 - 0.71 for height and ~0.25 - 0.35 for BMI, depending on quality control and analysis strategies. Low-MAF variants in low LD with neighbouring variants were enriched for heritability, to a greater extent for protein-altering variants, consistent with negative selection thereon. Cumulatively variants with 0.0001 < MAF < 0.1 explained 0.47 (SE 0.07) and 0.30 (SE 0.10) of heritability for height and BMI, respectively. Our results imply that rare variants, in particular those in regions of low LD, is a major source of the still missing heritability of complex traits and disease.
13,468 downloads bioRxiv genetics
Over the past 500 years, North America has been the site of ongoing mixing of Native Americans, European settlers, and Africans brought largely by the Trans-Atlantic slave trade, shaping the early history of what became the United States. We studied the genetic ancestry of 5,269 self-described African Americans, 8,663 Latinos, and 148,789 European Americans who are 23andMe customers and show that the legacy of these historical interactions is visible in the genetic ancestry of present-day Americans. We document pervasive mixed ancestry and asymmetrical male and female ancestry contributions in all groups studied. We show that regional ancestry differences reflect historical events, such as early Spanish colonization, waves of immigration from many regions of Europe, and forced relocation of Native Americans within the US. This study sheds light on the fine-scale differences in ancestry within and across the United States, and informs our understanding of the relationship between racial and ethnic identities and genetic ancestry.
12,771 downloads bioRxiv genetics
A major constraint on the evolution of large body sizes in animals is an increased risk of developing cancer. There is no correlation, however, between body size and cancer risk. This lack of correlation is often referred to as "Peto′s Paradox". Here we show that the elephant genome encodes 20 copies of the tumor suppressor gene TP53 and that the increase in TP53 copy number occurred coincident with the evolution of large body sizes in the elephant (Proboscidean) lineage. Furthermore we show that several of the TP53 retrogenes are transcribed and translated and contribute to an enhanced sensitivity of elephant cells to DNA damage and the induction of apoptosis via a hyperactive TP53 signaling pathway. These results suggest that an increase in the copy number of TP53 may have played a direct role in the evolution of very large body sizes and the resolution of Peto′s paradox in Proboscideans.
12,622 downloads bioRxiv genetics
Genetic clustering algorithms, implemented in popular programs such as STRUCTURE and ADMIXTURE, have been used extensively in the characterisation of individuals and populations based on genetic data. A successful example is reconstruction of the genetic history of African Americans who are a product of recent admixture between highly differentiated populations. Histories can also be reconstructed using the same procedure for groups which do not have admixture in their recent history, where recent genetic drift is strong or that deviate in other ways from the underlying inference model. Unfortunately, such histories can be misleading. We have implemented an approach (badMIXTURE, available at github.com/danjlawson/badMIXTURE) to assess the goodness of fit of the model using the ancestry 'palettes' estimated by CHROMOPAINTER and apply it to both simulated and real examples. Combining these complementary analyses with additional methods that are designed to test specific hypothesis allows a richer and more robust analysis of recent demographic history based on genetic data.
12,002 downloads bioRxiv genetics
Ditte Demontis, Raymond K Walters, Joanna Martin, Manuel Mattheisen, Thomas Damm Als, Esben Agerbo, Rich Belliveau, Jonas Bybjerg-Grauholm, Marie Baekvad-Hansen, Felecia Cerrato, Kimberly Chambert, Claire Churchhouse, Ashley Dumont, Nicholas Eriksson, Michael Gandal, Jacqueline Goldstein, Jakob Grove, Christine S. Hansen, Mads E Hauberg, Mads V Hollegaard, Daniel P Howrigan, Hailiang Huang, Julian Maller, Alicia R. Martin, Jennifer Moran, Jonatan Pallesen, Duncan S Palmer, Carsten B Pedersen, Marianne G Pedersen, Timothy Poterba, Jesper B Poulsen, Stephan Ripke, Elise B Robinson, Kyle F Satterstrom, Christine Stevens, Patrick Turley, Hyejung Won, ADHD Working Group of the Psychiatric Genomics Consortium (PGC), Early Lifecourse & Genetic Epidemiology (EAGLE) Consortium, 23andMe Research Team, Ole Andreassen, Christie Burton, Dorret Boomsma, Bru Cormand, Sören Dalsgaard, Barbara Franke, Joel Gelernter, Daniel Geschwind, Hakon Hakonarson, Jan Haavik, Henry Kranzler, Jonna Kuntsi, Kate Langley, Klaus-Peter Lesch, Christel Middeldorp, Andreas Reif, Luis A. Rohde, Panos Roussos, Russell Schachar, Pamela Sklar, Edmund Sonuga-Barke, Patrick F Sullivan, Anita Thapar, Joyce Tung, Irwin Waldman, Merete Nordentoft, David M Hougaard, Thomas Werge, Ole Mors, Preben Bo Mortensen, Mark J Daly, Stephen V Faraone, Anders Børglum, Benjamin Neale
Attention-Deficit/Hyperactivity Disorder (ADHD) is a highly heritable childhood behavioral disorder affecting 5% of school-age children and 2.5% of adults. Common genetic variants contribute substantially to ADHD susceptibility, but no individual variants have been robustly associated with ADHD. We report a genome-wide association meta-analysis of 20,183 ADHD cases and 35,191 controls that identifies variants surpassing genome-wide significance in 12 independent loci, revealing new and important information on the underlying biology of ADHD. Associations are enriched in evolutionarily constrained genomic regions and loss-of-function intolerant genes, as well as around brain-expressed regulatory marks. These findings, based on clinical interviews and/or medical records are supported by additional analyses of a self-reported ADHD sample and a study of quantitative measures of ADHD symptoms in the population. Meta-analyzing these data with our primary scan yielded a total of 16 genome-wide significant loci. The results support the hypothesis that clinical diagnosis of ADHD is an extreme expression of one or more continuous heritable traits.
11,762 downloads bioRxiv genetics
Jin Wei, Mia Madel Alfajaro, Ruth E Hanna, Peter C DeWeirdt, Madison S. Strine, William J. Lu-Culligan, Shang-Min Zhang, Vincent R. Graziano, Cameron O. Schmitz, Jennifer S. Chen, Madeleine C. Mankowski, Renata B. Filler, Victor Gasque, Fernando de Miguel, Huacui Chen, Kasopefoluwa Oguntuyo, Laura Abriola, Yulia V Surovtseva, Robert C. Orchard, Benhur Allison Lee, Brett Lindenbach, Katerina Politi, David van Dijk, Matthew D Simon, Qin Yan, John G. Doench, Craig B Wilen
Identification of host genes essential for SARS-CoV-2 infection may reveal novel therapeutic targets and inform our understanding of COVID-19 pathogenesis. Here we performed a genome-wide CRISPR screen with SARS-CoV-2 and identified known SARS-CoV-2 host factors including the receptor ACE2 and protease Cathepsin L. We additionally discovered novel pro-viral genes and pathways including the SWI/SNF chromatin remodeling complex and key components of the TGF-β signaling pathway. Small molecule inhibitors of these pathways prevented SARS-CoV-2-induced cell death. We also revealed that the alarmin HMGB1 is critical for SARS-CoV-2 replication. In contrast, loss of the histone H3.3 chaperone complex sensitized cells to virus-induced death. Together this study reveals potential therapeutic targets for SARS-CoV-2 and highlights host genes that may regulate COVID-19 pathogenesis. ### Competing Interest Statement Yale University (CBW) has a patent pending related to this work entitled: 'Compounds and Compositions for Treating, Ameliorating, and/or Preventing SARS-CoV-2 Infection and/or Complications Thereof.' Yale University has committed to rapidly executable non-exclusive royalty-free licenses to intellectual property rights for the purpose of making and distributing products to prevent, diagnose and treat COVID-19 infection during the pandemic and for a short period thereafter. JGD consults for Foghorn Therapeutics, Maze Therapeutics, Merck, Agios, and Pfizer; JGD consults for and has equity in Tango Therapeutics.
11,551 downloads bioRxiv genetics
Mike A Nalls, Cornelis Blauwendraat, Costanza L. Vallerga, Karl Heilbron, Sara Bandres Ciga, Diana Chang, Manuela Tan, Demis A Kia, Alastair J. Noyce, Angli Xue, Jose Bras, Emily Young, Rainer von Coelln, Javier Simón-Sánchez, Claudia Schulte, Manu Sharma, Lynne Krohn, Lasse Pihlstrom, Ari Siitonen, Hirotaka Iwaki, Hampton Leonard, Faraz Faghri, J Raphel Gibbs, Dena G Hernandez, Sonja W. Scholz, Juan A. Botia, Maria Martinez, Jean-Christophe Corvol, Suzanne Lesage, Joseph Jankovic, Lisa M. Shulman, The 23andMe Research Team, System Genomics of Parkinson’s Disease (SGPD) Consortium, Margaret Sutherland, Pentti Tienari, Kari Majamaa, Mathias Toft, Ole Andreassen, Tushar Bangale, Alexis Brice, Jian Yang, Ziv Gan-Or, Thomas Gasser, Peter Heutink, Joshua M. Shulman, Nicolas Wood, David A. Hinds, John A. Hardy, Huw Morris, Jacob Gratten, Peter M Visscher, Robert R Graham, Andrew B. Singleton, for the International Parkinson’s Disease Genomics Consortium
We performed the largest genome-wide association study of PD to date, involving the analysis of 7.8M SNPs in 37.7K cases, 18.6K UK Biobank proxy-cases, and 1.4M controls. We identified 90 independent genome-wide significant signals across 78 loci, including 38 independent risk signals in 37 novel loci. These variants explained 26-36% of the heritable risk of PD. Tests of causality within a Mendelian randomization framework identified putatively causal genes for 70 risk signals. Tissue expression enrichment analysis suggested that signatures of PD loci were heavily brain-enriched, consistent with specific neuronal cell types being implicated from single cell expression data. We found significant genetic correlations with brain volumes, smoking status, and educational attainment. In sum, these data provide the most comprehensive understanding of the genetic architecture of PD to date by revealing many additional PD risk loci, providing a biological context for these risk factors, and demonstrating that a considerable genetic component of this disease remains unidentified.
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!