Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 62,709 bioRxiv papers from 278,266 authors.

Most downloaded bioRxiv papers, all time

in category evolutionary biology

4,155 results found. For more information, click each entry to expand.

61: Flexible methods for estimating genetic distances from nucleotide data
more details view paper

Posted to bioRxiv 14 Apr 2014

Flexible methods for estimating genetic distances from nucleotide data
2,524 downloads evolutionary biology

Simon Joly, David Bryant, Peter J Lockhart

With the increasing use of massively parallel sequencing approaches in evolutionary biology, the need for fast and accurate methods suitable to investigate genetic structure and evolutionary history are more important than ever. We propose new distance measures for estimating genetic distances between individuals when allelic variation, gene dosage and recombination could compromise standard approaches. We present four distance measures based on single nucleotide polymorphisms (SNP) and evaluate them against previously published measures using coalescent- based simulations. Simulations were used to test (i) whether the measures give unbiased and accurate distance estimates, (ii) whether they can accurately identify the genomic mixture of hybrid individuals and (iii) whether they give precise (low variance) estimates. The effect of rate variation among genes and recombination was also investigated. The results showed that the SNP-based GENPOFAD distance we propose appears to work well in the widest range of circumstances. It was the most accurate and precise method for estimating genetic distances and is also relatively good at estimating the genomic mixture of hybrid individuals. Our simulations provide benchmarks to compare the performance of different method that estimate genetic distances between organisms.

62: RADpainter and fineRADstructure: population inference from RADseq data
more details view paper

Posted to bioRxiv 07 Jun 2016

RADpainter and fineRADstructure: population inference from RADseq data
2,517 downloads evolutionary biology

Milan Malinsky, Emiliano Trucchi, Daniel John Lawson, Daniel Falush

Powerful approaches to inferring recent or current population structure based on nearest neighbour haplotype coancestry have so far been inaccessible to users without high quality genome-wide haplotype data. With a boom in non-model organism genomics, there is a pressing need to bring these methods to communities without access to such data. Here we present RADpainter, a new program designed to infer the coancestry matrix from restriction-site-associated DNA sequencing (RADseq) data. We combine this program together with a previously published MCMC clustering algorithm into fineRADstructure - a complete, easy to use, and fast population inference package for RADseq data (https://github.com/millanek/fineRADstructure). Finally, with two example datasets, we illustrate its use, benefits, and robustness to missing RAD alleles in double digest RAD sequencing.

63: Ultra-rare variants drive substantial cis-heritability of human gene expression
more details view paper

Posted to bioRxiv 14 Nov 2017

Ultra-rare variants drive substantial cis-heritability of human gene expression
2,506 downloads evolutionary biology

Ryan D. Hernandez, Lawrence H. Uricchio, Kevin Hartman, Chun Ye, Andrew Dahl, Noah Zaitlen

The vast majority of human mutations have minor allele frequencies (MAF) under 1%, with the plurality observed only once (i.e., “singletons”). While Mendelian diseases are predominantly caused by rare alleles, their cumulative contribution to complex phenotypes remains largely unknown. We develop and rigorously validate an approach to jointly estimate the contribution of all alleles, including singletons, to phenotypic variation. We apply our approach to transcriptional regulation, an intermediate between genetic variation and complex disease. Using whole genome DNA and lymphoblastoid cell line RNA sequencing data from 360 European individuals, we conservatively estimate that singletons contribute ~25% of cis-heritability across genes (dwarfing the contributions of other frequencies). Strikingly, the majority (~76%) of singleton heritability derives from ultra-rare variants absent from thousands of additional samples. We develop a novel inference procedure to demonstrate that our results are consistent with rampant purifying selection shaping the regulatory architecture of most human genes.

64: How complexity originates: The evolution of animal eyes
more details view paper

Posted to bioRxiv 26 Mar 2015

How complexity originates: The evolution of animal eyes
2,470 downloads evolutionary biology

Todd H Oakley, Daniel I Speiser

Learning how complex traits like eyes originate is fundamental for understanding evolution. Here, we first sketch historical perspectives on trait origins and argue that new technologies offer key new insights. Next, we articulate four open questions about trait origins. To address them, we define a research program to break complex traits into components and study the individual evolutionary histories of those parts. By doing so, we can learn when the parts came together and perhaps understand why they stayed together. We apply the approach to five structural innovations critical for complex eyes, reviewing the history of the parts of each of those innovations. Photoreceptors evolved within animals by bricolage, recombining genes that originated far earlier. Multiple genes used in eyes today had ancestral roles in stress responses. We hypothesize that photo-stress could have increased the chance those genes were expressed together in places on animals where light was abundant.

65: Convergent evolution of psilocybin biosynthesis by psychedelic mushrooms
more details view paper

Posted to bioRxiv 25 Jul 2018

Convergent evolution of psilocybin biosynthesis by psychedelic mushrooms
2,464 downloads evolutionary biology

Ali R Awan, Jaclyn M Winter, Daniel Turner, William M Shaw, Laura M Suz, Alexander J Bradshaw, Tom Ellis, Bryn T.M. Dentinger

Psilocybin is a psychoactive compound with clinical applications produced by dozens of mushroom species. There has been a longstanding interest in psilocybin research with regard to treatment for addiction, depression, and end-of-life suffering. However, until recently very little was known about psilocybin biosynthesis and its ecological role. Here we confirm and refine recent findings about the genes underpinning psilocybin biosynthesis, discover that there is more than one psilocybin biosynthesis cluster in mushrooms, and we provide the first data directly addressing psilocybin's ecological role. By analysing independent genome assemblies for the hallucinogenic mushrooms Psilocybe cyanescens and Pluteus salicinus we recapture the recently discovered psilocybin biosynthesis cluster and show that a transcription factor previously implicated in its regulation is actually not part of the cluster. Further, we show that the mushroom Inocybe corydalina produces psilocybin but does not contain the established biosynthetic cluster, and we present an alternative cluster. Finally, a meta-transcriptome analysis of wild-collected mushrooms provides evidence for intra-mushroom insect gene expression of flies whose larvae grow inside Psilocybe cyanescens. These larvae were successfully reared into adults. Our results show that psilocybin does not confer complete protection against insect mycophagy, and the hypothesis that it is produced as an adaptive defense compound may need to be reconsidered.

66: Gene expression variability across cells and species shapes innate immunity
more details view paper

Posted to bioRxiv 15 May 2017

Gene expression variability across cells and species shapes innate immunity
2,453 downloads evolutionary biology

Tzachi Hagai, Xi Chen, Ricardo J Miragaia, Tomás Gomes, Raghd Rostom, Natalia Kunowska, Valentina Proserpio, Giacomo Donati, Lara Bossini-Castillo, Guy Naamati, Guy Emerton, Gosia Trynka, Ivanela Kondova, Mike Denis, Sarah A Teichmann

As the first line of defence against pathogens, cells mount an innate immune response, which is highly variable from cell to cell. The response must be potent yet carefully controlled to avoid self-damage. How these constraints have shaped the evolution of innate immunity remains poorly understood. Here, we characterise this programme's transcriptional divergence between species and expression variability across cells. Using bulk and single-cell transcriptomics in primate and rodent fibroblasts challenged with an immune stimulus, we reveal a striking architecture of the innate immune response. Transcriptionally diverging genes, including cytokines and chemokines, vary across cells and have distinct promoter structures. Conversely, genes involved in response regulation, such as transcription factors and kinases, are conserved between species and display low cell-to-cell variability. We suggest that this unique expression pattern, observed across species and conditions, has evolved as a mechanism for fine-tuned regulation, to achieve an effective but balanced response.

67: Identifying genetic variants that affect viability in large cohorts
more details view paper

Posted to bioRxiv 07 Nov 2016

Identifying genetic variants that affect viability in large cohorts
2,438 downloads evolutionary biology

Hakhamanesh Mostafavi, Tomaz Berisa, Felix R Day, John R B Perry, Molly Przeworski, Joseph K. Pickrell

A number of open questions in human evolutionary genetics would become tractable if we were able to directly measure evolutionary fitness. As a step towards this goal, we developed a method to examine whether individual genetic variants, or sets of genetic variants, currently influence viability. The approach consists in testing whether the frequency of an allele varies across ages, accounting for variation in ancestry. We applied it to the Genetic Epidemiology Research on Aging (GERA) cohort and to the parents of participants in the UK Biobank. Across the genome, we find only a few common variants with large effects on age-specific mortality: tagging the APOE ϵ4 allele and near CHRNA3. These results suggest that when large, even late onset effects are kept at low frequency by purifying selection. Testing viability effects of sets of genetic variants that jointly influence one of 42 traits, we detect a number of strong signals. In participants of the UK Biobank study of British ancestry, we find that variants that delay puberty timing are enriched in longer-lived parents (P~6×10-6 for fathers and P~2×10-3 for mothers), consistent with epidemiological studies. Similarly, in mothers, variants associated with later age at first birth are associated with a longer lifespan (P~1×10-3). Signals are also observed for variants influencing cholesterol levels, risk of coronary artery disease, body mass index, as well as risk of asthma. These signals exhibit consistent effects in the GERA cohort and among participants of the UK Biobank of non-British ancestry. Moreover, we see marked differences between males and females, most notably at the CHRNA3 locus, and variants associated with risk of coronary artery disease and cholesterol levels. Beyond our findings, the analysis serves as a proof of principle for how upcoming biomedical datasets can be used to learn about selection effects in contemporary humans.

68: Testing for ancient selection using cross-population allele frequency differentiation
more details view paper

Posted to bioRxiv 06 Apr 2015

Testing for ancient selection using cross-population allele frequency differentiation
2,433 downloads evolutionary biology

Fernando Racimo

A powerful way to detect selection in a population is by modeling local allele frequency changes in a particular region of the genome under scenarios of selection and neutrality, and finding which model is most compatible with the data. Chen et al. (2010) developed a composite likelihood method called XP-CLR that uses an outgroup population to detect departures from neutrality which could be compatible with hard or soft sweeps, at linked sites near a beneficial allele. However, this method is most sensitive to recent selection and may miss selective events that happened a long time ago. To overcome this, we developed an extension of XP-CLR that jointly models the behavior of a selected allele in a three-population tree. Our method - called 3P-CLR - outperforms XP-CLR when testing for selection that occurred before two populations split from each other, and can distinguish between those events and events that occurred specifically in each of the populations after the split. We applied our new test to population genomic data from the 1000 Genomes Project, to search for selective sweeps that occurred before the split of Yoruba and Eurasians, but after their split from Neanderthals, and that could have led to the spread of modern-human-specific phenotypes. We also searched for sweep events that occurred in East Asians, Europeans and the ancestors of both populations, after their split from Yoruba. In both cases, we are able to confirm a number of regions identified by previous methods, and find several new candidates for selection in recent and ancient times. For some of these, we also find suggestive functional mutations that may have driven the selective events.

69: Inferring the ancestry of everyone
more details view paper

Posted to bioRxiv 01 Nov 2018

Inferring the ancestry of everyone
2,414 downloads evolutionary biology

Jerome Kelleher, Yan Wong, Patrick K. Albers, Anthony Wilder Wohns, Gil McVean

A central problem in evolutionary biology is to infer the full genealogical history of a set of DNA sequences. This history contains rich information about the forces that have influenced a sexually reproducing species. However, existing methods are limited: the most accurate is unable to cope with more than a few dozen samples. With modern genetic data sets rapidly approaching millions of genomes, there is an urgent need for efficient inference methods to exploit such rich resources. We introduce an algorithm to infer whole-genome history which has comparable accuracy to the state-of-the-art but can process around four orders of magnitude more sequences. Additionally, our method results in an "evolutionary encoding" of the original sequence data, enabling efficient access to genealogies and calculation of genetic statistics over the data. We apply this technique to human data from the 1000 Genomes Project, Simons Genome Diversity Project and UK Biobank, showing that the genealogies we estimate are both rich in biological signal and efficient to process.

70: Extensive farming in Estonia started through a sex-biased migration from the Steppe
more details view paper

Posted to bioRxiv 02 Mar 2017

Extensive farming in Estonia started through a sex-biased migration from the Steppe
2,412 downloads evolutionary biology

Lehti Saag, Liivi Varul, Christiana Lyn Scheib, Jesper Stenderup, Morten E. Allentoft, Lauri Saag, Luca Pagani, Maere Reidla, Kristiina Tambets, Ene Metspalu, Aivar Kriiska, Eske Willerslev, Toomas Kivisild, Mait Metspalu

Farming-based economies appear relatively late in Northeast Europe and the extent to which they involve genetic ancestry change is still poorly understood. Here we present the analyses of low coverage whole genome sequence data from five hunter-gatherers and five farmers of Estonia dated to 4,500 to 6,300 years before present. We find evidence of significant differences between the two groups in the composition of autosomal as well as mtDNA, X and Y chromosome ancestries. We find that Estonian hunter-gatherers of Comb Ceramic Culture are closest to Eastern hunter-gatherers. The Estonian first farmers of Corded Ware Culture show high similarity in their autosomes with Steppe Belt Late Neolithic/Bronze Age individuals, Caucasus hunter-gatherers and Iranian farmers while their X chromosomes are most closely related with the European Early Farmers of Anatolian descent. These findings suggest that the shift to intensive cultivation and animal husbandry in Estonia was triggered by the arrival of new people with predominantly Steppe ancestry, but whose ancestors had undergone sex-specific admixture with early farmers with Anatolian ancestry.

71: Model adequacy and the macroevolution of angiosperm functional traits
more details view paper

Posted to bioRxiv 07 Apr 2014

Model adequacy and the macroevolution of angiosperm functional traits
2,409 downloads evolutionary biology

Matthew W Pennell, Richard G. FitzJohn, William K. Cornwell, Luke J. Harmon

Making meaningful inferences from phylogenetic comparative data requires a meaningful model of trait evolution. It is thus important to determine whether the model is appropriate for the data and the question being addressed. One way to assess this is to ask whether the model provides a good statistical explanation for the variation in the data. To date, researchers have focused primarily on the explanatory power of a model relative to alternative models. Methods have been developed to assess the adequacy, or absolute explanatory power, of phylogenetic trait models but these have been restricted to specific models or questions. Here we present a general statistical framework for assessing the adequacy of phylogenetic trait models. We use our approach to evaluate the statistical performance of commonly used trait models on 337 comparative datasets covering three key Angiosperm functional traits. In general, the models we tested often provided poor statistical explanations for the evolution of these traits. This was true for many different groups and at many different scales. Whether such statistical inadequacy will qualitatively alter inferences draw from comparative datasets will depend on the context. Regardless, assessing model adequacy can provide interesting biological insights -- how and why a model fails to describe variation in a dataset gives us clues about what evolutionary processes may have driven trait evolution across time.

72: A statistical approach to genome size evolution: Observations and explanations
more details view paper

Posted to bioRxiv 16 Nov 2015

A statistical approach to genome size evolution: Observations and explanations
2,377 downloads evolutionary biology

Dirson Jian Li

Genome size evolution is a fundamental problem in molecular evolution. Statistical analysis of genome sizes brings new insight into the evolution of genome size. Although the variation of genome sizes is complicated, it is indicated that the genome size evolution can be explained more clearly at taxon level than at species level. I find that the genome size distribution for species in a taxon fits log-normal distribution. And I find a relationship between the phylogeny of life and the statistical features of genome size distributions among taxa. I observed different statistical features of genome size distributions between animal taxa and plant taxa. A log-normal stochastic process model is developed to simulate the genome size evolution. The simulation results on the log-normal distributions of genome sizes and their statistical features agree with the observations.

73: orco mutagenesis causes loss of antennal lobe glomeruli and impaired social behavior in ants
more details view paper

Posted to bioRxiv 28 Feb 2017

orco mutagenesis causes loss of antennal lobe glomeruli and impaired social behavior in ants
2,376 downloads evolutionary biology

Waring Trible, Ni-Chen Chang, Benjamin J Matthews, Sean K McKenzie, Leonora Olivos-Cisneros, Peter R Oxley, Jonathan Saragosti, Daniel JC Kronauer

Life inside ant colonies is orchestrated with a diverse set of pheromones, but it is not clear how ants perceive these social cues. It has been proposed that pheromone perception in ants evolved via expansions in the numbers of odorant receptors (ORs) and antennal lobe glomeruli. Here we generate the first mutant lines in ants by disrupting orco, a gene required for the function of all ORs. We find that orco mutants exhibit severe deficiencies in social behavior and fitness, suggesting that they are unable to perceive pheromones. Surprisingly, unlike in Drosophila melanogaster, orco mutant ants also lack most of the approximately 500 antennal lobe glomeruli found in wild-types. These results illustrate that ORs are essential for ant social organization, and raise the possibility that, similar to mammals, receptor function is required for the development and/or maintenance of the highly complex olfactory processing areas in the ant brain.

74: Does linked selection explain the narrow range of genetic diversity across species?
more details view paper

Posted to bioRxiv 07 Mar 2016

Does linked selection explain the narrow range of genetic diversity across species?
2,375 downloads evolutionary biology

Graham Coop

The relatively narrow range of genetic polymorphism levels across species has been a major source of debate since the inception of molecular population genetics. Recently Corbett-Detig et al found evidence that linked selection strongly constrains levels of polymorphism in species with large census sizes. Here I reexamine this claim and find weak support for this conclusion. While linked selection is an important determinant of polymorphism levels along the genome in many species, we currently lack compelling evidence that it is a major determinant of polymorphism levels among obligately sexual species.

75: Widespread transcriptional scanning in the testis modulates gene evolution rates
more details view paper

Posted to bioRxiv 14 Mar 2018

Widespread transcriptional scanning in the testis modulates gene evolution rates
2,353 downloads evolutionary biology

Bo Xia, Yun Yan, Maayan Baron, Florian Wagner, Dalia Barkley, Marta Chiodin, Sang Y. Kim, David L. Keefe, Joseph P. Alukal, Jef D. Boeke, Itai Yanai

The testis expresses the largest number of genes of any mammalian organ, a finding that has long puzzled molecular biologists. Analyzing our single-cell transcriptomic maps of human and mouse spermatogenesis, we provide evidence that this widespread transcription serves to maintain DNA sequence integrity in the male germline by correcting DNA damage through 'transcriptional scanning'. Supporting this model, we find that genes expressed during spermatogenesis display lower mutation rates on the transcribed strand and have low diversity in the population. Moreover, this effect is fine-tuned by the level of gene expression during spermatogenesis. The unexpressed genes, which in our model do not benefit from transcriptional scanning, diverge faster over evolutionary time-scales and are enriched for sensory and immune-defense functions. Collectively, we propose that transcriptional scanning modulates germline mutation rates in a gene-specific manner, maintaining DNA sequence integrity for the bulk of genes but allowing for fast evolution in a specific subset.

76: Evolutionary mysteries in meiosis.
more details view paper

Posted to bioRxiv 29 Apr 2016

Evolutionary mysteries in meiosis.
2,325 downloads evolutionary biology

Thomas Lenormand, Jan Engelstaedter, Susan E Johnston, Erik Wijnker, Christoph R Haag

Meiosis is a key event of sexual life cycles in eukaryotes. Its mechanistic details have been uncovered in several model organisms, and most of its essential features have received various and often contradictory evolutionary interpretations. In this perspective, we present an overview of these often "weird" features. We discuss the origin of meiosis (origin of ploidy reduction and recombination, two-step meiosis), its secondary modifications (in polyploids or asexuals, inverted meiosis), its importance in punctuating life cycles (meiotic arrests, epigenetic resetting, meiotic asymmetry, meiotic fairness) and features associated with recombination (disjunction constraints, heterochiasmy, crossover interference and hotspots). We present the various evolutionary scenarios and selective pressures that have been proposed to account for these features, and we highlight that their evolutionary significance often remains largely mysterious. Resolving these mysteries will likely provide decisive steps towards understanding why sex and recombination are found in the majority of eukaryotes.

77: Migration and interaction in a contact zone: mtDNA variation among Bantu-speakers in southern Africa
more details view paper

Posted to bioRxiv 18 Feb 2014

Migration and interaction in a contact zone: mtDNA variation among Bantu-speakers in southern Africa
2,262 downloads evolutionary biology

Chiara Barbieri, Mário Vicente, Sandra Oliveira, Koen Bostoen, Jorge Rocha, Mark Stoneking, Brigitte Pakendorf

Bantu speech communities expanded over large parts of sub-Saharan Africa within the last 4000-5000 years, reaching different parts of southern Africa 1200-2000 years ago. The Bantu languages subdivide in several major branches, with languages belonging to the Eastern and Western Bantu branches spreading over large parts of Central, Eastern, and Southern Africa. There is still debate whether this linguistic divide is correlated with a genetic distinction between Eastern and Western Bantu speakers. During their expansion, Bantu speakers would have come into contact with diverse local populations, such as the Khoisan hunter-gatherers and pastoralists of southern Africa, with whom they may have intermarried. In this study, we analyze complete mtDNA genome sequences from over 900 Bantu-speaking individuals from Angola, Zambia, Namibia, and Botswana to investigate the demographic processes at play during the last stages of the Bantu expansion. Our results show that most of these Bantu-speaking populations are genetically very homogenous, with no genetic division between speakers of Eastern and Western Bantu languages. Most of the mtDNA diversity in our dataset is due to different degrees of admixture with autochthonous populations. Only the pastoralist Himba and Herero stand out due to high frequencies of particular L3f and L3d lineages; the latter are also found in the neighboring Damara, who speak a Khoisan language and were foragers and small-stock herders. In contrast, the close cultural and linguistic relatives of the Herero and Himba, the Kuvale, are genetically similar to other Bantu-speakers. Nevertheless, as demonstrated by resampling tests, the genetic divergence of Herero, Himba, and Kuvale is compatible with a common shared ancestry with high levels of drift and differential female admixture with local pre-Bantu populations.

78: Origin and evolution of the self-organizing cytoskeleton in the network of eukaryotic organelles
more details view paper

Posted to bioRxiv 04 Jun 2014

Origin and evolution of the self-organizing cytoskeleton in the network of eukaryotic organelles
2,246 downloads evolutionary biology

Gáspár Jékely

The eukaryotic cytoskeleton evolved from prokaryotic cytomotive filaments. Prokaryotic filament systems show bewildering structural and dynamic complexity, and in many aspects prefigure the self-organizing properties of the eukaryotic cytoskeleton. Here I compare the dynamic properties of the prokaryotic and eukaryotic cytoskeleton, and discuss how these relate to function and the evolution of organellar networks. The evolution of new aspects of filament dynamics in eukaryotes, including severing and branching, and the advent of molecular motors converted the eukaryotic cytoskeleton into a self-organizing ‘active gel’, the dynamics of which can only be described with computational models. Advances in modeling and comparative genomics hold promise of a better understanding of the evolution of the self-organizing cytoskeleton in early eukaryotes, and its role in the evolution of novel eukaryotic functions, such as amoeboid motility, mitosis, and ciliary swimming.

79: The marbled crayfish as a paradigm for saltational speciation by autopolyploidy and parthenogenesis in animals
more details view paper

Posted to bioRxiv 21 Aug 2015

The marbled crayfish as a paradigm for saltational speciation by autopolyploidy and parthenogenesis in animals
2,245 downloads evolutionary biology

Günter Vogt, Cassandra Falckenhayn, Anne Schrimpf, Katharina Schmid, Katharina Hanna, Jörn Panteleit, Mark Helm, Ralf Schulz, Frank Lyko

The parthenogenetic all-female marbled crayfish is a novel research model and potent invader of freshwater ecosystems. It is a triploid descendant of the sexually reproducing slough crayfish, Procambarus fallax, but its taxonomic status has remained unsettled. By cross-breeding experiments and parentage analysis we show here that marbled crayfish and P. fallax are reproductively separated. Both crayfish copulate readily, suggesting that the reproductive barrier is set at the cytogenetic rather than the behavioural level. Analysis of complete mitochondrial genomes of marbled crayfish from laboratory lineages and wild populations demonstrates genetic identity and indicates a single origin. Flow cytometric comparison of DNA contents of haemocytes and analysis of nuclear microsatellite loci confirm triploidy and suggest autopolyploidization as its cause. Global DNA methylation is significantly reduced in marbled crayfish implying the involvement of molecular epigenetic mechanisms in its origination. Morphologically, both crayfish are very similar but growth and fecundity are considerably larger in marbled crayfish, making it a different animal with superior fitness. These data and the high probability of a divergent future evolution of the marbled crayfish and P. fallax clusters suggest that marbled crayfish should be considered as an independent asexual species. Our findings also establish the P. fallax-marbled crayfish pair as a novel paradigm for rare chromosomal speciation by autopolyploidy and parthenogenesis in animals and for saltational evolution in general.

80: Whole genome phylogenies reflect long-tailed distributions of recombination rates in many bacterial species
more details view paper

Posted to bioRxiv 07 Apr 2019

Whole genome phylogenies reflect long-tailed distributions of recombination rates in many bacterial species
2,241 downloads evolutionary biology

Thomas Sakoparnig, Chris Field, Erik van Nimwegen

Although homologous recombination is accepted to be common in bacteria, so far it has been challenging to accurately quantify its impact on genome evolution within bacterial species. We here introduce methods that use the statistics of single-nucleotide polymorphism (SNP) splits in the core genome alignment of a set of strains to show that, for many bacterial species, recombination dominates genome evolution. Each genomic locus has been overwritten so many times by recombination that it is impossible to reconstruct the clonal phylogeny and, instead of a consensus phylogeny, the phylogeny typically changes many thousands of times along the core genome alignment. We also show how SNP splits can be used to quantify the relative rates with which different subsets of strains have recombined in the past. We find that virtually every strain has a unique pattern of recombination frequencies with other strains and that the relative rates with which different subsets of strains share SNPs follow long-tailed distributions. Our findings show that bacterial populations are neither clonal nor freely recombining, but structured such that recombination rates between different lineages vary along a continuum spanning several orders of magnitude, with a unique pattern of rates for each lineage. Thus, rather than reflecting clonal ancestry, whole genome phylogenies reflect these long-tailed distributions of recombination rates.

Previous page 1 2 3 4 5 6 7 8 . . . 208 Next page

Sign up for the Rxivist weekly newsletter! (Click here for more details.)