Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 62,709 bioRxiv papers from 278,266 authors.
Most downloaded bioRxiv papers, all time
in category evolutionary biology
4,155 results found. For more information, click each entry to expand.
2,524 downloads evolutionary biology
With the increasing use of massively parallel sequencing approaches in evolutionary biology, the need for fast and accurate methods suitable to investigate genetic structure and evolutionary history are more important than ever. We propose new distance measures for estimating genetic distances between individuals when allelic variation, gene dosage and recombination could compromise standard approaches. We present four distance measures based on single nucleotide polymorphisms (SNP) and evaluate them against previously published measures using coalescent- based simulations. Simulations were used to test (i) whether the measures give unbiased and accurate distance estimates, (ii) whether they can accurately identify the genomic mixture of hybrid individuals and (iii) whether they give precise (low variance) estimates. The effect of rate variation among genes and recombination was also investigated. The results showed that the SNP-based GENPOFAD distance we propose appears to work well in the widest range of circumstances. It was the most accurate and precise method for estimating genetic distances and is also relatively good at estimating the genomic mixture of hybrid individuals. Our simulations provide benchmarks to compare the performance of different method that estimate genetic distances between organisms.
2,517 downloads evolutionary biology
Powerful approaches to inferring recent or current population structure based on nearest neighbour haplotype coancestry have so far been inaccessible to users without high quality genome-wide haplotype data. With a boom in non-model organism genomics, there is a pressing need to bring these methods to communities without access to such data. Here we present RADpainter, a new program designed to infer the coancestry matrix from restriction-site-associated DNA sequencing (RADseq) data. We combine this program together with a previously published MCMC clustering algorithm into fineRADstructure - a complete, easy to use, and fast population inference package for RADseq data (https://github.com/millanek/fineRADstructure). Finally, with two example datasets, we illustrate its use, benefits, and robustness to missing RAD alleles in double digest RAD sequencing.
2,506 downloads evolutionary biology
The vast majority of human mutations have minor allele frequencies (MAF) under 1%, with the plurality observed only once (i.e., “singletons”). While Mendelian diseases are predominantly caused by rare alleles, their cumulative contribution to complex phenotypes remains largely unknown. We develop and rigorously validate an approach to jointly estimate the contribution of all alleles, including singletons, to phenotypic variation. We apply our approach to transcriptional regulation, an intermediate between genetic variation and complex disease. Using whole genome DNA and lymphoblastoid cell line RNA sequencing data from 360 European individuals, we conservatively estimate that singletons contribute ~25% of cis-heritability across genes (dwarfing the contributions of other frequencies). Strikingly, the majority (~76%) of singleton heritability derives from ultra-rare variants absent from thousands of additional samples. We develop a novel inference procedure to demonstrate that our results are consistent with rampant purifying selection shaping the regulatory architecture of most human genes.
2,470 downloads evolutionary biology
Learning how complex traits like eyes originate is fundamental for understanding evolution. Here, we first sketch historical perspectives on trait origins and argue that new technologies offer key new insights. Next, we articulate four open questions about trait origins. To address them, we define a research program to break complex traits into components and study the individual evolutionary histories of those parts. By doing so, we can learn when the parts came together and perhaps understand why they stayed together. We apply the approach to five structural innovations critical for complex eyes, reviewing the history of the parts of each of those innovations. Photoreceptors evolved within animals by bricolage, recombining genes that originated far earlier. Multiple genes used in eyes today had ancestral roles in stress responses. We hypothesize that photo-stress could have increased the chance those genes were expressed together in places on animals where light was abundant.
2,464 downloads evolutionary biology
Psilocybin is a psychoactive compound with clinical applications produced by dozens of mushroom species. There has been a longstanding interest in psilocybin research with regard to treatment for addiction, depression, and end-of-life suffering. However, until recently very little was known about psilocybin biosynthesis and its ecological role. Here we confirm and refine recent findings about the genes underpinning psilocybin biosynthesis, discover that there is more than one psilocybin biosynthesis cluster in mushrooms, and we provide the first data directly addressing psilocybin's ecological role. By analysing independent genome assemblies for the hallucinogenic mushrooms Psilocybe cyanescens and Pluteus salicinus we recapture the recently discovered psilocybin biosynthesis cluster and show that a transcription factor previously implicated in its regulation is actually not part of the cluster. Further, we show that the mushroom Inocybe corydalina produces psilocybin but does not contain the established biosynthetic cluster, and we present an alternative cluster. Finally, a meta-transcriptome analysis of wild-collected mushrooms provides evidence for intra-mushroom insect gene expression of flies whose larvae grow inside Psilocybe cyanescens. These larvae were successfully reared into adults. Our results show that psilocybin does not confer complete protection against insect mycophagy, and the hypothesis that it is produced as an adaptive defense compound may need to be reconsidered.
2,453 downloads evolutionary biology
Tzachi Hagai, Xi Chen, Ricardo J Miragaia, Tomás Gomes, Raghd Rostom, Natalia Kunowska, Valentina Proserpio, Giacomo Donati, Lara Bossini-Castillo, Guy Naamati, Guy Emerton, Gosia Trynka, Ivanela Kondova, Mike Denis, Sarah A Teichmann
As the first line of defence against pathogens, cells mount an innate immune response, which is highly variable from cell to cell. The response must be potent yet carefully controlled to avoid self-damage. How these constraints have shaped the evolution of innate immunity remains poorly understood. Here, we characterise this programme's transcriptional divergence between species and expression variability across cells. Using bulk and single-cell transcriptomics in primate and rodent fibroblasts challenged with an immune stimulus, we reveal a striking architecture of the innate immune response. Transcriptionally diverging genes, including cytokines and chemokines, vary across cells and have distinct promoter structures. Conversely, genes involved in response regulation, such as transcription factors and kinases, are conserved between species and display low cell-to-cell variability. We suggest that this unique expression pattern, observed across species and conditions, has evolved as a mechanism for fine-tuned regulation, to achieve an effective but balanced response.
2,438 downloads evolutionary biology
A number of open questions in human evolutionary genetics would become tractable if we were able to directly measure evolutionary fitness. As a step towards this goal, we developed a method to examine whether individual genetic variants, or sets of genetic variants, currently influence viability. The approach consists in testing whether the frequency of an allele varies across ages, accounting for variation in ancestry. We applied it to the Genetic Epidemiology Research on Aging (GERA) cohort and to the parents of participants in the UK Biobank. Across the genome, we find only a few common variants with large effects on age-specific mortality: tagging the APOE ϵ4 allele and near CHRNA3. These results suggest that when large, even late onset effects are kept at low frequency by purifying selection. Testing viability effects of sets of genetic variants that jointly influence one of 42 traits, we detect a number of strong signals. In participants of the UK Biobank study of British ancestry, we find that variants that delay puberty timing are enriched in longer-lived parents (P~6×10-6 for fathers and P~2×10-3 for mothers), consistent with epidemiological studies. Similarly, in mothers, variants associated with later age at first birth are associated with a longer lifespan (P~1×10-3). Signals are also observed for variants influencing cholesterol levels, risk of coronary artery disease, body mass index, as well as risk of asthma. These signals exhibit consistent effects in the GERA cohort and among participants of the UK Biobank of non-British ancestry. Moreover, we see marked differences between males and females, most notably at the CHRNA3 locus, and variants associated with risk of coronary artery disease and cholesterol levels. Beyond our findings, the analysis serves as a proof of principle for how upcoming biomedical datasets can be used to learn about selection effects in contemporary humans.
2,433 downloads evolutionary biology
A powerful way to detect selection in a population is by modeling local allele frequency changes in a particular region of the genome under scenarios of selection and neutrality, and finding which model is most compatible with the data. Chen et al. (2010) developed a composite likelihood method called XP-CLR that uses an outgroup population to detect departures from neutrality which could be compatible with hard or soft sweeps, at linked sites near a beneficial allele. However, this method is most sensitive to recent selection and may miss selective events that happened a long time ago. To overcome this, we developed an extension of XP-CLR that jointly models the behavior of a selected allele in a three-population tree. Our method - called 3P-CLR - outperforms XP-CLR when testing for selection that occurred before two populations split from each other, and can distinguish between those events and events that occurred specifically in each of the populations after the split. We applied our new test to population genomic data from the 1000 Genomes Project, to search for selective sweeps that occurred before the split of Yoruba and Eurasians, but after their split from Neanderthals, and that could have led to the spread of modern-human-specific phenotypes. We also searched for sweep events that occurred in East Asians, Europeans and the ancestors of both populations, after their split from Yoruba. In both cases, we are able to confirm a number of regions identified by previous methods, and find several new candidates for selection in recent and ancient times. For some of these, we also find suggestive functional mutations that may have driven the selective events.
2,414 downloads evolutionary biology
A central problem in evolutionary biology is to infer the full genealogical history of a set of DNA sequences. This history contains rich information about the forces that have influenced a sexually reproducing species. However, existing methods are limited: the most accurate is unable to cope with more than a few dozen samples. With modern genetic data sets rapidly approaching millions of genomes, there is an urgent need for efficient inference methods to exploit such rich resources. We introduce an algorithm to infer whole-genome history which has comparable accuracy to the state-of-the-art but can process around four orders of magnitude more sequences. Additionally, our method results in an "evolutionary encoding" of the original sequence data, enabling efficient access to genealogies and calculation of genetic statistics over the data. We apply this technique to human data from the 1000 Genomes Project, Simons Genome Diversity Project and UK Biobank, showing that the genealogies we estimate are both rich in biological signal and efficient to process.
2,412 downloads evolutionary biology
Lehti Saag, Liivi Varul, Christiana Lyn Scheib, Jesper Stenderup, Morten E. Allentoft, Lauri Saag, Luca Pagani, Maere Reidla, Kristiina Tambets, Ene Metspalu, Aivar Kriiska, Eske Willerslev, Toomas Kivisild, Mait Metspalu
Farming-based economies appear relatively late in Northeast Europe and the extent to which they involve genetic ancestry change is still poorly understood. Here we present the analyses of low coverage whole genome sequence data from five hunter-gatherers and five farmers of Estonia dated to 4,500 to 6,300 years before present. We find evidence of significant differences between the two groups in the composition of autosomal as well as mtDNA, X and Y chromosome ancestries. We find that Estonian hunter-gatherers of Comb Ceramic Culture are closest to Eastern hunter-gatherers. The Estonian first farmers of Corded Ware Culture show high similarity in their autosomes with Steppe Belt Late Neolithic/Bronze Age individuals, Caucasus hunter-gatherers and Iranian farmers while their X chromosomes are most closely related with the European Early Farmers of Anatolian descent. These findings suggest that the shift to intensive cultivation and animal husbandry in Estonia was triggered by the arrival of new people with predominantly Steppe ancestry, but whose ancestors had undergone sex-specific admixture with early farmers with Anatolian ancestry.
2,409 downloads evolutionary biology
Making meaningful inferences from phylogenetic comparative data requires a meaningful model of trait evolution. It is thus important to determine whether the model is appropriate for the data and the question being addressed. One way to assess this is to ask whether the model provides a good statistical explanation for the variation in the data. To date, researchers have focused primarily on the explanatory power of a model relative to alternative models. Methods have been developed to assess the adequacy, or absolute explanatory power, of phylogenetic trait models but these have been restricted to specific models or questions. Here we present a general statistical framework for assessing the adequacy of phylogenetic trait models. We use our approach to evaluate the statistical performance of commonly used trait models on 337 comparative datasets covering three key Angiosperm functional traits. In general, the models we tested often provided poor statistical explanations for the evolution of these traits. This was true for many different groups and at many different scales. Whether such statistical inadequacy will qualitatively alter inferences draw from comparative datasets will depend on the context. Regardless, assessing model adequacy can provide interesting biological insights -- how and why a model fails to describe variation in a dataset gives us clues about what evolutionary processes may have driven trait evolution across time.
2,377 downloads evolutionary biology
Genome size evolution is a fundamental problem in molecular evolution. Statistical analysis of genome sizes brings new insight into the evolution of genome size. Although the variation of genome sizes is complicated, it is indicated that the genome size evolution can be explained more clearly at taxon level than at species level. I find that the genome size distribution for species in a taxon fits log-normal distribution. And I find a relationship between the phylogeny of life and the statistical features of genome size distributions among taxa. I observed different statistical features of genome size distributions between animal taxa and plant taxa. A log-normal stochastic process model is developed to simulate the genome size evolution. The simulation results on the log-normal distributions of genome sizes and their statistical features agree with the observations.
2,376 downloads evolutionary biology
Life inside ant colonies is orchestrated with a diverse set of pheromones, but it is not clear how ants perceive these social cues. It has been proposed that pheromone perception in ants evolved via expansions in the numbers of odorant receptors (ORs) and antennal lobe glomeruli. Here we generate the first mutant lines in ants by disrupting orco, a gene required for the function of all ORs. We find that orco mutants exhibit severe deficiencies in social behavior and fitness, suggesting that they are unable to perceive pheromones. Surprisingly, unlike in Drosophila melanogaster, orco mutant ants also lack most of the approximately 500 antennal lobe glomeruli found in wild-types. These results illustrate that ORs are essential for ant social organization, and raise the possibility that, similar to mammals, receptor function is required for the development and/or maintenance of the highly complex olfactory processing areas in the ant brain.
2,375 downloads evolutionary biology
The relatively narrow range of genetic polymorphism levels across species has been a major source of debate since the inception of molecular population genetics. Recently Corbett-Detig et al found evidence that linked selection strongly constrains levels of polymorphism in species with large census sizes. Here I reexamine this claim and find weak support for this conclusion. While linked selection is an important determinant of polymorphism levels along the genome in many species, we currently lack compelling evidence that it is a major determinant of polymorphism levels among obligately sexual species.
2,353 downloads evolutionary biology
The testis expresses the largest number of genes of any mammalian organ, a finding that has long puzzled molecular biologists. Analyzing our single-cell transcriptomic maps of human and mouse spermatogenesis, we provide evidence that this widespread transcription serves to maintain DNA sequence integrity in the male germline by correcting DNA damage through 'transcriptional scanning'. Supporting this model, we find that genes expressed during spermatogenesis display lower mutation rates on the transcribed strand and have low diversity in the population. Moreover, this effect is fine-tuned by the level of gene expression during spermatogenesis. The unexpressed genes, which in our model do not benefit from transcriptional scanning, diverge faster over evolutionary time-scales and are enriched for sensory and immune-defense functions. Collectively, we propose that transcriptional scanning modulates germline mutation rates in a gene-specific manner, maintaining DNA sequence integrity for the bulk of genes but allowing for fast evolution in a specific subset.
2,325 downloads evolutionary biology
Meiosis is a key event of sexual life cycles in eukaryotes. Its mechanistic details have been uncovered in several model organisms, and most of its essential features have received various and often contradictory evolutionary interpretations. In this perspective, we present an overview of these often "weird" features. We discuss the origin of meiosis (origin of ploidy reduction and recombination, two-step meiosis), its secondary modifications (in polyploids or asexuals, inverted meiosis), its importance in punctuating life cycles (meiotic arrests, epigenetic resetting, meiotic asymmetry, meiotic fairness) and features associated with recombination (disjunction constraints, heterochiasmy, crossover interference and hotspots). We present the various evolutionary scenarios and selective pressures that have been proposed to account for these features, and we highlight that their evolutionary significance often remains largely mysterious. Resolving these mysteries will likely provide decisive steps towards understanding why sex and recombination are found in the majority of eukaryotes.
2,262 downloads evolutionary biology
Bantu speech communities expanded over large parts of sub-Saharan Africa within the last 4000-5000 years, reaching different parts of southern Africa 1200-2000 years ago. The Bantu languages subdivide in several major branches, with languages belonging to the Eastern and Western Bantu branches spreading over large parts of Central, Eastern, and Southern Africa. There is still debate whether this linguistic divide is correlated with a genetic distinction between Eastern and Western Bantu speakers. During their expansion, Bantu speakers would have come into contact with diverse local populations, such as the Khoisan hunter-gatherers and pastoralists of southern Africa, with whom they may have intermarried. In this study, we analyze complete mtDNA genome sequences from over 900 Bantu-speaking individuals from Angola, Zambia, Namibia, and Botswana to investigate the demographic processes at play during the last stages of the Bantu expansion. Our results show that most of these Bantu-speaking populations are genetically very homogenous, with no genetic division between speakers of Eastern and Western Bantu languages. Most of the mtDNA diversity in our dataset is due to different degrees of admixture with autochthonous populations. Only the pastoralist Himba and Herero stand out due to high frequencies of particular L3f and L3d lineages; the latter are also found in the neighboring Damara, who speak a Khoisan language and were foragers and small-stock herders. In contrast, the close cultural and linguistic relatives of the Herero and Himba, the Kuvale, are genetically similar to other Bantu-speakers. Nevertheless, as demonstrated by resampling tests, the genetic divergence of Herero, Himba, and Kuvale is compatible with a common shared ancestry with high levels of drift and differential female admixture with local pre-Bantu populations.
2,246 downloads evolutionary biology
The eukaryotic cytoskeleton evolved from prokaryotic cytomotive filaments. Prokaryotic filament systems show bewildering structural and dynamic complexity, and in many aspects prefigure the self-organizing properties of the eukaryotic cytoskeleton. Here I compare the dynamic properties of the prokaryotic and eukaryotic cytoskeleton, and discuss how these relate to function and the evolution of organellar networks. The evolution of new aspects of filament dynamics in eukaryotes, including severing and branching, and the advent of molecular motors converted the eukaryotic cytoskeleton into a self-organizing active gel, the dynamics of which can only be described with computational models. Advances in modeling and comparative genomics hold promise of a better understanding of the evolution of the self-organizing cytoskeleton in early eukaryotes, and its role in the evolution of novel eukaryotic functions, such as amoeboid motility, mitosis, and ciliary swimming.
2,245 downloads evolutionary biology
The parthenogenetic all-female marbled crayfish is a novel research model and potent invader of freshwater ecosystems. It is a triploid descendant of the sexually reproducing slough crayfish, Procambarus fallax, but its taxonomic status has remained unsettled. By cross-breeding experiments and parentage analysis we show here that marbled crayfish and P. fallax are reproductively separated. Both crayfish copulate readily, suggesting that the reproductive barrier is set at the cytogenetic rather than the behavioural level. Analysis of complete mitochondrial genomes of marbled crayfish from laboratory lineages and wild populations demonstrates genetic identity and indicates a single origin. Flow cytometric comparison of DNA contents of haemocytes and analysis of nuclear microsatellite loci confirm triploidy and suggest autopolyploidization as its cause. Global DNA methylation is significantly reduced in marbled crayfish implying the involvement of molecular epigenetic mechanisms in its origination. Morphologically, both crayfish are very similar but growth and fecundity are considerably larger in marbled crayfish, making it a different animal with superior fitness. These data and the high probability of a divergent future evolution of the marbled crayfish and P. fallax clusters suggest that marbled crayfish should be considered as an independent asexual species. Our findings also establish the P. fallax-marbled crayfish pair as a novel paradigm for rare chromosomal speciation by autopolyploidy and parthenogenesis in animals and for saltational evolution in general.
2,241 downloads evolutionary biology
Although homologous recombination is accepted to be common in bacteria, so far it has been challenging to accurately quantify its impact on genome evolution within bacterial species. We here introduce methods that use the statistics of single-nucleotide polymorphism (SNP) splits in the core genome alignment of a set of strains to show that, for many bacterial species, recombination dominates genome evolution. Each genomic locus has been overwritten so many times by recombination that it is impossible to reconstruct the clonal phylogeny and, instead of a consensus phylogeny, the phylogeny typically changes many thousands of times along the core genome alignment. We also show how SNP splits can be used to quantify the relative rates with which different subsets of strains have recombined in the past. We find that virtually every strain has a unique pattern of recombination frequencies with other strains and that the relative rates with which different subsets of strains share SNPs follow long-tailed distributions. Our findings show that bacterial populations are neither clonal nor freely recombining, but structured such that recombination rates between different lineages vary along a continuum spanning several orders of magnitude, with a unique pattern of rates for each lineage. Thus, rather than reflecting clonal ancestry, whole genome phylogenies reflect these long-tailed distributions of recombination rates.
- Top preprints of 2018
- Paper search
- Author leaderboards
- Overall metrics
- The API
- Email newsletter
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!