Rxivist uses download data on preprints from bioRxiv to help you find the papers being discussed in your field. Currently indexing 100,570 bioRxiv papers from 424,791 authors.
Most downloaded bioRxiv papers, all time
in category systems biology
2,528 results found. For more information, click each entry to expand.
2,738 downloads systems biology
Florian Meier, Andreas-David Brunner, Scarlet Koch, Heiner Koch, Markus Lubeck, Michael Krause, Niels Goedecke, Jens Decker, Thomas Kosinski, Melvin A Park, Nicolai Bache, Ole Hoerning, Jüergen Cox, Oliver Räther, Matthias Mann
In bottom−up proteomics, peptides are separated by liquid chromatography with elution peak widths in the range of seconds, while mass spectra are acquired in about 100 microseconds with time−of−fight (TOF) instruments. This allows adding ion mobility as a third dimension of separation. Among several formats, trapped ion mobility spectrometry (TIMS) is attractive due to its small size, low voltage requirements and high efficiency of ion utilization. We have recently demonstrated a scan mode termed parallel accumulation − serial fragmentation (PASEF), which multiplies the sequencing speed without any loss in sensitivity (Meier et al., PMID: 26538118). Here we introduce the timsTOF Pro instrument, which optimally implements online PASEF. It features an orthogonal ion path into the ion mobility device, limiting the amount of debris entering the instrument and making it very robust in daily operation. We investigate different precursor selection schemes for shotgun proteomics to optimally allocate in excess of 100 fragmentation events per second. More than 800,000 fragmentation spectra in standard 120 min LC runs are easily achievable, which can be used for near exhaustive precursor selection in complex mixtures or re-sequencing weak precursors. MaxQuant identified more than 6,400 proteins in single run HeLa analyses without matching to a library, and with high quantitative reproducibility (R > 0.97). Online PASEF achieves a remarkable sensitivity with more than 2,900 proteins identified in 30 min runs of only 10 ng HeLa digest. We also show that highly reproducible collisional cross sections can be acquired on a large scale (R > 0.99). PASEF on the timsTOF Pro is a valuable addition to the technological toolbox in proteomics, with a number of unique operating modes that are only beginning to be explored.
2,707 downloads systems biology
The mammalian liver is composed of repeating hexagonal units termed lobules. Spatially-resolved single-cell transcriptomics revealed that about half of hepatocyte genes are differentially expressed across the lobule. Technical limitations impede reconstructing similar global spatial maps of other hepatocyte features. Here, we used zonated surface markers to sort hepatocytes from defined lobule zones with high spatial resolution. We applied transcriptomics, microRNA array measurements and Mass-spectrometry proteomics to reconstruct spatial atlases of multiple zonated hepatocyte features. We found that protein zonation largely overlapped mRNA zonation. We identified zonation of key microRNAs such as miR-122, and inverse zonation of microRNAs and their hepatocyte gene targets, implying potential regulation through zonated mRNA degradation. These targets included the pericentral Wnt receptors Fzd7 and Fzd8 and the periportal Wnt inhibitors Tcf7l1 and Ctnnbip1. Our approach facilitates reconstruction of spatial atlases of multiple cellular features in the liver and in other structured tissues.
2,676 downloads systems biology
Olivia Wilkins, Christoph Hafemeister, Anne Plessis, Meisha-Marika Holloway-Phillips, Gina M. Pham, Adrienne B Nicotra, Glenn B. Gregorio, S.V. Krishna Jagadish, Endang M. Septiningsih, Richard Bonneau, Michael Purugganan
Environmental Gene Regulatory Influence Networks (EGRINs) coordinate the timing and rate of gene expression in response to environmental and developmental signals. EGRINs encompass many layers of regulation, which culminate in changes in the level of accumulated transcripts. Here we infer EGRINs for the response of five tropical Asian rice cultivars to high temperatures, water deficit, and agricultural field conditions, by systematically integrating time series transcriptome data (720 RNA-seq libraries), patterns of nucleosome-free chromatin (18 ATAC-seq libraries), and the occurrence of known cis-regulatory elements. First, we identify 5,447 putative target genes for 445 transcription factors (TFs) by connecting TFs with genes with known cis-regulatory motifs in nucleosome-free chromatin regions proximal to transcriptional start sites (TSS) of genes. We then use network component analysis to estimate the regulatory activity for these TFs from the expression of these putative target genes. Finally, we inferred an EGRIN using the estimated TFA as the regulator. The EGRIN included regulatory interactions between 4,052 target genes regulated by 113 TFs. We resolved distinct regulatory roles for members of a large TF family, including a putative regulatory connection between abiotic stress and the circadian clock, as well as specific regulatory functions for TFs in the drought response. TFA estimation using network component analysis is an effective way of incorporating multiple genome-scale measurements into network inference and that supplementing data from controlled experimental conditions with data from outdoor field conditions increases the resolution for EGRIN inference.
2,675 downloads systems biology
In recent years, cellular life science research has experienced a significant shift, moving away from conducting bulk cell interrogation towards single-cell analysis. It is only through single cell analysis that a complete understanding of cellular heterogeneity, and the interplay between various cell types that are fundamental to specific biological phenotypes, can be achieved. Single-cell assays at the protein level have been predominantly limited to targeted, antibody-based methods. However, here we present an experimental and computational pipeline, which establishes a comprehensive single-cell mass spectrometry-based proteomics workflow. By exploiting a leukemia culture system, containing functionally-defined leukemic stem cells, progenitors and terminally differentiated blasts, we demonstrate that our workflow is able to explore the cellular heterogeneity within this aberrant developmental hierarchy. We show our approach is capable to quantifying hundreds of proteins across hundreds of single cells using limited instrument time. Furthermore, we developed a computational pipeline (SCeptre), that effectively clusters the data and permits the extraction of cell-specific proteins and functional pathways. This proof-of-concept work lays the foundation for future global single-cell proteomics studies.
2,669 downloads systems biology
Samuel W. Lukowski, Camden Y. Lo, Alexei Sharov, Quan H. Nguyen, Lyujie Fang, Sandy SC Hung, Ling Zhu, Ting Zhang, Tu Nguyen, Anne Senabouth, Jafar S. Jabbari, Emily Welby, Jane C. Sowden, Hayley S. Waugh, Adrienne Mackey, Graeme Pollock, Trevor D. Lamb, Peng-Yuan Wang, Alex W. Hewitt, Mark Gillies, Joseph E. Powell, Raymond C.B. Wong
The retina is a highly specialized neural tissue that senses light and initiates image processing. Although the functional organisation of specific cells within the retina has been well-studied, the molecular profile of many cell types remains unclear in humans. To comprehensively profile cell types in the human retina, we performed single cell RNA-sequencing on 20,009 cells obtained post-mortem from three donors and compiled a reference transcriptome atlas. Using unsupervised clustering analysis, we identified 18 transcriptionally distinct clusters representing all known retinal cells: rod photoreceptors, cone photoreceptors, Müller glia cells, bipolar cells, amacrine cells, retinal ganglion cells, horizontal cells, retinal astrocytes and microglia. Notably, our data captured molecular profiles for healthy and early degenerating rod photoreceptors, and revealed a novel role of MALAT1 in putative rod degeneration. We also demonstrated the use of this retina transcriptome atlas to benchmark pluripotent stem cell-derived cone photoreceptors and an adult Müller glia cell line. This work provides an important reference with unprecedented insights into the transcriptional landscape of human retinal cells, which is fundamental to our understanding of retinal biology and disease.
2,658 downloads systems biology
Precision medicine is an emerging paradigm that requires realistic, mechanistic models capturing the complexity of the human body. We present two comprehensive molecular to physiological-level, gender-specific whole-body metabolism (WBM) reconstructions, named Harvey, in recognition of William Harvey, and Harvetta. These validated, knowledge-based WBM reconstructions capture the metabolism of 20 organs, six sex organs, six blood cells, the gastrointestinal lumen, systemic blood circulation, and the blood-brain barrier. They represent 99% of the human body weight, when excluding the weight of the skeleton. Harvey and Harvetta can be parameterized based on physiological, dietary, and omics data. They correctly predict inter-organ metabolic cycles, basal metabolic rates, and energy use. We demonstrate the integration of microbiome data thereby allowing the assessment of individual-specific, organ-level modulation of host metabolism by the gut microbiota. The WBM reconstructions and the individual organ reconstructions are available under http://vmh.life. Harvey and Harvetta represent a pivotal step towards virtual physiological humans.
2,650 downloads systems biology
Nucleosomes cover most of the genome and are thought to be displaced by transcription factors (TFs) in regions that direct gene expression. However, the modes of interaction between TFs and nucleosomal DNA remain largely unknown. Here, we use nucleosome consecutive affinity-purification systematic evolution of ligands by exponential enrichment (NCAP-SELEX) to systematically explore interactions between the nucleosome and 220 TFs representing diverse structural families. Consistently with earlier observations, we find that the vast majority of TFs have less access to nucleosomal DNA than to free DNA. The motifs recovered from TFs bound to nucleosomal and free DNA are generally similar; however, steric hindrance and scaffolding by the nucleosome result in specific positioning and orientation of the motifs. Many TFs preferentially bind close to the end of nucleosomal DNA, or to periodic positions at its solvent-exposed side. TFs often also bind nucleosomal DNA in a particular orientation, because the nucleosome breaks the local rotational symmetry of DNA. Some TFs also specifically interact with DNA located at the dyad position where only one DNA gyre is wound, whereas other TFs prefer sites spanning two DNA gyres and bind specifically to each of them. Our work reveals striking differences in TF binding to free and nucleosomal DNA, and uncovers a rich interaction landscape between the TFs and the nucleosome.
2,630 downloads systems biology
Single-cell time-lapse studies have advanced the quantitative understanding of cell-to-cell variability. However, as the information content of individual experiments is limited, methods to integrate data collected under different conditions are required. Here we present a multi-experiment nonlinear mixed effect modeling approach for mechanistic pathway models, which allows the integration of multiple single-cell perturbation experiments. We apply this approach to the translation of green fluorescent protein after transfection using a massively parallel read-out of micropatterned single-cell arrays. We demonstrate that the integration of data from perturbation experiments allows the robust reconstruction of cell-to-cell variability, i.e., parameter densities, while each individual experiment provides insufficient information. Indeed, we show that the integration of the datasets on the population level also improves the estimates for individual cells by breaking symmetries, although each of them is only measured in one experiment. Moreover, we confirmed that the suggested approach is robust with respect to batch effects across experimental replicates and can provide mechanistic insights into the nature of batch effects. We anticipate that the proposed multi-experiment nonlinear mixed effect modeling approach will serve as a basis for the analysis of cellular heterogeneity in single-cell dynamics.
2,609 downloads systems biology
We hypothesized that transcription factors (TFs) recognize DNA shape without nucleotide sequence recognition. Motivating an independent role for shape, many TF binding sites lack a sequence-motif, DNA shape adds specificity to sequence-motifs, and different sequences can encode similar shapes. We therefore asked if binding sites of a TF are enriched for specific patterns of DNA shape-features, e.g., helical twist. We developed ShapeMF, which discovers these shape-motifs de novo without taking sequence information into account. We find that most TFs assayed in ENCODE have shape-motifs and bind regulatory regions recognizing shape-motifs in the absence of sequence-motifs. When shape- and sequence-recognition co-occur, the two types of motifs can be overlapping, flanking, or separated by consistent spacing. Shape-motifs are prevalent in regions co-bound by multiple TFs. Finally, TFs with identical sequence motifs have different shape-motifs, explaining their binding at distinct locations. These results establish shape-motifs as drivers of TF-DNA recognition complementary to sequence-motifs.
2,608 downloads systems biology
Bridging genotype to phenotype, the proteome has increasingly become of major importance to generate large, longitudinal sample series for data-driven biology and personalized medicine. Major improvements in laboratory automation, chromatography and software have increased the scale and precision of proteomics. So far missing are however mass spectrometric acquisition techniques that could deal with very fast chromatographic gradients. Here we present scanning SWATH, a data-independent acquisition (DIA) method, in which the DIA-typical stepwise windowed acquisition is replaced by a continuous movement of the precursor isolation window. Scanning SWATH accelerates the duty cycles to a few hundreds of milliseconds, and enables precursor mass assignment to the MS2 fragment traces for improving true positive precursor identification in fast proteome experiments. In combination with 800 µL/min high-flow chromatography, we report the quantification of 270 precursors per second, increasing the precursor identifications by 70% or more compared to previous methods. Scanning SWATH quantified 1,410 Human protein groups in conjunction with chromatographic gradients as fast as 30 seconds, 2,250 with 60-second gradients, and 4,586 in conjunction with 5-minute gradients. At high quantitative precision, our method hence increases the proteomic throughput to hundreds of samples per day per mass spectrometer. Scanning SWATH hence enables a broad range of new proteomic applications that depend on large numbers of cheap yet quantification precise proteomes. ### Competing Interest Statement N.B, G.I., F.W and S.T. are employees of SCIEX
2,580 downloads systems biology
In the post-genomics era, exploration of phenotypic adaptation is limited by our ability to experimentally control selection conditions, including multi-variable and dynamic pressure regimes. While automated cell culture systems offer real-time monitoring and fine control over liquid cultures, they are difficult to scale to high-throughput, or require cumbersome redesign to meet diverse experimental requirements. Here we describe eVOLVER, a multipurpose, scalable DIY framework that can be easily configured to conduct a wide variety of growth fitness experiments at scale and cost. We demonstrate eVOLVER's versatility by configuring it for diverse growth and selection experiments that would be otherwise challenging for other systems. We conduct high-throughput evolution of yeast across different population density niches. We perform growth selection on a yeast knockout library under temporally varying temperature regimes. Finally, inspired by large-scale integration in electronics and microfluidics, we develop novel millifluidic multiplexing modules that enable complex fluidic routines including multiplexed media routing, cleaning, vial-to-vial transfers, and automated yeast mating. We propose eVOLVER to be a versatile design framework in which to study, characterize, and evolve biological systems.
2,519 downloads systems biology
A key goal of developmental biology is to understand how a single cell transforms into a full-grown organism consisting of many different cell types. Single-cell RNA- sequencing (scRNA-seq) has become a widely-used method due to its ability to identify all cell types in a tissue or organ in a systematic manner. However, a major challenge is to organize the resulting taxonomy of cell types into lineage trees revealing the developmental origin of cells. Here, we present a strategy for simultaneous lineage tracing and transcriptome profiling in thousands of single cells. By combining scRNA-seq with computational analysis of lineage barcodes generated by genome editing of transgenic reporter genes, we reconstruct developmental lineage trees in zebrafish larvae and adult fish. In future analyses, LINNAEUS (LINeage tracing by Nuclease-Activated Editing of Ubiquitous Sequences) can be used as a systematic approach for identifying the lineage origin of novel cell types, or of known cell types under different conditions.
2,491 downloads systems biology
During in vitro differentiation, pluripotent stem cells undergo extensive remodeling of their gene expression profile. While studied extensively at the transcriptome level, much less is known about protein dynamics. Here, we measured mRNA and protein levels of 7459 genes during differentiation of embryonic stem cells (ESCs). This comprehensive data set revealed pervasive discordance between mRNA and protein. The high temporal resolution of the data made it possible to determine protein turnover rates genome-wide by fitting a kinetic model. This model further enabled us to systematically identify dynamic post-transcriptional regulation. Moreover, we linked different modes of regulation to the function of specific gene sets. Finally, we showed that the kinetic model can be applied to single-cell transcriptomics data to predict protein levels in differentiated cell types. In conclusion, our comprehensive data set, easily accessible through a web application, is a valuable resource for the discovery of post-transcriptional regulation in ESC differentiation.
2,467 downloads systems biology
Gene expression heterogeneity in the pluripotent state of mouse embryonic stem cells (mESCs) has been increasingly well-characterized. In contrast, exit from pluripotency and lineage commitment have not been studied systematically at the single-cell level. Here we measured the gene expression dynamics of retinoic acid driven mESC differentiation using an unbiased single-cell transcriptomics approach. We found that the exit from pluripotency marks the start of a lineage bifurcation as well as a transient phase of susceptibility to lineage specifying signals. Our study revealed several transcriptional signatures of this phase, including a sharp increase of gene expression variability. Importantly, we observed a handover between two classes of transcription factors. The early-expressed class has potential roles in lineage biasing, the late-expressed class in lineage commitment. In summary, we provide a comprehensive analysis of lineage commitment at the single cell level, a potential stepping stone to improved lineage control through timing of differentiation cues.
2,454 downloads systems biology
Michael Getz, Yafei Wang, Gary An, Andrew Becker, Chase Cockrell, Nicholson Collier, Morgan Craig, Courtney L. Davis, James Faeder, Ashlee N. Ford Versypt, Juliano F. Gianlupi, James A. Glazier, Sara Hamis, Randy Heiland, Thomas Hillen, Dennis Hou, Mohammad Aminul Islam, Adrianne Jenner, Furkan Kurtoglu, Bing Liu, Fiona Macfarlane, Pablo Maygrundter, Penelope A Morel, Aarthi Narayanan, Jonathan Ozik, Elsje Pienaar, P. Rangamani, Jason Edward Shoemaker, Amber M. Smith, Paul Macklin
The 2019 novel coronavirus, SARS-CoV-2, is an emerging pathogen of critical significance to international public health. Knowledge of the interplay between molecular-scale virus-receptor interactions, single-cell viral replication, intracellular-scale viral transport, and emergent tissue-scale viral propagation is limited. Moreover, little is known about immune system-virus-tissue interactions and how these can result in low-level (asymptomatic) infections in some cases and acute respiratory distress syndrome (ARDS) in others, particularly with respect to presentation in different age groups or pre-existing inflammatory risk factors like diabetes. Given the nonlinear interactions within and among each of these processes, multiscale simulation models can shed light on the emergent dynamics that lead to divergent outcomes, identify actionable “choke points” for pharmacologic interventions, screen potential therapies, and identify potential biomarkers that differentiate patient outcomes. Given the complexity of the problem and the acute need for an actionable model to guide therapy discovery and optimization, we introduce and iteratively refine a prototype of a multiscale model of SARS-CoV-2 dynamics in lung tissue. The first prototype model was built and shared internationally as open source code and an online interactive model in under 12 hours, and community domain expertise is driving rapid refinements with a two-to-four week release cycle. In a sustained community effort, this consortium is integrating data and expertise across virology, immunology, mathematical biology, quantitative systems physiology, cloud and high performance computing, and other domains to accelerate our response to this critical threat to international health. ### Competing Interest Statement The authors have declared no competing interest.
2,395 downloads systems biology
Several methods were developed to mine gene-gene relationships from expression data. Examples include correlation and mutual information methods for co-expression analysis, clustering and undirected graphical models for functional assignments and directed graphical models for pathway reconstruction. Using a novel encoding for gene expression data, followed by deep neural networks analysis, we present a framework that can successfully address all these diverse tasks. We show that our method, CNNC, improves upon prior methods in tasks ranging from predicting transcription factor targets to identifying disease related genes to causality inference. CNNC's encoding provides insights about some of the decisions it makes and their biological basis. CNNC is flexible and can easily be extended to integrate additional types of genomics data leading to further improvements in its performance.
2,389 downloads systems biology
Stéphane Pesant, Fabrice Not, Marc Picheral, Stefanie Kandels-Lewis, Noan Le Bescot, Gabriel Gorsky, Daniele Iudicone, Eric Karsenti, Sabrina Speich, Romain Troublé, Céline Dimier, Sarah Searson, Tara Oceans Consortium Coordinators
The Tara Oceans expedition (2009-2013) sampled contrasting ecosystems of the world oceans, collecting environmental data and plankton, from viruses to metazoans, for later analysis using modern sequencing and state-of-the-art imaging technologies. It surveyed 210 ecosystems in 20 biogeographic provinces, collecting over 35000 samples of seawater and plankton. The interpretation of such an extensive collection of samples in their ecological context requires means to explore, assess and access raw and validated data sets. To address this challenge, the Tara Oceans Consortium offers open science resources, including the use of open access archives for nucleotides (ENA) and for environmental, biogeochemical, taxonomic and morphological data (PANGAEA), and the development of on line discovery tools and collaborative annotation tools for sequences and images. Here, we present an overview of Tara Oceans Data, and we provide detailed registries (data sets) of all campaigns (from port-to-port), stations and sampling events.
2,352 downloads systems biology
CRISPR-Cas9 gene editing strategies have revolutionized our ability to engineer the human genome for robust functional interrogation of complex biological processes. We have recently adapted this technology to primary human T cells to generate a high-throughput platform for analyzing the role of host factors in pathogen infection and lifecycle. Here, we describe applications of this system to investigate HIV pathogenesis in CD4+ T cells. Briefly, CRISPR-Cas9 ribonucleoproteins (crRNPs) are synthesized in vitro and delivered to activated primary human CD4+ T cells by nucleofection. These edited cells are then validated and expanded for use in downstream cellular, genetic, or protein-based assays. Our platform supports the arrayed generation of several gene manipulations in only a few hours' time and is widely adaptable across culture conditions, infection protocols, and downstream applications. We present detailed protocols for crRNP synthesis, primary T cell culture, 96-well nucleofection, molecular validation, and HIV infection with additional considerations for guide and screen design as well as crRNP multiplexing.
2,345 downloads systems biology
Helena García-Castro, Nathan J. Kenny, Patricia Álvarez-Campos, Vincent Mason, Anna Schönauer, Victoria A Sleight, Jakke Neiro, Aziz Aboobaker, Jon Permanyer, Marta Iglesias, Manuel Irimia, Arnau Sebé-Pedrós, Jordi Solana
Single-cell sequencing technologies are revolutionizing biology, but are limited by the need to dissociate fresh samples that can only be fixed at later stages. We present ACME (ACetic-MEthanol) dissociation, a cell dissociation approach that fixes cells as they are being dissociated. ACME-dissociated cells have high RNA integrity, can be cryopreserved multiple times, can be sorted by Fluorescence-Activated Cell Sorting (FACS) and are permeable, enabling combinatorial single-cell transcriptomic approaches. As a proof of principle, we have performed SPLiT-seq with ACME cells to obtain around ∼34K single cell transcriptomes from two planarian species and identified all previously described cell types in similar proportions. ACME is based on affordable reagents, can be done in most laboratories and even in the field, and thus will accelerate our knowledge of cell types across the tree of life. ### Competing Interest Statement The authors have declared no competing interest.
2,326 downloads systems biology
Claire D. McWhite, Ophelia Papoulas, Kevin Drew, Rachael M. Cox, Viviana June, Oliver Xiaoou Dong, Taejoon Kwon, Cuihong Wan, Mari L. Salmi, Stanley J. Roux, Karen S Browning, Z. Jeffrey Chen, Pamela C. Ronald, Edward M. Marcotte
Plants are foundational to global ecological and economic systems, yet most plant proteins remain uncharacterized. Protein interaction networks often suggest protein functions and open new avenues to characterize genes and proteins. We therefore systematically determined protein complexes from 13 plant species of scientific and agricultural importance, greatly expanding the known repertoire of stable protein complexes in plants. Using co-fractionation mass spectrometry, we recovered known complexes, confirmed complexes predicted to occur in plants, and identified novel interactions conserved over 1.1 billion years of green plant evolution. Several novel complexes are involved in vernalization and pathogen defense, traits critical to agriculture. We also uncovered plant analogs of animal complexes with distinct molecular assemblies, including a megadalton-scale tRNA multi-synthetase complex. The resulting map offers the first cross-species view of conserved, stable protein assemblies shared across plant cells and provides a mechanistic, biochemical framework for interpreting plant genetics and mutant phenotypes.
- 20 Oct 2020: Support for sorting preprints using Twitter activity has been removed, at least temporarily, until a new source of social media activity data becomes available.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!