Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 71,082 bioRxiv papers from 310,116 authors.
Most downloaded bioRxiv papers, all time
in category systems biology
1,931 results found. For more information, click each entry to expand.
3,291 downloads systems biology
Non-genetic factors can cause individual cells to fluctuate substantially in gene expression levels over time. Yet it remains unclear whether these fluctuations can persist for much longer than the time of one cell division. Current methods for measuring gene expression in single cells mostly rely on single time point measurements, making the duration of gene expression fluctuations or cellular memory difficult to measure. Here, we report a method combining Luria and Delbrück’s fluctuation analysis with population-based RNA sequencing (MemorySeq) for identifying genes transcriptome-wide whose fluctuations persist for several cell divisions. MemorySeq revealed multiple gene modules that are expressed together in rare cells within otherwise homogeneous clonal populations. Further, we found that these rare cell subpopulations are associated with biologically distinct behaviors, such as the ability to proliferate in the face of anti-cancer therapeutics, in different cancer cell lines. The identification of non-genetic, multigenerational fluctuations has the potential to reveal new forms of biological memory at the level of single cells and suggests that non-genetic heritability of cellular state may be a quantitative property.
3,249 downloads systems biology
Determining the three dimensional structures of macromolecules is a major goal of biological research because of the close relationship between structure and function. Structure determination usually relies on physical techniques including x-ray crystallography, NMR spectroscopy and cryo-electron microscopy. Here we present a method that allows the high-resolution three-dimensional structure of a biological macromolecule to be determined only from measurements of the activity of mutant variants of the molecule. This genetic approach to structure determination relies on the quantification of genetic interactions (epistasis) between mutations and the discrimination of direct from indirect interactions. This provides a new experimental strategy for structure determination, with the potential to reveal functional and in vivo structural conformations at low cost and high throughput.
3,248 downloads systems biology
Nathanael G. Lintner, Kim F. McClure, Donna Petersen, Allyn T. Londregan, David W. Piotrowski, Liuqing Wei, Jun Xiao, Michael Bolt, Paula M. Loria, Bruce Maguire, Kieran F. Geoghegan, Austin Huang, Tim Rolph, Spiros Liras, Jennifer A. Doudna, Robert G. Dullea, Jamie H.D. Cate
Proprotein Convertase Subtilisin/Kexin Type 9 (PCSK9) plays a key role in regulating the levels of plasma low density lipoprotein cholesterol (LDL-C). Here we demonstrate that the compound PF-06446846 inhibits translation of PCSK9 by inducing the ribosome to stall around codon 34, mediated by the sequence of the nascent chain within the exit tunnel. We further show that PF-06446846 reduces plasma PCSK9 and total cholesterol levels in rats following oral dosing. Using ribosome profiling, we demonstrate that PF-06446846 is highly selective for the inhibition of PCSK9 translation. The mechanism of action employed by PF-06446846 reveals a previously unexpected tunability of the human ribosome, which allows small molecules to specifically block translation of individual transcripts.
3,243 downloads systems biology
Piper longum L. (P. longum, also called as long pepper) is one of the common culinary herb and has been extensively used as an important constituent of various indigenous medicines, specifically in traditional Indian medicinal system known as Ayurveda. Towards obtaining a global regulatory framework of P. longum's constituents, in this work we first reviewed phytochemicals present in this herb and then studied their pharmacological and medicinal features using network pharmacology approach. We developed high-confidence level tripartite networks consisting of phytochemicals-protein targets-disease association and explain the role of its phytochemicals to various chronic diseases. 7 drug-like phytochemicals in this herb were found as the potential regulators of 5 FDA approved drug targets; and 28 novel drug targets were also reported. 105 phytochemicals were linked with immunomodulatory potency by pathway level mapping in human metabolic network. A sub-network of human PPI regulated by its phytochemicals was derived and various modules in this sub-network were successfully associated with specific diseases.
3,140 downloads systems biology
RNA profiling is an excellent phenotype of cellular responses and tissue states, but can be costly to generate at the massive scale required for studies of regulatory circuits, genetic states or perturbation screens. Here, we draw on a series of advances over the last decade in the field of mathematics to establish a rigorous link between biological structure, data compressibility, and efficient data acquisition. We propose that very few random composite measurements - in which gene abundances are combined in a random linear combination - are needed to approximate the high-dimensional similarity between any pair of gene abundance profiles. We then show how finding latent, sparse representations of gene expression data would enable us to 'decompress' a small number of random composite measurements and recover high-dimensional gene expression levels that were not measured (unobserved). We present a new algorithm for finding sparse, modular structure, which improves the ability to interpret samples in terms of small numbers of active modules, and show that the modular structure we find is sufficient to recover gene expression profiles from composite measurements (with ~100-fold fewer composite measurements than genes). Moreover, the knowledge that sparse, modular structures exist allows us to recover expression profiles from composite measurements, even without access to any training data. Finally, we present a proof-of-concept experiment for making composite measurements in the laboratory, involving the measurement of linear combinations of RNA abundances. Altogether, our results suggest new compressive modalities in experimental biology that can form a foundation for massive scaling in high-throughput measurements, while also offering new insights into the interpretation of high-dimensional data. A recorded seminar presentation of this work is available at: https://www.youtube.com/watch?v=2dBZEOXqKHs
3,112 downloads systems biology
Although mRNAs are key molecules for understanding life, there exists no method to determine the full-length sequence of endogenous mRNAs including their poly(A) tails. Moreover, although poly(A) tails can be modified in functionally important ways, there also exists no method to accurately sequence them. Here, we present FLAM-seq, a rapid and simple method for high-quality sequencing of entire mRNAs. We report a cDNA library preparation method coupled to single-molecule sequencing to perform FLAM-seq. Using human cell lines, brain organoids, and C. elegans we show that FLAM-seq delivers high-quality full-length mRNA sequences for thousands of different genes per sample. We find that (a) 3' UTR length is correlated with poly(A) tail length, (b) alternative polyadenylation sites and alternative promoters for the same gene are linked to different tail lengths, (c) tails contain a significant number of cytosines. Thus, we provide a widely useful method and fundamental insights into poly(A) tail regulation.
3,023 downloads systems biology
Benoit Lehallier, David Gate, Nicholas Schaum, Tibor Nanasi, Song Eun Lee, Hanadie Yousef, Patricia Moran Losada, Daniela Berdnik, Verena Keller, Joe Verghese, Sanish Sathyan, Claudio Franceschi, Sofiya Milman, Nir Barzilai, Tony Wyss-Coray
Aging is the predominant risk factor for numerous chronic diseases that limit healthspan. Mechanisms of aging are thus increasingly recognized as therapeutic targets. Blood from young mice reverses aspects of aging and disease across multiple tissues, pointing to the intriguing possibility that age-related molecular changes in blood can provide novel insight into disease biology. We measured 2,925 plasma proteins from 4,331 young adults to nonagenarians and developed a novel bioinformatics approach which uncovered profound non-linear alterations in the human plasma proteome with age. Waves of changes in the proteome in the fourth, seventh, and eighth decades of life reflected distinct biological pathways, and revealed differential associations with the genome and proteome of age-related diseases and phenotypic traits. This new approach to the study of aging led to the identification of unexpected signatures and pathways of aging and disease and offers potential pathways for aging interventions.
3,008 downloads systems biology
In morphological profiling, quantitative data are extracted from microscopy images of cells to identify biologically relevant similarities and differences among samples based on these profiles. This protocol describes the design and execution of experiments using Cell Painting, a morphological profiling assay multiplexing six fluorescent dyes imaged in five channels, to reveal eight broadly relevant cellular components or organelles. Cells are plated in multi-well plates, perturbed with the treatments to be tested, stained, fixed, and imaged on a high-throughput microscope. Then, automated image analysis software identifies individual cells and measures ~1,500 morphological features (various measures of size, shape, texture, intensity, etc.) to produce a rich profile suitable for detecting subtle phenotypes. Profiles of cell populations treated with different experimental perturbations can be compared to suit many goals, such as identifying the phenotypic impact of chemical or genetic perturbations, grouping compounds and/or genes into functional pathways, and identifying signatures of disease. Cell culture and image acquisition takes two weeks; feature extraction and data analysis take an additional 1-2 weeks.
2,989 downloads systems biology
Single-cell, spatially resolved 'omics analysis of tissues is poised to transform biomedical research and clinical practice. We have developed a computational histology topography cytometry analysis toolbox (histoCAT) to enable the interactive, quantitative, and comprehensive exploration of phenotypes of individual cells, cell-to-cell interactions, microenvironments, and morphological structures within intact tissues. histoCAT will be useful in all areas of tissue-based research. We highlight the unique abilities of histoCAT by analysis of highly multiplexed mass cytometry images of human breast cancer tissues.
2,886 downloads systems biology
Omics data contains signal from the molecular, physical, and kinetic inter- and intra-cellular interactions that control biological systems. Matrix factorization techniques can reveal low-dimensional structure from high-dimensional data that reflect these interactions. These techniques can uncover new biological knowledge from diverse high-throughput omics data in topics ranging from pathway discovery to time course analysis. We review exemplary applications of matrix factorization for systems-level analyses. We discuss appropriate application of these methods, their limitations, and focus on analysis of results to facilitate optimal biological interpretation. The inference of biologically relevant features with matrix factorization enables discovery from high-throughput data beyond the limits of current biological knowledge-answering questions from high-dimensional data that we have not yet thought to ask.
2,831 downloads systems biology
Nucleosomes restrict DNA accessibility throughout eukaryotic genomes, with repercussions for replication, transcription, and other DNA-templated processes. How this globally restrictive organization emerged from a presumably more open ancestral state remains poorly understood. Here, to better understand the challenges associated with establishing globally restrictive chromatin, we express histones in a naïve bacterial system that has not evolved to deal with nucleosomal structures: Escherichia coli . We find that histone proteins from the archaeon Methanothermus fervidus assemble on the E. coli chromosome in vivo and protect DNA from micrococcal nuclease digestion, allowing us to map binding footprints genome-wide. We provide evidence that nucleosome occupancy along the E. coli genome tracks intrinsic sequence preferences but is disturbed by ongoing transcription and replication. Notably, we show that higher nucleosome occupancy at promoters and across gene bodies is associated with lower transcript levels, consistent with local repressive effects. Surprisingly, however, this sudden enforced chromatinization has only mild repercussions for growth, suggesting that histones can become established as ubiquitous chromatin proteins without interfering critically with key DNA-templated processes. Our results have implications for the evolvability of transcriptional ground states and highlight chromatinization by archaeal histones as a potential avenue for controlling genome accessibility in synthetic prokaryotic systems.
2,774 downloads systems biology
In embryology, image processing methods such as segmentation are applied to acquiring quantitative criteria from time-series three-dimensional microscopic images. When used to segment cells or intracellular organelles, several current deep learning techniques outperform traditional image processing algorithms. However, segmentation algorithms still have unsolved problems, especially in bioimage processing. The most critical issue is that the existing deep learning-based algorithms for bioimages can perform only semantic segmentation, which distinguishes whether a pixel is within an object (for example, nucleus) or not. In this study, we implemented a novel segmentation algorithm, based on deep learning, which segments each nucleus and adds different labels to the detected objects. This segmentation algorithm is called instance segmentation. Our instance segmentation algorithm, implemented as a neural network, which we named QCA Net, substantially outperformed 3D U-Net, which is the best semantic segmentation algorithm that uses deep learning. Using QCA Net, we quantified the nuclear number, volume, surface area, and center of gravity coordinates during the development of mouse embryos. In particular, QCA Net distinguished nuclei of embryonic cells from those of polar bodies formed in meiosis. We consider that QCA Net can greatly contribute to bioimage segmentation in embryology by generating quantitative criteria from segmented images.
2,757 downloads systems biology
Tapio Lönnberg, Valentine Svensson, Kylie R James, Daniel Fernandez-Ruiz, Ismail Sebina, Ruddy Montandon, Megan S F Soon, Lily G Fogg, Michael J. T. Stubbington, Frederik Otzen Bagger, Max Zwiessele, Neil Lawrence, Fernando Souza-Fonseca-Guimaraes, William R Heath, Oliver Billker, Oliver Stegle, Ashraful Haque, Sarah A Teichmann
Differentiation of naïve CD4+ T cells into functionally distinct T helper subsets is crucial for the orchestration of immune responses. Due to multiple levels of heterogeneity and multiple overlapping transcriptional programs in differentiating T cell populations, this process has remained a challenge for systematic dissection in vivo. By using single-cell RNA transcriptomics and computational modelling of temporal mixtures, we reconstructed the developmental trajectories of Th1 and Tfh cell populations during Plasmodium infection in mice at single-cell resolution. These cell fates emerged from a common, highly proliferative and metabolically active precursor. Moreover, by tracking clonality from T cell receptor sequences, we infer that ancestors derived from the same naïve CD4+ T cell can concurrently populate both Th1 and Tfh subsets. We further found that precursor T cells were coached towards a Th1 but not a Tfh fate by monocytes/macrophages. The integrated genomic and computational approach we describe is applicable for analysis of any cellular system characterized by differentiation towards multiple fates.
2,720 downloads systems biology
Recent studies using single cell RNA-seq (scRNA-seq) data derived from differentiating systems have raised fundamental questions regarding the discrete vs continuous nature of both differentiation and cell fate. Here we present Palantir, an algorithm that models trajectories of differentiating cells, which treats cell-fate as a probabilistic process, and leverages entropy to measure the changing nature of cell plasticity along the differentiation trajectory. Palantir generates a high resolution pseudotime ordering of cells, and assigns each cell state with its probability to differentiate into each terminal state. We apply Palantir to human bone marrow scRNA-seq data and detect key landmarks of hematopoietic differentiation. Palantir's resolution enables identification of key transcription factors driving lineage fate choices, as these TFs closely track when cells lose plasticity. We demonstrate that Palantir is generalizable to diverse tissue types and well-suited to resolve less studied differentiating systems.
2,663 downloads systems biology
The observations of phenotypic plasticity have stimulated the revival of 'epigenetics'. Over the past 70 years the term has come in many colors and flavors, depending on the biological discipline and time period. The meanings span from Waddington's "epigenotype" and "epigenetic landscape" to the molecular biologists' "epigenetic marks" embodied by DNA methylation and histone modifications. Here we seek to quell the ambiguity of the name. First we place "epigenetic" in the various historical contexts. Then, by presenting the formal concepts of dynamical systems theory we show that the "epigenetic landscape" is more than a metaphor: it has specific mathematical foundations. The latter explains how gene regulatory networks produce multiple attractor states, the self-stabilizing patterns of gene activation across the genome that account for "epigenetic memory". This network dynamics approach replaces the reductionist correspondence of molecular epigenetic modifications with concept of the epigenetic landscape, by providing a concrete and crisp correspondence.
2,630 downloads systems biology
Integrated -omics approaches are quickly spreading across microbiology research labs, leading to i) the possibility of detecting previously hidden features of microbial cells like multi-scale spatial organisation and ii) tracing molecular components across multiple cellular functional states. This promises to reduce the knowledge gap between genotype and phenotype and poses new challenges for computational microbiologists. We underline how the capability to unravel the complexity of microbial life will strongly depend on the integration of the huge and diverse amount of information that can be derived today from -omics experiments. In this work, we present opportunities and challenges of multi –omics data integration in current systems biology pipelines. We here discuss which layers of biological information are important for biotechnological and clinical purposes, with a special focus on bacterial metabolism and modelling procedures. A general review of the most recent computational tools for performing large-scale datasets integration is also presented, together with a possible framework to guide the design of systems biology experiments by microbiologists.
2,617 downloads systems biology
The intestinal epithelium is a highly structured tissue composed of repeating crypt-villus units. Enterocytes, which constitute the most abundant cell type, perform the diverse tasks of absorbing a wide range of nutrients while protecting the body from the harsh bacterial-rich environment. It is unknown if these tasks are equally performed by all enterocytes or whether they are spatially zonated along the villus axis. Here, we performed whole-transcriptome measurements of laser-capture-microdissected villus segments to extract a large panel of landmark genes, expressed in a zonated manner. We used these genes to localize single sequenced enterocytes along the villus axis, thus reconstructing a global spatial expression map. We found that most enterocyte genes were zonated. Enterocytes at villi bottoms expressed an anti-bacterial Reg gene program in a microbiome-dependent manner, potentially reducing the crypt pathogen exposure. Translation, splicing and respiration genes steadily decreased in expression towards the villi tops, whereas distinct mid-top villus zones sub-specialized in the absorption of carbohydrates, peptides and fat. Enterocytes at the villi tips exhibited a unique gene-expression signature consisting of Klf4, Egfr, Neat1, Malat1, cell adhesion and purine metabolism genes. Our study exposes broad spatial heterogeneity of enterocytes, which could be important for achieving their diverse tasks.
2,530 downloads systems biology
Motivation: Parameter estimation methods for ordinary differential equation (ODE) models of biological processes can exploit gradients and Hessians of objective functions to achieve convergence and computational efficiency. However, the computational complexity of established methods to evaluate the Hessian scales linearly with the number of state variables and quadratically with the number of parameters. This limits their application to low-dimensional problems. Results: We introduce second order adjoint sensitivity analysis for the computation of Hessians and a hybrid optimization-integration based approach for profile likelihood computation. Second order adjoint sensitivity analysis scales linearly with the number of parameters and state variables. The Hessians are effectively exploited by the proposed profile likelihood computation approach. We evaluate our approaches on published biological models with real measurement data. Our study reveals an improved computational efficiency and robustness of optimization compared to established approaches, when using Hessians computed with adjoint sensitivity analysis. The hybrid computation method was more than two-fold faster than the best competitor. Thus, the proposed methods and implemented algorithms allow for the improvement of parameter estimation for medium and large scale ODE models. Availability: The algorithms for second order adjoint sensitivity analysis are implemented in the Advance MATLAB Interface CVODES and IDAS (AMICI, https://github.com/ICB-DCM/AMICI/). The algorithm for hybrid profile likelihood computation is implemented in the parameter estimation toolbox (PESTO, https://github.com/ICB-DCM/PESTO/). Both toolboxes are freely available under the BSD license.
2,523 downloads systems biology
We hypothesized that transcription factors (TFs) recognize DNA shape without nucleotide sequence recognition. Motivating an independent role for shape, many TF binding sites lack a sequence-motif, DNA shape adds specificity to sequence-motifs, and different sequences can encode similar shapes. We therefore asked if binding sites of a TF are enriched for specific patterns of DNA shape-features, e.g., helical twist. We developed ShapeMF, which discovers these shape-motifs de novo without taking sequence information into account. We find that most TFs assayed in ENCODE have shape-motifs and bind regulatory regions recognizing shape-motifs in the absence of sequence-motifs. When shape- and sequence-recognition co-occur, the two types of motifs can be overlapping, flanking, or separated by consistent spacing. Shape-motifs are prevalent in regions co-bound by multiple TFs. Finally, TFs with identical sequence motifs have different shape-motifs, explaining their binding at distinct locations. These results establish shape-motifs as drivers of TF-DNA recognition complementary to sequence-motifs.
2,521 downloads systems biology
Technologies that visualize multiple biomolecules at the nanometer scale in cells will enable deeper understanding of biological processes that proceed at the molecular scale. Current fluorescence-based methods for microscopy are constrained by a combination of spatial resolution limitations, limited parameters per experiment, and detector systems for the wide variety of biomolecules found in cells. We present here super-resolution ion beam imaging (srIBI), a secondary ion mass spectrometry approach capable of high-parameter imaging in 3D of targeted biological entities and exogenously added small molecules. Uniquely, the atomic constituents of the biomolecules themselves can often be used in our system as the "tag". We visualized the subcellular localization of the chemotherapy drug cisplatin simultaneously with localization of five other nuclear structures, with further carbon elemental mapping and secondary electron visualization, down to ~30 nm lateral resolution. Cisplatin was preferentially enriched in nuclear speckles and excluded from closed-chromatin regions, indicative of a role for cisplatin in active regions of chromatin. These data highlight how multiplexed super-resolution techniques, such as srIBI, will enable studies of biomolecule distributions in biologically relevant subcellular microenvironments.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!