Rxivist logo

Rxivist.org combines preprints from bioRxiv.org with data from Twitter to help you find the papers being discussed in your field.
Currently indexing 84,482 bioRxiv papers from 363,659 authors.

Most downloaded bioRxiv papers, all time

Results 1 through 20 out of 772

in category synthetic biology


1: Daisyfield gene drive systems harness repeated genomic elements as a generational clock to limit spread

John Min, Charleston Noble et al.

15,555 downloads (posted 06 Feb 2017)

Methods of altering wild populations are most useful when inherently limited to local geographic areas. Here we describe a novel form of gene drive based on the introduction of multiple copies of an engineered 'daisy' sequence into repeated elements of the genome. Each introduced copy encodes guide RNAs that target one or more engineered loci carrying the CRISPR nuclease gene and the desired traits. When organisms encoding a drive system are released into the environment, each generation of mating with wild-type organisms will reduce the average number of the guide RNA elements per 'daisyfield' organism by half, serving as a generational clock. The loci encoding the nuclease and payload will exhibit drive only as long as a single copy remains, placing an inherent limit on the extent of spread.


2: DNA Fountain enables a robust and efficient storage architecture

Yaniv Erlich, Dina Zielinski

13,406 downloads (posted 09 Sep 2016)

DNA is an attractive medium to store digital information. Here, we report a storage strategy, called DNA Fountain, that is highly robust and approaches the information capacity per nucleotide. Using our approach, we stored a full computer operating system, movie, and other files with a total of 2.14x10^6 bytes in DNA oligos and perfectly retrieved the information from a sequencing coverage equivalent of a single tile of Illumina sequencing. We also tested a process that can allow 2.18x10^15 retrievals using the original...


3: Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences

Alexander Rives, Siddharth Goyal et al.

12,664 downloads (posted 29 Apr 2019)

In the field of artificial intelligence, a combination of scale in data and model capacity enabled by unsupervised learning has led to major advances in representation learning and statistical generation. In biology, the anticipated growth of sequencing promises unprecedented data on natural sequence diversity. Learning the natural distribution of evolutionary protein sequence variation is a logical step toward predictive and generative modeling for biology. To this end we use unsupervised learning to train a deep conte...


4: Unified rational protein engineering with sequence-only deep representation learning

Ethan C Alley, Grigory Khimulya et al.

8,854 downloads (posted 26 Mar 2019)

Rational protein engineering requires a holistic understanding of protein function. Here, we apply deep learning to unlabelled amino acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily, and biophysically grounded. We show that the simplest models built on top of this unified representation (UniRep) are broadly applicable and generalize to unseen regions of sequence space. Our data-driven approach reaches near state-o...


5: Optimization of Golden Gate assembly through application of ligation sequence-dependent fidelity and bias profiling

Potapov Vladimir, Jennifer L. Ong et al.

7,813 downloads (posted 15 May 2018)

Modern synthetic biology depends on the manufacture of large DNA constructs from libraries of genes, regulatory elements or other genetic parts. Type IIS-restriction enzyme-dependent DNA assembly methods (e.g., Golden Gate) enable rapid one-pot, ordered, multi-fragment DNA assembly, facilitating the generation of high-complexity constructs. The order of assembly of genetic parts is determined by the ligation of flanking Watson-Crick base-paired overhangs. The ligation of mismatched overhangs leads to erroneous assembly,...


6: Programmable patterns in a DNA-based reaction-diffusion system

Sifang Chen, Georg Seelig

7,354 downloads (posted 21 Feb 2019)

Biology offers compelling proof that macroscopic "living materials" can emerge from reactions between diffusing biomolecules. Here, we show that molecular self-organization could be a similarly powerful approach for engineering functional synthetic materials. We introduce a programmable DNA-hydrogel that produces tunable patterns at the centimeter length scale. We generate these patterns by implementing chemical reaction networks through synthetic DNA complexes, embedding the complexes in hydrogel, and triggering with l...


7: RNA-guided gene drives can efficiently and reversibly bias inheritance in wild yeast

James J. DiCarlo, Kevin M. Esvelt

7,141 downloads (posted 16 Jan 2015)

Inheritance-biasing “gene drives” may be capable of spreading genomic alterations made in laboratory organisms through wild populations. We previously considered the potential for RNA-guided gene drives based on the versatile CRISPR/Cas9 genome editing system to serve as a general method of altering populations. Here we report molecularly contained gene drive constructs in the yeast Saccharomyces cerevisiae that are typically copied at rates above 99% when mated to wild yeast. We successfully targeted both non-essential...


8: Rapidly evolving homing CRISPR barcodes

Reza Kalhor, Prashant Mali et al.

5,979 downloads (posted 27 May 2016)

We present here an approach for engineering evolving DNA barcodes in living cells. The methodology entails using a homing guide RNA (hgRNA) scaffold that directs the Cas9-hgRNA complex to target the DNA locus of the hgRNA itself. We show that this homing CRISPR-Cas9 system acts as an expressed genetic barcode that diversifies its sequence and that the rate of diversification can be controlled in cultured cells. We further evaluate these barcodes in cultured cell populations and show that they can record lineage history ...


9: Toward machine-guided design of proteins

Surojit Biswas, Gleb Kuznetsov et al.

5,825 downloads (posted 02 Jun 2018)

Proteins---molecular machines that underpin all biological life---are of significant therapeutic and industrial value. Directed evolution is a high-throughput experimental approach for improving protein function, but has difficulty escaping local maxima in the fitness landscape. Here, we investigate how supervised learning in a closed loop with DNA synthesis and high-throughput screening can be used to improve protein design. Using the green fluorescent protein (GFP) as an illustrative example, we demonstrate the opport...


10: Enabling large-scale genome editing by reducing DNA nicking

Cory J. Smith, Oscar Castanon et al.

5,777 downloads (posted 15 Mar 2019)

To extend the frontier of genome editing and enable the radical redesign of mammalian genomes, we developed a set of dead-Cas9 base editor (dBE) variants that allow editing at tens of thousands of loci per cell by overcoming the cell death associated with DNA double-strand breaks (DSBs) and single-strand breaks (SSBs). We used a set of gRNAs targeting repetitive elements – ranging in target copy number from about 31 to 124,000 per cell. dBEs enabled survival after large-scale base editing, allowing targeted mutations at...


11: Highly-efficient Cas9-mediated transcriptional programming

Alejandro Chavez, Jonathan Scheiman et al.

5,714 downloads (posted 20 Dec 2014)

The RNA-guided bacterial nuclease Cas9 can be reengineered as a programmable transcription factor by a series of changes to the Cas9 protein in addition to the fusion of a transcriptional activation domain (AD). However, the modest levels of gene activation achieved by current Cas9 activators have limited their potential applications. Here we describe the development of an improved transcriptional regulator through the rational design of a tripartite activator, VP64-p65-Rta (VPR), fused to Cas9. We demonstrate its utili...


12: Deep learning enables therapeutic antibody optimization in mammalian cells by deciphering high-dimensional protein sequence space

Derek M Mason, Simon Friedensohn et al.

5,261 downloads (posted 24 Apr 2019)

Therapeutic antibody optimization is time and resource intensive, largely because it requires low-throughput screening (103 variants) of full-length IgG in mammalian cells, typically resulting in only a few optimized leads. Here, we use deep learning to interrogate and predict antigen-specificity from a massively diverse sequence space to identify globally optimized antibody variants. Using a mammalian display platform and the therapeutic antibody trastuzumab, rationally designed site-directed mutagenesis libraries are ...


13: Daisy-chain gene drives for the alteration of local populations

Charleston Noble, John Min et al.

5,186 downloads (posted 07 Jun 2016)

RNA-guided gene drive elements could address many ecological problems by altering the traits of wild organisms, but the likelihood of global spread tremendously complicates ethical development and use. Here we detail a localized form of CRISPR-based gene drive composed of genetic elements arranged in a daisy-chain such that each element drives the next. "Daisy drive" systems can duplicate any effect achievable using an equivalent global drive system, but their capacity to spread is limited by the successive loss of non-...


14: Enzymatic DNA synthesis for digital information storage

Henry H Lee, Reza Kalhor et al.

4,690 downloads (posted 16 Jun 2018)

DNA is an emerging storage medium for digital data but its adoption is hampered by limitations of phosphoramidite chemistry, which was developed for single-base accuracy required for biological functionality. Here, we establish a de novo enzymatic DNA synthesis strategy designed from the bottom-up for information storage. We harness a template-independent DNA polymerase for controlled synthesis of sequences with user-defined information content. We demonstrate retrieval of 144-bits, including addressing, from perfectly ...


15: Low-N protein engineering with data-efficient deep learning

Surojit Biswas, Grigory Khimulya et al.

4,469 downloads (posted 24 Jan 2020)

Protein engineering has enormous academic and industrial potential. However, it is limited by the lack of experimental assays that are consistent with the design goal and sufficiently high-throughput to find rare, enhanced variants. Here we introduce a machine learning-guided paradigm that can use as few as 24 functionally assayed mutant sequences to build an accurate virtual fitness landscape and screen ten million sequences via in silico directed evolution. As demonstrated in two highly dissimilar proteins, avGFP and ...


16: Human 5′ UTR design and variant effect prediction from a massively parallel translation assay

Paul J. Sample, Ban Wang et al.

4,034 downloads (posted 29 Apr 2018)

Predicting the impact of cis-regulatory sequence on gene expression is a foundational challenge for biology. We combine polysome profiling of hundreds of thousands of randomized 5′ UTRs with deep learning to build a predictive model that relates human 5′ UTR sequence to translation. Together with a genetic algorithm, we use the model to engineer new 5′ UTRs that accurately target specified levels of ribosome loading, providing the ability to tune sequences for optimal protein expression. We show that the same approach c...


17: Continuous Genetic Recording with Self-Targeting CRISPR-Cas in Human Cells

Samuel D Perli, Cheryl H. Cui et al.

3,994 downloads (posted 20 May 2016)

The ability to longitudinally track and record molecular events in vivo would provide a unique opportunity to monitor signaling dynamics within cellular niches and to identify critical factors in orchestrating cellular behavior. We present a self-contained analog memory device that enables the recording of molecular stimuli in the form of DNA mutations in human cells. The memory unit consists of a self-targeting guide RNA (stgRNA) cassette that repeatedly directs Streptococcus pyogenes Cas9 nuclease activity towards the...


18: Current CRISPR gene drive systems are likely to be highly invasive in wild populations

Charleston Noble, Ben Adlam et al.

3,885 downloads (posted 16 Nov 2017)

Recent reports have suggested that CRISPR-based gene drives are unlikely to invade wild populations due to drive-resistant alleles that prevent cutting. Here we develop mathematical models based on existing empirical data to explicitly test this assumption. We show that although resistance prevents drive systems from spreading to fixation in large populations, even the least effective systems reported to date are highly invasive. Releasing a small number of organisms often causes invasion of the local population, follow...


19: Resource usage and gene circuit performance characterization in a cell-free ?breadboard?

Dan Siegal-Gaskins, Zoltan A. Tuza et al.

3,870 downloads (posted 25 Nov 2013)

The many successes of synthetic biology have come in a manner largely different from those in other engineering disciplines; in particular, without well-characterized and simplified prototyping environments to play a role analogous to wind-tunnels in aerodynamics and breadboards in electrical engineering. However, as the complexity of synthetic circuits increases, the benefits?in cost savings and design cycle time?of a more traditional engineering approach can be significant. We have recently developed an in vitro ?brea...


20: Marionette: E. coli containing 12 highly-optimized small molecule sensors

Adam J. Meyer, Thomas H. Segall-Shapiro et al.

3,816 downloads (posted 20 Mar 2018)

Cellular processes are carried out by many interacting genes and their study and optimization requires multiple levers by which they can be independently controlled. The most common method is via a genetically-encoded sensor that responds to a small molecule (an "inducible system"). However, these sensors are often suboptimal, exhibiting high background expression and low dynamic range. Further, using multiple sensors in one cell is limited by cross-talk and the taxing of cellular resources. Here, we have developed a di...