Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 67,110 bioRxiv papers from 295,231 authors.

Toward machine-guided design of proteins

By Surojit Biswas, Gleb Kuznetsov, Pierce J Ogden, Nicholas J Conway, Ryan P Adams, George M Church

Posted 02 Jun 2018
bioRxiv DOI: 10.1101/337154

Proteins---molecular machines that underpin all biological life---are of significant therapeutic and industrial value. Directed evolution is a high-throughput experimental approach for improving protein function, but has difficulty escaping local maxima in the fitness landscape. Here, we investigate how supervised learning in a closed loop with DNA synthesis and high-throughput screening can be used to improve protein design. Using the green fluorescent protein (GFP) as an illustrative example, we demonstrate the opportunities and challenges of generating training datasets conducive to selecting strongly generalizing models. With prospectively designed wet lab experiments, we then validate that these models can generalize to unseen regions of the fitness landscape, even when constrained to explore combinations of non-trivial mutations. Taken together, this suggests a hybrid optimization strategy for protein design in which a predictive model is used to explore difficult-to-access but promising regions of the fitness landscape that directed evolution can then exploit at scale.

Download data

  • Downloaded 4,953 times
  • Download rankings, all-time:
    • Site-wide: 440 out of 67,110
    • In synthetic biology: 12 out of 638
  • Year to date:
    • Site-wide: 377 out of 67,110
  • Since beginning of last month:
    • Site-wide: 1,361 out of 67,110

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News