Rxivist logo

Towards Practical and Robust DNA-based Data Archiving by Codec System Named ‘Yin-Yang’

By Zhi Ping, Shihong Chen, Guangyu Zhou, Xiaoluo Huang, Sha Joe Zhu, Chen Chai, Haoling Zhang, Henry H Lee, Tsan-Yu Chiu, Tai Chen, Huanming Yang, Xun Xu, George M. Church, Yue Shen

Posted 05 Nov 2019
bioRxiv DOI: 10.1101/829721

Motivation DNA has been reported as a promising medium of data storage for its remarkable durability and space-efficient storage capacity. Here, we propose a robust DNA-based data storage method based on a new codec algorithm, namely ‘Yin-Yang’. Results Using this strategy, we successfully stored different file formats in a single synthetic DNA oligonucleotide pool. Compared to most well-established DNA-based data storage coding schemes presented to date, this codec system can achieve a variety of user goals (e.g. reduce homopolymer length to 3 or 4 at most, maintain balanced GC content between 40% and 60% and simple secondary structure with the Gibbs free energy above −30 kcal/mol). It also shows enhanced robustness in transcoding of different data structure and practical feasibility. We tested this codec with an end-to-end experiment including encoding, DNA synthesis, sequencing and decoding. Through successful retrieval of 3 files totaling 2.02 Megabits after sequencing and decoding, our strategy exhibits great qualities of achieving high storing capacity per nucleotide (427.1 PB/gram) and high fidelity of data recovery.

Download data

  • Downloaded 560 times
  • Download rankings, all-time:
    • Site-wide: 28,072 out of 92,180
    • In synthetic biology: 431 out of 871
  • Year to date:
    • Site-wide: 8,864 out of 92,180
  • Since beginning of last month:
    • Site-wide: 7,463 out of 92,180

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)