Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 62,472 bioRxiv papers from 277,337 authors.

Privacy-preserving generative deep neural networks support clinical data sharing

By Brett K Beaulieu-Jones, Zhiwei Steven Wu, Chris Williams, Ran Lee, Sanjeev P Bhavnani, James Brian Byrd, Casey S. Greene

Posted 05 Jul 2017
bioRxiv DOI: 10.1101/159756 (published DOI: 10.1161/CIRCOUTCOMES.118.005122)

Background: Data sharing accelerates scientific progress but sharing individual level data while preserving patient privacy presents a barrier. Methods and Results: Using pairs of deep neural networks, we generated simulated, synthetic "participants" that closely resemble participants of the SPRINT trial. We showed that such paired networks can be trained with differential privacy, a formal privacy framework that limits the likelihood that queries of the synthetic participants' data could identify a real a participant in the trial. Machine-learning predictors built on the synthetic population generalize to the original dataset. This finding suggests that the synthetic data can be shared with others, enabling them to perform hypothesis-generating analyses as though they had the original trial data. Conclusions: Deep neural networks that generate synthetic participants facilitate secondary analyses and reproducible investigation of clinical datasets by enhancing data sharing while preserving participant privacy.

Download data

  • Downloaded 11,049 times
  • Download rankings, all-time:
    • Site-wide: 87 out of 62,472
    • In bioinformatics: 15 out of 6,231
  • Year to date:
    • Site-wide: 153 out of 62,472
  • Since beginning of last month:
    • Site-wide: 1,137 out of 62,472

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide

Sign up for the Rxivist weekly newsletter! (Click here for more details.)