Curation of BIDS (CuBIDS): a workflow and software package for streamlining reproducible curation of large BIDS datasets
By
Sydney Covitz,
Tinashe Tapera,
Azeez Adebimpe,
Aaron Alexander-Bloch,
Maxwell A. Bertolero,
Eric Feczko,
Alexandre R. Franco,
Raquel E Gur,
Ruben C Gur,
Timothy Hendrickson,
Audrey Houghton,
Kahini Mehta,
Kristin Murtha,
Anders J. Perrone,
Tim Robert-Fitzgerald,
Jenna M. Schabdach,
Russell T Shinohara,
Jacob W. Vogel,
Chenying Zhao,
Damien A. Fair,
Michael P. Milham,
Matthew Cieslak,
Theodore D Satterthwaite
Posted 05 May 2022
bioRxiv DOI: 10.1101/2022.05.04.490620
The Brain Imaging Data Structure (BIDS) is a specification accompanied by a software ecosystem that was designed to create reproducible and automated workflows for processing neuroimaging data. BIDS Apps flexibly build workflows based on the metadata detected in a dataset. However, even BIDS valid metadata can include incorrect values or omissions that result in inconsistent processing across sessions. Additionally, in large-scale, heterogeneous neuroimaging datasets, hidden variability in metadata is difficult to detect and classify. To address these challenges, we created a Python-based software package titled "Curation of BIDS" (CuBIDS), which provides an intuitive workflow that helps users validate and manage the curation of their neuroimaging datasets. CuBIDS includes a robust implementation of BIDS validation that scales to large samples and incorporates DataLad, a version control software package for data, to ensure reproducibility and provenance tracking throughout the entire curation process. CuBIDS provides tools to help users perform quality control on their images' metadata and identify unique combinations of imaging parameters. Users can then execute BIDS Apps on a subset of participants that represent the full range of acquisition parameters that are present, accelerating pipeline testing on large datasets.
Download data
- Downloaded 285 times
- Download rankings, all-time:
- Site-wide: 142,574
- In neuroscience: None
- Year to date:
- Site-wide: 11,506
- Since beginning of last month:
- Site-wide: 1,448
Altmetric data
Distribution of downloads per paper, site-wide
PanLingua
News
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!