TractoFlow: A robust, efficient and reproducible diffusion MRI pipeline leveraging Nextflow & Singularity
Posted 09 May 2019
bioRxiv DOI: 10.1101/631952 (published DOI: 10.1016/j.neuroimage.2020.116889)
Posted 09 May 2019
A diffusion MRI (dMRI) tractography processing pipeline should be: i) reproducible in immediate test-test, ii) reproducible in time, iii) efficient and iv) easy to use. Two runs of the same processing pipeline with the same input data should give the same output today, tomorrow and in 2 years. However, processing dMRI data requires a large number of steps (20+ steps) that, at this time, may not be reproducible between runs or over time. If parameters such as the number of threads or the random number generator are not carefully set in the brain extraction, registration and fiber tracking steps, the end tractography results obtained can be far from reproducible and limit brain connectivity studies. Moreover, processing can take several hours to days of computation for a large database, even more so if the steps are running sequentially. To handle these issues, we present TractoFlow, a fully automated pipeline that processes datasets from the raw diffusion weighted images (DWI) to tractography. It also outputs classical diffusion tensor imaging measures (fractional anisotropy (FA) and diffusivities) and several HARDI measures (Number of Fiber Orientation (NuFO), Apparent Fiber Density (AFD)). The pipeline requires a DWI and T1-weighted image as NIfTI files and b-values/b-vectors in FSL format. An optional reversed phase encoded b=0 image can also be used. This pipeline is based on two technologies: Nextflow and Singularity, as well as recommended pre-processing and processing steps from the dMRI community. In this work, the TractoFlow pipeline is evaluated on three databases and shown to be efficient and reproducible from 98% to 100% depending on parameter choices. For example, 105 subjects from the Human Connectome Project (HCP) were fully ran in twenty-five (25) hours to produce, for each subject, a whole-brain tractogram with 4 million streamlines. The contribution of this paper is to introduce the importance of a robust pipeline in terms of runtime and reproducibility over time. In the era of open data and open science, efficiency and reproducibility is critical in neuroimaging projects. Our TractoFlow pipeline is publicly available for academic research and is an important step forward for better structural brain connectivity mapping.
- Downloaded 1,572 times
- Download rankings, all-time:
- Site-wide: 10,786
- In neuroscience: 1,182
- Year to date:
- Site-wide: 21,186
- Since beginning of last month:
- Site-wide: 26,288
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!