The Brain Imaging Data Structure (BIDS) is a standardized format for organizing and describing neuroimaging data and study outputs (Gorgolewski et al., 2016).
Having your data in BIDS format is helpful in several ways:
The BIDS Starter Kit is a “community-curated collection of tutorials, wikis, and templates to get you started with creating BIDS compliant datasets.” As the name implies, this is a good place to start.
Also spend some time checking out the BIDS website and looking over the specification.
Great simple description of the BIDS folder hierarchy here: https://github.com/bids-standard/bids-starter-kit/wiki/The-BIDS-folder-hierarchy
As well, there are three main types of files you’ll find in a BIDS dataset:
This is my directory structure from the oxytocin grief study (resting state data only), where we administered two different treatments (“A” and “B”) at two sessions a week apart:
restingstate/
└─ sourcedata
└── <DICOMS go here>
└─ sub-101
└── ses-txA
└── anat
└── func
└── ses-txB
└── anat
└── func
└── derivatives
└── fmriprep
dataset_description.json
└─ sub-101
sub-101.html
└── anat
└── figures
└── ses-txA
└── anat
└── func
└── ses-txB
└── anat
└── func
README
dataset_description.json
participants.tsv
/restingstate
is the main BIDS directory../sourcedata
contains the DICOMs, in whatever haphazard organization they came in from Osirix. Note the BIDS distinction between “raw” and “sourcedata”: raw = unprocessed or minimally processed due to file format conversion; source = data before conversion to BIDS../sub-101
contains two sub-directories, one for each session. Each has an anat
and a func
folder for the T1w and EPI images, respectively. These are where the NIFTIs go after they’re imported from /sourcedata
../derivatives
contains any files that result from doing anything to the raw data, including brain masks, processed images, reports, logs, metadata files, reports…Each pipeline/software/BIDS app you use gets its own subdirectory. Here, I just have fmriprep
. Different apps will organize their outputs differently, but they do it automatically so you don’t have to worry about this.
A valid BIDS dataset also needs these three files:
The naming of both directories and files is highly specific, and detailed in the BIDS Specification document.
This is probably the steepest curve in using BIDS, but luckily there are a multitude of software packages and approaches to make this happen.
Some of these only convert, some convert and create JSON sidecars, some do all of that + organize your files for you.
For the oxytocin study, there were a ton of inconsistencies in how the DICOMs were named and organized for each participant/session (and I wasn’t savvy enough to successfully modify the custom shell script), so dcm2niibatch ended up being the best option for my dissertation.
dcm2niibatch “performs a batch conversion of multiple dicoms using dcm2niibatch, which is run by passing a configuration file e.g dcm2niibatch batch_config.yml” (manual)
N.B.: All of this assumes you’re using OSX.
First, install Homebrew, which is a package manager for OSX.
$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
The script explains what it will do and then pauses before it does it.
Then install dcm2niix, with a flag to install dcm2niibatch too:
$ brew install dcm2niix --with-batch
Then make the subdirectories where each subject’s raw NIFTIs will ultimately live (assuming you already set up the top level directory, in this case /restingstate
, and moved your DICOMs into /restingstate/sourcedata
):
$ cd <your BIDS directory>
$ mkdir -p sub-{101,102,103,104,105,107,110,113,114,115,117,118,119,120,121,122,123,125,126,127,128,129,130,131,132,133,134,135,137,138,139,140,141,142,144,145,146,147,148,149}/ses-{txA,txB}/{func,anat}
The next step is to build the configuration files, which move each subject’s data from /sourcedata
into where it should be according to the BIDS spec.
Configuration files for dcm2niibatch need to be in a very specific format called YAML that keeps data stored as key-value pairs. This is a good overview of YAML.
Note: YAML uses whitespace as formatting, so be very careful about which text editor you use (I used Atom). TextEdit (the default on OSX) does not work. You need something that isn’t going to insert any invisible formatting whatsoever.
In order to make the configuration file for dcm2niibatch, I needed to first get the paths for all of the DICOM source data:
$ find . -type d -name *MPRAGE* > ~/Desktop/restingstate/sourcedata/mprage-files.txt
$ find . -type d -name *Rest* > ~/Desktop/restingstate/sourcedata/rest-files.txt
The config files follow the format:
Options:
isGz: false
isFlipY: false
isVerbose: false
isCreateBIDS: true
isOnlySingleFile: false
Files:
-
in_dir: /path/to/first/folder
out_dir: /path/to/output/folder
filename: firstfile
-
in_dir: /path/to/second/folder
out_dir: /path/to/output/folder
filename: secondfile
isCreateBIDS: true
makes a BIDS-compliant JSON sidecar that contains metadata from the NIFTI header.
You can specify as many files as you want, as long as they are separated by a dash.
For T1w images, filenames are sub-1**_ses-tx*_T1w
. Note the very specific naming, including the sub-
prefix, the ses-
prefix (for multi-session/longitudinal data), and the modality (t1w
):
Options:
isGz: false
isFlipY: false
isVerbose: true
isCreateBIDS: true
isOnlySingleFile: false
Files:
-
in_dir: './sourcedata/D101A/T1-MPRAGE - 12'
out_dir: ./sub-101/ses-txA/anat
filename: sub-101_ses-txA_T1w
-
in_dir: './sourcedata/D101B/T1-MPRAGE - 8'
out_dir: ./sub-101/ses-txB/anat
filename: sub-101_ses-txB_T1w
For the functional images, filenames will be sub-1**_ses-tx*_task-*_bold
. Again, very specific naming, including the subject ID, session ID, task (task-
prefix), and modality.
Options:
isGz: false
isFlipY: false
isVerbose: true
isCreateBIDS: true
isOnlySingleFile: false
Files:
-
in_dir: './sourcedata/D101A/RestingState - 11'
out_dir: ./sub-101/ses-txA/func
filename: sub-101_ses-txA_task-rest_bold
-
in_dir: './sourcedata/D101B/RestingState - 7'
out_dir: ./sub-101/ses-txB/func
filename: sub-101_ses-txB_task-rest_bold
The validator will throw a “NOT_INCLUDED” error due to the configuration files being in the dataset.
To avoid this, add a .bidsignore file containing the following:
/restingstate
*.yaml
This is pretty simple, assuming the config files are all in order:
$ dcm2niibatch batch_config_anat.yaml
$ dcm2niibatch batch_config_rest.yaml
The BIDS validator can be run online. I think you can also install the software but I haven’t. The site notes: “Selecting a dataset only performs validation. Files are never uploaded.”
Note: Works in Chrome or Firefox only.
/restingstate
) and wait for it to finish validating.See section Problems and Solutions below.
NOTE: This documentation is based on my experience with fmriprep version 1.1.8. As of 05/16/2019, the current version is 1.4.0 (see https://neurostars.org/t/fmriprep-1-4-0-just-released/4265)
Generally, a BIDS app is “a container image capturing a neuroimaging pipeline that takes a BIDS-formatted dataset as input. Since the input is a whole dataset, apps are able to combine multiple modalities, sessions, and/or subjects, but at the same time need to implement ways to query input datasets. Each BIDS App has the same core set of command-line arguments, making them easy to run and integrate into automated platforms.” (Gorgolweski et al., 2017)
Containers are similar to Virtual Machines, but still rely on some OS subprocesses.
BIDS apps rely on two technologies for container computing:
Container softwares such as Docker bundle all relevant software for processing. This lets you avoid “dependencies hell” – especially important for something like fMRIPrep that uses modules from various neuroimaging software (FreeSurfer, FSL, AFNI, etc.) As well, having the exact version numbers of bundled software allows for reproducibility.
Many BIDS apps are also available to run in the cloud via GUI on OpenNeuro.org (based on the idea of “science as a service”; datasets automatically published after 3 years if >2 subjects.)
fMRIPrep is one of many BIDS apps, or “portable neuroimaging pipelines that understand BIDS datasets”. fMRIPrep is a generic fMRI preprocessing pipeline providing results robust to the input data quality as well as informative reports.
fMRIPrep was built around three principles:
From https://fmriprep.readthedocs.io/en/latest/workflows.html:
Source: https://fmriprep.readthedocs.io/en/latest/outputs.html
pip
: $ sudo easy_install pip
pip
. Must log in as admin user to use sudo
. Some people suggest that you should use homebrew to install pip instead of messing with the system Python, but then you have two versions of Python on your computer - which also seems like it has high potential to lead to issues. So I’m not sure which approach is better.
fmriprep
: https://fmriprep.readthedocs.io/en/stable/installation.htmlTo install the Docker wrapper (recommended way of running fMRIPrep):
$ pip install --user --upgrade fmriprep-docker
The first time I tried to use the wrapper, I kept getting -bash: fmriprep-docker: command not found
. Turns out this is because pip
installed fmriprep-docker
into /Users/sarenseeley/Library/Python/2.7/bin/. which
didn’t turn it up, but this worked:
$ yes n | pip uninstall fmriprep-docker | grep bin
What does this command do? yes n | pip uninstall fmriprep-docker | grep bin
tells you where the program is (as if it were going to uninstall it) but doesn’t actually do so because of the yes n
part. Then once you find out where it is, you can add the path to that location to your global environment. See effigies’ response in this thread: https://github.com/poldracklab/fmriprep/issues/909#issuecomment-353322728
fMRIPrep also had issues locating the FreeSurfer license for some reason, or when it could find it, wanted to treat it as an executable file.
To solve both of these issues, I had to add their paths to the global environment.
To add this path permanently to the global environment, follow the instructions here: Setting permanent environment variable using terminal
$ cd ~/
$ nano .bash_profile
When nano opens up, add the following (change to reflect where your stuff is):
export PATH=$PATH:/Users/sarenseeley/Library/Python/2.7/bin
export FS_LICENSE=$HOME/Desktop/restingstate/derivatives/license.txt
Then save and exit nano.
If you only have 8GB RAM on your computer, you will receive a warning:
$ fmriprep-docker /Users/sarenseeley/Desktop/test/ /Users/sarenseeley/Desktop/test/derivatives participant
Warning: <8GB of RAM is available within your Docker environment.
Some parts of fMRIPrep may fail to complete.
Continue anyway? [y/N]
2GB is the default RAM available with Docker for Mac. fmriprep
did fail while running the test subject with 2GB RAM so I increased the RAM available to Docker:
fmriprep-docker
.I still get the warning about RAM, but it runs. FYI: This will probably slow down your system like crazy, so don’t forget to quit Docker when you’re done.
Upgrading to 24GB RAM and increasing the memory available to Docker to 16GB RAM sped things up significantly: 4 hours/subject vs. 12 hours/subject (no FreeSurfer). (My lab computer is a late-2013 iMac with 3.2 GHz Intel Core i5 processor.)
Helpful links:
Using the Docker wrapper is recommended:
$ fmriprep-docker /Users/sarenseeley/Desktop/restingstate/ /Users/sarenseeley/Desktop/restingstate/derivatives --longitudinal --participant_label sub-110 sub-113 sub-114 sub-115 sub-117
But you can also invoke Docker directly:
$ docker run -ti --rm \
-v filepath/to/data/dir:/data:ro \
-v filepath/to/output/dir:/out \
poldracklab/fmriprep:latest \
/data /out/out \
participant
Lots of different options!
From https://fmriprep.readthedocs.io/en/stable/usage.html#command-line-arguments:
[-h] [--version]
[--participant_label PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]]
[-t TASK_ID] [--debug] [--nthreads NTHREADS]
[--omp-nthreads OMP_NTHREADS] [--mem_mb MEM_MB] [--low-mem]
[--use-plugin USE_PLUGIN] [--anat-only] [--boilerplate]
[--ignore-aroma-denoising-errors] [-v]
[--ignore {fieldmaps,slicetiming,sbref} [{fieldmaps,slicetiming,sbref} ...]]
[--longitudinal] [--t2s-coreg] [--bold2t1w-dof {6,9,12}]
[--output-space {T1w,template,fsnative,fsaverage,fsaverage6,fsaverage5} [{T1w,template,fsnative,fsaverage,fsaverage6,fsaverage5} ...]]
[--force-bbr] [--force-no-bbr]
[--template {MNI152NLin2009cAsym}]
[--output-grid-reference OUTPUT_GRID_REFERENCE]
[--template-resampling-grid TEMPLATE_RESAMPLING_GRID]
[--medial-surface-nan] [--use-aroma]
[--aroma-melodic-dimensionality AROMA_MELODIC_DIMENSIONALITY]
[--skull-strip-template {OASIS,NKI}]
[--skull-strip-fixed-seed] [--fmap-bspline] [--fmap-no-demean]
[--use-syn-sdc] [--force-syn] [--fs-license-file PATH]
[--no-submm-recon] [--cifti-output | --fs-no-reconall]
[-w WORK_DIR] [--resource-monitor] [--reports-only]
[--run-uuid RUN_UUID] [--write-graph] [--stop-on-first-crash]
[--notrack]
bids_dir output_dir {participant}
Example of how you might use these arguments:
fmriprep-docker --low-mem --resource-monitor --stop-on-first-crash --longitudinal --use-syn-sdc --use-aroma /Users/sarenseeley/Desktop/restingstate/ /Users/sarenseeley/Desktop/restingstate/derivatives -w /Users/sarenseeley/Desktop/restingstate/derivatives/scratch --participant_label sub-110 sub-113 sub-114 sub-115 sub-117
(I am only running a few participants at a time because I’m not yet sure how many I can run together before it crashes.)
These are some of the ones I am using:
--use-aroma
: “Given a motion-corrected fMRI, a brain mask, mcflirt movement parameters and a segmentation, the discover_wf sub-workflow calculates potential confounds per volume. Calculated confounds include the mean global signal, mean tissue class signal, tCompCor, aCompCor, Frame-wise Displacement, 6 motion parameters, DVARS, and, if the –use-aroma flag is enabled, the noise components identified by ICA-AROMA (those to be removed by the “aggressive” denoising strategy)" & see section on ICA-AROMA (source)
--use-syn-sdc
: “In the absence of direct measurements of fieldmap data, we provide an (experimental) option to estimate the susceptibility distortion based on the ANTs symmetric normalization (SyN) technique. This feature may be enabled, using the –use-syn-sdc flag, and will only be applied if fieldmaps are unavailable.” (source)
--longitudinal
: “In the case of multiple T1w images (across sessions and/or runs), T1w images are merged into a single template image using FreeSurfer’s mri_robust_template. This template may be unbiased, or equidistant from all source images, or aligned to the first image (determined lexicographically by session label). For two images, the additional cost of estimating an unbiased template is trivial and is the default behavior, but, for greater than two images, the cost can be a slowdown of an order of magnitude. Therefore, in the case of three or more images, fmriprep constructs templates aligned to the first image, unless passed the –longitudinal flag, which forces the estimation of an unbiased template.” (source)
--low-mem
: Option to reduce memory usage for large BOLD series (will increase disk usage in working directory). “This will wait until the end of the pipeline to compress the resampled BOLD series, which allows tasks that need to read these files to read only the necessary parts of the file into memory.” (https://neurostars.org/t/memory-usage-of-fmriprep/1552/3)
--fs-no-reconall
: Disables surface preprocessing, which saves a ton of time. If your registration looks okay without it, then great! If you’re seeing issues with the registration, like “brain” is identified outside of the brain, then give it a try using Freesurfer’s bbregister
instead (this is the default, so don’t need to specify, just remove --fs-no-reconall
).
-w /Users/sarenseeley/Desktop/restingstate/derivatives/scratch
: Specifies your own local scratch directory (vs. having interim files written somewhere in the Docker container). This is really helpful because if fMRIPrep crashes, it can use the previously computed outputs to pick up where it left off, saving you from having to wait for it to rerun the whole thing.
(SECTION IN PROGRESS)
The figure shows on top several confounds estimated for the BOLD series: global signals (‘GlobalSignal’, ‘WM’, ‘GM’), standardized DVARS (‘stdDVARS’), and framewise-displacement (‘FramewiseDisplacement’). At the bottom, a ‘carpetplot’ summarizing the BOLD series. The colormap on the left-side of the carpetplot denotes signals located in cortical gray matter regions (blue), subcortical gray matter (orange), cerebellum (green) and the union of white-matter and CSF compartments (red). (source)
$ docker run -it poldracklab/mriqc:latest --version
It will say first that it’s unable to find poldracklab/mriqc, then proceed to download what it needs. When it’s done, it will display something like this in your Terminal window:
Unable to find image 'poldracklab/mriqc:latest' locally
latest: Pulling from poldracklab/mriqc
c83208261473: Already exists
6e1a85c1d66a: Already exists
f1320ef45e20: Already exists
5a6ab6e6fbf6: Already exists
6fd240c27767: Already exists
58a3bd8fa030: Pull complete
f3e3661defbc: Pull complete
47da0cb1bc78: Pull complete
ef820cb9cdfe: Pull complete
3888bc11a283: Pull complete
a4cca34e324b: Pull complete
Digest: sha256:6609a2427d6f270947f466c00591c3948b7682360be8259b661dc4009455af94
Status: Downloaded newer image for poldracklab/mriqc:latest
mriqc v0.14.2
$ docker run -it --rm -v /Users/sarenseeley/Desktop/restingstate:/data:ro -v /Users/sarenseeley/Desktop/restingstate/derivatives/mriqc:/out poldracklab/mriqc:latest /data /out participant -m T1w bold
This runs both the participant- and group-level analysis.
/Users/sarenseeley/Desktop/restingstate
is the input directory. Must be a valid BIDS directory.
/Users/sarenseeley/Desktop/restingstate/derivatives/mriqc
is wherever you want MRIQC to put the output.
-m T1w bold
indicates that the dataset contains images in T1w and BOLD modalities.
To run a single subject (or set of subjects):
$ docker run -it --rm -v /Users/sarenseeley/Desktop/restingstate:/data:ro -v /Users/sarenseeley/Desktop/restingstate/derivatives/mriqc:/out poldracklab/mriqc:latest /data /out participant --participant_label 138 -m T1w bold
Note that you only have to put the number, and not the sub-
prefix (i.e., --participant_label 138
vs. --participant_label sub-138
- the latter will not run.)
Example (MRIQC on the ABIDE dataset - this is a clinical population so more quality issues than might see in non-clinical dataset):
These are the summary metrics currently provided by MRIQC (with thanks to my MRIQCEPTION team members at Neurohackademy 2019 for their help creating more user-friendly definitions):
TYPE OF SCAN METRIC APPLIES TO | ABBREVIATION | NAME | DESCRIPTION |
Structural | cjv | Coefficient of joint variation |
Coefficient of joint variation between white matter and gray matter. Higher values indicate more head motion and/or intensity non-uniformity artifacts. |
Structural | cnr | Contrast-to-noise ratio |
Contrast-to-noise ratio, reflecting separation between GM & WM. Higher values indicate higher quality. |
Structural | snr_dietrich | Dietrich’s SNR |
Dietrich et al. (2007)’s signal-to-noise ratio. Higher values indicate higher quality. |
Structural | art_qi2 | Mortamet’s quality index 2 |
A quality index accounting for effects of both clustered and subtle artifacts in the air background. Higher values indicate lower quality. |
Structural | art_qi1 | Mortamet’s quality index 1 |
The proportion of voxels outside the brain with artifacts to the total number of voxels outside the brain. Higher values indicate lower quality |
Structural | wm2max | White matter-to-maximum intensity ratio |
Captures skewed distributions within the WM mask, caused by fat and vascular-related hyperintensities. Ideal values fall within the interval [0.6, 0.8] |
Structural | fwhm_ | Full-width half-maximum smoothness |
Image blurriness (full-width half-maximum). Higher values indicate a blurrier image. |
Structural | volume_fraction | Volume fraction |
Summary statistics for the intra-cranial volume fractions of CSF, GM, and WM. Be aware of potential outliers. |
Structural | rpve | Residual partial voluming error |
Residual partial volume error. Higher values indicate lower quality. |
Structural | overlap_ | Overlap of tissue probabilities |
How well the image tissue probability maps overlap with those from the MNI ICBM 2009 template. Higher values indicate better spatial normalization. |
Structural, Functional | efc | Entropy-focus criterion |
Shannon entropy criterion. Higher values indicate more ghosting and/or head motion blurring. |
Structural, Functional | fber | Foreground-background energy ratio |
The variance of voxels inside the brain divided by the variance of voxels outside the brain. Higher values indicate higher quality. |
Structural, Functional | inu_ | Intensity non-uniformity |
Intensity non-uniformity (bias field) summary statistics. Values closer to 1 indicate higher quality; further from zero indicate greater RF field inhomogeneity. |
Structural, Functional | snr | Signal-to-noise ratio |
Signal-to-noise ratio within the tissue mask. Higher values indicate higher quality. |
Structural, Functional | summary_stats | Summary stats | Summary statistics for average intensities in CSF, GM, and WM. |
Functional | dvars | Derivatives of variance |
The average change in mean intensity between each pair of fMRI volumes in a series. Higher values indicate more dramatic changes (e.g., due to motion or spiking). |
Functional | gcor | Global correlation |
Average correlation of all pairs of voxel time series inside of the brain. Illustrates differences between data due to motion/physiological noise/imaging artifacts. Values closer to zero are better. |
Functional | tsnr | Temporal signal-to-noise ratio |
Temporal signal-to-noise ratio taking into account mean signal over time. Higher values indicate higher quality. |
Functional | fd_mean | Framewise displacement - mean |
A measure of subject head motion, which compares the motion between the current and previous volumes. Higher values indicate lower quality. |
Functional | fd_num | Framewise displacement - number |
Number of timepoints with framewise displacement >0.2mm. Higher values indicate lower quality. |
Functional | fd_perc | Framewise displacement - percent |
Percent of timepoints with framewise displacement >0.2mm. Higher values indicate lower quality. |
Functional | gsr | Ghost-to-signal ratio |
Ghost-to-signal ratio along the x or y encoding axes. Higher values indicate lower quality. |
Functional | aor | AFNI’S outlier ratio |
Mean fraction of outliers per fMRI volume, from AFNI’s 3dToutcount. Higher values indicate lower quality. |
Functional | aqi | AFNI’s quality index |
Mean quality index, from AFNI’s 3dTqual. Values close to 0 indicate higher quality. |
Functional | dummy | Dummy scans | Number of volumes in the beginning of the fMRI timeseries identified as non-steady state. |
Note that many of the IQMs calculated are “no-reference” metrics: “A no-reference IQM is a measurement of some aspect of the actual image which cannot be compared to a reference value for the metric since there is no ground-truth about what this number should be.” source
We’re working on a project to help you interpret your MRIQC results…stay tuned :)
Read more about the classifier here: https://mriqc.readthedocs.io/en/stable/classifier.html
Usage with docker:
$ docker run -v $PWD:/scratch -w /scratch --entrypoint=mriqc_clf poldracklab/mriqc:latest --load-classifier -X group_T1w.tsv`
Explanation from https://groups.google.com/forum/#!topic/mriqc-users/P3LwhuIagaU (commands modified to work on my data): docker run
- invokes Docker. -v $PWD:/scratch
- provides a folder to communicate data into the container and off the container. -w /scratch
- changes the working directory to read the input file and write the results. --entrypoint=mriqc_clf
- tells Docker to run a different binary (mriqc_clf
) rather than the default (mriqc
). poldracklab/mriqc:latest
- pulls the latest version of MRIQC. You can also have it pull a specific version (e.g., poldracklab/mriqc:0.9.6
). --load-classifier -X
- loads the classifier trained on the ABIDE dataset (default) or your custom classifier, if you created one. group_T1w.tsv
- tells MRIQC to apply the classifier to group T1w report.
There is currently no MRIQC classifier for BOLD or T2 images.
Here’s a template I made for inspecting the MRIQC group reports, listing each metric and its definition: https://docs.google.com/spreadsheets/d/1OsWJFxzXaDFjSwCXbAwv1YonPEd4f9a7cFb7ITLVVnI/edit?usp=sharing
Broadly, in order to run fMRIPrep or MRIQC on the HPC, several things need to happen:
-B <host_folder>:<container_folder>
Singularity argument.
See https://fmriprep.readthedocs.io/en/latest/installation.html#singularity-container
See also Chidi’s slides from Nov. 16th ’18 BMW meeting:.
Below are some miscellaneous problems we have encountered, and how we solved them.
What does dcm2niix’s message “slices stacked despite varying acquisition numbers (if this is not desired please recompile)” mean?
Look at your data to know whether this is okay or not. You can look at the DICOMs (converted to .nii) in your image viewer of choice (SPM, Mango, FSLview…) against the .nii files generated by dcm2niix
to check that they look the same.
From Dianne Patterson:
There are different ways to order the data, especially 4d data like fmri… slice 1 vol1, slice 1 vol2, slice 1 vol3… vs vol1, slice1, slice 2, slice3 etc….vol2 slice1, slice2, slice 3 etc….
I think there can be a difference between the way the scanner exports them and the program stacks them, though I generally don’t learn these gory details unless something breaks. Did you look at the fmris to make sure the volumes display all their slices in order (I suspect it’ll seem okay or look like a total hot mess)?
This means exactly what it says.
When running dcm2niibatch
, I specified isVerbose: true
and copied the command line output into a .txt file to inspect for any issues with files that dcm2niix
was unable to covert. For D110B, there was the following error:
Found 192 DICOM file(s) #repeated 192x
Unsupported transfer syntax '1.2.840.10008.1.2.4.90' (see www.nitrc.org/plugins/mwiki/index.php/dcm2nii:MainPage)
No valid DICOM images were found
Conversion required 0.165132 seconds.
The problem is that for some weird reason, these DICOMs were JPEG-compressed so dcm2niix
doesn’t know what to do with them.
I had to decompress the files, using GDCM (https://github.com/malaterre/GDCM/releases; http://gdcm.sourceforge.net/wiki/index.php/Main_Page).
How to install and compile GDCM, following instructions in the INSTALL.txt
file in /gdcm
(included when you clone /gdcm
to your local machine):
$ git clone --branch release git://git.code.sf.net/p/gdcm/gdcm
$ mkdir gdcmbin
$ cd gdcmbin
$ ccmake ../gdcm
Then you need to add the path to /gdcm
so that the software can be located and used:
$ export PATH=$PATH:~/gdcmbin/bin
$ echo $PATH
For-loop to decompress each of the DICOMs in the specified directory and print converting [filename]...
while doing so (note that GDCM cannot handle spaces in the directory names):
$ DIR='/Users/sarenseeley/Desktop/restingstate/data/sourcedata/D110B/AnonymizedD1D0D305/OconnorSequences/T1MPRAGE5/*.dcm'; for f in $DIR; do echo "converting $f..."; gdcmconv -w $f $f; done
$ DIR='/Users/sarenseeley/Desktop/restingstate/data/sourcedata/D110B/AnonymizedD1D0D305/OconnorSequences/RestingState8/*.dcm'; for f in $DIR; do echo "converting $f..."; gdcmconv -w $f $f; done
Then the next step is to build two more configuration files (anat and rest) specific to D110B, and run dcm2niix
on just those files:
$ dcm2niibatch /Users/sarenseeley/Desktop/restingstate/data/batch_config_anat-D110B.yaml
$ dcm2niibatch /Users/sarenseeley/Desktop/restingstate/data/batch_config_rest-D110B.yaml
After this, I went back to the original configuration files and updated them to reflect the new directory path for D110B, then ran the BIDS validator again on the dataset.
Below are some errors and warnings that the validator gave me. Warnings can be ignored (at your own peril), but errors mean that your dataset is not BIDS-compliant.
This means you’re probably missing some files for certain participants in your BIDS dataset. Example:
Warning: 1 (4 files)
Not all subjects contain the same files. Each subject should contain the same number of files with the same naming unless some files are known to be missing.
sub-110_ses-txB_T1w.jsonNaN KB |
Location:
/sub-110/ses-txB/anat/sub-110_ses-txB_T1w.json
Reason:
This file is missing for subject sub-110, but is present for at least one other subject.
sub-110_ses-txB_T1w.niiNaN KB |
Location:
/sub-110/ses-txB/anat/sub-110_ses-txB_T1w.nii
Reason:
This file is missing for subject sub-110, but is present for at least one other subject.
sub-110_ses-txB_task-rest_bold.jsonNaN KB |
Location:
/sub-110/ses-txB/func/sub-110_ses-txB_task-rest_bold.json
Reason:
This file is missing for subject sub-110, but is present for at least one other subject.
sub-110_ses-txB_task-rest_bold.niiNaN KB |
Location:
/sub-110/ses-txB/func/sub-110_ses-txB_task-rest_bold.nii
Reason:
This file is missing for subject sub-110, but is present for at least one other subject.
The BIDS validator will tell you which subjects are missing files, as shown above. Figure out why they are missing files (for me, it was the “dcm2niibatch
doesn’t read compressed DICOMs” issue described above), fix it, and re-run the validator.
Files with such naming scheme are not part of BIDS specification. This error is most commonly caused by typos in file names that make them not BIDS compatible. Please consult the specification and make sure your files are named correctly. If this is not a file naming issue (for example when including files not yet covered by the BIDS specification) you should include a “.bidsignore” file in your dataset (see https://github.com/bids-standard/bids-validator#bidsignore for details). Please note that derived (processed) data should be placed in /derivatives folder and source data (such as DICOMS or behavioural logs in proprietary formats) should be placed in the /sourcedata folder.
Why did this happen? The YAML configuration files that I used for dcm2niibatch
are in /data
.
Added a .bidsignore file containing the following:
/data
*.yaml
Example:
Error: 1 (79 files)
You have to define 'TaskName' for this file.
sub-101_ses-txA_task-rest_bold.nii 78759.712 KB |
Location:
derivatives/sub-101/ses-txA/func/sub-101_ses-txA_task-rest_bold.nii
Reason:
You have to define 'TaskName' for this file. It can be included one of the following locations: /bold.json, /task-rest_bold.json, /sub-101/sub-101_bold.json, /sub-101/sub-101_task-rest_bold.json, /sub-101/ses-txA/sub-101_ses-txA_bold.json, /sub-101/ses-txA/sub-101_ses-txA_task-rest_bold.json, /sub-101/ses-txA/func/sub-101_ses-txA_task-rest_bold.json
That info (TaskName
) wasn’t stored in the image header so dcm2niix
can’t pull it. See Chris G.’s response here: https://github.com/rordenlab/dcm2niix/issues/148
But you don’t need to add a TaskName field individually into each of the .json sidecars for each subject/session. Just stick a task-rest_bold.json
file containing the task name (as shown below) into the top level of your BIDS directory, and that will apply to all of the task-rest_bold
files in that location:
The .json file should contain the following (adapted for your task name):
{
"TaskName": "rest"
}
$ fmriprep-docker /Users/sarenseeley/Desktop/test/ /Users/sarenseeley/Desktop/test/derivatives participant
Warning: <8GB of RAM is available within your Docker environment.
Some parts of fMRIPrep may fail to complete.
Continue anyway? [y/N]
2GB is the default RAM available in Docker for Mac, at least for my installation. fMRIPrep needs at least 8GB to run (more is better), even if using the --low-mem
flag to reduce memory usage.
fmriprep-docker
.You might see something like this in the command window as fMRIPrep is running:
181026-21:01:38,622 nipype.interface INFO:
stderr 2018-10-26T21:01:38.622139:*+ WARNING: If you are performing spatial transformations on an oblique dset,
181026-21:01:38,623 nipype.interface INFO:
stderr 2018-10-26T21:01:38.622139: such as /tmp/work/fmriprep_wf/single_subject_120_wf/func_preproc_ses_txB_task_rest_wf/bold_reference_wf/gen_ref/slice.nii.gz,
181026-21:01:38,624 nipype.interface INFO:
stderr 2018-10-26T21:01:38.622139: or viewing/combining it with volumes of differing obliquity,
181026-21:01:38,625 nipype.interface INFO:
stderr 2018-10-26T21:01:38.622139: you should consider running:
181026-21:01:38,625 nipype.interface INFO:
stderr 2018-10-26T21:01:38.622139: 3dWarp -deoblique
181026-21:01:38,626 nipype.interface INFO:
stderr 2018-10-26T21:01:38.622139: on this and other oblique datasets in the same session.
181026-21:01:38,627 nipype.interface INFO:
stderr 2018-10-26T21:01:38.622139: See 3dWarp -help for details.
181026-21:01:38,628 nipype.interface INFO:
stderr 2018-10-26T21:01:38.622139:++ Oblique dataset:/tmp/work/fmriprep_wf/single_subject_120_wf/func_preproc_ses_txB_task_rest_wf/bold_reference_wf/gen_ref/slice.nii.gz is 13.716262 degrees from plumb.
It seems that this warning can be safely ignored.
If fmriprep can’t find the license file, this error message appears:
$ RuntimeError: ERROR: a valid license file is required for FreeSurfer to run. FMRIPREP looked for an existing license file at several paths, in this order: 1) command line argument ``--fs-license-file``; 2) ``$FS_LICENSE`` environment variable; and 3) the ``$FREESURFER_HOME/license.txt`` path. Get it (for free) by registering at https://surfer.nmr.mgh.harvard.edu/registration.html
See the section in installing fMRIPrep. Note that if you get an error message saying that you need an updated license file, you actually may not (especially if you just downloaded one). I encountered that issue when it was still having trouble finding the license.
This means that pip
installed the Docker wrapper somewhere obscure.
Find out where pip
installed it, and add that path to your global environment as described here.
This is a memory allocation issue.
From the fMRIPrep documentation:
When running on Linux platforms (or containerized environments, because they are built around Ubuntu), there is a Python bug that affects fMRIPrep that drives the Linux kernel to kill processes as a response to running out of memory. Depending on the process killed by the kernel, fMRIPrep may crash with a BrokenProcessPool error or hang indefinitely, depending on settings. While we are working on finding a solution that does not run up against this bug, this may take some time. This can be most easily resolved by allocating more memory to the process, if possible.
Additionally, consider using the –low-mem flag, which will make some memory optimizations at the cost of disk space in the working directory.
Making this its own section since I’ve had lots of questions on this topic.
See also: https://fmriprep.readthedocs.io/en/latest/workflows.html#confounds-estimation
As described by Chris Markiewicz here, ICA-AROMA has two denoising strategies: aggressive and non-aggressive.
When you pass the --use-aroma
flag, fmriprep performs non-aggressive denoising on the preprocessed data as a last step, and generates those non-aggressively denoised images. Even using the flag, fmriprep will still produce the non-denoised regular preprocessed files (*_bold_space-MNI152NLin2009cAsym_preproc.nii
) in addition to the denoised ones (*smoothAROMAnonaggr_preproc.nii.gz
).
For more details and a simulation comparing the two strategies, see Chris’s notebook here. This is a really helpful resource for understanding what the strategies do and how they differ.
Note again that the ICA-AROMA algorithm is trained to pick up motion artifacts specifically, so won’t take care of any physio-related denoising that needs to occur. For this, you may want to use the “aCompCor” regressors. Some people suggest that you also use the cosine regressors in the confounds.csv
file if using “aCompCor” (or even if not - they perform high-pass filtering).
A: Yes, the preprocessed images (*_bold_space-MNI152NLin2009cAsym_preproc.nii
) are motion-corrected, based on the following parts of the documentation from https://fmriprep.readthedocs.io/en/stable/workflows.html:
Using the previously estimated reference scan, FSL mcflirt is used to estimate head-motion. As a result, one rigid-body transform with respect to the reference image is written for each BOLD time-step. Additionally, a list of 6-parameters (three rotations, three translations) per time-step is written and fed to the confounds workflow. For a more accurate estimation of head-motion, we calculate its parameters before any time-domain filtering (i.e. slice-timing correction), as recommended in [Power2017].
Given a motion-corrected fMRI [my note: so this motion-corrected images must have been generated at some point if they’re required as inputs for this step], a brain mask, mcflirt movement parameters and a segmentation, the discover_wf sub-workflow calculates potential confounds per volume. Calculated confounds include the mean global signal, mean tissue class signal, tCompCor, aCompCor, Frame-wise Displacement, 6 motion parameters, DVARS, and, if the –use-aroma flag is enabled, the noise components identified by ICA-AROMA (those to be removed by the “aggressive” denoising strategy).
A: If you used the --use-aroma
flag, confounds are calculated before ICA-AROMA is applied to the preprocessed *_bold_space-MNI152NLin2009cAsym_preproc.nii
files, as confirmed here: https://neurostars.org/t/fmriprep-confounds-extracted-before-or-after-ica-aroma/4077.
James Kent’s simulation suggests a slight benefit to extracting confounds prior to ICA-AROMA. See also the discussions here and here.
When using AROMA-denoised data (*_bold_space-MNI152NLin2009cAsym_variant-smoothAROMAnonaggr_preproc.nii
files), you would likely NOT want to regress out motion-related variables from confounds.tsv
as this may reintroduce motion. However, the ICA-AROMA algorithm is fairly specific to motion-related artifacts and will not address physiological artifacts such as those from CSF and blood flow. (For what it’s worth, I’ve seen this in my data when using the GIFT toolbox to run a subsequent ICA on both the non-AROMA and AROMA-denoised data from fmriprep
. ICA-AROMA handles motion beautifully, as evidenced by the number of motion-related components being vastly reduced. However, I see basically the same physiological components either way.)
CSF and WM are the “CSF” and “WhiteMatter” columns in confounds.tsv
.
If you’re using the non-aggressively denoised AROMA files (*_bold_space-MNI152NLin2009cAsym_variant-smoothAROMAnonaggr_preproc.nii
), you do NOT need to use the confounds “X”, “Y”, “Z”, “RotX”, “RotY”, “RotZ” as you would with the regular preprocessed files.
Discussion of best practices for ICA-AROMA and fMRIPrep
It is up to you as the researcher. fMRIPrep gives you lots of options but as part of its “minimal preprocessing” philosophy, does not dictate which you should or shouldn’t use.
See this lengthy discussion: https://neurostars.org/t/confounds-from-fmriprep-which-one-would-you-use-for-glm/326 (the six motion parameters in the confounds file are “X”, “Y”, “Z”, “RotX”, “RotY”, “RotZ”).
Suggestions for how to select number of aCompCor components: https://neurostars.org/t/confounds-from-fmriprep-which-one-would-you-use-for-glm/326/35
See also the new tools fMRIdenoise and Denoiser: A nuisance regression tool for fMRI BOLD data.
Various things I and/or others using these tools at UA have discovered along the way.
In some cases, you may want to go back to using a previous version of fmriprep (for example, you upgrade, then realize you need to go back and reprocess some participants you did earlier). In this case, you can run from Terminal:
$ pip install fmriprep-docker==<version number, e.g. 1.1.8>
I initially got error: [Errno 13] Permission denied: '/Library/Python/2.7/site-packages/fmriprep_docker.py'
As it says on the label, this is a permissions issue. Use $ login <yourusername>
to switch users to one with admin privileges, then sudo pip install fmriprep-docker==1.1.8
Alternatively, if you don’t have root access on the server you can try appending the --user
flag to the pip
command to install the package in your local home directory:
$ pip install fmriprep-docker==1.1.8 --user
You might get the following warning:
The script fmriprep-docker is installed in ‘/Users/rblair/.local/bin’ which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
If this doesn’t work you may need to setup a virtual environment using a program like conda or virtualenv
If you get this error you could try adding $HOME/.local/bin
to your path variable, or call it directly with $ $HOME/.local/bin/fmriprep-docker
See https://neurostars.org/t/how-to-use-earlier-version-of-fmriprep-docker-wrapper/3645
See this article comparing denoising approaches for multi-echo data: https://doi.org/10.1371/journal.pone.0173289
Example: You want to use the T1w scan from session 2 for both sessions, since there was too much movement in the one from session 1.
See https://neurostars.org/t/multiple-scan-sessions-some-bad-anatomicals/3851
Another option is to copy the “good” image from session 2 into the session 1 anat
folder, making sure to rename it as session 1. This isn’t optimal IMO because it would be nice to make it obvious that this image is from a different session, but that’s how I made it work. You could note that in the metadata associated with the dataset (i.e., the README file)
From https://fmriprep.readthedocs.io/en/stable/workflows.html#surface-preprocessing:
Surface processing will be skipped if the outputs already exist. In order to bypass reconstruction in fmriprep, place existing reconstructed subjects in < output dir >/freesurfer prior to the run. fmriprep will perform any missing recon-all steps, but will not perform any steps whose outputs already exist.
Indeed, this works well (when data in the Freesurfer folder are organized in a BIDS-compliant manner).
See this thread I started on NeuroStars: https://neurostars.org/t/fmriprep-weird-epi-t1-registration-artifact-problems/3811/5 Summary: It’s not an artifact in your data, it’s an artifact of using a different method of interpolation for how registration between the two images is visualized in the reports. Check the registration in SPM (or other program) and if it looks good there, then it’s good.
Example: I ran fmriprep-docker
on sub-138. Reports showed that registration failed (because for some random reason, the files in their T1w folder from one of the sessions was…a scan of a phantom). I swapped out those images for the correct images and re-ran. Even after deleting the local scratch directory that I was using, fMRIPrep was still “collecting precomputed outputs” from the initial run and generating the same results in both the reports and images. I ultimately was successful after I replaced sub-138
anywhere in the file or folder names with sub-138b
, reran fmriprep-docker
and then changed the names back to sub-138
. Need to ask on NeuroStars if there’s a better way to do this.
Thanks to Dianne Patterson, Ramsey Wilcox, Jessica Andrews-Hanna, Aneta Kielar, others in the UA Brain Mapping Workgroup, and the Neurostars community for contributing questions and answers to this document.
Major thanks also to the teams behind the BIDS initiative, fMRIPrep, and MRIQC.