babs init: Initialize a BABS project
Table of Contents
Command-Line Arguments
Initialize a BABS project and bootstrap scripts that will be used later.
usage: babs init [-h] [--list_sub_file LIST_SUB_FILE] --container_ds
CONTAINER_DS --container_name CONTAINER_NAME
[--container_config CONTAINER_CONFIG] --processing_level
{subject,session} --queue {slurm} [--keep_if_failed]
PATH
Positional Arguments
- PATH
Absolute path to the directory where the BABS project will be located. This folder will be automatically created.
Named Arguments
- --list_sub_file, --list-sub-file
Path to the CSV file that lists the subject (and sessions) to analyze; If there is no such file, please not to specify this flag. Single-session data: column of 'sub_id'; Multi-session data: columns of 'sub_id' and 'ses_id'.
- --container_ds, --container-ds
Path to the container DataLad dataset
- --container_name, --container-name
The name of the BIDS App container, i.e., the
<image NAME>used when runningdatalad containers-add <image NAME>. Importantly, this should include the BIDS App's name to make sure the bootstrap scripts are set up correctly; Also, the version number should be added, too.babs initis not case sensitive to this--container_name. Example:toybidsapp-0-0-7for toy BIDS App version 0.0.7.- --container_config, --container-config
Path to a YAML file that contains the configurations of how to run the BIDS App container
- --processing_level, --processing-level
Possible choices: subject, session
Whether jobs should be run on a per-subject or per-session (within subject) basis.
- --queue
Possible choices: slurm
The name of the job scheduling queue that you will use.
- --keep_if_failed, --keep-if-failed
If
babs initfails with error, whether to keep the created BABS project. By default, you don't need to turn this option on. However, whenbabs initfails and you hope to usebabs check-setupto diagnose, please turn it on to rerunbabs init, then runbabs check-setup. Please refer to section below 'What ifbabs initfails?' for details.Default: False
Detailed description
How do I prepare the input dataset, container, and container's YAML file?
Please see document Step I: Get prepared for how to prepare these inputs.
How do I define the input dataset's name <name> in babs init --datasets?
General guideline: a string you think that's informative.
Examples are BIDS, freesurfer.
Specific restrictions:
If you have more than one input BIDS dataset (i.e., more than one
--datasets), please make sure the<name>is different for each dataset;If an input BIDS dataset is a zipped dataset, i.e., files are zipped files, such as BIDS data derivatives from another BABS project:
You must name it with pattern in the zip filenames so that
babs initknows which zip file you want to use for a subject or session. For example, one of your input dataset is BIDS derivates of fMRIPrep, which includes zip files ofsub-xx*_freesurfer*.zipandsub-xx*_fmriprep*.zip. If you'd like to feedfreesurferresults zip files into current BABS project, then you should name this input dataset asfreesurfer. If you name it a random name likeBIDS_derivatives, as this is not a pattern found in these zip files,babs initwill fail.In addition, the zip files named with such pattern (e.g.,
*freesurfer*.zip) should include a folder named as the same name too (e.g., a folder calledfreesurfer).For example, in multi-session, zipped fMRIPrep derivatives data (e.g., https://osf.io/k9zw2/):
sub-01_ses-A_freesurfer-20.2.3.zip ├── freesurfer │ ├── fsaverage │ └── sub-01 sub-01_ses-B_freesurfer-20.2.3.zip ├── freesurfer │ ├── fsaverage │ └── sub-02 etc
How is the list of subjects (and sessions) determined?
A list of subjects (and sessions) will be determined when running babs init,
and will be saved in a CSV file called named processing_inclusion.csv (for single-session dataset)
or processing_inclusion.csv (for multiple-session dataset),
located at /path/to/my_BABS_project/analysis/code.
To filter subjects and sessions, use babs init with --list-sub-file /path/to/subject/list/csv/file. Examples: Single-session example, Multi-session example.
See List of included subjects (and sessions) to process for how this list is determined.
What if babs init fails?
If babs init fails, by default it will remove ("clean up") the created, failed BABS project.
When this happens, if you hope to use babs check-setup to debug what's wrong, you'll notice that
the failed BABS project has been cleaned and it's not ready to run babs check-setup yet. What you need
to do are as follows:
Run
babs initwith--keep-if-failedturned on.In this way, the failed BABS project will be kept.
Then you can run
babs check-setupfor diagnosis.After you know what's wrong, please remove the failed BABS project with following commands:
cd <project_root>/analysis # replace `<project_root>` with the path to your BABS project # Remove input dataset(s) one by one: datalad remove -d inputs/data/<input_ds_name> # replace `<input_ds_name>` with each input dataset's name # repeat above step until all input datasets have been removed. # if above command leads to "drop impossible" due to modified content, add `--reckless modification` at the end git annex dead here datalad push --to input datalad push --to output cd .. pwd # this prints `<project_root>`; you can copy it in case you forgot cd .. # outside of `<project_root>` rm -rf <project_root>
If you don't remove the failed BABS project, you cannot overwrite it by running
babs initagain.
Example commands
Example babs init command for toy BIDS App + multi-session data on
a SLURM cluster:
babs init \
--datasets BIDS=/path/to/BIDS_datalad_dataset \
--container_ds /path/to/toybidsapp-container \
--container_name toybidsapp-0-0-7 \
--container_config /path/to/container_toybidsapp.yaml \
--processing_level session \
--queue slurm \
/path/to/a/folder/holding/BABS/project/my_BABS_project
Example command if you have more than one input datasets, e.g., raw BIDS data, and fMRIPrep with FreeSurfer results ingressed. The 2nd dataset is also result from another BABS project - a zipped dataset has filenames in patterns of 'sub-xx*_freesurfer*.zip'. Therefore, the 2nd input dataset should be named as 'freesurfer', a keyword in filename:
babs init \
... \
--datasets \
BIDS=/path/to/BIDS_datalad_dataset \
freesurfer=/path/to/freesurfer_results_datalad_dataset \
...
Debugging
Error when cloning an input dataset
What happened: After babs init prints out a message like this:
Cloning input dataset #x: '/path/to/input_dataset', there was an error message that includes this information:
err: 'fatal: repository '/path/to/input_dataset' does not exist'.
Diagnosis: This means that the specified path to this input dataset (i.e., in --datasets) was not valid;
there is no DataLad dataset there.
How to solve the problem: Fix this path. To confirm the updated path is valid, you can try cloning
it to a temporary directory with datalad clone /updated/path/to/input_dataset. If it is successful,
you can go ahead rerun babs init.