Prepare input BIDS dataset(s) as DataLad dataset(s)

Table of Contents

Prepare input BIDS dataset(s) as DataLad dataset(s)

You can provide one or more BIDS dataset(s) as input. Each input can either be raw BIDS dataset(s), or zipped BIDS derivatives datasets (e.g., zipped results from another BABS' project). BABS is compatible to both single-session datasets and multi-session datasets (more than one session per subject).

Regardless, each BIDS dataset should be presented as a DataLad dataset, which is version tracked by DataLad and can be cloned to another directory. A DataLad dataset can be created with DataLad command datalad create. For more details, please refer to its documentation, and DataLad Handbook.

Raw (unzipped) BIDS dataset, or zipped BIDS dataset?

If there are zip files, like sub-*.zip, it would be considered as a zipped BIDS derivatives dataset.
If there are only unzipped folders in the BIDS file/directory structure format, like folder sub-*/, then it is considered as a raw (unzipped) BIDS dataset.
If both zipped file sub-*.zip and unzipped folders sub-* present, then it is considered as a zipped BIDS derivatives dataset.
- Therefore, if you have a raw BIDS dataset, please do not include zipped files called sub-*.zip in this dataset.

When running babs init, you will see printed messages describing each input BIDS dataset was categorized as a raw (unzipped) dataset or a zipped dataset.

Requirements for a zipped BIDS derivatives input dataset 

There are several requirements for zipped BIDS derivatives dataset:

Note: an input dataset's name is defined when babs init --name.

Naming zip files 

For single-session dataset, the zip filename should follow the pattern of sub-*_<name>*.zip, where <name> is the name of this input dataset.
- Here, *ses-* is allowed in the zip filenames.
Similarly, for multi-session dataset, the zip filename should follow the pattern of sub-*_ses-*_<name>*.zip
In this dataset, for each subject/session pair, there should only be one zip file whose filename contains input dataset's name.
- For example, say we have sub-01_ses-A_freesurfer-20-2-3.zip, where freesurfer will be the input dataset's name. There should not be another zip file with freesurfer for this session, e.g., sub-01_ses-A_freesurfer-xxx.zip

Content of the zip files 

Within the zip file of a specific subject (or session), there should be a folder named by this input dataset's name, e.g., folder freesurfer inside sub-01_ses-A_freesurfer-20-2-3.zip in the input dataset freesurfer.

For more explanations and examples, please refer to "See also" below.

Using results from another BABS project as an input BIDS dataset 

If you hope to use zipped results from another BABS project ("BABS project A") as input dataset for a new BABS project ("BABS project B"), you may follow these steps:

Test out the path you'll to use. This step is optional but highly recommended. This is to make sure that the input dataset's path you'll provide is correct. To do so, please try cloning the results from the output RIA of BABS project A:
- If BABS project A is on the local file system that current directory has access to, you may clone the results from its output RIA by:
  datalad clone ria+file:///absolute/path/to/my_BABS_project_A/output_ria#~data
- For more details and/or other RIA scenarios, please refer to datalad clone's documentation and DataLad Handbook about cloning from RIA stores
If you successfully cloned the results, then this means the path you used is correct. You can go ahead and use this path as the input dataset path for generating BABS project B.
- Please make sure you use the entire string after datalad clone as the input dataset path. For above example, this path is:
  ria+file:///absolute/path/to/my_BABS_project_A/output_ria#~data
You may remove the cloned results of the BABS project A (from the first step):
```
datalad remove -d <cloned_results_of_BABS_project_A>
```
warning Please refer to docs listed below for detailed requirements before you run babs init:
- How do I define the input dataset's name <name> in babs init --datasets?: for restrictions in naming a zipped dataset as input.
- Requirements for a zipped BIDS derivatives input dataset: for requirements in zip files naming and their contents.

Example input BIDS datasets for BABS 

Example input datasets available on OSF
	single-session data	multi-session data
raw BIDS data	https://osf.io/t8urc/	https://osf.io/w2nu3/
zipped BIDS derivatives from fMRIPrep	https://osf.io/2jvub/	https://osf.io/k9zw2/
zipped BIDS derivatives from QSIPrep	https://osf.io/8t9sf/	https://osf.io/d3js6/

Notes:

All images have been zero-ed out.

To clone a dataset:

conda activate <datalad_env>
# Here, `<datalad_env>`: the conda environment where DataLad is installed

datalad clone https://osf.io/<id>/ <local_foldername>
# Please replace `<id>` and `<local_foldername>` accordingly
# e.g., `datalad clone https://osf.io/t8urc/ raw_BIDS_single-ses`