How C-PAC NodeBlocks and resource pool enable modular testing and cross-pipeline compatibility

Poster No:

2271 

Submission Type:

Abstract Submission 

Authors:

Elizabeth Kenneally1, Steven Giavasis2, Jon Clucas2, Michael Milham2, Gregory Kiar3

Institutions:

1Child Mind Institute, New York, NY - New York, 2Child Mind Institute, New York, NY, 3Child Mind Institute, Montreal, Quebec

First Author:

Elizabeth Kenneally  
Child Mind Institute
New York, NY - New York

Co-Author(s):

Steven Giavasis  
Child Mind Institute
New York, NY
Jon Clucas  
Child Mind Institute
New York, NY
Michael Milham  
Child Mind Institute
New York, NY
Gregory Kiar  
Child Mind Institute
Montreal, Quebec

Introduction:

C-PAC is a pipeline-builder for structural and functional MRI preprocessing and denoising that sits between researchers and core tool developers. We developed an engine built on NodeBlocks, an internal abstraction that wraps Nipype [1] commands into interchangeable pipeline steps. A common challenge with highly configurable software is testing tool inter-compatibility and validating results, since there are many potential combinations. As tools are added and the length of possible pipelines increases, exhaustive testing becomes computationally intractable. Instead, we need to test individual components so that their composition results in high-quality pipelines. Complete integration tests are impossible, so we must turn to robust unit testing.
In this abstract, we will focus on how the NodeBlock abstraction developed for C-PAC supports component-wise unit testing, and our proposal for a streamlined testing workflow to ensure pipeline quality as we issue new releases or integrate new algorithms. We will also highlight the various ways that researchers can compose pipelines across a variety of tools used commonly in the field, in particular, how they can ingress data derivatives from fMRIPrep [2] and FreeSurfer [3].

Methods:

Nodes in Nipype are objects that contain and execute one specific function. NodeBlocks are defined and implemented in C-PAC as groupings of Nipype nodes that comprise a pipeline step. By wrapping Nipype nodes into intuitive chunks, C-PAC NodeBlocks provide an interface between Nipype nodes and the processing pipeline as a whole, allowing users to create pipelines without having to interact with Nipype or the C-PAC engine. Users configure the pipeline file to switch on desired processing steps, each of which connects to a NodeBlock that encompasses all of the relevant processes. Due to the modular structure of the C-PAC NodeBlocks, the inputs and outputs of each NodeBlock remain identical regardless of internal modifications. This lends itself well to a NodeBlock testing suite of mock data where each pipeline step can be run and tested in isolation.
In addition to the NodeBlocks architecture, C-PAC relies on a resource pool design where data are injected into the pool either after they are initially loaded, produced via processing, or pulled from another source. In the resource pool, each file has a strict definition, and NodeBlocks use this information to map inputs and outputs across each node of the pipelines.

Results:

With the introduction of the NodeBlock infrastructure, it is now possible to set up a piecewise testing infrastructure. If there is an existing resource pool from prior runs, a NodeBlocks can be run and tested in isolation by developers.
C-PAC gives users the opportunity to run community-standard processing pipelines, such as DCAN labs ABCD-HCP pipeline [6] and fMRIPrep. These pre-built pipelines are modeled after and maintained to produce results that are maximally similar to their respective reference pipelines. Users can also build pipelines entirely from scratch. The modular NodeBlock structure allows for this seamless mixing and matching of tools and preprocessing steps. In addition, users can now import non-C-PAC output directories to carry out further processing and calculate derivatives in C-PAC. Currently, C-PAC is compatible with fMRIPrep and FreeSurfer output directories, and this function will be expanded to accept any BIDS-compliant directory [4-5]. In an effort to maximally standardize the pipelines, we also now have an option to pre-process data in C-PAC to run through FreeSurfer according to the ABCD-HCP pipeline.

Conclusions:

C-PAC is a highly configurable and flexible tool that supports researchers in not only processing their data in an end-to-end fashion, but extending the processing performed in other platforms. This configurability is due in large part to the NodeBlock infrastructure, and a testing suite for individual NodeBlocks further enables and demonstrates C-PAC's stability and utility.

Modeling and Analysis Methods:

fMRI Connectivity and Network Modeling
Task-Independent and Resting-State Analysis 2

Neuroinformatics and Data Sharing:

Workflows 1

Keywords:

Computational Neuroscience
Computing
Data analysis
Development
FUNCTIONAL MRI
Open-Source Code
Open-Source Software
Workflows

1|2Indicates the priority used for review
Supporting Image: OHBM-gkedits1.png
 

Provide references using author date format

[1] Gorgolewski, Krzysztof, et al. "Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in python." Frontiers in neuroinformatics 5 (2011): 13.
[2] Esteban, Oscar, et al. "fMRIPrep: a robust preprocessing pipeline for functional MRI." Nature methods 16.1 (2019): 111-116.
[3] Fischl, Bruce. "FreeSurfer." Neuroimage 62.2 (2012): 774-781.
[4] Gorgolewski, Krzysztof J., et al. "The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments." Scientific data 3.1 (2016): 1-9.
[5] Gorgolewski, Krzysztof J., et al. "BIDS apps: Improving ease of use, accessibility, and reproducibility of neuroimaging data analysis methods." PLoS computational biology 13.3 (2017): e1005209.
[6] Feczko, Eric, et al. "Adolescent Brain Cognitive Development (ABCD) community MRI collection and utilities." BioRxiv (2021): 2021-07.