Working closely together from far away - the Nipoppy and Neurobagel tools for decentralized data harmonization and discovery
Saturday, Jun 28: 9:00 AM - 10:15 AM
Symposium
Brisbane Convention & Exhibition Centre
Room: Great Hall (Mezzanine Level) Doors 5, 6 & 7
To create large, globally representative neuroimaging datasets for biomarker discovery we have to collaborate with and integrate data from institutes and data platforms across the world. But it’s not just the sample size that grows by increasing collaboration: working with distributed sites with often heterogeneous data practices and limited resources for harmonization can consume a lot of time and human effort, ultimately impeding research progress. The increasing international adoption of strong data privacy frameworks creates further uncertainty and inconsistency on what sites can share with each other. Together these challenges create a need for easy-to-use neuroinformatics tools that help create consistent data workflows at each site to enable an efficient, decentralized way of working together. In my talk I will focus on two projects I am involved in that each address one of these challenges: the Neurobagel project is an ecosystem for federated cohort discovery questions across distributed data sites such as “How many participants are there (across our network) who fit our cohort inclusion criteria and have been processed with freesurfer 7?”. It is built around the idea that data remain under the control of the collecting institution but are made discoverable by being harmonized through annotation with standardized terminology from existing, FAIR vocabularies. The Nipoppy project is a lightweight framework to standardize the curation and processing of an individual dataset. Its aim is to reduce the time and effort of adopting a new processing protocol or reprocessing data with an upgraded version of a pipeline consistently across many sites. Nipoppy builds on existing standards and tools for reproducible processing, and maintains a list of automatic extraction tools for data availability and imaging derived phenotypes for existing pipelines. I will present a practical case study of how these tools are integrated into existing collaborations, outline the plans for their ongoing development, and we will discuss both the successes and the social challenges that come with the adoption of these standards based, decentralized models of collaboration.
You have unsaved changes.