Impact of cross-site imbalance in ComBat-based harmonization methods

Poster No:

1499 

Submission Type:

Abstract Submission 

Authors:

Nicolás Nieto1, Simon Eickhoff1, Christian Jung2, Martin Reuter3, Kersten Diers3, Malte Kelm2, Artur Lichtenberg2, Federico Raimondo4, Kaustubh Patil4

Institutions:

1Research Centre Jülich, Jülich, NRW, 2Heinrich-Heine University, Düsseldorf, NRW, 3German Center for Neurodegenerative Diseases, Bonn, NRW, 4Research Center Jülich, Jülich, NRW

First Author:

Nicolás Nieto  
Research Centre Jülich
Jülich, NRW

Co-Author(s):

Simon Eickhoff  
Research Centre Jülich
Jülich, NRW
Christian Jung  
Heinrich-Heine University
Düsseldorf, NRW
Martin Reuter  
German Center for Neurodegenerative Diseases
Bonn, NRW
Kersten Diers  
German Center for Neurodegenerative Diseases
Bonn, NRW
Malte Kelm  
Heinrich-Heine University
Düsseldorf, NRW
Artur Lichtenberg  
Heinrich-Heine University
Düsseldorf, NRW
Federico Raimondo  
Research Center Jülich
Jülich, NRW
Kaustubh Patil  
Research Center Jülich
Jülich, NRW

Introduction:

Machine learning (ML) techniques have significantly contributed to advances in neuroscience. ML benefits from large datasets (DS), so integrating multiple DS is an attractive option. However, variations in data collection conditions, such as different scanners, result in systematic undesired variability called effects of sites (EoS). Harmonization techniques aim to remove EoS while retaining biological information, making ComBat-based methods like neuroHarmonize [Pomponio et al. 2020] a widely used tool for Magnetic Resonance Imaging (MRI) data [Hu et al. 2023]. However, their integration into ML workflows raises data leakage issues, mainly when target imbalance across sites generates site-target dependencies, e.g. when patients and controls are collected in different sites. This study shows the effects of site-target dependence and independence scenarios on ComBat in age regression tasks using MRI data. We also introduce "PrettYharmonize", a harmonization method designed to prevent data leakage.

Methods:

Voxel-based morphometry was performed using CAT12.8 [Gaser et al. 2022] to obtain modulated gray matter (GM) volume, which was linearly resampled to 8x8x8 mm3 voxels, resulting in 3747 features. For site-target dependence, 118 images were sampled in disjoint age ranges with balanced sex representation in four DS (The Enhanced Nathan Kline Institute (eNKI, [Nooner et al. 2012], Amsterdam Open MRI Collection (AOMIC-ID1000, [Snoek et al. 2021], 1000Brains [Caspers et al. 2014] and CamCAN [Krieger et al. 2017]. For site-target independence, eNKI (N=300), CamCAN (N=288), and Southwest University Adult Lifespan Dataset (SALD, N=200 [Wei et al 2018]) were sampled in the range of 18 to 80 years, with equal sex representation.
Five harmonization schemes were tested. PrettYharmonize employed leakage-free harmonization using pretended labels and stacking. Whole Data Harmonization (WDH) applied harmonization on pooled DS before cross-validation (CV). Test Target Leakage (TTL) retained target variance with test labels in a CV-consistent manner. WDH and TTL are leakage schemes, as test labels are needed. The No Target (NT) scheme doesn't explicitly preserve target variance and does not need test labels. Lastly, the Unharmonized scheme applied no harmonization.
Relevance Vector Regression with a linear kernel served as the ML model. A 5-fold CV scheme was used and the Mean Absolute Error (MAE) was calculated on the test sets.

Results:

In site-target dependence scenarios, the Unharmonized scheme achieved an MAE of 6.20, while WDH and TTL schemes reduced the MAE by ~2 years (Figure 1-A). PrettYharmonize performed similarly to WDH and TTL without leakage. NT scheme exhibited the highest error, as it removed age-related signals from features, causing the model to predict the mean age across sites (Figure 1-B).
In site-target independence scenarios, the Unharmonized model achieved an MAE of 6.31. None of the harmonization schemes improved performance, including PrettYharmonize. NT performed comparably to other schemes, as biologically relevant variance was shared across sites (Figure 2).
Supporting Image: Figure1.jpg
   ·A: Age regression in site-target dependence scenarios. B: Predicted age versus true age for NT method. Each dot represents a participant on the dataset. The diagonal represents perfect age prediction.
Supporting Image: Figure2.png
   ·Age regression in site-target independence scenarios
 

Conclusions:

This study highlights the importance of careful integration of harmonization in ML pipelines. Under site-target dependence, harmonization schemes that allow leakage (WDH and TTL) improve performance but leak test label information. PrettYharmonize, achieved similar results as the leakage-prone methods, offering a solution for integrating harmonization into ML pipelines. The method is available at github.com/juaml/PrettYharmonize. In site-target independence scenarios, contrary to the literature, harmonization did not improve performance.
Although we showcase the problem on the age regression task, similar results were obtained in sex classification and dementia-MCI prediction. Further research is needed to examine harmonization's impact on model interpretability and scenarios with varying degrees of site-target dependence.

Modeling and Analysis Methods:

Classification and Predictive Modeling
Methods Development 1
Other Methods 2

Keywords:

Data analysis
Machine Learning
MRI
STRUCTURAL MRI
Other - Harmonization

1|2Indicates the priority used for review

Abstract Information

By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.

I accept

The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information. Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:

I do not want to participate in the reproducibility challenge.

Please indicate below if your study was a "resting state" or "task-activation” study.

Other

Healthy subjects only or patients (note that patient studies may also involve healthy subjects):

Healthy subjects

Was this research conducted in the United States?

No

Were any human subjects research approved by the relevant Institutional Review Board or ethics panel? NOTE: Any human subjects studies without IRB approval will be automatically rejected.

Not applicable

Were any animal research approved by the relevant IACUC or other animal research panel? NOTE: Any animal studies without IACUC approval will be automatically rejected.

Not applicable

Please indicate which methods were used in your research:

Structural MRI

For human MRI, what field strength scanner do you use?

3.0T

Which processing packages did you use for your study?

Other, Please list  -   CAT12.8

Provide references using APA citation style.

Caspers, S. (2014). Studying variability in human brain aging in a population-based German cohort—rationale and design of 1000BRAINS. Frontiers in aging neuroscience, 6, 149.
Gaser, C. (2024). CAT: a computational anatomy toolbox for the analysis of structural MRI data. GigaScience, 13, giae049.
Hu, F. (2023). Image harmonization: A review of statistical and deep learning methods for removing batch effects and evaluation metrics for effective harmonization. NeuroImage, 274, 120125.
Krieger, D. (2017, November). Shared high value research resources: The CamCAN human lifespan neuroimaging dataset processed on the open science grid. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1815-1822). IEEE.
Nooner, K. B. (2012). The NKI-Rockland sample: a model for accelerating the pace of discovery science in psychiatry. Frontiers in neuroscience, 6, 152.
Pomponio, R. (2019). Harmonization of large MRI datasets for the analysis of brain imaging patterns throughout the lifespan. NeuroImage, 208, Article 116450.
Snoek, L. (2021). The Amsterdam Open MRI Collection, a set of multimodal MRI datasets for individual difference analyses. Scientific data, 8(1), 85.
Wei, D. (2018). Structural and functional brain scans from the cross-sectional Southwest University adult lifespan dataset. Scientific data, 5(1), 1-10.

UNESCO Institute of Statistics and World Bank Waiver Form

I attest that I currently live, work, or study in a country on the UNESCO Institute of Statistics and World Bank List of Low and Middle Income Countries list provided.

No