ComBatLS: A location- and scale-preserving method for multi-site image harmonization

Presented During:

Wednesday, June 26, 2024: 11:30 AM - 12:45 PM
COEX  
Room: Hall D 2  

Poster No:

1889 

Submission Type:

Abstract Submission 

Authors:

Margaret Gardner1, Russell Shinohara1, Richard Bethlehem2, Rafael Romero-García3, Varun Warrier4, Sheila Shanmugan1, Jakob Seidlitz1, Aaron Alexander-Bloch1, Andrew Chen5

Institutions:

1University of Pennsylvania, Philadelphia, PA, 2Autism Research Centre, Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom, 3University of Seville, Seville, Spain, 4University of Cambridge, Cambridge, Cambridgeshire, 5Medical University of South Caroline, Charleston, SC

First Author:

Margaret Gardner  
University of Pennsylvania
Philadelphia, PA

Co-Author(s):

Russell Shinohara  
University of Pennsylvania
Philadelphia, PA
Richard Bethlehem  
Autism Research Centre, Department of Psychiatry, University of Cambridge
Cambridge, United Kingdom
Rafael Romero-García  
University of Seville
Seville, Spain
Varun Warrier  
University of Cambridge
Cambridge, Cambridgeshire
Sheila Shanmugan  
University of Pennsylvania
Philadelphia, PA
Jakob Seidlitz  
University of Pennsylvania
Philadelphia, PA
Aaron Alexander-Bloch  
University of Pennsylvania
Philadelphia, PA
Andrew Chen  
Medical University of South Caroline
Charleston, SC

Introduction:

Recent work has leveraged massive datasets and advanced image harmonization algorithms to construct normative models of imaging-derived phenotypes (IDPs)[1,2]. These brain chart models, which can produce centile or z-scores to benchmark individuals' morphology within a population, are often fit on magnetic resonance imaging data collected across hundreds of scanners. One popular method for harmonizing these data is ComBat, which preserves the effects of specified covariates on the IDPs' means. However, evidence suggests that biological factors, such as sex, also impact an IDP's variance across a population[3]. These scale effects, which directly impact centile and z-score distributions, are not preserved by current harmonization methods. Thus, harmonization may induce error in centile and z-scores, particularly when factors that impact scale are distributed unequally across sites.

Here, we propose a new method in the ComBat family of harmonization tools, ComBatLS, that preserves biological variance in IDPs' location and scale. We tested ComBatLS's ability to preserve variation in scale and its impacts on centile and z-scores by harmonizing across sex-imbalanced artificial "sites" in data from the UK Biobank.

Methods:

As in all ComBat versions, ComBatLS harmonizes across sites by targeting each IDP to a pooled mean while estimating and preserving the effects of designated covariates. However, prior ComBat models assumed that the variance of each IDP's distribution was consistent across sites. In ComBatLS, we incorporate a log-linear relationship between the error's standard deviation and the covariates[4], enabling the estimation and preservation of covariates' effects on IDP variance.

To test this new method, we utilized IDPs derived from the structural MRIs of 28619 participants in the UK Biobank (49.7% female, age 50-80 years), collected across 3 scanners using identical hardware and protocols[5]. T1-weighted structural images, along with T2 for ~98% of subjects, were processed using FreeSurfer v. 6 to extract global volumes for each tissue class (cortical gray matter, subcortical gray matter, white matter, and CSF), as well as surface area, thickness, and volume for each cortical region[6]. Subjects were randomly assigned to one of three simulated sites such that each site had a Male:Female ratio of 1:1, 1:4, and 4:1, respectively. We then harmonized IDPs across the simulated sites using 4 different ComBat configurations: the first preserved no covariate effects, while the remainder preserved the effects of sex and age estimated by a linear model[7], ComBat-GAM[8], and ComBatLS, respectively. Data harmonized by each configuration was then used to fit simple brain charts for each IDP using generalized additive models of location, scale, and shape.

Results:

We assessed ComBat configurations' influence by comparing sex effects in scale in each brain chart and, crucially, how the centile and z-scores derived from them were affected by each harmonization technique. Models fit on the complete, unharmonized data showed the true sex effect in scale in this dataset. Despite the confounding of sex- and site-effects in our simulations, ComBatLS accurately preserved the effects of sex on IDP variance (Fig 1). For both centile and z-scores, pairwise t-tests revealed that the magnitude of error for each subject's scores differed significantly across ComBat methods, with ComBatLS producing more accurate scores than the next most accurate method, ComBat-GAM (Fig 2), in 207 of 208 IDPs (pFDR centiles = <0.001 - 0.67, pFDR z-scores = <0.001 - 0.134). These results were stable across 10 permutations of subjects' site assignments.
Supporting Image: fig_1.jpg
Supporting Image: fig_2.png
 

Conclusions:

We propose ComBatLS as a robust method for harmonizing neuroimaging data across sites while preserving biologically meaningful differences in scale for accurate centile and z-score estimation. ComBatLS is available alongside other ComBat harmonization tools at https://github.com/andy1764/ComBatFamily.

Modeling and Analysis Methods:

Methods Development 1

Neuroanatomy, Physiology, Metabolism and Neurotransmission:

Neuroanatomy Other 2

Keywords:

Informatics
Modeling
MRI
Open-Source Code
Sexual Dimorphism
Statistical Methods
STRUCTURAL MRI
Other - multisite harmonization

1|2Indicates the priority used for review

Provide references using author date format

[1] Frangou, S. (2022). 'Cortical thickness across the lifespan: Data from 17,075 healthy individuals aged 3-90 years', Human Brain Mapping, 43(1), 431–451. https://doi.org/10.1002/hbm.25364
[2] Schabdach, J. M. (2023). 'Brain Growth Charts for Quantitative Analysis of Pediatric Clinical Brain MRI Scans with Limited Imaging Pathology', Radiology. https://doi.org/10.1148/radiol.230096
[3] Wierenga, L. M. (2022). 'Greater male than female variability in regional brain structure across the lifespan', Human Brain Mapping, 43(1), 470. https://doi.org/10.1002/HBM.25204
[4] Harvey, A. C. (1976). 'Estimating Regression Models with Multiplicative Heteroscedasticity', Econometrica, 44(3), 461–465. https://doi.org/10.2307/1913974
[5] Littlejohns, T. J., (2020). 'The UK Biobank imaging enhancement of 100,000 participants: Rationale, data collection, management and future directions', Nature Communications, 11(1), Article 1. https://doi.org/10.1038/s41467-020-15948-9
[6] Desikan, R. S. (2006). 'An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest', NeuroImage, 31(3), 968–980. https://doi.org/10.1016/J.NEUROIMAGE.2006.01.021
[7] Fortin, J. P. (2018). 'Harmonization of cortical thickness measurements across scanners and sites', NeuroImage, 167, 104. https://doi.org/10.1016/J.NEUROIMAGE.2017.11.024
[8] Pomponio, R. (2020). 'Harmonization of large MRI datasets for the analysis of brain imaging patterns throughout the lifespan', NeuroImage, 208. https://doi.org/10.1016/J.NEUROIMAGE.2019.116450