Poster No:
1640
Submission Type:
Abstract Submission
Authors:
Pedro M. Gordaliza1,2, Yasser Alemán-Gómez3,4, Jaume Banus2, Meritxell Bach-Cuadra1,2
Institutions:
1CIBM Center for Biomedical Imaging, Lausanne, Switzerland, 2Department of Radiology, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland, 3Connectomics Lab, Department of Radiology, Lausanne University Hospital (CHUV), Lausanne, Switzerland, 4Department of Radiology, Lausanne University Hospital, Lausanne, Switzerland
First Author:
Pedro M. Gordaliza
CIBM Center for Biomedical Imaging|Department of Radiology, Lausanne University Hospital and University of Lausanne
Lausanne, Switzerland|Lausanne, Switzerland
Co-Author(s):
Yasser Alemán-Gómez
Connectomics Lab, Department of Radiology, Lausanne University Hospital (CHUV)|Department of Radiology, Lausanne University Hospital
Lausanne, Switzerland|Lausanne, Switzerland
Jaume Banus
Department of Radiology, Lausanne University Hospital and University of Lausanne
Lausanne, Switzerland
Meritxell Bach-Cuadra
CIBM Center for Biomedical Imaging|Department of Radiology, Lausanne University Hospital and University of Lausanne
Lausanne, Switzerland|Lausanne, Switzerland
Introduction:
Accurate volumetric segmentation of brain structures is key for tracking neurological disorders[1-3]. FreeSurfer (FS)[4], a widely used reference method for brain segmentation, employs Bayesian estimation for subcortical parcellation and atlas-based methods for cortical parcellation requiring hours to process. Deep learning (DL) tools like SynthSeg[5] enable seconds-long segmentation from raw images and have been integrated into FSv8.0.0. Given this methodological transition, understanding systematic differences between traditional and DL-based measurements is essential for neuroimaging research continuity. Here we assess these differences through a multi-site reliability analysis.
Methods:
Materials: We utilized the SRPBS Traveling Subject Dataset[6], comprising T1-weighted MRI scans from 9 healthy male participants (age 25-30) acquired across 9 scanners (81 total images). The dataset's homogeneous demographics and multi-site protocol provide an ideal framework for comprehensive segmentation.
We evaluated four segmentation approaches: 1) FSv7.4.1 using recon-all with traditional subcortical segmentation and Desikan atlas for cortical parcellation; 2) FSv8.0.0 default, which replaces FSv7.4.1's traditional registration and skull-stripping with DL-based SynthMorph [7] and SynthStrip[8] respectively, and implements SynthSegv2.0.0 for subcortical segmentation; 3) FSv8.0.0robust, using the same pipeline but with SynthSegv2.0.0's robust option (SynthSeg+); 4) direct application of SynthSeg+ on raw images for complete brain segmentation. All approaches generated volumes for 101 brain regions (68 cortical, 33 subcortical).
Statistical Analysis: Algorithm comparisons were conducted using two approaches: 1) pairwise regional volume comparisons across 81 images with Wilcoxon signed-rank tests and Bonferroni correction (p < 0.05), along with concordance assessment via Spearman's correlation and Bland-Altman analysis; 2) segmentation reliability was evaluated through within-subject reproducibility (coefficient of variation, CV, across scanner sites) and between-subject discriminability metrics. For discriminability, we averaged pairwise between-subject Kendall's τ distance (0-1). These metrics were combined into a between/within, τ/CV ratio, to assess each algorithm's overall performance per region.
Results:
Pairwise comparisons (Fig. 1a) between FSv7.4.1 and DL-based methods revealed significant volumetric differences (p<0.05). In summary, cortical regions showed differences in 46%, 49%, and 96% of cases for FSv8.0.0, FSv8.0.0robust, and SynthSeg+, respectively, with the majority (18, 21, and 60 regions) being highly significant (p < 0.001). Methods using FSv8.0.0 and SynthSeg consistently showed differences in temporal regions. A higher difference was found in subcortical regions, with significant differences in 82%, 88%, and 85% of the regions (see Fig. 1b for the left thalamus). Fig. 2a depicts that SynthSeg+ achieved superior τ/CV ratios in 68 regions, compared to 17 for FSv8.0.0, 9 for FSv7.4.1, and 7 for FSv8.0.0robust. Fig. 2b highlights the 10 regions with best ratios showing significant differences for SynthSeg+. Complementary region-wise analyses and supplementary discriminability metrics, for alternative (non)gaussianity assumptions, are available at shorturl.at/oyw0l and doi.org/10.5281/zenodo.14511324 showing similar or stronger tendencies.

Conclusions:
Our study demonstrates that DL approaches, particularly SynthSeg+, significantly improve brain segmentation reliability, notably in temporal regions where traditional FS segmentation faces known limitations due to partial volume effects. These differences between traditional and DL measurements require careful interpretation when pooling studies across methodologies while offering enhanced capability to detect structural differences in both existing and future neuroimaging research, particularly for subtle pathological alterations.
Modeling and Analysis Methods:
Segmentation and Parcellation 1
Neuroinformatics and Data Sharing:
Workflows 2
Novel Imaging Acquisition Methods:
Anatomical MRI
Keywords:
Cortex
Segmentation
STRUCTURAL MRI
Sub-Cortical
Workflows
1|2Indicates the priority used for review
By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.
I accept
The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information.
Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:
I am submitting this abstract as an original work to be reproduced. I am available to be the “source party” in an upcoming team and consent to have this work listed on the OSSIG website. I agree to be contacted by OSSIG regarding the challenge and may share data used in this abstract with another team.
Please indicate below if your study was a "resting state" or "task-activation” study.
Other
Healthy subjects only or patients (note that patient studies may also involve healthy subjects):
Healthy subjects
Was this research conducted in the United States?
No
Were any human subjects research approved by the relevant Institutional Review Board or ethics panel?
NOTE: Any human subjects studies without IRB approval will be automatically rejected.
Not applicable
Were any animal research approved by the relevant IACUC or other animal research panel?
NOTE: Any animal studies without IACUC approval will be automatically rejected.
Not applicable
Please indicate which methods were used in your research:
Structural MRI
Other, Please specify
-
Segmentation Workflows
For human MRI, what field strength scanner do you use?
3.0T
Which processing packages did you use for your study?
Free Surfer
Other, Please list
-
SytnthSeg
Provide references using APA citation style.
1-Sebenius, I., Dorfschmidt, L., Seidlitz, J., Alexander-Bloch, A., Morgan, S. E., & Bullmore, E. (2024). Structural MRI of brain similarity networks. Nature Reviews Neuroscience, 1–18. https://doi.org/10.1038/s41583-024-00882-2
2-Gordaliza, P. M., Molchanova, N., Wynen, M., Maggi, P., Janssen, J., Banus, J., Cagol, A., Graziera, C., & Cuadra, M. B. (2024). Towards Longitudinal Characterization of Multiple Sclerosis Atrophy Employing SynthSeg Framework and Normative Modeling (p. 2024.09.17.613272). bioRxiv. https://doi.org/10.1101/2024.09.17.613272
3-Janssen, J., Alloza, C., Díaz-Caneja, C. M., Santonja, J., Pina-Camacho, L., Gordaliza, P. M., Fernández-Pena, A., Lois, N., Buimer, E. E. L., Van Haren, N. E. M., Cahn, W., Vieta, E., Castro-Fornieles, J., Bernardo, M., Arango, C., Kahn, R. S., Hulshoff Pol, H. E., & Schnack, H. G. (2022). Longitudinal allometry of sulcal morphology in health and schizophrenia. Journal of Neuroscience
4-Fischl, B. (2012). FreeSurfer. NeuroImage, 62(2), 774–781. https://doi.org/10.1016/J.NEUROIMAGE.2012.01.021
5-Billot, B., Magdamo, C., Cheng, Y., Arnold, S. E., Das, S., & Iglesias, J. E. (2023). Robust machine learning segmentation for large-scale analysis of heterogeneous clinical brain MRI datasets. Proceedings of the National Academy of Sciences of the United States of America, 120(9), e2216399120. https://doi.org/10.1073/pnas.2216399120
6-Tanaka, S. C., Yamashita, A., Yahata, N., Itahashi, T., Lisi, G., Yamada, T., Ichikawa, N., Takamura, M., Yoshihara, Y., Kunimatsu, A., Okada, N., Hashimoto, R., Okada, G., Sakai, Y., Morimoto, J., Narumoto, J., Shimada, Y., Mano, H., Yoshida, W., … Imamizu, H. (2021). A multi-site, multi-disorder resting-state magnetic resonance image database. Scientific Data, 8(1), Article 1. https://doi.org/10.1038/s41597-021-01004-8
7-Hoffmann, M., Billot, B., Greve, D. N., Iglesias, J. E., Fischl, B., & Dalca, A. V. (2022). SynthMorph: Learning Contrast-Invariant Registration Without Acquired Images. IEEE Transactions on Medical Imaging, 41(3), 543–558. https://doi.org/10.1109/TMI.2021.3116879
8-Hoopes, A., Mora, J. S., Dalca, A. V., Fischl, B., & Hoffmann, M. (2022). SynthStrip: Skull-stripping for any brain image. NeuroImage, 260, 119474. https://doi.org/10.1016/j.neuroimage.2022.119474
No