Poster No:
1115
Submission Type:
Abstract Submission
Authors:
Lucas Backes1, Simon Eickhoff1, Christian Rubbert2, Christophe Phillips3, Georgios Antonopoulos1, Kaustubh Patil1
Institutions:
1Research Centre Jülich, Jülich, Germany, 2Institute of Diagnostic and Interventional Radiology, Düsseldorf, Germany, 3Université de Liège, Liège, Belgium
First Author:
Co-Author(s):
Christian Rubbert
Institute of Diagnostic and Interventional Radiology
Düsseldorf, Germany
Introduction:
As average lifetime increases and the population ages, quantifying the aging process is becoming increasingly essential, highlighting the necessity for the development of biomarkers of biological age. Magnetic resonance imaging (MRI) can help us produce such a biomarker for an overall 'brain age' such that it more accurately measures disease and mortality risks than chronological age. However, due to site and scanner differences brain age models do not usually work well on new datasets that were not used for training. This drawback prevents clinical use of 'brain age' as an informative and relevant biomarker.
Methods:
Here, we compiled three different models, namely Kalc (Kalc et al., 2024), brainageR (Hobday et al., 2022) and More (More et al., 2023) trained on the same seven datasets (IXI, AIBL, DLBS, GSP, NKIRSE, OASIS-1, SALD) to predict chronological age from healthy brain scans. All three models use Gaussian Process Regression (GPR) with slight differences in implementation and in their preprocessing pipelines as well as feature spaces: Kalc applies CAT12-based preprocessing to extract non-linearly registered and modulated GM and WM features, which then undergo PCA and ensemble strategies; brainageR uses SPM12 for segmentation and normalization, deriving features from GM, WM, and CSF after masking and finally PCA retaining 80% variance; More focuses only on features from GM volume extracted with CAT12 which is smoothed using a 4mm Gaussian kernel, resampled in 4mm before undergoing PCA and retaining 100% variance. We first employed the UKB (N=6883) as the out-of-sample dataset, testing our three models without any subsequent training (Figure 1 (a)). We then calibrated each model individually using linear regression with a training set of N=4818 and tested them on N=2065 stratified by age and sex to preserve the original data distribution (Figure 1 (b) for the model More). Finally we performed stacking of all three predictions (with/without sex as a feature) using Ridge regression and Random forest. We systematically evaluated the effect of sample size available for training the stacking models by progressively increasing the sample size (from 100 to 4000). The process was repeated 100 times with different training samples, each of them also preserving age and sex distribution (Figure 1 (c)). We quantified the performance using mean absolute error (MAE) and Pearson's correlation (r).
Results:
More's model yields the best results as an out-of-the-box model (MAE=7.35, r=0.81) compared to the other two models (BrainageR: MAE=15.12, r=-4.40, Kalc: MAE=16.60, r=-4.42, see Table 1). The linear calibration also performed best with More's model (MAE=3.50, r=0.81; BrainageR: MAE=4.56, r=0.44, Kalc: MAE=4.01, r=0.55). As for the stacking model, Ridge regression (with and without sex as a feature) performed well with few training samples but reached a plateau at N=500. The performance of Random forests was worse at small training samples but it improved with the training size and outperformed ridge after 1500 training samples (see Figure 1 (c)). Stacking models using the sex feature on average performed better, however, the improvement was marginal.

·Figure 1: Comparison of brain age prediction performance across models and conditions

·Table 1: MAE, R2 and pearson correlation for each individual models, linear regression using each model and stacking models (with and without sex)
Conclusions:
The three out-of-the-box models trained on the same data performed differently on the UKB test data with overall large MAE. We demonstrate that stacking models, particularly random forests, significantly enhance brain age prediction when trained on increasing sample sizes, surpassing ridge regression. The inclusion of sex as a feature marginally improved accuracy which warrants further investigation. These results should push research teams to integrate meta learning into their brain age prediction pipeline, facilitating the integration of a sex feature. Our results underline the potential of stacking models for creating robust and generalizable brain age models, ultimately supporting their integration into clinical and research applications.
Disorders of the Nervous System:
Neurodegenerative/ Late Life (eg. Parkinson’s, Alzheimer’s)
Lifespan Development:
Aging 2
Lifespan Development Other
Modeling and Analysis Methods:
Classification and Predictive Modeling 1
Keywords:
Aging
Data analysis
Machine Learning
MRI
1|2Indicates the priority used for review
By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.
I accept
The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information.
Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:
I am submitting this abstract as an original work to be reproduced. I am available to be the “source party” in an upcoming team and consent to have this work listed on the OSSIG website. I agree to be contacted by OSSIG regarding the challenge and may share data used in this abstract with another team.
Please indicate below if your study was a "resting state" or "task-activation” study.
Other
Healthy subjects only or patients (note that patient studies may also involve healthy subjects):
Healthy subjects
Was this research conducted in the United States?
No
Were any human subjects research approved by the relevant Institutional Review Board or ethics panel?
NOTE: Any human subjects studies without IRB approval will be automatically rejected.
Yes
Were any animal research approved by the relevant IACUC or other animal research panel?
NOTE: Any animal studies without IACUC approval will be automatically rejected.
Not applicable
Please indicate which methods were used in your research:
Structural MRI
For human MRI, what field strength scanner do you use?
3.0T
Which processing packages did you use for your study?
SPM
Provide references using APA citation style.
Hobday, H., Cole, J. H., Stanyard, R. A., Daws, R. E., Giampietro, V., O’Daly, O., . . . Váša, F. (2022). Tissue volume estimation and age prediction using rapid structural brain scans. Sci Rep, 12(1), 12005. doi:10.1038/s41598-022-14904-5
Kalc, P., Dahnke, R., Hoffstaedter, F., Gaser, C., & Alzheimer's Disease Neuroimaging Initiative. (2024). BrainAGE: Revisited and reframed machine learning workflow. *Human Brain Mapping, 45*(3), e26632. https://doi.org/10.1002/hbm.26632
More, S., Antonopoulos, G., Hoffstaedter, F., Caspers, J., Eickhoff, S. B., & Patil, K. R. (2023). Brain-age prediction: A systematic comparison of machine learning workflows. *NeuroImage, 270,* 119947. https://doi.org/10.1016/j.neuroimage.2023.119947
No