Presented During:
Thursday, June 26, 2025: 11:30 AM - 12:45 PM
Brisbane Convention & Exhibition Centre
Room:
Great Hall
Poster No:
157
Submission Type:
Abstract Submission
Authors:
Bruno Hebling Vieira1, Camille Ellaume1, Dorothea Floris1, Franziskus Liem2, Gaël Varoquaux3, Nicolas Langer1
Institutions:
1University of Zurich, Zurich, Zurich, 2ETHZ, Zurich, Zurich, 3SODA - Inria, Paris, France
First Author:
Co-Author(s):
Introduction:
Current efforts on the prediction of cognitive decline from demographic, genetic, and brain imaging features primarily focus on: (1) predicting future diagnoses1, (2) generating point estimates (e.g., expected values)2, and (3) using fixed time windows3,4. A probabilistic approach predicting future cognitive decline trajectories offers significant advantages by capturing the uncertainty of predictions, accommodating arbitrary time intervals, and enabling personalized trajectories that account for individual variability in disease progression. Despite benefits, evaluating probabilistic forecasts poses greater challenges than point estimates due to the need for robust calibration and discrimination assessments, ensuring that predicted probabilities can be used for adjustable decision thresholds with confidence guarantees that meet requirements for future translation of such models into actionable insights5. In this work, we introduce probabilistic forecasting of future Clinical Dementia Rating Sum of Boxes (CDR-SOB)6, often used as primary outcome measures in clinical trials of Alzheimer's disease (AD), and the multidimensional evaluation of performance.
Methods:
We trained gradient-boosted decision trees, which are well suited for tabular data7, using LightGBM on the ADNI dataset to predict the CDR-SOB from four data modalities: (1) past anatomical MRI features, (2) past CDR assessments, (3) a combination of both, and (4) demographic and clinical covariates alone. Training used 80% of ADNI participants, with 20% reserved for validation. External generalizability was tested using the OASIS dataset. CDR-SOB was predicted by combining estimated probabilities of its six constituent sub-scores (see Fig 1A for more details). Features included demographics, APOE4 status, and MRI-derived measures such as cortical thickness and brain volume (Fig 1B). Model performance was evaluated on both ADNI and OASIS using the Brier score and its decomposition into uncertainty, resolution, and reliability (URR)5, which assess prediction variability, separability of outcomes, and alignment with observed event rate, respectively (Fig 1C). Biological validity was evaluated using SHAP-based feature attributions to highlight key contributors across data modalities.

·Figure 1
Results:
Models combining MRI features and prior CDR scores exhibited superior discrimination (AUCs) and calibration (reliability) compared to models based on past CDR scores or MRI features alone (Fig 2A). For short time windows (≤5 years), past CDR assessments were the strongest predictors of future CDR-SOB. In long time windows (>5 years), brain-based models outperformed past CDR assessments. Despite high performance in ADNI (Fig 2B), calibration did not generalize to OASIS (Fig 2C), highlighting the need for dataset-specific recalibration. This is evidenced by the difference in the prevalence of different levels of CDR-SOB, measured by the uncertainty term. Feature attributions confirmed known pathological markers, with APOE4 emerging as the most significant covariate (Fig 2D). Identified key brain features included hippocampal and medial temporal volumes (Fig 2E), alongside the lateral inferior ventricles, known to be associated with AD8.

·Figure 2
Conclusions:
Probabilistic forecasting enhances the translational potential of machine learning models by quantifying uncertainty while maintaining accuracy comparable to point-prediction methods. Optimal predictions across all time windows were obtained combining MRI features and past CDR scores, which consistently improved discrimination and resolution across time windows compared to models using only one modality, underscoring their importance for predicting long-term cognitive trajectories. The failure of calibration in OASIS, i.e., high reliability score, underscores the necessity of tailored recalibration for external datasets. Our work advances personalized medicine and clinical trial design, enabling dynamic and confident predictions of cognitive impairment trajectories.
Disorders of the Nervous System:
Neurodegenerative/ Late Life (eg. Parkinson’s, Alzheimer’s) 1
Lifespan Development:
Aging
Modeling and Analysis Methods:
Classification and Predictive Modeling 2
Neuroinformatics and Data Sharing:
Workflows
Keywords:
ADULTS
Aging
Degenerative Disease
Machine Learning
Statistical Methods
STRUCTURAL MRI
1|2Indicates the priority used for review
By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.
I accept
The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information.
Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:
I do not want to participate in the reproducibility challenge.
Please indicate below if your study was a "resting state" or "task-activation” study.
Other
Healthy subjects only or patients (note that patient studies may also involve healthy subjects):
Patients
Was this research conducted in the United States?
No
Were any human subjects research approved by the relevant Institutional Review Board or ethics panel?
NOTE: Any human subjects studies without IRB approval will be automatically rejected.
Not applicable
Were any animal research approved by the relevant IACUC or other animal research panel?
NOTE: Any animal studies without IACUC approval will be automatically rejected.
Not applicable
Please indicate which methods were used in your research:
Structural MRI
Neuropsychological testing
Computational modeling
For human MRI, what field strength scanner do you use?
1.5T
3.0T
Which processing packages did you use for your study?
Free Surfer
Provide references using APA citation style.
1. Karaman BK, Mormino EC, Sabuncu MR, for the Alzheimer’s Disease Neuroimaging Initiative. Machine learning based multi-modal prediction of future decline toward Alzheimer’s disease: An empirical study. PLOS ONE. 2022;17(11):e0277322. doi:10.1371/journal.pone.0277322
2. Vieira BH, Liem F, Dadi K, Engemann DA, Gramfort A, Bellec P, Craddock RC, Damoiseaux JS, Steele CJ, Yarkoni T, Langer N, Margulies DS, Varoquaux G. Predicting future cognitive decline from non-brain and multimodal brain imaging data in healthy and pathological aging. Neurobiol Aging. 2022;118:55-65. doi:10.1016/j.neurobiolaging.2022.06.008
3. Marinescu RV, Oxtoby NP, Young AL, Bron EE, Toga AW, Weiner MW, ..., The EuroPOND Consortium, The Alzheimer’s Disease Neuroimaging Initiative. The Alzheimer’s Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up. Mach Learn Biomed Imaging. 2021;1(December 2021 issue):1-60. doi:10.59275/j.melba.2021-2dcc
4. Bhagwat N, Viviano JD, Voineskos AN, Chakravarty MM, Alzheimer’s Disease Neuroimaging Initiative. Modeling and prediction of clinical symptom trajectories in Alzheimer’s disease using longitudinal data. PLoS Comput Biol. 2018;14(9):e1006376. doi:10.1371/journal.pcbi.1006376
5. Mitchell K. Score Decompositions in Forecast Verification. University of Exeter; 2019.
6. Cedarbaum JM, Jaros M, Hernandez C, Coley N, Andrieu S, Grundman M, Vellas B, Initiative ADN. Rationale for use of the Clinical Dementia Rating Sum of Boxes as a primary outcome measure for Alzheimer’s disease clinical trials. Alzheimers Dement. 2013;9(1S):S45-S55. doi:10.1016/j.jalz.2011.11.002
7. Grinsztajn L, Oyallon E, Varoquaux G. Why do tree-based models still outperform deep learning on typical tabular data? In: Advances in Neural Information Processing Systems 35 (NeurIPS 2022). Vol 35. ; 2022:507-520. Accessed November 22, 2024. https://proceedings.neurips.cc/paper_files/paper/2022/hash/0378c7692da36807bdec87ab043cdadc-Abstract-Datasets_and_Benchmarks.html
8. Ezzati A, Katz MJ, Zammit AR, Lipton ML, Zimmerman ME, Sliwinski MJ, Lipton RB. Differential association of left and right hippocampal volumes with verbal episodic and spatial memory in older adults. Neuropsychologia. 2016;93:380-385. doi:10.1016/j.neuropsychologia.2016.08.016
No