Poster No:
1134
Submission Type:
Abstract Submission
Authors:
Myrthe van Haaften1, Kaouther Mouheb1, Tavia Evans1, Henri Vrooman1, Harro Seelaar1, Frank Wolters1, Meike Vernooij1, Esther Bron1
Institutions:
1Erasmus MC University Medical Center, Rotterdam, the Netherlands
First Author:
Co-Author(s):
Kaouther Mouheb
Erasmus MC University Medical Center
Rotterdam, the Netherlands
Tavia Evans
Erasmus MC University Medical Center
Rotterdam, the Netherlands
Henri Vrooman
Erasmus MC University Medical Center
Rotterdam, the Netherlands
Frank Wolters
Erasmus MC University Medical Center
Rotterdam, the Netherlands
Esther Bron, PhD
Erasmus MC University Medical Center
Rotterdam, the Netherlands
Introduction:
Dementia is a syndrome that can be caused by different underlying diseases, such as Alzheimer's disease (AD) and frontotemporal dementia (FTD). Differentiating between AD and FTD is difficult because of overlap in symptoms and neuroimaging patterns, and both diseases have varying clinical presentations. Data-driven deep learning models are able to discover complex patterns in imaging data, which can support the diagnostic process. Several large neuroimaging cohorts of patients with dementia provide the sample size to study such deep learning models, but their generalizability to local situations is uncertain. The aim of this study is twofold: first, to develop an MRI-based deep learning model on public AD and FTD datasets, and evaluate its generalizability to a local memory clinic (MC) cohort; second, to evaluate the effect of finetuning the model on the local MC cohort.
Methods:
We included T1-weighted brain MRI scans of AD and FTD patients from four cohorts: ADNI, NIFD, NACC (NIA-funded Alzheimer's Disease Research Centers, grant U24 AG072122), and ACE (our in-house MC cohort). Only AD patients with an age up to 75 years were included, as first FTD diagnosis in older patients is uncommon and therefore the classification task is less relevant beyond this age. ADNI, NIFD and NACC were combined into the source dataset (N=593 AD, N=192 FTD), whereas ACE was the local MC cohort (N=125 AD, N=113 FTD). The datasets were 10 times randomly split into a train, validation and test set (source: 80%/10%/10%, ACE: 40%/10%/50%), stratified for the clinical diagnosis and, for the source, original data cohort. The MRI scans were processed using a voxel-based morphometry pipeline to construct gray matter density maps, which were then used as the input for AD vs. FTD classification using DenseNet-121, a widely used deep learning model. For each data split, we first trained the model on the source dataset, and then finetuned all model layers on ACE. Hyperparameters were tuned in the first split, with the optimal combination used for each of the 10 splits. Class weights were used to correct for the class imbalance. The models before and after finetuning were tested on the respective source and ACE test sets, and mean performance metrics were computed over the 10 splits.
Results:
The models trained on only the source data (pre-finetuning) showed a good overall performance (Fig. 1), with a small drop between the internal source test set (balanced accuracy (BA)=0.87) and external ACE test set (BA=0.82). Notably, finetuning the models on ACE (post-finetuning) lowered the performance slightly instead of improving it (BA=0.80 on ACE). FTD images from NACC were harder to classify (accuracy (ACC)=0.70) compared to those from NIFD (ACC=0.89) and ACE (ACC=0.77) (Fig. 2a), which might be due to increased data heterogeneity (e.g. in imaging protocol) and lower FTD sample size in NACC. We did not observe this effect in AD, however, possibly because the AD sample size in NACC was larger than for FTD. In ACE, model performance was higher on AD than on FTD (ACC 0.87 vs. 0.77 pre-finetuning, 0.84 vs. 0.77 post-finetuning). For both the pre- and post-finetuning models, the performance differed over the AD and FTD subtypes (Fig. 2b). The pre-finetuning model had a large performance drop for progressive non-fluent aphasia diagnosis, an FTD language variant, between the source data and ACE (ACC 0.86 vs. 0.44), which was not resolved by finetuning.


Conclusions:
Models developed on ADNI, NIFD and NACC generally showed good generalizability to a local tertiary MC cohort, implying that these datasets are representative at least for specialized referral centers. Finetuning all layers had no beneficial effect on performance, but its added value might be limited by the heterogeneity and small size of the MC validation set. Other finetuning strategies (e.g. tuning only last layers, few-shot learning) or more extensive hyperparameter tuning can be further explored.
Disorders of the Nervous System:
Neurodegenerative/ Late Life (eg. Parkinson’s, Alzheimer’s) 2
Modeling and Analysis Methods:
Classification and Predictive Modeling 1
Novel Imaging Acquisition Methods:
Anatomical MRI
Keywords:
Degenerative Disease
Machine Learning
STRUCTURAL MRI
Other - Dementia Diagnosis
1|2Indicates the priority used for review
By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.
I accept
The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information.
Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:
I do not want to participate in the reproducibility challenge.
Please indicate below if your study was a "resting state" or "task-activation” study.
Other
Healthy subjects only or patients (note that patient studies may also involve healthy subjects):
Patients
Was this research conducted in the United States?
No
Were any human subjects research approved by the relevant Institutional Review Board or ethics panel?
NOTE: Any human subjects studies without IRB approval will be automatically rejected.
Yes
Were any animal research approved by the relevant IACUC or other animal research panel?
NOTE: Any animal studies without IACUC approval will be automatically rejected.
Not applicable
Please indicate which methods were used in your research:
Structural MRI
Computational modeling
Provide references using APA citation style.
Not applicable
No