Poster No:
1538
Submission Type:
Abstract Submission
Authors:
Danny Dongyeop Han1, Ahhyun Lee2, Junbeom Kwon3, Jiook Cha4
Institutions:
1Seoul National University, Daegu, Korea, Republic of, 2Seoul National University, Seoul, Seoul, 3The University of Texas at Austin, Austin, TX, 4Seoul National University, Seoul, Korea, Republic of
First Author:
Co-Author(s):
Jiook Cha
Seoul National University
Seoul, Korea, Republic of
Introduction:
Deep learning holds promise for brain imaging, but scarcity and cost of labeled clinical data limit its application. One solution is pretraining models using self-supervised learning on large unlabeled brain imaging data. During pretraining, the model compares and contrasts different views of the same data (contrastive learning) to learn useful features, which can then be fine-tuned on smaller, labeled datasets. Previous pretraining methods often overlooked metadata, a limitation given its availability and correlation with mental disorders (Eaton et al., 2011; McCrea et al., 2012; Solmi et al., 2022). Recent contrastive learning methods addressed this by incorporating readily available metadata (patient age), showing better performance in detecting Alzheimer's disease compared to traditional contrastive learning and fully supervised methods(y-Aware, Dufumier et al., 2021). Building on this, our method integrates multiple metadata types (age, BMI, sex) during pretraining on a dataset six times larger (60,000 vs 10,000 scans) and improves adaptation to clinical tasks with limited data, named Multi-label Aware loss. Our method shows promise for enhancing prediction of psychiatric disorders by effective pretraining with unlabeled data.
Methods:
T1 MRI scans from the Adolescent Brain Cognitive Development (ABCD) Study (N=11365, male: 53%) of 8-14 years olds and UK Biobank (UKB) (N=40986, male: 47%) of 40-70 year olds were used. They were prepared by linear transforming the Freesurfer preprocessed data in native space to the Talairach space. Linear transform ensured consistent image shapes while avoiding nonlinear warping. Three approaches were tested: (1) our proposed method expanding yAware by redefining the InfoNCE loss based on distances in a multi-label space (age, BMI, sex). Other two baseline models (2) supervised training (from scratch) with no pretraining, using cross validation training with folds divided according to iterative stratification based on age, sex, and BMI to mitigate overfitting issues from low sample size (~80), (3) yAware pre-training using supervised contrastive learning with the continuous proxy data InfoNCE loss, with similar ages being considered more as a positive pair. Approaches (1) and (3) were followed by finetuning using the same method as the supervised training approach. All training included hyperparameter optimization on a separate validation set.
Results:
Our model demonstrated superior performance, achieving higher AUROC (Area Under Receiver Operating Characteristics) compared to both yAware and supervised baselines across all downstream classification tasks. Notably, it achieved performance improvement in multiple areas, including classification of Attention-Deficit/Hyperactivity Disorder, Anxiety Disorder, and Major Depressive Disorder. Specifically, our model achieved AUROC of 0.609 in ADHD classification, over 29% performance gain compared to the baseline models (29.5% vs. yAware, 30.0% vs. scratch). Also, our model achieved a 27.7% performance gain (vs. yAware) in classifying Major Depressive Disorder. Additionally, the model's superior performance on sex and BMI classification tasks demonstrates its robustness and validates its effective training.
Conclusions:
Our approach demonstrates the effectiveness of incorporating meta-data during the pretraining of T1 MRI models. Our model achieves superior performance in psychiatric disorder classification compared to both baseline models. Our Multi-label Aware loss includes multi-label metadata in contrastive learning by expanding y-Aware InfoNCE loss from single continuous variable "age", to leverage data with heterogeneous types, continuous variables such as "age" and "BMI," and discrete variables such as "sex." This research opens new possibilities of effective pretraining strategy by leveraging diverse covariates – including genetic, biological, environmental, and psychological data – to develop better predictive clinical models.
Disorders of the Nervous System:
Neurodevelopmental/ Early Life (eg. ADHD, autism)
Psychiatric (eg. Depression, Anxiety, Schizophrenia)
Modeling and Analysis Methods:
Classification and Predictive Modeling 2
Methods Development 1
Multivariate Approaches
Keywords:
Affective Disorders
Attention Deficit Disorder
Computational Neuroscience
Data analysis
Modeling
MRI
Psychiatric Disorders
STRUCTURAL MRI
Other - Deep Learning
1|2Indicates the priority used for review
By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.
I accept
The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information.
Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:
I do not want to participate in the reproducibility challenge.
Please indicate below if your study was a "resting state" or "task-activation” study.
Other
Healthy subjects only or patients (note that patient studies may also involve healthy subjects):
Patients
Was this research conducted in the United States?
No
Were any human subjects research approved by the relevant Institutional Review Board or ethics panel?
NOTE: Any human subjects studies without IRB approval will be automatically rejected.
Not applicable
Were any animal research approved by the relevant IACUC or other animal research panel?
NOTE: Any animal studies without IACUC approval will be automatically rejected.
Not applicable
Please indicate which methods were used in your research:
Structural MRI
Computational modeling
Which processing packages did you use for your study?
Free Surfer
Provide references using APA citation style.
Chen, T. (2020). A simple framework for contrastive learning of visual representations. arXiv. https://arxiv.org/abs/2002.05709
Dufumier, B. (2021). Contrastive learning with continuous proxy meta-data for 3D MRI classification. arXiv. https://arxiv.org/abs/2106.08808
Eaton, N. (2011). An invariant dimensional liability model of gender differences in mental disorder prevalence: Evidence from a national sample. Journal of Abnormal Psychology, 120(1), 282–288.
Huang, G. (2018). Densely connected convolutional networks. arXiv. https://arxiv.org/abs/1608.06993
Khosla, P. (2021). Supervised contrastive learning. arXiv. https://arxiv.org/abs/2004.11362
McCrea, R. (2012). Body mass index and common mental disorders: exploring the shape of the association and its moderation by age, gender and education. International Journal of Obesity, 36, 414–421.
Solmi, M. (2022). Age at onset of mental disorders worldwide: large-scale meta-analysis of 192 epidemiological studies. Molecular Psychiatry, 27, 281–295.
Wang, W. (2023). A Review of Predictive and Contrastive Self-supervised Learning for Medical Images. Machine Intelligence Research, 20, 483–513.
Zhang-James, Y. (2021). Evidence for similar structural brain anomalies in youth and adult attention-deficit/hyperactivity disorder: A machine learning analysis. Translational Psychiatry, 11(1), 82.
No