Poster No:
1862
Submission Type:
Late-Breaking Abstract Submission
Authors:
Cheng Wang1, Yu Jiang1, Zhihao Peng1, Chenxin Li1, Changbei Bang2, Carl Yang3, Lifang He4, Daniel Barron5, Randy Hirschtick6, Byung-Hoon Kim2, Xiang Li6, Yixuan Yuan1
Institutions:
1The Chinese University of Hong Kong, Hong Kong, Hong Kong, 2Yonsei University, Seoul, Seoul, 3Emory University, Atlanta, GA, 4Lehigh University, Bethlehem, PA, 5Brigham and Women's Hospital, Boston, MA, 6Massachusetts General Hospital, Boston, MA
First Author:
Cheng Wang
The Chinese University of Hong Kong
Hong Kong, Hong Kong
Co-Author(s):
Yu Jiang
The Chinese University of Hong Kong
Hong Kong, Hong Kong
Zhihao Peng
The Chinese University of Hong Kong
Hong Kong, Hong Kong
Chenxin Li
The Chinese University of Hong Kong
Hong Kong, Hong Kong
Xiang Li
Massachusetts General Hospital
Boston, MA
Yixuan Yuan
The Chinese University of Hong Kong
Hong Kong, Hong Kong
Late Breaking Reviewer(s):
Jaehee Kim
Duksung Women's University
Seoul, 서울특별시
Introduction:
Functional Magnetic Resonance Imaging (fMRI) is essential for studying brain function and neurological disorders, yet its analysis is hindered by low signal-to-noise ratio, test-retest variability, complex preprocessing, and limited dataset sizes. These challenges contribute to a reproducibility crisis and limit model transferability across tasks and populations. In response, we introduce fMRI-GPT, a foundation model for fMRI analysis, pre-trained on 55,000+ multi-site fMRI 4D volumes. Evaluation on a diverse array of downstream tasks shows that fMRI-GPT achieves state-of-the-art performance while requiring fewer training samples. By learning whole-brain voxel-wise representations, fMRI-GPT provides a scalable and generalizable framework for fMRI analysis. It enhances brain function decoding for perception, memory, emotion, and decision-making, while improving reproducibility across studies. Bridging deep learning with neuroimaging, fMRI-GPT advances precision psychiatry, brain-computer interfaces, and early disease detection.

·Overview of the proposed fMRI-GPT model.
Methods:
fMRI-GPT is pre-trained on 55,000+ fMRI sequences from multi-center datasets, including UK Biobank, ABCD, and HCP, covering diverse demographics and clinical conditions. Data preprocessing includes motion correction, slice timing correction, spatial normalization to MNI152 space, and Z-score intensity normalization. The model employs a Masked Autoencoder (MAE) framework to learn latent representations of brain activity. A Spatiotemporal Redundancy Dropout (STRD) module enhances noise resilience by filtering redundant information, improving test-retest reliability. The Shifted-Window Mamba Backbone optimizes 4D volume processing, reducing GPU memory usage while preserving long-range dependencies in brain signals.
Pretraining follows a self-supervised approach, where the model reconstructs masked fMRI sequences to learn intrinsic brain activity patterns. Fine-tuning is performed using Task-specific Prompt Tuning, which updates only a small subset of parameters, allowing efficient adaptation to new tasks with minimal labeled data. fMRI-GPT is evaluated across five key fMRI tasks: age and gender prediction, phenotype prediction, disease diagnosis, fMRI retrieval, and task-state classification. Performance is compared against state-of-the-art ROI-based and volume-based models using accuracy, Pearson correlation, mean squared error (MSE), and area under the curve (AUC).
Results:
fMRI-GPT achieves state-of-the-art (SOTA) performance across five key fMRI tasks while requiring significantly fewer training samples. For age prediction, it reduces mean squared error (MSE) by 12.5% compared to SwiFT (Kim et al., 2023), the best-performing volume-based method. In gender classification, it outperforms BrainGNN (Li et al., 2021), achieving 93.3% accuracy (vs. 85.6% for BrainGNN). For phenotype prediction, fMRI-GPT surpasses existing models, achieving a 0.429 Pearson Correlation Coefficient (PCC) in emotion-related phenotype estimation, 15.2% higher than previous SOTA models such as BrainLM (Ortega Caro et al., 2023). In disease diagnosis, it outperforms BrainLM and Com-BrainTF (Kan et al., 2022) on schizophrenia classification (HCP-EP dataset), reaching 75.2% accuracy, a 15.3% improvement over ROI-based approaches. Notably, fMRI-GPT is the first model to perform 4D volume-based fMRI retrieval, achieving 79.4% accuracy. In comparison, the SOTA method MindEyeV2 (Scotti et al., 2024) relies on pre-defined vision-related ROIs. In task-state classification, it attains 92.6% accuracy, exceeding SwiFT's 88.1%.
Conclusions:
fMRI-GPT introduces a new paradigm in fMRI analysis by leveraging large-scale data and a powerful pre-trained model. It provides a scalable and versatile solution with broad applications in cognitive neuroscience, precision psychiatry, and brain-computer interfaces.
Modeling and Analysis Methods:
Activation (eg. BOLD task-fMRI)
fMRI Connectivity and Network Modeling 2
Neuroinformatics and Data Sharing:
Workflows 1
Keywords:
FUNCTIONAL MRI
Modeling
Other - Foundation Model
1|2Indicates the priority used for review
By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.
I accept
The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information.
Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:
I am submitting this abstract as an original work to be reproduced. I am available to be the “source party” in an upcoming team and consent to have this work listed on the OSSIG website. I agree to be contacted by OSSIG regarding the challenge and may share data used in this abstract with another team.
Please indicate below if your study was a "resting state" or "task-activation” study.
Other
Healthy subjects only or patients (note that patient studies may also involve healthy subjects):
Patients
Was this research conducted in the United States?
No
Were any human subjects research approved by the relevant Institutional Review Board or ethics panel?
NOTE: Any human subjects studies without IRB approval will be automatically rejected.
Not applicable
Were any animal research approved by the relevant IACUC or other animal research panel?
NOTE: Any animal studies without IACUC approval will be automatically rejected.
Not applicable
Please indicate which methods were used in your research:
Functional MRI
For human MRI, what field strength scanner do you use?
If Other, please list
-
varies
Which processing packages did you use for your study?
Free Surfer
Provide references using APA citation style.
Casey, B. J., Cannonier, T., Conley, M. I., Cohen, A. O., Barch, D. M., Heitzeg, M. M., … & Garavan, H. (2018). The Adolescent Brain Cognitive Development (ABCD) study: Imaging acquisition across 21 sites. Developmental Cognitive Neuroscience, 32, 43–54.
Kan, X., et al. (2022). Brain network transformer. Advances in Neural Information Processing Systems, 35, 25586–25599.
Kim, P., et al. (2023). SwiFT: Swin 4D fMRI transformer. Advances in Neural Information Processing Systems, 36, 42015–42037.
Li, X., et al. (2021). BrainGNN: Interpretable brain graph neural network for fMRI analysis. Medical Image Analysis, 74, 102233.
Miller, K. L., Alfaro-Almagro, F., Bangerter, N. K., Thomas, D. L., Yacoub, E., Xu, J., … & Smith, S. M. (2016). Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nature Neuroscience, 19(11), 1523–1536.
Ortega Caro, J., et al. (2023). BrainLM: A foundation model for brain activity recordings. bioRxiv.
Scotti, P. S., et al. (2024). MindEyeV2: Shared-subject models enable fMRI-to-image with 1 hour of data. arXiv preprint arXiv:2403.11207.
Van Essen, D. C., Smith, S. M., Barch, D. M., Behrens, T. E., Yacoub, E., & Ugurbil, K. (2013). The WU-Minn Human Connectome Project: An overview. NeuroImage, 80, 62–79.
No