NeuroConText: Contrastive Text-to-Brain Mapping for Neuroscientific Literature

Poster No:

1162 

Submission Type:

Late-Breaking Abstract Submission 

Authors:

Fateme Ghayem1, Raphaë Meudec2, Himanshu Aggarwal3, Jérôme Dockès3, Demian Wassermann4, Bertrand Thirion3

Institutions:

1Inria Paris Saclay - MIND team, Palaiseau, Ile de France, 2Inria Paris Saclay - MIND team, Palaiseau, ile de france, 3INRIA Saclay, Paris, Ile-de-France, 4MIND Team, Inria Saclay, Université Paris-Saclay, Palaiseau, France, Palaiseau, France

First Author:

Fateme Ghayem  
Inria Paris Saclay - MIND team
Palaiseau, Ile de France

Co-Author(s):

Raphaë Meudec  
Inria Paris Saclay - MIND team
Palaiseau, ile de france
Himanshu Aggarwal  
INRIA Saclay
Paris, Ile-de-France
Jérôme Dockès  
INRIA Saclay
Paris, Ile-de-France
Demian Wassermann  
MIND Team, Inria Saclay, Université Paris-Saclay, Palaiseau, France
Palaiseau, France
Bertrand Thirion  
INRIA Saclay
Paris, Ile-de-France

Late Breaking Reviewer(s):

Fernando Barrios, Ph.D.  
Universidad Nacional Autónoma de México
Querétaro, Querétaro
Yi-Ju Lee, Dr.  
Academia Sinica
Taipei City, Taipei City
Casey Paquola  
Institute for Neuroscience and Medicine, INM-7, Forschungszentrum Jülich
Jülich, NA

Introduction:

Meta-analysis, particularly coordinate-based meta-analysis (CBMA), is widely used in neuroscience to synthesize findings across studies (Yarkoni, 2011). However, existing CBMA tools struggle with terminology inconsistencies, long-text processing, and capturing semantic meaning as they rely on bag-of-words approaches. We propose NeuroConText, a contrastive learning-based model that enhances text-to-brain activation associations to address these limitations. Unlike previous methods such as NeuroQuery (Dockès, 2020) and Text2Brain (Ngo, 2021), NeuroConText leverages large language models (LLMs) to effectively process neuroscientific text of varying lengths and link it to brain activations. While NeuroConText significantly improves text-brain retrieval performance, it reconstructs brain maps from text on par with the state-of-the-art methods.

Methods:

We collected 20K neuroscientific articles from PubMed Central, extracting their text, metadata, and activation coordinates. We encoded textual information using Mistral-7B (Jiang, 2023) embeddings and addressed token size limits by chunking and averaging embeddings for long texts. Activation coordinates were transformed into kernel density estimation (KDE) maps. to mitigate high-dimensionality challenges, we reduced KDE size using the Dictionary of Functional Modes (DiFuMo) atlas with 512 components (Dadi, 2020).

NeuroConText follows a contrastive learning paradigm inspired by CLIP (Radford, 2021). We designed two encoders: a projection head for text embeddings and a residual head for DiFuMo representations. These encoders were trained with the InfoNCE loss to align text and brain activation coordinates within a shared latent space (Fig.1-A). Additionally, we developed a decoder to generate brain activation maps from latent text embeddings to enable brain image reconstruction from text queries (Fig.1-B).
Supporting Image: Fig1.jpg
 

Results:

We evaluated NeuroConText against NeuroQuery and Text2Brain using our dataset splited into 19K training and 1K test samples, employing 15-fold cross-validation. Performance was measured using Recall@K (K={10,100}) and Mix&Match metrics. NeuroConText outperformed baselines in all association tasks. For full-body text queries, NeuroConText achieved Recall@10 of 22.6%, surpassing NeuroQuery (7%) and Text2Brain (1.4%). Similar trends were observed across other text sections (title and abstract) and evaluation metrics, confirming the superiority of our contrastive framework in linking text with brain activation maps (Fig.2-A).

Additionally, we evaluated the capability of NeuroConText in reconstructing brain maps from text using NeuroVault descriptions and contrast maps (Pinho, 2018). Results show that NeuroConText performed on par with NeuroQuery and Text2Brain in this task. NeuroConText successfully maintained reconstruction quality while benefiting from the advantages of a contrastive learning framework (Fig.2-B).
Supporting Image: Fig2.jpg
 

Conclusions:

We introduced NeuroConText, a contrastive-based coordinate-based meta-analysis model designed to link neuroscientific text with brain activation coordinates. It leverages LLM embeddings of full-body text for rich text representation and DiFuMo coefficients derived from KDE of activation coordinates. By establishing a shared latent space between these two modalities, NeuroConText enhances text-to-brain association and enables brain map reconstruction from textual descriptions. Our results demonstrate its superior performance in associating text with brain activations while achieving reconstruction quality comparable to existing methods.

Modeling and Analysis Methods:

Activation (eg. BOLD task-fMRI)
Classification and Predictive Modeling 1
Methods Development
Multivariate Approaches 2
Other Methods

Keywords:

Data analysis
FUNCTIONAL MRI
Machine Learning
Meta- Analysis
MRI
Multivariate
Statistical Methods

1|2Indicates the priority used for review

Abstract Information

By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.

I accept

The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information. Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:

I am submitting this abstract as an original work to be reproduced. I am available to be the “source party” in an upcoming team and consent to have this work listed on the OSSIG website. I agree to be contacted by OSSIG regarding the challenge and may share data used in this abstract with another team.

Please indicate below if your study was a "resting state" or "task-activation” study.

Resting state
Task-activation

Healthy subjects only or patients (note that patient studies may also involve healthy subjects):

Patients

Was this research conducted in the United States?

No

Were any human subjects research approved by the relevant Institutional Review Board or ethics panel? NOTE: Any human subjects studies without IRB approval will be automatically rejected.

Not applicable

Were any animal research approved by the relevant IACUC or other animal research panel? NOTE: Any animal studies without IACUC approval will be automatically rejected.

Not applicable

Please indicate which methods were used in your research:

Functional MRI

For human MRI, what field strength scanner do you use?

If Other, please list  -   NA

Provide references using APA citation style.

Dadi, K., Varoquaux, G., Machlouzarides-Shalit, A., Gorgolewski, K. J., Wassermann, D., Thirion, B., & Mensch, A. (2020). Fine-grain atlases of functional modes for fMRI analysis. NeuroImage, 221, 117126.

Dockès, J., Poldrack, R. A., Primet, R., Gözükan, H., Yarkoni, T., Suchanek, F., Thirion, B., & Varoquaux, G. (2020). Neuroquery, comprehensive meta-analysis of human brain mapping. eLife, 9, e53385.

Jiang, A. Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D. S., Casas, D. d. l., Bressand, F., Lengyel, G., Lample, G., & Saulnier, L. (2023). Mistral 7b. arXiv preprint arXiv:2310.06825.

Ngo, G. H., Nguyen, M., Chen, N. F., & Sabuncu, M. R. (2021). Text2brain: Synthesis of brain activation maps from free-form text query. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part VII 24 (pp. 605–614). Springer.

Pinho, A. L., Amadon, A., Ruest, T., Fabre, M., Dohmatob, E., Denghien, I., Ginisty, C., Becuwe-Desmidt, S., Roger, S., Laurier, L., Joly-Testault, V., Médiouni-Cloarec, G., Doublé, C., Martins, B., Pinel, P., Eger, E., Varoquaux, G., Pallier, C., Dehaene, S., Hertz-Pannier, L., & Thirion, B. (2018). Individual Brain Charting, a high-resolution fMRI dataset for cognitive mapping. Scientific Data, 5, 180105.

Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. (2021). Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (pp. 8748–8763). PMLR.

Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513–523.

Yarkoni, T., Poldrack, R. A., Nichols, T. E., Van Essen, D. C., & Wager, T. D. (2011). Large-scale automated synthesis of human functional neuroimaging data. Nature Methods, 8(8), 665–670.

UNESCO Institute of Statistics and World Bank Waiver Form

I attest that I currently live, work, or study in a country on the UNESCO Institute of Statistics and World Bank List of Low and Middle Income Countries list provided.

No