Poster No:
804
Submission Type:
Abstract Submission
Authors:
Ahmad Samara1, Zaid Zada2, Uri Hasson2, Tamara Vanderwal1, Sam Nastase2
Institutions:
1University of British Columbia, Vancouver, BC, 2Princeton University, Princeton, NJ
First Author:
Co-Author(s):
Introduction:
Different regions of the cortical language network contribute differently to natural language comprehension (1). How do these regions coordinate with one another? In this study, we hypothesized that these different cortical areas are coupled via a shared linguistic space (i.e., high-dimensional vector space).
To pursue this question, we first used inter-subject functional connectivity (ISFC) analysis to delineate the shared, stimulus-driven component of network connectivity (2). ISFC maps tell us where and how much connectivity is driven by the stimulus, but not what stimulus features are driving the shared network structure.
We thus developed a model-based connectivity analysis (3) to assess how well explicit model features generalize from one brain area to another. We decomposed the story stimulus into acoustic, speech, and language features extracted from the Whisper speech recognition model (4). This framework allows us to explicitly model what features are shared between different regions of the language network (5).
Methods:
We used fMRI data from the Narratives collection (6) where 46 subjects listened to two different ~13-minute spoken stories. Data were minimally preprocessed using fMRIPrep and time series were averaged within 1000 cortical parcels (7). For each story, we extracted three types of linguistic features from Whisper: acoustic embeddings from the input layer of the encoder, speech embeddings extracted from the final layer of encoder, and text-based language embeddings from the decoder (8) (Fig.1).
Banded ridge regression was used to estimate joint parcel-wise encoding models combining all three feature spaces in a training story (9). Model-predicted BOLD activity was generated for each of the three feature spaces in a test story. Predictions were evaluated within and across parcels by computing model-based ISFC (mISFC) matrices: we correlated each subject's model-predicted time series with the average actual time series in N – 1 subjects across all pairs of parcels.
Each of the three mISFC matrices were compared to the original data-driven ISFC matrices. The original ISFC matrices serve as an index of the reliable, stimulus-driven connectivity between cortical areas, while the mISFC matrices allow us to test which features are driving this connectivity. To aid interpretability, we grouped parcels into auditory, language, default, and somatomotor networks (6).

Results:
The parcel-wise encoding models yielded strong predictions of brain activity across stories in auditory, language and default areas. Encoding performance for the acoustic embeddings was localized to early auditory cortex, while the speech embeddings captured superior temporal gyrus more broadly. The language embeddings predicted activity across a widespread network of lateral temporal and inferior frontal language areas, as well as default areas.
In terms of network structure, we found that acoustic and speech embeddings best captured connectivity within early auditory areas (explaining ~60% of ISFC variance), whereas language embeddings best captured connectivity across areas within the language network (~58% of ISFC) and within the default network (~40% of ISFC). Interestingly, all three feature bands predicted components of the connectivity between auditory and language areas (~44% of ISFC). Conversely, connectivity between language and default areas was best predicted by the language embeddings (~58% of ISFC) (Fig.2).

Conclusions:
These findings demonstrate that encoding models provide a powerful framework for interpreting network connectivity in terms of explicit linguistic features during narrative comprehension. Acoustic, speech, and language features recapitulate connectivity within networks, and reveal that different networks are aligned along shared subsets of features. We speculate that different language areas may "agree" on a common subset of shared linguistic features in order to harmonize their individual contributions to language comprehension (10).
Language:
Language Comprehension and Semantics 1
Speech Perception
Modeling and Analysis Methods:
Connectivity (eg. functional, effective, structural)
fMRI Connectivity and Network Modeling 2
Methods Development
Keywords:
Cortex
FUNCTIONAL MRI
Language
Modeling
Other - Naturalistic, Encoding
1|2Indicates the priority used for review
By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.
I accept
The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information.
Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:
I do not want to participate in the reproducibility challenge.
Please indicate below if your study was a "resting state" or "task-activation” study.
Task-activation
Healthy subjects only or patients (note that patient studies may also involve healthy subjects):
Healthy subjects
Was this research conducted in the United States?
Yes
Are you Internal Review Board (IRB) certified?
Please note: Failure to have IRB, if applicable will lead to automatic rejection of abstract.
Not applicable
Were any human subjects research approved by the relevant Institutional Review Board or ethics panel?
NOTE: Any human subjects studies without IRB approval will be automatically rejected.
Yes
Were any animal research approved by the relevant IACUC or other animal research panel?
NOTE: Any animal studies without IACUC approval will be automatically rejected.
Not applicable
Please indicate which methods were used in your research:
Functional MRI
Computational modeling
For human MRI, what field strength scanner do you use?
3.0T
Which processing packages did you use for your study?
Other, Please list
-
fMRIPrep
Provide references using APA citation style.
1. Fedorenko, E., Ivanova, A. A., & Regev, T. I. (2024). The language network as a natural kind within the broader landscape of the human brain. Nature Reviews Neuroscience, 25, 289–312.
2. Simony, E., Honey, C. J., Chen, J., Lositsky, O., Yeshurun, Y., Wiesel, A., & Hasson, U. (2016). Dynamic reconfiguration of the default mode network during narrative comprehension. Nature Communications, 7, 12141.
3. Meschke, E. X., Castello, M. V. D. O., Tour, T. D. L., & Gallant, J. L. (2023). Model connectivity: leveraging the power of encoding models to overcome the limitations of functional connectivity. bioRxiv, 2023-07.
4. Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey, C., & Sutskever, I. (2023). Robust speech recognition via large-scale weak supervision. In Proceedings of the 40th International Conference on Machine Learning, PMLR, 202 (pp. 28492–28518). https://proceedings.mlr.press/v202/radford23a.html
5. Zada, Z., Goldstein, A., Michelmann, S., Simony, E., Price, A., ... Nastase, S. A., & Hasson, U. (2024). A shared model-based linguistic space for transmitting our thoughts from brain to brain in natural conversations. Neuron, 112(18), 3211–3222.
6.Nastase, S. A., Liu, Y. F., Hillman, H., Zadbood, A., Hasenfratz, L., Keshavarzian, N., ... & Hasson, U. (2021). The “Narratives” fMRI dataset for evaluating models of naturalistic language comprehension. Scientific Data, 8, 250.
7. Kong, R., Yang, Q., Gordon, E., Xue, A., Yan, X., Orban, C., ... & Yeo, B. T. (2021). Individual-specific areal-level parcellations improve functional connectivity prediction of behavior. Cerebral Cortex, 31(10), 4477–4500.
8. Goldstein, A., Wang, H., Niekerken, L., Zada, Z., Aubrey, B., Sheffer, T., ... & Singh, A. (2023). Deep speech-to-text models capture the neural basis of spontaneous speech in everyday conversations. bioRxiv.
9. Dupré la Tour, T., Eickenberg, M., Nunez-Elizalde, A. O., & Gallant, J. L. (2022). Feature-space selection with banded ridge regression. NeuroImage, 264, 119728.
10. Elhage, N., Nanda, N., Olsson, C., Henighan, T., Joseph, N., Mann, B., ... & Olah, C. (2021). A mathematical framework for transformer circuits. Transformer Circuits Thread.
No