Generating Stimuli for Simultaneously Mapping Retinotopic and Category-Selective Brain Regions.

Presented During:

Wednesday, June 26, 2024: 11:30 AM - 12:45 PM
COEX  
Room: Grand Ballroom 104-105  

Poster No:

2549 

Submission Type:

Abstract Submission 

Authors:

Insub Kim1, Zhenzhen Weng2, Kalanit Grill-Spector1,3

Institutions:

1Psychology, Stanford University, Stanford, CA, 2Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, 3Wu Tsai Neurosciences Institute, Stanford University, Stanford , CA

First Author:

Insub Kim  
Psychology, Stanford University
Stanford, CA

Co-Author(s):

Zhenzhen Weng  
Institute for Computational and Mathematical Engineering, Stanford University
Stanford, CA
Kalanit Grill-Spector  
Psychology, Stanford University|Wu Tsai Neurosciences Institute, Stanford University
Stanford, CA|Stanford , CA

Introduction:

Mapping of retinotopic and category-selective regions in individual human brains using fMRI has become a routine task across multiple labs. As retinotopic and category-selective regions are selective to different properties of visual stimuli, two distinct experiments are conducted to map these brain regions. For example, traveling-wave stimuli with bars (Dumoulin and Wandell, 2008; Benson et al., 2018; Finzi et al., 2021; Kim et al., 2023) or with wedges and rings (Engel et al., 1997; Benson et al., 2018) are typically used to map population receptive fields (pRFs, Dumoulin and Wandell, 2008), identify visual field maps, and delineate borders of retinotopic visual regions (V1, V2, V3, hV4, VO, LO, TO, V3ab, and IPS). Whereas, functional localizer experiments (Kanwisher et al., 1997; Stigliani et al., 2015) using various categorical images of faces, bodies, scenes, and objects are used to define category-selective regions (mFus-faces, pFus-faces, mOTS-words, pOTS-words, OTS-bodies, CoS-places). Here, we developed a method that generates optimal stimuli for simultaneously mapping retinotopic and category-selective regions in a single fMRI experiment.

Methods:

Using a large generative model (stable diffusion, Rombach et al., 2022) trained on large-scale datasets with classifier-free guidance (Nichol et al., 2021), we iteratively generated a sequence of naturalistic stimuli that have visual features from different categories (Fig. 1) while maximizing the ability to estimate pRF parameters. To ensure the successful mapping of category-selective regions, the sequence of generated images followed a block design in which 12 different images of the same category were presented for 12 seconds (1 second / image), and each block was repeated for 4 times in a randomized order. To also guarantee the mapping of retinotopic regions, images within each category block were generated to maximize the estimation of pRF parameters. This was achieved by assessing ability to generate distinct fMRI time courses of 100 simulated pRFs, which tiled a 12° visual field with varying pRF locations, sizes, and temporal parameters (Fig. 2A).
Supporting Image: fig1.jpg
 

Results:

We generated an optimal set of stimuli that were naturalistic and contained visual features associated with five categories (face, bodies, objects, places, and words). Despite the naturalistic and categorical nature of these generated images, there was variation in local contrast information among them. This variability across images allowed us to test the feasibility of estimating pRF parameters. Indeed, we found that in a simulated 240-second experiment with additive noise in a level similar to typical fMRI experiments, estimated pRF parameters were similar to the ground truth parameters with better estimation of central pRF parameters than pRFs close to the edges of the images (Fig. 2B). The median absolute percentage errors for pRF locations (x and y) and size (σ) were 26.4%, 29.73%, and 19.9%, respectively.
Supporting Image: fig2.jpg
 

Conclusions:

We demonstrate the feasibility of an approach that can systematically generate optimized images for a fMRI experiment from natural images that can be used to define category selectivity and at the same time maximize the estimation of pRF parameters of each voxel. This approach can be used to optimize any fMRI experimental sequence based on any kind of image-computable models. This opens up new possibilities for designing optimal experiments that can: 1) reduce experimental time and resources, especially benefiting populations that are difficult to scan such as infants, children, and patients; 2) specifically target different subsets of brain regions; and 3) validate encoding models of the brain.

Modeling and Analysis Methods:

Methods Development 2

Perception, Attention and Motor Behavior:

Perception: Visual 1

Keywords:

Experimental Design
Perception
Vision
Other - pRF; FFA; PPA; fMRI

1|2Indicates the priority used for review

Provide references using author date format

Benson, N. C., Jamison, K. W., Arcaro, M. J., Vu, A. T., Glasser, M. F., Coalson, T. S., ... & Kay, K. (2018). The Human Connectome Project 7 Tesla retinotopy dataset: Description and population receptive field analysis. Journal of vision, 18(13), 23-23.
Dumoulin, S. O., & Wandell, B. A. (2008). Population receptive field estimates in human visual cortex. Neuroimage, 39(2), 647-660.
Engel, S. A., Glover, G. H., & Wandell, B. A. (1997). Retinotopic organization in human visual cortex and the spatial precision of functional MRI. Cerebral cortex (New York, NY: 1991), 7(2), 181-192.
Finzi, D., Gomez, J., Nordt, M., Rezai, A. A., Poltoratski, S., & Grill-Spector, K. (2021). Differential spatial computations in ventral and lateral face-selective regions are scaffolded by structural connections. Nature communications, 12(1), 2278.
Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: a module in human extrastriate cortex specialized for face perception. Journal of neuroscience, 17(11), 4302-4311.
Kim, I., Kupers, E. R., Lerma-Usabiaga, G., & Grill-Spector, K. (2023). Characterizing spatiotemporal population receptive fields in human visual cortex with fMRI. bioRxiv, 2023-05.
Nichol, A., Dhariwal, P., Ramesh, A., Shyam, P., Mishkin, P., McGrew, B., ... & Chen, M. (2021). Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741.
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10684-10695).
Stigliani, A., Weiner, K. S., & Grill-Spector, K. (2015). Temporal processing capacity in high-level visual cortex is domain specific. Journal of Neuroscience, 35(36), 12412-12424.