Towards generative AI-based fMRI paradigms: reinforcement learning via real-time brain feedback

Presented During:

Thursday, June 27, 2024: 11:30 AM - 12:45 PM
COEX  
Room: Grand Ballroom 104-105  

Poster No:

2053 

Submission Type:

Abstract Submission 

Authors:

Giuseppe Gallitto1, Robert Englert2, Balint Kincses1, Raviteja Kotikalapudi1, Jialin Li1, Kevin Hoffschlag1, Ulrike Bingel1, Tamas Spisak2

Institutions:

1Department of Neurology, University Medicine Essen, Germany, Essen, NRW, 2Department of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen, Essen, NRW

First Author:

Giuseppe Gallitto  
Department of Neurology, University Medicine Essen, Germany
Essen, NRW

Co-Author(s):

Robert Englert  
Department of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen
Essen, NRW
Balint Kincses  
Department of Neurology, University Medicine Essen, Germany
Essen, NRW
Raviteja Kotikalapudi  
Department of Neurology, University Medicine Essen, Germany
Essen, NRW
Jialin Li  
Department of Neurology, University Medicine Essen, Germany
Essen, NRW
Kevin Hoffschlag  
Department of Neurology, University Medicine Essen, Germany
Essen, NRW
Ulrike Bingel  
Department of Neurology, University Medicine Essen, Germany
Essen, NRW
Tamas Spisak  
Department of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen
Essen, NRW

Introduction:

In traditional human neuroimaging experiments, researchers create experimental paradigms with a psychological/behavioral validity to infer the corresponding neural correlates. Here, we introduce a novel approach called Reinforcement Learning via Brain Feedback (RLBF), that inverts the direction of inference; it seeks for the optimal stimulation or paradigm to maximize (or minimize) response in predefined brain regions or networks (fig.1). The stimulation/paradigm is found via a reinforcement learning algorithm (Kaelbling et al., 1996) that is rewarded based on real-time fMRI (Sulzer et al., 2013) data. Specifically, the reinforcement learning agent manipulates the paradigm space (e.g. via generative AI) to drive neural activity in a specific direction. Then, rewarded by measured brain responses, the agent gradually learns to adjust its choices to converge towards an optimal solution. Here, we present the results of a proof of concept study that aimed to confirm the viability of the proposed approach with simulated and empirical real-time fMRI data.

Methods:

We build a streamlined setup: a soft Q-learner (Haarnoja et al., 2017) with a smooth reward function (fig 1. "Reinforcement Learning") and simple visual stimulations to implement the paradigm space (fig 1. "Paradigm Generator"). We present the participants various versions of a flickering checkerboard, with contrast and frequency as free parameters of the paradigm space, and a contrast of zero equal to no stimulation. The reward signal is calculated from responses in the primary visual cortex (fig.2b), measured by a linear model fitted to a single block of fMRI data (5s stimulation, 11s rest). The hypothesis function is convolved with a conventional double-gamma HRF. Here, the agent's task is to determine the best contrast-frequency configuration that maximizes a participant's brain activity in the primary visual cortex.
First, we test our method through simulations with realistic effect size estimates. The optimal ground truth is defined as a linear function of contrast and frequency, with maximum activation at maximum contrast and 7Hz frequency. The agent has 100 trials in which it selects a contrast and frequency value and updates its Q-table using the reward calculated by our ground truth equation, with Gaussian noise added. We fine-tune the hyperparameters for the models using realistic initial conditions (signal-to-noise: 0.5 - 3.0; q-table smoothness: 0.5 - 4.0; soft-Q temperature: 0.2; learning rate: 0.05 - 0.9). Then, with parameters chosen on the simulation results, we measure data for n=5 participants. In the scanner, we presented the checkerboard in 45 blocks with a TR of 1 second (10 minutes) and allowed the reinforcement learner to optimize the visual stimulation based on brain feedback.

Results:

Simulation results show that the proposed implementation provides a robust solution in a relatively wide range of initial conditions and a small amount of trials. High smoothing power appears to function well with higher SNRs, while lower SNRs seem to require lower learning rates for optimal training (fig.2a). The models display a remarkable stability with a wide range of learning rate values. Results from the empirical measurements (fig.2b) are in line with knowledge about the contrast and frequency dependence of the checkerboard-response (Victor et al., 1997) and provide initial confirmation for the feasibility of the proposed approach.

Conclusions:

Here we presented a proof of concept for the RLBF method, a novel experimental approach, which aims to find the optimal stimulation paradigm to modulate individual brain activity in predefined regions/networks. While this proof of concept study employed a simplified setup, future work aims to extend the approach with paradigm spaces constructed by generative AI solutions. By inverting the direction of inference ("brain -> behavior"; instead of "behavior -> brain") the proposed approach may emerge as a novel tool for basic and translational research.

Modeling and Analysis Methods:

Activation (eg. BOLD task-fMRI) 2
Methods Development

Motor Behavior:

Brain Machine Interface 1

Perception, Attention and Motor Behavior:

Perception: Visual

Keywords:

Data analysis
Machine Learning
MRI
Other - Brain Machine Interface

1|2Indicates the priority used for review
Supporting Image: Fig1.png
Supporting Image: Fig2.png
 

Provide references using author date format

Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285.

Sulzer, J., Haller, S., Scharnowski, F., Weiskopf, N., Birbaumer, N., Blefari, M. L., ... & Sitaram, R. (2013). Real-time fMRI neurofeedback: progress and challenges. Neuroimage, 76, 386-399.

Haarnoja, T., Tang, H., Abbeel, P., & Levine, S. (2017, July). Reinforcement learning with deep energy-based policies. In International conference on machine learning (pp. 1352-1361). PMLR.

Victor, J. D., Conte, M. M., & Purpura, K. P. (1997). Dynamic shifts of the contrast-response function. Visual neuroscience, 14(3), 577-587.