FMRI-based Encoding for Self-supervised Deep Predictive Coding in the Human Brain

Poster No:

2544 

Submission Type:

Abstract Submission 

Authors:

Jungmin Lee1,2,3, Sungwoo Lee1,2,3, Sunghyoung Hong1,2, Choong-Wan Woo1,2,3, Seok Jun Hong1,2,3,4

Institutions:

1Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon, Korea, Republic of, 2Department of Biomedical Engineering, Sungkyunkwan University, Suwon, Korea, Republic of, 3Life-inspired Neural networks for Prediction and Optimization (LNPO) Group, Suwon, Korea, Republic of, 4Center for the Developing Brain, Child Mind Institute, New York, NY, United States

First Author:

Jungmin Lee  
Center for Neuroscience Imaging Research, Institute for Basic Science|Department of Biomedical Engineering, Sungkyunkwan University|Life-inspired Neural networks for Prediction and Optimization (LNPO) Group
Suwon, Korea, Republic of|Suwon, Korea, Republic of|Suwon, Korea, Republic of

Co-Author(s):

Sungwoo Lee  
Center for Neuroscience Imaging Research, Institute for Basic Science|Department of Biomedical Engineering, Sungkyunkwan University|Life-inspired Neural networks for Prediction and Optimization (LNPO) Group
Suwon, Korea, Republic of|Suwon, Korea, Republic of|Suwon, Korea, Republic of
Sunghyoung Hong  
Center for Neuroscience Imaging Research, Institute for Basic Science|Department of Biomedical Engineering, Sungkyunkwan University
Suwon, Korea, Republic of|Suwon, Korea, Republic of
Choong-Wan Woo  
Center for Neuroscience Imaging Research, Institute for Basic Science|Department of Biomedical Engineering, Sungkyunkwan University|Life-inspired Neural networks for Prediction and Optimization (LNPO) Group
Suwon, Korea, Republic of|Suwon, Korea, Republic of|Suwon, Korea, Republic of
Seok Jun Hong  
Center for Neuroscience Imaging Research, Institute for Basic Science|Department of Biomedical Engineering, Sungkyunkwan University|Life-inspired Neural networks for Prediction and Optimization (LNPO) Group|Center for the Developing Brain, Child Mind Institute, New York
Suwon, Korea, Republic of|Suwon, Korea, Republic of|Suwon, Korea, Republic of|NY, United States

Introduction:

The ability to anticipate future outcomes is of utmost importance in human cognition. Predictive coding theory[3] provides a parsimonious model for this phenomenon, conceptualizing the brain as a self-supervised hierarchical generative model minimizing prediction errors by reconciling top-down information with bottom-up sensory input. Despite its significance, the neuroscientific evidence supporting this theory remains scarce, especially at the whole-brain level. Here, we employed 'PredNet[2]', a predictive coding-inspired deep learning model, and correlated its temporal activation with movie-watching fMRI in order to explore the representation of prediction and prediction error across the human brain.

Methods:

The model was trained on a 3-hour movie, 'Titanic', and subsequently tested on a 2-hour movie, 'ForrestGump'. The fMRI data of this test movie was obtained from StudyForrest[1], an open source repository providing audio-visual movie watching fMRI data from 15 subjects. For the next-frame video prediction task, we employed PredNet[2], a convLSTM model with four hierarchical layers, each containing four computational units: R (representation), Ahat (prediction), A (input), E (error) (Fig1A). Features from both prediction and prediction error units were extracted for voxel-wise encoding. Principal component analysis (PCA) was applied to this feature matrix to preserve 90% of its variance, followed by hemodynamic response convolution of 4 seconds and down-sampling to the actual fMRI rate. The feature matrix was used to train an encoding model via ridge regression, ensuring a robust evaluation through 9-fold cross-validation. During training, we estimated regression coefficients (β), which were then used to predict BOLD signals. Finally, we correlated this signal with experimental fMRI data to assess the encoding performance.

Results:

We identified the distinct patterns of PredNet-Brain correspondence for both prediction and prediction error across the whole brain (Fig1B,D). Specifically, the prediction exhibited a more pronounced correlation in the early visual areas (i.e. V1-4), with average encoding performance values of 0.05, 0.13, and 0.19 in accordance with the hierarchy of PredNet (Fig1B). In comparison, the prediction error exhibited a stronger performance across widespread brain areas, persisting even at the lowest PredNet layer, and showed a layerwise increase, with averaged encoding accuracy values of 0.26, 0.33, and 0.40 (Fig1D). This implies that the computation of prediction error occurs on the whole-brain level, as sensory information is initiated for the processing and intensifies from the lower visual to higher dorsal attention networks. Subsequently, we focused on 8 regions of interest including V1-4, MT, and FFC, known for their relevance to visual processing, to assess how each prediction and prediction error could be mapped along the PredNet layer orders (Fig1C,E). For further analysis, we also mapped β as a means to identify the most explanatory features of prediction and prediction error within the layer, and examined their dominance ratio (Fig2A,C). Notably, at the layer 1, the prediction error initially showed a predominant pattern which, by the layer 2, gradually shifted towards the prediction taking over the dominance. At the highest layer, both the prediction and prediction error displayed a comparatively balanced dominance ratio (Fig2E).
Supporting Image: figure1.jpg
Supporting Image: figure2.jpg
 

Conclusions:

The study showed unique signal pathways for each prediction and prediction error across the macroscale network, revealing a dynamic processing shift. By demonstrating a proof-of-concept for the biological plausibility of predictive coding, our study provides an important avenue to comprehending the intricate dynamics of macroscale information processing in the human brain.

Modeling and Analysis Methods:

Activation (eg. BOLD task-fMRI) 2

Perception, Attention and Motor Behavior:

Perception: Visual 1

Keywords:

Cognition
Computational Neuroscience
Modeling
Perception

1|2Indicates the priority used for review

Provide references using author date format

1. Hanke, M., Adelhöfer, N., Kottke, D., Iacovella, V., Sengupta, A., Kaule, F. R., ... & Stadler, J. (2016). A studyforrest extension, simultaneous fMRI and eye gaze recordings during prolonged natural stimulation. Scientific data, 3(1), 1-15.
2. Lotter, W., Kreiman, G., & Cox, D. (2016). Deep predictive coding networks for video prediction and unsupervised learning. arXiv preprint arXiv:1605.08104
3. Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra- classical receptive-field effects. Nature neuroscience, 2(1), 79–87.