Imagery-based image retrieval in a closed-loop condition using electrocorticograms

Poster No:

770 

Submission Type:

Abstract Submission 

Authors:

Ryohei Fukuma1,2 Takufumi Yanagisawa1,2,3, Hidenori Sugano4, Kentaro Tamura5, Satoru Oshino1, Naoki Tani1, Yasushi Iimura4, Hui Ming Khoo1, Hiroharu Suzuki4, Huixiang Yang1, Takamitsu Iwata1, Madoka Nakajima4, Shinji Nishimoto1,6, Yukiyasu Kamitani2,7, Haruhiko Kishima1,3

Institutions:

1Osaka University, Suita, Osaka, 2ATR Computational Neuroscience Laboratories, Seika-cho, Japan, 3Osaka University Hospital Epilepsy Center, Suita, Japan, 4Juntendo University, Tokyo, Japan,5Nara Medical University, Kashihara, Japan,6National Institute of Information and Communications Technology (NICT), Center for Information and Neural Networks (CiNet), Suita, Japan,7Kyoto University, Kyoto, Japan

First Author:

Ryohei Fukuma  
Osaka University
Suita, Osaka

Co-Author(s):

Takufumi Yanagisawa  
Osaka University
Suita, Osaka
Hidenori Sugano  
Juntendo University
Bunkyo-ku, Tokyo
Kentaro Tamura  
Nara Medical University
Kashihara, Nara
Satoru Oshino  
Osaka University
Suita, Osaka
Naoki Tani  
Osaka University
Suita, Osaka
Yasushi Iimura  
Juntendo University
Bunkyo-ku, Tokyo
Hui Ming Khoo  
Osaka University
Suita, Osaka
Hiroharu Suzuki  
Juntendo University
Bunkyo-ku, Tokyo
Huixiang Yang  
Osaka University
Suita, Osaka
Takamitsu Iwata  
Osaka University
Suita, Osaka
Madoka Nakajima  
Juntendo University
Bunkyo-ku, Tokyo
Shinji Nishimoto  
Osaka University|National Institute of Information and Communications Technology (NICT), Center for Information and Neural Networks (CiNet)
Suita, Osaka|Suita, Japan
Yukiyasu Kamitani  
Kyoto University
Kyoto, Kyoto
Haruhiko Kishima  
Osaka University
Suita, Osaka

Introduction:

Restoration of communication capabilities for patients with severe paralysis, including amyotrophic lateral sclerosis (ALS), has been achieved by recent advances in brain‒computer interfaces (BCIs) technologies, particularly using motor cortical activity (Card et al., 2024; Metzger et al., 2023; Moses et al., 2021; Willett, Avansino, Hochberg, Henderson, & Shenoy, 2021). However, it has been reported that the use of motor-related cortical activity becomes difficult for an ALS patient in a completely locked-in state (CLIS) due to the deteriorated cortical activity (Chaudhary et al., 2022; Vansteensel et al., 2024). Given that neural decoding via the latent space of deep neural network models allows inference of perceived and imagined images that are novel to the subject and the decoder (Horikawa & Kamitani, 2017), BCIs using the latent space and cortical activity of the visual cortex, which is known to be stable in patients in CLIS, may allow a subject to retrieve an intended image from a large dataset, but has not yet been realised.

Methods:

We developed a BCI that uses a linear decoder in a closed-loop condition to retrieve images of the instructed categories from 2.3 million images via imagery. During the task, high-γ power (80-150 Hz) of electrocorticograms (ECoGs) from visual cortices were decoded to infer a latent vector (real-time vector) to select the feedback images based on the highest cosine similarity. The BCI was tested (online task) by subjects with drug-resistant epilepsy via one of two latent spaces: the space spanned by the contrastive language-image pretraining (CLIP) (Radford et al., 2021) model and the AlexNet (Krizhevsky, Sutskever, & Hinton, 2012) model. Subjects were given one of two instructions (e.g. "animal" vs "tool") to display the feedback image containing the instructed meaning using imagery. Controllability was assessed by the similarity of the real-time vector to the latent vectors corresponding to the target and nontarget instructions. In addition, to assess the effect of imagery on the cortical activity during the perception of an image that differed from the imagined content, another task (modulation task) was performed for nine subjects. In this task, the subjects imagined an image of a category while watching an image of another category.

Results:

Three out of three subjects who performed the online task with the CLIP latent space could significantly control the BCI in a two-choice manner with the highest accuracy of 92.50%. On the other hand, three out of three subjects using the AlexNet latent space failed to control the feedback images. In addition, the cortical activity in higher visual cortices during the modulation task revealed that the inferred vector via the CLIP latent space became significantly closer to the vector corresponding to the imagined meaning by the imagery; on the other hand, the vector inferred via the AlexNet latent space became significantly further away.

Conclusions:

In the present study, we demonstrate that image retrieval based on CLIP vectors decoded from ECoGs in the visual cortex allows subjects to perform retrieval of images representing the intended category in a two-choice manner from 2.3 million images under closed-loop conditions with greater than 90% accuracy. Interestingly, the controllability of feedback images was suggested to depend on the choice of latent space used to annotate the image. Similar differences between the two latent spaces were observed in the changes in the inferred vectors due to imagery in the modulation task, even though the same ECoGs were decoded. It was suggested that, during the online task, the imagery, particularly the semantic component of the imagery (Pylyshyn, 2003), altered the neural activity in higher visual cortex to steer the inferred vector in the CLIP latent space in the direction of the instructed meaning.

Higher Cognitive Functions:

Imagery 1

Motor Behavior:

Brain Machine Interface 2

Perception, Attention and Motor Behavior:

Perception: Visual

Keywords:

ELECTROCORTICOGRAPHY
Machine Learning
Vision
Other - Imagery, Decoding, Brain-Machine Interface

1|2Indicates the priority used for review

Abstract Information

By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.

I accept

The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information. Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:

I do not want to participate in the reproducibility challenge.

Please indicate below if your study was a "resting state" or "task-activation” study.

Other

Healthy subjects only or patients (note that patient studies may also involve healthy subjects):

Patients

Was this research conducted in the United States?

No

Were any human subjects research approved by the relevant Institutional Review Board or ethics panel? NOTE: Any human subjects studies without IRB approval will be automatically rejected.

Yes

Were any animal research approved by the relevant IACUC or other animal research panel? NOTE: Any animal studies without IACUC approval will be automatically rejected.

Not applicable

Please indicate which methods were used in your research:

Other, Please specify  -   ECoG, Decoding

Which processing packages did you use for your study?

Free Surfer

Provide references using APA citation style.

Card, N. S., Wairagkar, M., Iacobacci, C., Hou, X., Singer-Clark, T., Willett, F. R., . . . Brandman, D. M. (2024). An accurate and rapidly calibrating speech neuroprosthesis. N Engl J Med, 391(7), 609-618.
Chaudhary, U., Vlachos, I., Zimmermann, J. B., Espinosa, A., Tonin, A., Jaramillo-Gonzalez, A., . . . Birbaumer, N. (2022). Spelling interface using intracortical signals in a completely locked-in patient enabled via auditory neurofeedback training. Nat. Commun., 13(1), 1236.
Horikawa, T., & Kamitani, Y. (2017). Generic decoding of seen and imagined objects using hierarchical visual features. Nat. Commun., 8, 15037.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst., 25, 1097–1105.
Metzger, S. L., Littlejohn, K. T., Silva, A. B., Moses, D. A., Seaton, M. P., Wang, R., . . . Chang, E. F. (2023). A high-performance neuroprosthesis for speech decoding and avatar control. Nature, 620(7976), 1037-1046.
Moses, D. A., Metzger, S. L., Liu, J. R., Anumanchipalli, G. K., Makin, J. G., Sun, P. F., . . . Chang, E. F. (2021). Neuroprosthesis for decoding speech in a paralyzed person with anarthria. N. Engl. J. Med., 385(3), 217-227.
Pylyshyn, Z. (2003). Return of the mental image: are there really pictures in the brain? Trends Cogn Sci, 7(3), 113-118.
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., . . . Clark, J. (2021). Learning transferable visual models from natural language supervision. Paper presented at the International conference on machine learning.
Vansteensel, M. J., Leinders, S., Branco, M. P., Crone, N. E., Denison, T., Freudenburg, Z. V., . . . Ramsey, N. F. (2024). Longevity of a brain-computer interface for amyotrophic lateral sclerosis. N. Engl. J. Med., 391(7), 619-626.
Willett, F. R., Avansino, D. T., Hochberg, L. R., Henderson, J. M., & Shenoy, K. V. (2021). High-performance brain-to-text communication via handwriting. Nature, 593(7858), 249-254.

UNESCO Institute of Statistics and World Bank Waiver Form

I attest that I currently live, work, or study in a country on the UNESCO Institute of Statistics and World Bank List of Low and Middle Income Countries list provided.

No