3. Deep Speech-to-Text Models Capture the Neural Basis of Spontaneous Speech in Everyday Conversations

Leonard Niekerken Presenter
Princeton University
Princeton, NJ 
United States
 
Tuesday, Jun 25: 12:00 PM - 1:15 PM
2602 
Oral Sessions 
COEX 
Room: Grand Ballroom 101-102 
One of the most distinctively human behaviors is our ability to use language for communication during spontaneous conversations. Here, we collected continuous speech recordings and concurrent neural signals recorded from epilepsy patients during their week-long stay in the hospital, resulting in a uniquely large ECoG dataset of 100 hours of speech recordings during spontaneous, open-ended conversations. Deep learning provides a novel computational framework that embraces the multidimensional and context-dependent nature of language (Goldstein et al., 2022; Schrimpf et al., 2021). Here, we use Whisper, a deep multimodal speech-to-text model (Radford et al., 2022) to investigate the neural basis of speech processing.

Abstracts