Page 78 - 2025S
P. 78

UEC Int’l Mini-Conference No.54                                                               71









                 Advanced EEG-to-Text Translation Using Pre-trained Language

                                  Models and Multi-Modal Transformers


                             Jose Manuel CARRICHI CHAVEZ         *1  and Toru NAKASHIKA     2

                                   1 UEC Exchange Study Program (JUSST Program)
                                   2 Department of Computer and Network Engineering
                                The University of Electro-Communications, Tokyo, Japan



             Keywords: Brain-Computer Interfaces (BCIs), EEG-to-Text, Large Language Models (LLMs), Multi-
             Modal Transformers, Neural Signal Decoding.



                                                        Abstract
                    This research proposes an advanced system for translating non-invasive electroencephalography
                 (EEG) signals into coherent natural language text, leveraging recent developments in both Brain-
                 Computer Interfaces (BCIs) and Large Language Models (LLMs). While EEG-based text generation
                 has seen promising results, current methods are often constrained by closed vocabularies and external
                 dependencies such as eye-tracking. Building on the EEG2TEXT framework, this study integrates
                 a pretrained convolutional transformer encoder with a multi-view transformer for spatial modeling of
                 brain activity. The resulting representations are decoded by a multimodal-adapted LLM, such as Llama,
                 enabling open-vocabulary, semantically rich text generation. Self-supervised pretraining techniques
                 and multimodal fine-tuning are employed to enhance signal understanding and language coherence.
                 Experiments will be conducted using the ZuCo dataset, aiming to outperform existing methods on
                 metrics like BLEU and ROUGE. The anticipated outcome is a robust, generalizable EEG-to-text system
                 with improved fluency, accuracy, and semantic depth, contributing novel tools and insights to the BCI
                 and NLP communities.




























                The author is supported by JASSO Scholarship.
               *
   73   74   75   76   77   78   79   80   81   82   83