Page 39 - 2024S
P. 39
32 UEC Int’l Mini-Conference No.52
Odor Detection Based on Semantic Context in Video Subtitles
2
Yi Chen Shen ∗1 , Haruka MATSUKURA , and Maki SAKAMOTO 2
1 UEC Exchange Study Program (JUSST Program)
2 Graduate School of Informatics and Engineering
The University of Electro-Communications, Tokyo, Japan
Abstract
Our research introduces an innovative approach to enhancing multimedia experiences by developing
a software system that uses advanced language models to detect and predict relevant odors from
video subtitles. This enables the addition of olfactory information to conventional multimedia, which
typically comprises only visual and audio elements. By applying fine-tuning and modern prompt
engineering techniques to large language models, we achieved over 95% accuracy in odor prediction
tasks. Our comprehensive evaluation, including both model comparisons and a user study with a
13-component olfactory display, demonstrates the system’s effectiveness in terms of accuracy, cost-
efficiency, and user engagement. This work showcases the potential for further advancements in
multimedia experiences through synchronized odor integration, paving the way for more immersive
and engaging content consumption.
Keywords: Olfactory Display, Virtual Reality, Artificial Intelligence, Large Language Model, Human-
Computer Interaction.
1 Introduction with video content. This paper explores the
development of our software, focusing on how
In today’s world, multimedia experiences are it processes textual information and predicts
evolving beyond traditional audiovisual content relevant odor labels using state-of-the-art
to create more immersive and engaging inter- language models, fine-tuning techniques, and
actions. While significant advancements have modern prompt engineering methods.
been made in visual and auditory technologies,
the integration of olfactory stimuli—our sense The key objectives of this research are:
of smell—remains largely unexplored and
underutilized. This research aims to bridge 1. To develop a robust system for detecting
this sensory gap by developing an innovative and predicting odors based on semantic
software system that analyzes video subtitles analysis of video subtitles.
to detect and suggest corresponding scents, 2. To evaluate and compare the performance
thereby enhancing the overall multimedia of various large language models in the con-
experience.
text of odor prediction.
Our approach harnesses the power of ad- 3. To assess the real-world effectiveness and
vanced language models to understand the user perception of olfactory-enhanced mul-
context and nuances of language used in subti- timedia through a comprehensive user
tles. By identifying phrases or words that hint study.
at specific scents, our system enables a dynamic
and responsive scent experience synchronized Our work builds upon existing research in
olfactory displays and multi-modal sensory in-
∗ The author is supported by JASSO Scholarship. tegration, while introducing novel techniques