Page 40 - 2024S
P. 40
UEC Int’l Mini-Conference No.52 33
in natural language processing for odor detec- 2.3 Odor Detection from Visual
tion. By combining advanced AI techniques Modalities
with practical implementation strategies, we
aim to demonstrate a new video viewing experi- A related line of our research [5] explored de-
ence that incorporates sophisticated text-based tecting odor-related objects in images using an
odor analysis, making media experiences more object detection model trained on 12 categories
immersive and accessible. of common odors in everyday life in Japan.
Their work demonstrated the feasibility of
extracting odor information from visual data,
achieving around 60% accuracy in detecting
2 Related Work odor-related objects in images. While Eda et
al. focused on the visual modality, our research
2.1 Relationship of Odor and Emo- aims to complement their approach by detect-
tion ing odors from textual data, specifically video
subtitles. By leveraging advanced language
The relationship between odor and emotion is models and prompt engineering techniques,
intricate and profoundly significant in evoking we seek to understand the semantic context
sensory memories and experiences. The Prous- and nuanced language cues that suggest the
tian Memory Effect, which describes the vivid presence of specific odors.
recollection of memories triggered by specific
odors, underscores how smell can elicit power- Our text-based system could analyze subtitles
ful emotional connections [1, 2]. The challenge to detect odor-related phrases or descriptions,
lies in accurately interpreting the linguistic con- while the model could simultaneously identify
text of odors and predicting these into precise odor-related objects in the corresponding video
odor label. frames. The combined information from both
modalities could be fused to enhance the
accuracy and specificity of odor predictions,
2.2 Limitations of Odor Fusion offering a more robust and reliable solution.
The current technologies for olfactory displays
rely on odor libraries composed of available
odorants mixed to replicate desired smells. 3 Methodology
This approach, however, has significant lim- The system architecture is designed to process a
itations. Firstly, creating a comprehensive video source and export a subtitle file for further
library to cover the possible odors is currently use. First, we extract the transcript using its
infeasible due to technological constraints, as audio data by an automatic speech recognition
each additional odor complicates the system (ASR) approach called Whisper by OpenAI that
and increases its cost. Moreover, the human can run locally [6]. Next, for each subtitle in the
olfactory system is intricately sophisticated, transcript, we call a request to get potential odor
utilizing complex neural pathways and pattern label prediction. Finally we update the subtitle
recognition capabilities that are challenging to file to record each prediction and export it. The
mimic artificially [3].
proposed system flow is shown as Figure 1.
Recognizing these limitations, our project fo- 3.1 Olfactory Display
cuses on a practical set of 12 predefined smells
commonly encountered in everyday life and me- Our research employs the latest model of an
dia in Japan, such as food and natural odors [4]. advanced olfactory display system developed by
This selection aims to maintain a manageable Nakamoto’s laboratory [7]. This compact device
range of odors that can be accurately repro- features 13 odor components and operates on
duced and controlled. the principle of rapid solenoid valve switching