Page 40 - 2024S
P. 40

UEC Int’l Mini-Conference No.52                                                               33







            in natural language processing for odor detec-    2.3 Odor      Detection     from     Visual
            tion.  By combining advanced AI techniques              Modalities
            with practical implementation strategies, we
            aim to demonstrate a new video viewing experi-    A related line of our research [5] explored de-
            ence that incorporates sophisticated text-based   tecting odor-related objects in images using an
            odor analysis, making media experiences more      object detection model trained on 12 categories
            immersive and accessible.                         of common odors in everyday life in Japan.
                                                              Their work demonstrated the feasibility of
                                                              extracting odor information from visual data,
                                                              achieving around 60% accuracy in detecting
            2 Related Work                                    odor-related objects in images. While Eda et
                                                              al. focused on the visual modality, our research
            2.1 Relationship of Odor and Emo-                 aims to complement their approach by detect-
                  tion                                        ing odors from textual data, specifically video
                                                              subtitles.  By leveraging advanced language
            The relationship between odor and emotion is      models and prompt engineering techniques,
            intricate and profoundly significant in evoking   we seek to understand the semantic context
            sensory memories and experiences. The Prous-      and nuanced language cues that suggest the
            tian Memory Effect, which describes the vivid     presence of specific odors.
            recollection of memories triggered by specific
            odors, underscores how smell can elicit power-      Our text-based system could analyze subtitles
            ful emotional connections [1, 2]. The challenge   to detect odor-related phrases or descriptions,
            lies in accurately interpreting the linguistic con-  while the model could simultaneously identify
            text of odors and predicting these into precise   odor-related objects in the corresponding video
            odor label.                                       frames. The combined information from both
                                                              modalities could be fused to enhance the
                                                              accuracy and specificity of odor predictions,
            2.2 Limitations of Odor Fusion                    offering a more robust and reliable solution.

            The current technologies for olfactory displays
            rely on odor libraries composed of available
            odorants mixed to replicate desired smells.       3 Methodology
            This approach, however, has significant lim-      The system architecture is designed to process a
            itations.  Firstly, creating a comprehensive      video source and export a subtitle file for further
            library to cover the possible odors is currently  use. First, we extract the transcript using its
            infeasible due to technological constraints, as   audio data by an automatic speech recognition
            each additional odor complicates the system       (ASR) approach called Whisper by OpenAI that
            and increases its cost. Moreover, the human       can run locally [6]. Next, for each subtitle in the
            olfactory system is intricately sophisticated,    transcript, we call a request to get potential odor
            utilizing complex neural pathways and pattern     label prediction. Finally we update the subtitle
            recognition capabilities that are challenging to  file to record each prediction and export it. The
            mimic artificially [3].
                                                              proposed system flow is shown as Figure 1.
              Recognizing these limitations, our project fo-  3.1 Olfactory Display
            cuses on a practical set of 12 predefined smells
            commonly encountered in everyday life and me-     Our research employs the latest model of an
            dia in Japan, such as food and natural odors [4].  advanced olfactory display system developed by
            This selection aims to maintain a manageable      Nakamoto’s laboratory [7]. This compact device
            range of odors that can be accurately repro-      features 13 odor components and operates on
            duced and controlled.                             the principle of rapid solenoid valve switching
   35   36   37   38   39   40   41   42   43   44   45