Page 84 - 2025S
P. 84

UEC Int’l Mini-Conference No.54                                                               77









                 A Novel Lightweight Pipeline for Real-Time Lesion Detection in

                                             Bronchoscopic Imaging


                                                                 2
                                                                                  3
                     Tawhid Ahmed Komol     ∗1 , Norihiro Koizumi , Yu Nishiyama , and Peiji Chen  4
                                   1 UEC Exchange Study Program (JUSST Program)
                            2,4 Department of Mechanical and Intelligent Systems Engineering
                                   3 Department of Computer and Network Engineering
                                The University of Electro-Communications, Tokyo, Japan




             Keywords: Bronchoscopy, Real-Time Detection, Lesion Localization, Deep Learning, Medical Image
             Analysis, RT-DETR



                                                        Abstract
                    Bronchoscopy is a vital diagnostic tool for identifying lesions within the respiratory tract but most
                 computer-aided detection systems are based on static image classification models due to the absence of
                 lightweight, optimized architectures. In this study, we present a lightweight variant of the Real-Time
                 Detection Transformer (RT-DETR) optimized for bronchoscopic lesion detection. Using the BI2K
                 dataset, which contains 1550 annotated bronchoscopic images (600 benign and 950 malignant). We
                 evaluate and compare the performance of four real-time object detection models: YOLOv8-S, YOLO-
                 NAS, standard RT-DETR, and our proposed RT-DETR Lite. RT-DETR Lite achieves competitive
                 detection mean average precision of 0.87 while significantly reducing computational complexity to 32
                 GFLOPs and achieving 92 FPS on an RTX 3060 GPU. Architectural optimizations include substitut-
                 ing the ResNet50 backbone with MobileNetV3-Small, reducing encoder and decoder layers from 6 to
                 3, lowering the hidden dimension to 192, and decreasing the input resolution to 416×416. This bal-
                 ance of speed and accuracy demonstrates the suitability of RT-DETR Lite for real-time clinical use in
                 bronchoscopy.


























               ∗
                The author is supported by JASSO Scholarship.
   79   80   81   82   83   84   85   86   87   88   89