Page 82 - 2024F
P. 82

UEC Int’l Mini-Conference No.53                                                               75









               Bengali Diff: Diffusion Model for One-Shot Bengali Font Generation


                                                    1
                                                                       2
                             Md Bilayet HOSSAIN , Honghui YUAN , and Keiji YANAI           2
                                   1 UEC Exchange Study Program (JUSST Program)
                                               2 Department of Informatics
                                The University of Electro-Communications, Tokyo, Japan



             Keywords: Font Generation, Deep Learning, Diffusion model



                                                        Abstract
                    Bengali is a script-rich language with intricate characters and ligatures, making it rare in the field of
                 font generation. Traditional methods, including FontDiffuser, struggle to produce high-quality Bengali
                 fonts due to the script’s complexity. To address this challenge, we propose Bengali Diff, a diffusion-
                 based model that generates Bengali fonts with high fidelity. Our approach leverages image-to-image
                 translation techniques, improving font creation by preserving character structure and ensuring style
                 consistency across various fonts. We use multi-scale content aggregation (MCA) to maintain the struc-
                 tural integrity of each character and style contrastive refinement (SCR) to align the generated fonts
                 with the desired style. To evaluate FontDiffuser’s effectiveness for Bengali fonts, we conduct initial
                 experiments using a small-scale dataset, providing insights into model performance. However, a larger
                 and more diverse dataset is essential for achieving robust and high-quality font generation, though it
                 requires significant training time and computational resources. As part of our long-term vision, we
                 plan to expand the dataset and optimize training strategies for better efficiency. This research also
                 introduces a public Bengali font dataset and a pre-trained model, enabling better support for digital
                 publishing, handwriting recognition, and custom typography.































               ∗
                The author is supported by JASSO Scholarship.
   77   78   79   80   81   82   83   84   85   86   87