Page 82 - 2024F

P. 82

UEC Int’l Mini-Conference No.53 75

Bengali Diﬀ: Diﬀusion Model for One-Shot Bengali Font Generation

1
2
Md Bilayet HOSSAIN , Honghui YUAN , and Keiji YANAI 2
1 UEC Exchange Study Program (JUSST Program)
2 Department of Informatics
The University of Electro-Communications, Tokyo, Japan

Keywords: Font Generation, Deep Learning, Diﬀusion model

Abstract
Bengali is a script-rich language with intricate characters and ligatures, making it rare in the ﬁeld of
font generation. Traditional methods, including FontDiﬀuser, struggle to produce high-quality Bengali
fonts due to the script’s complexity. To address this challenge, we propose Bengali Diﬀ, a diﬀusion-
based model that generates Bengali fonts with high ﬁdelity. Our approach leverages image-to-image
translation techniques, improving font creation by preserving character structure and ensuring style
consistency across various fonts. We use multi-scale content aggregation (MCA) to maintain the struc-
tural integrity of each character and style contrastive reﬁnement (SCR) to align the generated fonts
with the desired style. To evaluate FontDiﬀuser’s eﬀectiveness for Bengali fonts, we conduct initial
experiments using a small-scale dataset, providing insights into model performance. However, a larger
and more diverse dataset is essential for achieving robust and high-quality font generation, though it
requires signiﬁcant training time and computational resources. As part of our long-term vision, we
plan to expand the dataset and optimize training strategies for better eﬃciency. This research also
introduces a public Bengali font dataset and a pre-trained model, enabling better support for digital
publishing, handwriting recognition, and custom typography.

∗
The author is supported by JASSO Scholarship.

77 78 79 80 81 82 83 84 85 86 87