Page 14 - 2025S
P. 14
UEC Int’l Mini-Conference No.54 7
BengaliDiff: Diffusion Model for Few-Shot Bengali Font Generation
1
2
Md Bilayet HOSSAIN ∗1 , Honghui YUAN , Shabnur Annona AKHY , and Keiji YANAI 2
1
UEC Exchange Study Program (JUSST Program)
2 Department of informatics
The University of Electro-Communications, Tokyo, Japan
Abstract
Bengali is a script-rich language with complex characters and ligatures, making it rare in the field of
font generation. Existing font generation methods have achieved good results in Chinese, English, and
other fonts. However, due to the complexity of the Bengali character, recent methods like FontDiffuser
do not produce high-quality Bengali fonts. We propose BengaliDiff, a novel generative model that
uses a diffusion-based architecture, style-content fusion, and adversarial supervision to synthesize
Bengali characters in a target font style. We use image-to-image translation-based methodology,
which enhances font production, as we maintain the structure of characters and provide them with a
uniform style in different fonts. We build on our approach of FontDiffuser but use a dual aggregation
cross-attention scheme to inject content and style features on channel and spatial levels, individually,
into the reverse denoising process. In addition, we embed an adversarial discriminator that promotes
stylistically coherent and perceptually accurate generations. According to tests performed with a
predefined group of Bengali fonts, it can be said that BengaliDiff is better in content preservation
and style consistency compared to the current baselines that exist. To the best of our knowledge,
our method is the first to use the diffusion model for the Bengali font generation task. The study
also provides a publicly available Bengali font dataset and a pre-trained model that allows them
to support digitally published materials, text handwriting recognition, and custom typography with
better assistance.
Keywords: Font Generation, Diffusion model, Bengali Font
1 Introduction herent and stylistically consistent fonts from a
small number of references [2, 9, 18]. There are
Bengali is ranked among the popular languages unique issues in Bengali script with its multi-
of the world and is spoken by more than 200 glyph structure and diverse glyph composition,
million people. It has a rich and distinctive text which are not addressed by generic font pro-
that constitutes a significant component of the ducing methods developed to support Latin or
culture and heritage. Meanwhile, Bengali fonts logographic languages (like Chinese). The ro-
of good quality are very limited when compared bust model of Bengali font synthesis must ad-
to Latin fonts or other scripts. Building new dress a number of underlying problems, such
Bengali fonts by hand requires skilled designers as conjunct forms and complex ligatures, ma-
and significant time, as the letters of the Ben- tra (rendering and baseline positioning of float-
gali language are of complex shapes. This stim- ing vowel signs). From style transfer to GAN-
ulates the necessity of automatic generation of based font generation [16], traditional genera-
the Bengali fonts, which may save time on the tive approaches [7,10,14] have trouble with these
work and encourage the design of new fonts. One script-specific subtleties. Latin and Chinese let-
of the biggest challenges in digital typography ters have received interesting results under a
for a long time has been producing visually co- noise-to-noise model recently introduced to dif-
∗ The author is supported by JASSO Scholarship. fusion models, such as FontDiffuser [19] and Diff-