Page 83 - 2024F
P. 83
76 UEC Int’l Mini-Conference No.53
Bengali Diff: Diffusion Model for One-Shot Bengali Font Generation
Md Bilayet Hossain*, Honghui Yuan and Keiji Yanai
UEC Exchange Study Program (JUSST Program)
Department of Informatics
The University of Electro-Communication Tokyo, Japan
h2495009@gl.cc.uec.ac.jp
1.Introduction Our method working procedure:
Input Processing: Converts a reference Bengali font into high-
Bengali is a widely spoken language with a unique script that resolution glyph images.
includes vowels, consonants, and complex characters. Diffusion Model: Gradually refines noisy input to learn structure and
However, it has been less explored in the field of font style.[2]
generation. Traditional methods often struggle to create Enhancement: MCA preserves strokes; SCR ensures style
accurate Bengali fonts that capture all the details of the script. adaptation.[1]
Recently, new AI techniques, especially diffusion models, have Output & Evaluation: Generates fonts, assessed by SSIM, FID, and
shown great potential in font creation. This research presents human tests.
Bengali Diff, a model that uses diffusion-based methods[2] to
generate high-quality Bengali fonts, inspired by the
FontDiffuser [1]approach. 4.Experimental Results
2.Research Objectives Generation Results:
Source Reference
Develop a generative model capable of producing high-quality Ours
Bengali fonts from a single reference style.
Preserve intricate strokes and conjunct characters through
multi-scale content aggregation (MCA) blocks.[1]
Implement a Style Contrastive Refinement (SCR) module to
enhance style adaptation across different font types.[1]
Evaluate the model’s effectiveness using structural similarity,
perceptual loss, and human evaluation.[2]
Our method could generate unseen characters and Style
based on the reference image.
Result Discussion: FontDiffuser Output
The Bengali is rare in font generation field that’s why FontDiffuser
could not generate good result.
Generated
Style encoder struggles with Bengali font intricacies, leading to
Image
blurred results.
Low guidance scale and fewer diffusion steps lead to noise and
incomplete outputs.
Unicode or font rendering inconsistencies affect accuracy.
Figure 1: Overview of Font Generation[2] Improvements:
Better preprocessing, fine-tuned hyperparameters, Bengali-specific
3.Methodology training, and GPU usage.
We proposed a font generation framework called FontDiffuser 5.Expected Outcomes
based on the Diffusion model.[1]
✅ A novel diffusion-based Bengali font generation model.
✅ A publicly available Bengali font dataset and pre-trained model.
Reconstructed
Image ✅ Comparative analysis demonstrating improvements over
existing font generation techniques.[1]
✅ Applications in digital publishing, handwriting recognition, and
personalized typography.
Source
Style Extractor
Encoder
UNet 6.Conclusion
MCA Generated Image VGG Style Projector Vector
Reference Space Bengali Diff leverages diffusion models for one-shot Bengali font
Encoder generation, preserving intricate[1] details with high accuracy. This
research contributes to digital typography and automated font
synthesis, with applications in publishing, handwriting recognition,
(a) Conditional Diffusion for Font Generation (a) Style Contrastive Refinement
and personalized design. Future work will enhance efficiency and
Figure 2: Overview of our proposed method[1] expand dataset diversity for broader usability.
[1] Yang, Z., Peng, D., Kong, Y., Zhang, Y., Yao, C., Jin, L.: Fontdiffuser: One-shot font generation via denoising diffusion with multi-scale content aggregation and style contrastive learning. In: AAAI. 2024
[2] Yuan, Honghui, and Keiji Yanai. "KuzushijiFontDiff: Diffusion Model for Japanese Kuzushiji Font Generation." International Conference on Multimedia Modeling. Singapore: Springer Nature Singapore,
2025.