Page 14 - 2025S
P. 14

UEC Int’l Mini-Conference No.54                                                                7








             BengaliDiff: Diffusion Model for Few-Shot Bengali Font Generation


                                                                                   1
                                                        2
              Md Bilayet HOSSAIN     ∗1 , Honghui YUAN , Shabnur Annona AKHY , and Keiji YANAI          2
                                  1
                                   UEC Exchange Study Program (JUSST Program)
                                             2 Department of informatics
                              The University of Electro-Communications, Tokyo, Japan





                                                       Abstract


                   Bengali is a script-rich language with complex characters and ligatures, making it rare in the field of
                font generation. Existing font generation methods have achieved good results in Chinese, English, and
                other fonts. However, due to the complexity of the Bengali character, recent methods like FontDiffuser
                do not produce high-quality Bengali fonts. We propose BengaliDiff, a novel generative model that
                uses a diffusion-based architecture, style-content fusion, and adversarial supervision to synthesize
                Bengali characters in a target font style. We use image-to-image translation-based methodology,
                which enhances font production, as we maintain the structure of characters and provide them with a
                uniform style in different fonts. We build on our approach of FontDiffuser but use a dual aggregation
                cross-attention scheme to inject content and style features on channel and spatial levels, individually,
                into the reverse denoising process. In addition, we embed an adversarial discriminator that promotes
                stylistically coherent and perceptually accurate generations. According to tests performed with a
                predefined group of Bengali fonts, it can be said that BengaliDiff is better in content preservation
                and style consistency compared to the current baselines that exist. To the best of our knowledge,
                our method is the first to use the diffusion model for the Bengali font generation task. The study
                also provides a publicly available Bengali font dataset and a pre-trained model that allows them
                to support digitally published materials, text handwriting recognition, and custom typography with
                better assistance.
            Keywords: Font Generation, Diffusion model, Bengali Font


            1    Introduction                                 herent and stylistically consistent fonts from a
                                                              small number of references [2, 9, 18]. There are
            Bengali is ranked among the popular languages     unique issues in Bengali script with its multi-
            of the world and is spoken by more than 200       glyph structure and diverse glyph composition,
            million people. It has a rich and distinctive text  which are not addressed by generic font pro-
            that constitutes a significant component of the   ducing methods developed to support Latin or
            culture and heritage. Meanwhile, Bengali fonts    logographic languages (like Chinese). The ro-
            of good quality are very limited when compared    bust model of Bengali font synthesis must ad-
            to Latin fonts or other scripts. Building new     dress a number of underlying problems, such
            Bengali fonts by hand requires skilled designers  as conjunct forms and complex ligatures, ma-
            and significant time, as the letters of the Ben-  tra (rendering and baseline positioning of float-
            gali language are of complex shapes. This stim-   ing vowel signs). From style transfer to GAN-
            ulates the necessity of automatic generation of   based font generation [16], traditional genera-
            the Bengali fonts, which may save time on the     tive approaches [7,10,14] have trouble with these
            work and encourage the design of new fonts. One   script-specific subtleties. Latin and Chinese let-
            of the biggest challenges in digital typography   ters have received interesting results under a
            for a long time has been producing visually co-   noise-to-noise model recently introduced to dif-
               ∗ The author is supported by JASSO Scholarship.  fusion models, such as FontDiffuser [19] and Diff-
   9   10   11   12   13   14   15   16   17   18   19