Page 22 - 2025S
P. 22

UEC Int’l Mini-Conference No.54                                                               15















































            Figure 5: Qualitative Results Comparisons on UFUC between our method and the previous state-of-
            the-art method of FontDiffuser. The red boxes highlight the challenging areas of fontDiffuser.


            titative evaluation results of ablation studies.
            Specifically, Fig. 6 compares the visual qual-
            ity of generated Bangla characters across sev-
            eral model configurations: Baseline, Baseline +
            Cross-Attention (CA), and Baseline + Cross-
            Attention with Discriminator (CA + D), along-
            side the ground truth (Target) and reference      Figure 6: Qualitative evaluation results of abla-
            font. The example of the input character and      tion studies. An illustration of several modules.
            the style can be seen in the source and the refer-  CA and D represent Cross-attention and Dis-
            ence columns. Looking at the Baseline column,
            we can see how the overall shape of characters    criminator, respectively.  Red boxes represent
                                                              the missing strokes, while green represents the
            still remained intact, but we can notice a signif-  corresponding improvements.
            icant number of distortions in the complicated
            strokes and misrepresentation of ligatures. This
            indicates that the baseline does not have the ca-
            pability to capture any form of detailed style in-  ture, leading to clearer glyph shapes and more
            formation.                                        accurate stroke positioning. For example, in the
                                                              second row, the complex character shows better
              When we add the Cross-Attention (CA) mod-       integration of the matra (horizontal stroke) and
            ule, the visual results improve significantly. The  conjunct formation compared to the baseline.
            CA module allows better alignment between the     However, although CA improves style transfer,
            reference style and the target character struc-   some inconsistencies remain in the finer details,
   17   18   19   20   21   22   23   24   25   26   27