Page 21 - 2025S
        P. 21
     14                                                                UEC Int’l Mini-Conference No.54
            Figure 4: Qualitative Results Comparisons on SFUC between our method and the previous state-of-
            the-art method of FontDiffuser. The red boxes highlight the challenging areas of FontDiffuser.
            FID and LPIPS, with improvements of 0.1622        correct shapes but sometimes fails to match the
            and 0.0345, respectively, better than FontDif-    style of the reference font, like stroke thickness
            fuser. This shows that our model not only pro-    or curve design. FontDiffuser, in contrast, often
            vides more realistic images, but the differences  shows distorted or broken glyphs. Overall, Ben-
            with regard to the target fonts are also percep-  galiDiff is better at preserving content, but style
            tually closer.  Our approach yielded a higher     transfer to unseen fonts still needs improvement.
            score in L1 and RMSE, albeit slightly, but the      All these results showed that our model pro-
            scores are close enough that it implies a slight  duces better-quality font images on average and
            trade-off of pixel-level accuracy. The structural  has a good generalization capability to a wide va-
            similarity is also preserved well since our SSIM  riety of Bangla characters and produces font im-
            scores are also fairly close to each other. Table 2  ages of a better aesthetic quality comparatively,
            demonstrates that for unseen fonts and charac-    making it more useful in real-world OCR appli-
            ters, BengaliDiff is more effective than the Font-  cations and digital typography.
            Diffuser. It has better results that are more real-
            istic and visually closer (lower FID and LPIPS)
            and maintains the structure properly. The L1      4.4   Ablation Studies
            and RMSE small differences indicate that trade-   We conducted ablation tests to evaluate the ef-
            offs are only minor at the pixel level and that the  ficacy of every component of our approach. our
            SSIM should also have excellent shape preser-     qualitative results presented in Fig. 6 illustrate
            vation. Fig. 5 visually compares both methods     effectiveness of various elements in our model via
            on UFUC. BengaliDiff generates characters with
                                                              ablation studies, and Table 3 shows the Quan-





