Page 21 - 2025S

P. 21

14 UEC Int’l Mini-Conference No.54

Figure 4: Qualitative Results Comparisons on SFUC between our method and the previous state-of-
the-art method of FontDiffuser. The red boxes highlight the challenging areas of FontDiffuser.

FID and LPIPS, with improvements of 0.1622 correct shapes but sometimes fails to match the
and 0.0345, respectively, better than FontDif- style of the reference font, like stroke thickness
fuser. This shows that our model not only pro- or curve design. FontDiffuser, in contrast, often
vides more realistic images, but the differences shows distorted or broken glyphs. Overall, Ben-
with regard to the target fonts are also percep- galiDiff is better at preserving content, but style
tually closer. Our approach yielded a higher transfer to unseen fonts still needs improvement.
score in L1 and RMSE, albeit slightly, but the All these results showed that our model pro-
scores are close enough that it implies a slight duces better-quality font images on average and
trade-off of pixel-level accuracy. The structural has a good generalization capability to a wide va-
similarity is also preserved well since our SSIM riety of Bangla characters and produces font im-
scores are also fairly close to each other. Table 2 ages of a better aesthetic quality comparatively,
demonstrates that for unseen fonts and charac- making it more useful in real-world OCR appli-
ters, BengaliDiff is more effective than the Font- cations and digital typography.
Diffuser. It has better results that are more real-
istic and visually closer (lower FID and LPIPS)
and maintains the structure properly. The L1 4.4 Ablation Studies
and RMSE small differences indicate that trade- We conducted ablation tests to evaluate the ef-
offs are only minor at the pixel level and that the ficacy of every component of our approach. our
SSIM should also have excellent shape preser- qualitative results presented in Fig. 6 illustrate
vation. Fig. 5 visually compares both methods effectiveness of various elements in our model via
on UFUC. BengaliDiff generates characters with
ablation studies, and Table 3 shows the Quan-

16 17 18 19 20 21 22 23 24 25 26