Page 66 - 2024S
P. 66

UEC Int’l Mini-Conference No.52                                                               59









                Beyond Word Count: Exploring Approximated Target Lengths for

                                                     CIF-RNNT


                                          Wen Shen TEO , Yasuhiro MINAMI
                                                          ∗
                                   Department of Computer and Network Engineering
                                       The University of Electro-Communications
                                                      Tokyo, Japan


             Keywords: streaming speech recognition, self-information, decoding speed, word segmentation



                                                        Abstract
                    Our previous work proposed the CIF-RNNT architecture, a combination of Continuous Integrate-
                 and-Fire (CIF) and RNN-Transducers (RNN-T) that compresses speech into units equivalent to lin-
                 guistic words to achieve efficient decoding. This work extends on that research by investigating the
                 impact of different target length definitions, approximated from self-information and token count. Our
                 results on English and Japanese datasets show that approximated target length types based on self-
                 information outperform simpler approaches, and CIF-RNNT models even surpass topline models on
                 the Japanese dataset at smaller chunk sizes. Furthermore, our comparisons demonstrate an inherent
                 ability of CIF-RNNT to produce output tokens in group of words, regardless of the target length type.
                 These results showcase the potential of the CIF-RNNT architecture for efficient and accurate speech
                 recognition.




































               ∗ The author is supported by (AiQuSci) MEXT Scholar-
             ship.
   61   62   63   64   65   66   67   68   69   70   71