TY - GEN
T1 - Character Expressions in Meta-Learning for Extremely Low Resource Language Speech Recognition
AU - Zhou, Rui
AU - Ito, Akinori
AU - Nose, Takashi
N1 - Publisher Copyright:
© 2024 Owner/Author.
PY - 2024/2/2
Y1 - 2024/2/2
N2 - For the construction of a high-quality speech recognition system, a substantial volume of annotated speech data is requisite. However, preparing such expansive datasets is impracticable for a vast majority of global languages. Therefore, we need to develop speech recognition systems for low-resource languages. In this paper, we propose a method where the model is initially pretrained on speech data from ten languages utilizing meta-learning. After this, the model undergoes fine-tuning using a small amount of speech data of the target language. We used only about 15 minutes of speech and acheved CER of less 30%. Although the set of alphabet differs from language to language, alphabets often show phonetic resemblances. Capitalizing on this observation, we proposed the method of alphabet unification, employed English pronunciation rules based on Latin alphabet as a standard to align the alphabetic representation across languages under study. Our results show that such alphabet unification approach enhanced the performance.
AB - For the construction of a high-quality speech recognition system, a substantial volume of annotated speech data is requisite. However, preparing such expansive datasets is impracticable for a vast majority of global languages. Therefore, we need to develop speech recognition systems for low-resource languages. In this paper, we propose a method where the model is initially pretrained on speech data from ten languages utilizing meta-learning. After this, the model undergoes fine-tuning using a small amount of speech data of the target language. We used only about 15 minutes of speech and acheved CER of less 30%. Although the set of alphabet differs from language to language, alphabets often show phonetic resemblances. Capitalizing on this observation, we proposed the method of alphabet unification, employed English pronunciation rules based on Latin alphabet as a standard to align the alphabetic representation across languages under study. Our results show that such alphabet unification approach enhanced the performance.
KW - Character unification
KW - Low resource
KW - Meta-learning
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=85196183062&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85196183062&partnerID=8YFLogxK
U2 - 10.1145/3651671.3651730
DO - 10.1145/3651671.3651730
M3 - Conference contribution
AN - SCOPUS:85196183062
T3 - ACM International Conference Proceeding Series
SP - 525
EP - 529
BT - Proceedings of the 2024 16th International Conference on Machine Learning and Computing, ICMLC 2024
PB - Association for Computing Machinery
T2 - 16th International Conference on Machine Learning and Computing, ICMLC 2024
Y2 - 2 February 2024 through 5 February 2024
ER -