Character Expressions in Meta-Learning for Extremely Low Resource Language Speech Recognition

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

For the construction of a high-quality speech recognition system, a substantial volume of annotated speech data is requisite. However, preparing such expansive datasets is impracticable for a vast majority of global languages. Therefore, we need to develop speech recognition systems for low-resource languages. In this paper, we propose a method where the model is initially pretrained on speech data from ten languages utilizing meta-learning. After this, the model undergoes fine-tuning using a small amount of speech data of the target language. We used only about 15 minutes of speech and acheved CER of less 30%. Although the set of alphabet differs from language to language, alphabets often show phonetic resemblances. Capitalizing on this observation, we proposed the method of alphabet unification, employed English pronunciation rules based on Latin alphabet as a standard to align the alphabetic representation across languages under study. Our results show that such alphabet unification approach enhanced the performance.

Original languageEnglish
Title of host publicationProceedings of the 2024 16th International Conference on Machine Learning and Computing, ICMLC 2024
PublisherAssociation for Computing Machinery
Pages525-529
Number of pages5
ISBN (Electronic)9798400709234
DOIs
Publication statusPublished - 2024 Feb 2
Event16th International Conference on Machine Learning and Computing, ICMLC 2024 - Shenzhen, China
Duration: 2024 Feb 22024 Feb 5

Publication series

NameACM International Conference Proceeding Series

Conference

Conference16th International Conference on Machine Learning and Computing, ICMLC 2024
Country/TerritoryChina
CityShenzhen
Period24/2/224/2/5

Keywords

  • Character unification
  • Low resource
  • Meta-learning
  • Speech recognition

Fingerprint

Dive into the research topics of 'Character Expressions in Meta-Learning for Extremely Low Resource Language Speech Recognition'. Together they form a unique fingerprint.

Cite this