Language modeling by stochastic dependency grammar for Japanese speech recognition

Akinori Ito, Chiori Hori, Masaharu Kotow, Masaki Kohda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper describes a language modeling technique using a kind of stochastic context free grammar (stochastic dependency grammar, SDG). In this work, two improvements are done upon the general CFG based SCFG model. The first improvement is to use a restricted grammar instead of general CFG. The dependency grammar used here is a restricted CFG that expresses modification between two words or phrases. The derivation probabilities are estimated by inside-outside algorithm. The computational complexity of the estimation is reduced from 0(N3L3) to 0(N2L3), where N and L means the number of nonterminals and length of a sentence respectively. Second, word grouping is introduced for further reduction of the estimation time. The basic idea is that regular grammar is applied within a group and CFG is used to express intergroup relationship. To achieve the idea, a new algorithm is introduced. When a group have two words in average, the learning time becomes about one-eighth. Two experiments were carried out to investigate the performance of the proposed model. In the first experiment, various kinds of SCFGs were compared using perplexity. From the result, it was found that the proposed model have much lower PP than the original model. As for the training speed, restricted grammar made training process twenty times faster, and the word grouping made it eight times faster. In the second experiment, the proposed model was used as a language model of LVCSR. The result showed that the proposed model was as good as bigram and trigram, and that the combination of trigram and the proposed model achieved further improvement of WER.

Original languageEnglish
Title of host publication6th International Conference on Spoken Language Processing, ICSLP 2000
PublisherInternational Speech Communication Association
ISBN (Electronic)7801501144, 9787801501141
Publication statusPublished - 2000
Externally publishedYes
Event6th International Conference on Spoken Language Processing, ICSLP 2000 - Beijing, China
Duration: 2000 Oct 162000 Oct 20

Other

Other6th International Conference on Spoken Language Processing, ICSLP 2000
Country/TerritoryChina
CityBeijing
Period00/10/1600/10/20

ASJC Scopus subject areas

  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Language modeling by stochastic dependency grammar for Japanese speech recognition'. Together they form a unique fingerprint.

Cite this