Abstract
This paper describes a language modeling technique using a kind of stochastic context free grammar (stochastic dependency grammar, SDG). In this work, two improvements are done upon the general CFG based SCFG model. The first improvement is to use a restricted grammar instead of general CFG. The dependency grammar used here is a restricted CFG that expresses modification between two words or phrases. The derivation probabilities are estimated by inside-outside algorithm. The computational complexity of the estimation is reduced from 0(N3L3) to 0(N2L3), where N and L means the number of nonterminals and length of a sentence respectively. Second, word grouping is introduced for further reduction of the estimation time. The basic idea is that regular grammar is applied within a group and CFG is used to express intergroup relationship. To achieve the idea, a new algorithm is introduced. When a group have two words in average, the learning time becomes about one-eighth. Two experiments were carried out to investigate the performance of the proposed model. In the first experiment, various kinds of SCFGs were compared using perplexity. From the result, it was found that the proposed model have much lower PP than the original model. As for the training speed, restricted grammar made training process twenty times faster, and the word grouping made it eight times faster. In the second experiment, the proposed model was used as a language model of LVCSR. The result showed that the proposed model was as good as bigram and trigram, and that the combination of trigram and the proposed model achieved further improvement of WER.
Original language | English |
---|---|
Title of host publication | 6th International Conference on Spoken Language Processing, ICSLP 2000 |
Publisher | International Speech Communication Association |
ISBN (Electronic) | 7801501144, 9787801501141 |
Publication status | Published - 2000 |
Externally published | Yes |
Event | 6th International Conference on Spoken Language Processing, ICSLP 2000 - Beijing, China Duration: 2000 Oct 16 → 2000 Oct 20 |
Other
Other | 6th International Conference on Spoken Language Processing, ICSLP 2000 |
---|---|
Country/Territory | China |
City | Beijing |
Period | 00/10/16 → 00/10/20 |
ASJC Scopus subject areas
- Linguistics and Language
- Language and Linguistics