A NEW METRIC FOR STOCHASTIC LANGUAGE MODEL EVALUATION

Akinori Ito, Masaki Kohda, Mari Ostendorf

Research output: Contribution to conferencePaperpeer-review

Abstract

Though Perplexity shows good correlation with word error rate within simple n-gram framework like Wall Street Journal task, it has been reported that perplexity have poor correlation with WER when more complicated LM is used. In this paper, a global measure for language model evaluation is proposed which achieves higher correlation between word accuracy. The metric is based on difference of LM score between a word in the evaluation text and the word that gives the maximum score at that context. Two experiments were carried out to investigate the correlation between word accuracy and the proposed measure. In the first experiment, LMs in this paper were created using n-gram adaptation by n-gram count mixture. 47 LMs were created for the experiments by changing mixture weight and vocabulary cut-off threshold. Correlation betwen perplexity and word accuracy was very poor (correlation coefficient -0.36). On the other hand, the proposed metric gave much higher correlation (correlation coefficient 0.82). In the second experiment, a simple mixture trigram model was employed to recognize Switchboard task data. The highest correlation between word accuracy and the proposed method was 0.81, which was much higher than the correlation between PP and accucary 0.59.

Original languageEnglish
Pages1591-1594
Number of pages4
Publication statusPublished - 1999
Externally publishedYes
Event6th European Conference on Speech Communication and Technology, EUROSPEECH 1999 - Budapest, Hungary
Duration: 1999 Sept 51999 Sept 9

Conference

Conference6th European Conference on Speech Communication and Technology, EUROSPEECH 1999
Country/TerritoryHungary
CityBudapest
Period99/9/599/9/9

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Linguistics and Language
  • Communication

Fingerprint

Dive into the research topics of 'A NEW METRIC FOR STOCHASTIC LANGUAGE MODEL EVALUATION'. Together they form a unique fingerprint.

Cite this