Speech recognition based on tree-structured clustering and aspect model in multiple noise environments

Seong Jun Hahm, Yuichi Ohkawa, Motoyuki Suzuki, Masashi Ito, Shozo Makino, Akinori Ito

Research output: Contribution to conferencePaperpeer-review

Abstract

In this paper, we propose speech recognition by using cluster-specific aspect model based on tree-structured clustering in multiple noise environments. Multi-condition hidden Markov model (MC-HMM) is one of the standard methods for speech recognition in noisy environment. While MC-HMM is pretty simple, it is known to be robust against various noises, thus this method is regarded as a "standard" of noise-robust acoustic model. However, it is difficult to train a model with large number of parameters to represent wide variabilities. We use tree-structured clustering method to avoid this problem. After training cluster models, cluster-specific aspect models are trained by using results of tree-structured clustering. Each cluster-specific aspect model can represent latent characteristic of specific noisy environments included in a certain cluster. The method for adaptation is based on the aspect model, which is a "mixture-of- mixture" model. To realize adaptation using extremely small amount of adaptation data (i.e., a few seconds), we first select the model according to the result of binary search of tree-structure and train a small number of mixture models which can be interpreted as models for "subclusters" of cluster models. The experimental results showed that the adaptation based on the cluster-specific aspect model improved the word accuracy in a heavy noise environment.

Original languageEnglish
Pages454-457
Number of pages4
Publication statusPublished - 2010 Dec 1
Event2nd Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2010 - Biopolis, Singapore
Duration: 2010 Dec 142010 Dec 17

Other

Other2nd Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2010
Country/TerritorySingapore
CityBiopolis
Period10/12/1410/12/17

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'Speech recognition based on tree-structured clustering and aspect model in multiple noise environments'. Together they form a unique fingerprint.

Cite this