Feature enhancement by speaker-normalized SPLICE for robust speech recognition

Yusuke Shinohara, Takashi Masuko, Masami Akamine

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

The SPLICE method of feature enhancement is known for its powerful performance. It learns a mapping from noisy to clean feature vectors given a set of stereo training data. However, feature vector variation caused by speaker changes conceals noise-induced variation, which is what we want to find in the SPLICE training. In this paper, an improvement of SPLICE by means of speaker-normalization is proposed. The training data is first normalized with respect to speaker variation, and a mapping is learned afterward. CMLLR with a GMM as its target is utilized for the speaker-normalization, where the GMM representing a standard speaker is learned via a novel variant of the speaker adaptive training. The proposed method was evaluated on Aurora2, and achieved a relative word error rate reduction of 38% over the conventional SPLICE.

Original languageEnglish
Title of host publication2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Pages4881-4884
Number of pages4
DOIs
Publication statusPublished - 2008
Event2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP - Las Vegas, NV, United States
Duration: 2008 Mar 312008 Apr 4

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Country/TerritoryUnited States
CityLas Vegas, NV
Period08/3/3108/4/4

Keywords

  • Feature enhancement
  • Robust speech recognition
  • Speaker adaptive training
  • Speaker normalization
  • SPLICE

Fingerprint

Dive into the research topics of 'Feature enhancement by speaker-normalized SPLICE for robust speech recognition'. Together they form a unique fingerprint.

Cite this