An HMM-based segment quantizer and its application to low bit rate speech coding

Motoyuki Suzuki, Masashi Adachi, Minoru Kohata, Akinori Ito, Shozo Makino, Fuji Ren

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Several speech coding systems employ a segment quantizer instead of a vector quantizer. One of the most important problems is how to construct a segment codebook. In this paper, a new speech coder based on the ML-BEATS is proposed. The ML-BEATS is one of the HMM-based segment quantizer. First, it splits a vector sequence into several sub-sequences, and then these sub-sequences are clustered in order to construct a codebook. Each cluster center is represented by a left-to-right HMM. In the encoding process, input speech is matched with HMMs in the codebook, and then HMM index and duration information are sent to the decoder. In the decoding process, a decoded sequence is generated from HMM parameters by applying the HMM-based speech synthesis method. From the experimental results, the HMM-based speech coder gave 1.13 dB spectral distortion with 5.83 bit/frame. It is 0.11 dB higher spectral distortion than that given by G.729 coder, but bit rate decreased only 32%. In order to consider a shifting problem of LSP dimensions, we also propose a new codebook construction method. Many training vectors are extracted from training samples by shifting dimensions, and all vectors are used for constructing a universal codebook. The universal codebook can deal with any shifted vectors because all possibilities are included in the training data. From the experimental results, the shifted vector method encoded an input speech with very low bit rate, but it gave higher spectral distortions.

Original languageEnglish
Title of host publication20th International Congress on Acoustics 2010, ICA 2010 - Incorporating Proceedings of the 2010 Annual Conference of the Australian Acoustical Society
Pages3877-3880
Number of pages4
Publication statusPublished - 2010
Event20th International Congress on Acoustics 2010, ICA 2010 - Incorporating the 2010 Annual Conference of the Australian Acoustical Society - Sydney, NSW, Australia
Duration: 2010 Aug 232010 Aug 27

Publication series

Name20th International Congress on Acoustics 2010, ICA 2010 - Incorporating Proceedings of the 2010 Annual Conference of the Australian Acoustical Society
Volume5

Conference

Conference20th International Congress on Acoustics 2010, ICA 2010 - Incorporating the 2010 Annual Conference of the Australian Acoustical Society
Country/TerritoryAustralia
CitySydney, NSW
Period10/8/2310/8/27

Fingerprint

Dive into the research topics of 'An HMM-based segment quantizer and its application to low bit rate speech coding'. Together they form a unique fingerprint.

Cite this