Efficient Implementation of Global Variance Compensation for Parametric Speech Synthesis

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)

Abstract

This paper proposes a simple and efficient technique for variance compensation to improve the perceptual quality of synthetic speech in parametric speech synthesis. First, we analyze the problem of spectral and F0 enhancement with global variance (GV) in HMM-based speech synthesis. In the conventional GV-based parameter generation, the enhancement is achieved by taking account of a GV probability density function with fixed GV model parameters for every output utterance through the speech parameter generation process. We find that the use of fixed GV parameters results in much smaller variations of GVs in synthesized utterances than those in natural speech. In addition, the computational cost is high because of iterative optimization. This paper examines these issues in terms of multiple objective measures such as variance characteristics, GV distortions, and GV correlations. We propose a simple and fast compensation method based on a global affine transformation that provides a GV distribution closer to that of natural speech and improves the correlation of GVs between natural and generated parameter sequences. The experimental results demonstrate that the proposed variance compensation methods outperform the conventional GV-based parameter generation in terms of objective and subjective speech similarity to natural speech while maintaining speech naturalness.

Original languageEnglish
Pages (from-to)1694-1704
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume24
Issue number10
DOIs
Publication statusPublished - 2016 Oct

Keywords

  • HMM-based speech synthesis
  • affine transformation
  • global variance
  • over-smoothing problem
  • variance compensation

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Efficient Implementation of Global Variance Compensation for Parametric Speech Synthesis'. Together they form a unique fingerprint.

Cite this