Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization

Tomoki Koriyama, Takashi Nose, Takao Kobayashi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Citations (Scopus)

Abstract

This paper examines two issues of a statistical speech synthesis approach based Gaussian process (GP) regression. Although GP-based speech synthesis can give higher performance in generating spectral parameters than the HMM-based one, a number of issues still remain. In this paper, we incorporate global variance (GV) feature to overcome over-smoothing problem into the parameter generation. Furthermore, in order to utilize an appropriate kernel function in accordance with actual data, we propose an EM-based kernel hyperparameter optimization technique. Objective and subjective evaluation results show that using GV and hyperparameter estimation enhanced the performance in spectral feature generation.

Original languageEnglish
Title of host publication2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3834-3838
Number of pages5
ISBN (Print)9781479928927
DOIs
Publication statusPublished - 2014
Event2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, Italy
Duration: 2014 May 42014 May 9

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
Country/TerritoryItaly
CityFlorence
Period14/5/414/5/9

Keywords

  • Gaussian process
  • global variance
  • kernel hyperparameter
  • statistical parametric speech synthesis

Fingerprint

Dive into the research topics of 'Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization'. Together they form a unique fingerprint.

Cite this