Analytic Generation of Synthesis Units by Closed Loop Training @for Totally Speaker Driven Text to Speech System (TOS Drive TTS)

Masami Akamine, Takehiko Kagoshima

Research output: Contribution to conferencePaperpeer-review

22 Citations (Scopus)

Abstract

This paper provides a new method for automatically generating speech synthesis units. The algorithm, called Closed-Loop Training (CLT), is based on evaluating and reducing the distortion in synthesized speech. It minimizes distortion caused by synthesis process such as prosodic modification in an analytic way. The distortion is measured by calculating the error between synthesized speech units and natural speech units in a large speech database (corpus). The CLT method effectively generates the synthesis units that are most resembling of natural speech after synthesis process. In this paper, CLT is applied to a waveform concatenation based synthesizer, whose basic unit is a diphone. By using CLT, the synthesizer generates clear and smooth synthetic speech even with a relatively small volume of synthesis units.

Original languageEnglish
Publication statusPublished - 1998
Externally publishedYes
Event5th International Conference on Spoken Language Processing, ICSLP 1998 - Sydney, Australia
Duration: 1998 Nov 301998 Dec 4

Conference

Conference5th International Conference on Spoken Language Processing, ICSLP 1998
Country/TerritoryAustralia
CitySydney
Period98/11/3098/12/4

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Analytic Generation of Synthesis Units by Closed Loop Training @for Totally Speaker Driven Text to Speech System (TOS Drive TTS)'. Together they form a unique fingerprint.

Cite this