Automatic generation of speech synthesis units based on closed loop training

Takehiko Kagoshima, Masami Akamine

Research output: Contribution to journalConference articlepeer-review

9 Citations (Scopus)


This paper proposes a new method for automatically generating speech synthesis units. A small set of synthesis units is selected from a large speech database by the proposed Closed-Loop Training method (CLT). Because CLT is based on the evaluation and minimization of the distortion caused by the synthesis process such as prosodic modification, the selected synthesis units are most suitable for synthesizers. In this paper, CLT is applied to a waveform concatenation based synthesizer, whose basic unit is CV/VC (diphone). It is shown that synthesis units can be efficiently generated by CLT from a labeled speech database with a small amount of computation. Moreover, the synthesized speech is clear and smooth even though the storage size of the waveform dictionary is small.

Original languageEnglish
Pages (from-to)963-966
Number of pages4
JournalProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Publication statusPublished - 1997
EventProceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 5) - Munich, Ger
Duration: 1997 Apr 211997 Apr 24


Dive into the research topics of 'Automatic generation of speech synthesis units based on closed loop training'. Together they form a unique fingerprint.

Cite this