Unit selection speech synthesis using multiple speech units at non-adjacent segments for prosody and waveform generation

Masatsune Tamura, Norbert Braunschweiler, Takehiko Kagoshima, Masami Akamine

研究成果: 書籍の章/レポート/Proceedings会議への寄与査読

3 被引用数 (Scopus)

抄録

In this paper, we propose a speech synthesis method that combines a natural waveform concatenation based speech synthesis method and our baseline plural unit selection and fusion method. Two main features of the proposed method are (i) prosody regeneration from selected speech units and (ii) using multiple speech units at non-adjacent segments. The non-adjacent segments is the segment that the previous or following speech units in the optimum speech unit sequence are not adjacent in the database. By using the prosody of selected speech units, the original prosodic expressions and sounds of recorded speech are retained, while discontinuities are reduced by using multiple speech units at non-adjacent segments. MOS evaluations showed that the proposed method provides a clear improvement against the conventional unit selection method and our baseline method.

本文言語英語
ホスト出版物のタイトル2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010 - Proceedings
出版社Institute of Electrical and Electronics Engineers Inc.
ページ4802-4805
ページ数4
ISBN(印刷版)9781424442966
DOI
出版ステータス出版済み - 2010
イベント2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010 - Dallas, TX, 米国
継続期間: 2010 3月 142010 3月 19

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN(印刷版)1520-6149

会議

会議2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010
国/地域米国
CityDallas, TX
Period10/3/1410/3/19

フィンガープリント

「Unit selection speech synthesis using multiple speech units at non-adjacent segments for prosody and waveform generation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル