TY - GEN
T1 - Unit selection speech synthesis using multiple speech units at non-adjacent segments for prosody and waveform generation
AU - Tamura, Masatsune
AU - Braunschweiler, Norbert
AU - Kagoshima, Takehiko
AU - Akamine, Masami
PY - 2010
Y1 - 2010
N2 - In this paper, we propose a speech synthesis method that combines a natural waveform concatenation based speech synthesis method and our baseline plural unit selection and fusion method. Two main features of the proposed method are (i) prosody regeneration from selected speech units and (ii) using multiple speech units at non-adjacent segments. The non-adjacent segments is the segment that the previous or following speech units in the optimum speech unit sequence are not adjacent in the database. By using the prosody of selected speech units, the original prosodic expressions and sounds of recorded speech are retained, while discontinuities are reduced by using multiple speech units at non-adjacent segments. MOS evaluations showed that the proposed method provides a clear improvement against the conventional unit selection method and our baseline method.
AB - In this paper, we propose a speech synthesis method that combines a natural waveform concatenation based speech synthesis method and our baseline plural unit selection and fusion method. Two main features of the proposed method are (i) prosody regeneration from selected speech units and (ii) using multiple speech units at non-adjacent segments. The non-adjacent segments is the segment that the previous or following speech units in the optimum speech unit sequence are not adjacent in the database. By using the prosody of selected speech units, the original prosodic expressions and sounds of recorded speech are retained, while discontinuities are reduced by using multiple speech units at non-adjacent segments. MOS evaluations showed that the proposed method provides a clear improvement against the conventional unit selection method and our baseline method.
KW - Concatenative speech synthesis
KW - Prosody generation
KW - Unit fusion
KW - Unit selection
UR - http://www.scopus.com/inward/record.url?scp=78049384950&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78049384950&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2010.5495151
DO - 10.1109/ICASSP.2010.5495151
M3 - Conference contribution
AN - SCOPUS:78049384950
SN - 9781424442966
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4802
EP - 4805
BT - 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010
Y2 - 14 March 2010 through 19 March 2010
ER -