Abstract
This paper proposes a technique for controlling singing style in the HMM-based singing voice synthesis. A style control technique based on multiple regression HSMM (MRHSMM), which was originally proposed for the HMM-based expressive speech synthesis, is applied to the conventional technique. The idea of pitch adaptive training is introduced into the MRHSMM to improve the modeling accuracy of fundamental frequency (F0) associated with notes. A robust vibrato modeling technique based on a moving average filter is also proposed to reproduce a natural-sounding vibrato expression even when the vibrato expression of the original singing voice is unclear. Subjective evaluation results show that users can intuitively control a singing style while keeping naturalness of the synthetic voice.
Original language | English |
---|---|
Pages (from-to) | 378-382 |
Number of pages | 5 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Publication status | Published - 2013 |
Event | 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013 - Lyon, France Duration: 2013 Aug 25 → 2013 Aug 29 |
Keywords
- HMM-based singing voice synthesis
- Multiple-regression HSMM
- Pitch adaptive training
- Style control
- Vibrato modeling