Abstract
This paper proposes a voice conversion named SpSiVC that appropriately converts both speech and singing voices with a single model. Since the distribution of pitch between speakers is significantly different for speech and singing voices, voice conversion has been mainly evaluated as a separate task for speech and singing voice conversion. SpSiVC introduces an adaptive F0 loss, which enables conversion that implicitly switches the shift width of the logarithm F0 according to the type of input voice. We examine the effectiveness of the F0 constraints in objective and subjective evaluations.
Original language | English |
---|---|
Pages (from-to) | 120-123 |
Number of pages | 4 |
Journal | Acoustical Science and Technology |
Volume | 46 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2025 Jan |
Keywords
- CycleGAN
- Singing voice conversion (SVC)
- Unified model
- Voice conversion (VC)