TY - GEN
T1 - Voice conversion from arbitrary speakers based on deep neural networks with adversarial learning
AU - Miyamoto, Sou
AU - Nose, Takashi
AU - Ito, Suzunosuke
AU - Koike, Harunori
AU - Chiba, Yuya
AU - Ito, Akinori
AU - Shinozaki, Takahiro
N1 - Funding Information:
Part of this work was supported by JSPS KAKENHI Grant Number JP26280055 and JP15H02720.
Publisher Copyright:
© Springer International Publishing AG 2018.
PY - 2018
Y1 - 2018
N2 - In this study, we propose a voice conversion technique from arbitrary speakers based on deep neural networks using adversarial learning, which is realized by introducing adversarial learning to the conventional voice conversion. Adversarial learning is expected to enable us more natural voice conversion by using a discriminative model which classifies input speech to natural speech or converted speech in addition to a generative model. Experiments showed that proposed method was effective to enhance global variance (GV) of melcepstrum but naturalness of converted speech was a little lower than speech using the conventional variance compensation technique.
AB - In this study, we propose a voice conversion technique from arbitrary speakers based on deep neural networks using adversarial learning, which is realized by introducing adversarial learning to the conventional voice conversion. Adversarial learning is expected to enable us more natural voice conversion by using a discriminative model which classifies input speech to natural speech or converted speech in addition to a generative model. Experiments showed that proposed method was effective to enhance global variance (GV) of melcepstrum but naturalness of converted speech was a little lower than speech using the conventional variance compensation technique.
KW - Adversarial learning
KW - DNN-based voice conversion
KW - Model training
KW - Spectral differential filter
UR - http://www.scopus.com/inward/record.url?scp=85026657460&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85026657460&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-63859-1_13
DO - 10.1007/978-3-319-63859-1_13
M3 - Conference contribution
AN - SCOPUS:85026657460
SN - 9783319638584
T3 - Smart Innovation, Systems and Technologies
SP - 97
EP - 103
BT - Advances in Intelligent Information Hiding and Multimedia Signal Processing - Proceedings of the 13th International Conference on Intelligent Information Hiding and Multimedia Signal Processing,
A2 - Watada, Junzo
A2 - Jain, Lakhmi C.
A2 - Pan, Jeng-Shyang
A2 - Tsai, Pei-Wei
PB - Springer Science and Business Media Deutschland GmbH
T2 - 13th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2017
Y2 - 12 August 2017 through 15 August 2017
ER -