Fast end-to-end non-parallel voice conversion based on speaker-adaptive neural vocoder with cycle-consistent learning

Shuhei Imai, Aoi Kanagaki, Takashi Nose, Shogo Fukawa, Akinori Ito

Research output: Contribution to journalArticlepeer-review

Abstract

This paper proposes a fast end-to-end non-parallel voice conversion (VC) named Tachylone. In Thachylone, speaker conversion and waveform generation is performed by a single vocoder network. In the training of Tachylone, a pre-trained universal neural vocoder is used as the initial model, and the model parameters are updated using source and target speakers’ non-parallel data based on cycle-consistent learning in an end-to-end manner. We compare Tachylone to conventional CycleGAN-based VC with objective and subjective measures and discuss the results.

Original languageEnglish
Pages (from-to)116-119
Number of pages4
JournalAcoustical Science and Technology
Volume46
Issue number1
DOIs
Publication statusPublished - 2025 Jan

Keywords

  • Cycle-consistent learning
  • End-to-end VC
  • Neural vocoder
  • Non-parallel VC
  • Voice conversion (VC)

Fingerprint

Dive into the research topics of 'Fast end-to-end non-parallel voice conversion based on speaker-adaptive neural vocoder with cycle-consistent learning'. Together they form a unique fingerprint.

Cite this