Lyrics recognition from a singing voice based on finite state automaton for music information retrieval

Toru Hosoya, Motoyuki Suzuki, Akinori Ito, Shozo Makino

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

29 Citations (Scopus)

Abstract

Recently, several music information retrieval (MIR) systems have been developed which retrieve musical pieces by the user's singing voice. All of these systems use only the melody information for retrieval. Although the lyrics information is useful for retrieval, there have been few attempts to exploit lyrics in the user's input. In order to develop a MIR system that uses lyrics and melody information, lyrics recognition is needed. Lyrics recognition from a singing voice is achieved by similar technology to that of speech recognition. The difference between lyrics recognition and general speech recognition is that the input lyrics are a part of the lyrics of songs in a database. To exploit linguistic constraints maximally, we described the recognition grammar using a finite state automaton (FSA) that accepts only lyrics in the database. In addition, we carried out a "singing voice adaptation" using a speaker adaptation technique. In our experimental results, about 86% retrieval accuracy was obtained.

Original languageEnglish
Title of host publicationISMIR 2005 - 6th International Conference on Music Information Retrieval
Pages532-535
Number of pages4
Publication statusPublished - 2005
Event6th International Conference on Music Information Retrieval, ISMIR 2005 - London, United Kingdom
Duration: 2005 Sept 112005 Sept 15

Publication series

NameISMIR 2005 - 6th International Conference on Music Information Retrieval

Conference

Conference6th International Conference on Music Information Retrieval, ISMIR 2005
Country/TerritoryUnited Kingdom
CityLondon
Period05/9/1105/9/15

Keywords

  • FSA
  • Lyrics recognition
  • MIR

Fingerprint

Dive into the research topics of 'Lyrics recognition from a singing voice based on finite state automaton for music information retrieval'. Together they form a unique fingerprint.

Cite this