TY - JOUR
T1 - Automatic evaluation of singing enthusiasm for karaoke
AU - Daido, Ryunosuke
AU - Ito, Masashi
AU - Makino, Shozo
AU - Ito, Akinori
PY - 2014/3
Y1 - 2014/3
N2 - Evaluation of singing skill is a popular function of karaoke machines. Here, we introduce a different aspect of evaluating the singing voice of an amateur singer: "singing enthusiasm". First, we investigated whether human listeners can evaluate singing enthusiasm consistently and whether the listener's perception matches the singer's intended enthusiasm. We then identified three acoustic features relevant to the perception of singing enthusiasm: A-weighted power, "fall-down", and vibrato extent. Finally, we developed a method for combining the selected three features to estimate the value of singing enthusiasm, and obtained a correlation coefficient of 0.65 between the estimated value and human evaluation.
AB - Evaluation of singing skill is a popular function of karaoke machines. Here, we introduce a different aspect of evaluating the singing voice of an amateur singer: "singing enthusiasm". First, we investigated whether human listeners can evaluate singing enthusiasm consistently and whether the listener's perception matches the singer's intended enthusiasm. We then identified three acoustic features relevant to the perception of singing enthusiasm: A-weighted power, "fall-down", and vibrato extent. Finally, we developed a method for combining the selected three features to estimate the value of singing enthusiasm, and obtained a correlation coefficient of 0.65 between the estimated value and human evaluation.
KW - Karaoke
KW - Perception of singing voice
KW - Singing enthusiasm
KW - Singing voice
UR - http://www.scopus.com/inward/record.url?scp=84890548933&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84890548933&partnerID=8YFLogxK
U2 - 10.1016/j.csl.2012.07.007
DO - 10.1016/j.csl.2012.07.007
M3 - Article
AN - SCOPUS:84890548933
SN - 0885-2308
VL - 28
SP - 501
EP - 517
JO - Computer Speech and Language
JF - Computer Speech and Language
IS - 2
ER -