TY - GEN
T1 - Automatic evaluation system of english prosody for Japanese Learner's Speech
AU - Suzuki, Motoyuki
AU - Konno, Tatsuki
AU - Ito, Akinori
AU - Makino, Shozo
PY - 2007
Y1 - 2007
N2 - Prosody plays an important role in speech communication between humans. Several computer-assisted language learning (CALL) systems with utterance evaluation have been developed so far; however, accuracy of their prosody evaluation is still poor. In this paper, we develop new methods to evaluate rhythm and intonation of English sentence uttered by Japanese learners. The new points of our work are that (1) new prosodic features are added to traditional features, and (2) word importance factors are introduced in the calculation of intonation score. The word importance score is automatically estimated using the ordinary least squares method, and optimized based on word clusters generated by a decision tree. The rhythm evaluator uses two acoustic features, time duration ratio of each word and normalized log-power. From the experiments, correlation coefficient (±1.0 denotes the best correlation) between the rhythm score given by native speakers and the system was -0.55. On the other hand, a conventional feature (pause insertion error rate) gave only -0.11. The intonation evaluator uses four acoustic features, pitch, normalized log-power, and first-order regression coefficients of those two features. Prom the experiments, correlation coefficient between the intonation score given by native speakers and the system was 0.45.
AB - Prosody plays an important role in speech communication between humans. Several computer-assisted language learning (CALL) systems with utterance evaluation have been developed so far; however, accuracy of their prosody evaluation is still poor. In this paper, we develop new methods to evaluate rhythm and intonation of English sentence uttered by Japanese learners. The new points of our work are that (1) new prosodic features are added to traditional features, and (2) word importance factors are introduced in the calculation of intonation score. The word importance score is automatically estimated using the ordinary least squares method, and optimized based on word clusters generated by a decision tree. The rhythm evaluator uses two acoustic features, time duration ratio of each word and normalized log-power. From the experiments, correlation coefficient (±1.0 denotes the best correlation) between the rhythm score given by native speakers and the system was -0.55. On the other hand, a conventional feature (pause insertion error rate) gave only -0.11. The intonation evaluator uses four acoustic features, pitch, normalized log-power, and first-order regression coefficients of those two features. Prom the experiments, correlation coefficient between the intonation score given by native speakers and the system was 0.45.
KW - Computer assisted language learning system
KW - Decision tree
KW - Intonation
KW - Prosody evaluation
KW - Rhythm
UR - http://www.scopus.com/inward/record.url?scp=84896923315&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84896923315&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84896923315
SN - 1934272116
SN - 9781934272114
T3 - IMSCI 2007 - International Multi-Conference on Society, Cybernetics and Informatics, Proceedings
SP - 48
EP - 53
BT - IMSCI 2007 - International Multi-Conference on Society, Cybernetics and Informatics, Proceedings
PB - International Institute of Informatics and Systemics, IIIS
T2 - International Multi-Conference on Society, Cybernetics and Informatics, IMSCI 2007
Y2 - 12 July 2007 through 15 July 2007
ER -