Modeling user's state during dialog turn using HMM for multi-modal spoken dialog system

Yuya Chiba, Akinori Ito, Masashi Ito

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Conventional spoken dialog systems cannot estimate the user's state while waiting for an input from the user because the estimation process is triggered by observing the user's utterance. This is a problem when, for some reason, the user cannot make an input utterance in response to the system's prompt. To help these users before they give up, the system should handle the requests expressed by them unconsciously. Based on this assumption, we have examined a method to estimate the state of a user before making an utterance by using the non-verbal behavior of the user. The present paper proposes an automatic discrimination method by using time sequential non-verbal information of the user. In this method, the user's internal state is estimated using multi-modal information such as speech, facial expression and gaze, modeled using a Hidden Markov Model (HMM).

Original languageEnglish
Title of host publicationACHI 2014 - 7th International Conference on Advances in Computer-Human Interactions
EditorsLeslie Miller, Alma Leora Culen
PublisherInternational Academy, Research and Industry Association, IARIA
Pages343-346
Number of pages4
ISBN (Electronic)9781612083254
Publication statusPublished - 2014
Event7th International Conference on Advances in Computer-Human Interactions, ACHI 2014 - Barcelona, Spain
Duration: 2014 Mar 232014 Mar 27

Publication series

NameACHI 2014 - 7th International Conference on Advances in Computer-Human Interactions

Conference

Conference7th International Conference on Advances in Computer-Human Interactions, ACHI 2014
Country/TerritorySpain
CityBarcelona
Period14/3/2314/3/27

Keywords

  • Multi-modal information processing
  • Spoken dialog system
  • User's state

Fingerprint

Dive into the research topics of 'Modeling user's state during dialog turn using HMM for multi-modal spoken dialog system'. Together they form a unique fingerprint.

Cite this