TY - GEN
T1 - Comparison of speech recognition performance between kaldi and google cloud speech API
AU - Kimura, Takashi
AU - Nose, Takashi
AU - Hirooka, Shinji
AU - Chiba, Yuya
AU - Ito, Akinori
N1 - Funding Information:
Part of this work was supported by JSPS KAKENHI Grant Numbers JP17H00823.
Funding Information:
Part of this work was supported by JSPS KAKENHI Grant Num-
Publisher Copyright:
© Springer Nature Switzerland AG 2019.
PY - 2019
Y1 - 2019
N2 - In recent years, many systems having a speech interface have grown. The speech interface includes spoken dialogue function and high performance of a spoken dialogue system has been required. The spoken dialogue system consists of a speech recognition module. In this study, we focus on the speech recognition module of the spoken dialogue system and aim for improving the spoken dialogue system by enhancing the performance of the speech recognition system. Among several speech recognition systems, Kaldi is a widely used speech recognition system in many kinds of researches. On the other hand, several speech recognition services that are Web API is also provided, such as IBM Watson Speech to Text, Microsoft Bing Speech API, and Google Cloud Speech API, which is known that it has high performance. This paper compares speech recognition performance between Kaldi and Google Cloud Speech API in WER and RTF and confirms the recognition performance of each recognition system.
AB - In recent years, many systems having a speech interface have grown. The speech interface includes spoken dialogue function and high performance of a spoken dialogue system has been required. The spoken dialogue system consists of a speech recognition module. In this study, we focus on the speech recognition module of the spoken dialogue system and aim for improving the spoken dialogue system by enhancing the performance of the speech recognition system. Among several speech recognition systems, Kaldi is a widely used speech recognition system in many kinds of researches. On the other hand, several speech recognition services that are Web API is also provided, such as IBM Watson Speech to Text, Microsoft Bing Speech API, and Google Cloud Speech API, which is known that it has high performance. This paper compares speech recognition performance between Kaldi and Google Cloud Speech API in WER and RTF and confirms the recognition performance of each recognition system.
KW - Google Cloud Speech API
KW - Kaldi
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=85057108112&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057108112&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-03748-2_13
DO - 10.1007/978-3-030-03748-2_13
M3 - Conference contribution
AN - SCOPUS:85057108112
SN - 9783030037475
T3 - Smart Innovation, Systems and Technologies
SP - 109
EP - 115
BT - Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing - Proceeding of the Fourteenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing
A2 - Jain, Lakhmi C.
A2 - Jain, Lakhmi C.
A2 - Tsai, Pei-Wei
A2 - Ito, Akinori
A2 - Pan, Jeng-Shyang
A2 - Jain, Lakhmi C.
PB - Springer Science and Business Media Deutschland GmbH
T2 - 14th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2018
Y2 - 26 November 2018 through 28 November 2018
ER -