TY - GEN
T1 - Language models as an alternative evaluator of word order hypotheses
T2 - 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020
AU - Kuribayashi, Tatsuki
AU - Ito, Takumi
AU - Suzuki, Jun
AU - Inui, Kentaro
N1 - Funding Information:
We would like to offer our gratitude to Kaori Uchiyama for taking the time to discuss our paper and Ana Brassard for her sharp feedback on English. We also would like to show our appreciation to the Tohoku NLP lab members for their valuable advice. We are particularly grateful to Ry-ohei Sasano for sharing the data for double objects order analyses. This work was supported by JST CREST Grant Number JPMJCR1513, JSPS KAK-ENHI Grant Number JP19H04162, and Grant-in-Aid for JSPS Fellows Grant Number JP20J22697.
Publisher Copyright:
© 2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - We examine a methodology using neural language models (LMs) for analyzing the word order of language. This LM-based method has the potential to overcome the difficulties existing methods face, such as the propagation of preprocessor errors in count-based methods. In this study, we explore whether the LM-based method is valid for analyzing the word order. As a case study, this study focuses on Japanese due to its complex and flexible word order. To validate the LM-based method, we test (i) parallels between LMs and human word order preference, and (ii) consistency of the results obtained using the LM-based method with previous linguistic studies. Through our experiments, we tentatively conclude that LMs display sufficient word order knowledge for usage as an analysis tool. Finally, using the LM-based method, we demonstrate the relationship between the canonical word order and topicalization, which had yet to be analyzed by large-scale experiments.
AB - We examine a methodology using neural language models (LMs) for analyzing the word order of language. This LM-based method has the potential to overcome the difficulties existing methods face, such as the propagation of preprocessor errors in count-based methods. In this study, we explore whether the LM-based method is valid for analyzing the word order. As a case study, this study focuses on Japanese due to its complex and flexible word order. To validate the LM-based method, we test (i) parallels between LMs and human word order preference, and (ii) consistency of the results obtained using the LM-based method with previous linguistic studies. Through our experiments, we tentatively conclude that LMs display sufficient word order knowledge for usage as an analysis tool. Finally, using the LM-based method, we demonstrate the relationship between the canonical word order and topicalization, which had yet to be analyzed by large-scale experiments.
UR - http://www.scopus.com/inward/record.url?scp=85117929922&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85117929922&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85117929922
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 488
EP - 504
BT - ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
Y2 - 5 July 2020 through 10 July 2020
ER -