A self-refinement strategy for noise reduction in grammatical error correction

Masato Mita, Shun Kiyono, Masahiro Kaneko, Jun Suzuki, Kentaro Inui

研究成果: Conference contribution

3 被引用数 (Scopus)

抄録

Existing approaches for grammatical error correction (GEC) largely rely on supervised learning with manually created GEC datasets. However, there has been little focus on verifying and ensuring the quality of the datasets, and on how lower-quality data might affect GEC performance. We indeed found that there is a non-negligible amount of “noise” where errors were inappropriately edited or left uncorrected. To address this, we designed a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models, and outperformed strong denoising baseline methods. We further applied task-specific techniques and achieved state-of-the-art performance on the CoNLL-2014, JFLEG, and BEA-2019 benchmarks. We then analyzed the effect of the proposed denoising method, and found that our approach leads to improved coverage of corrections and facilitated fluency edits which are reflected in higher recall and overall performance.

本文言語English
ホスト出版物のタイトルFindings of the Association for Computational Linguistics Findings of ACL
ホスト出版物のサブタイトルEMNLP 2020
出版社Association for Computational Linguistics (ACL)
ページ267-280
ページ数14
ISBN(電子版)9781952148903
出版ステータスPublished - 2020
イベントFindings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020 - Virtual, Online
継続期間: 2020 11月 162020 11月 20

出版物シリーズ

名前Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020

Conference

ConferenceFindings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020
CityVirtual, Online
Period20/11/1620/11/20

ASJC Scopus subject areas

  • 情報システム
  • コンピュータ サイエンスの応用
  • 計算理論と計算数学

フィンガープリント

「A self-refinement strategy for noise reduction in grammatical error correction」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル