Abstract
In this paper, we discuss how to annotate coreference and predicate-argument relations in Japanese written text. There have been research activities for building Japanese text corpora annotated with coreference and predicate-argument relations as are done in the Kyoto Text Corpus version 4.0 (Kawahara et al., 2002) and the GDATagged Corpus (Hasida, 2005). However, there is still much room for refining their specifications. For this reason, we discuss issues in annotating these two types of relations, and propose a new specification for each. In accordance with the specification, we built a large-scaled annotated corpus, and examined its reliability. As a result of our current work, we have released an annotated corpus named the NAIST Text Corpus1, which is used as the evaluation data set in the coreference and zero-anaphora resolution tasks in Iida et al. (2005) and Iida et al. (2006).
Original language | English |
---|---|
Pages | 132-139 |
Number of pages | 8 |
DOIs | |
Publication status | Published - 2007 |
Event | Linguistic Annotation Workshop, LAW 2007 - Prague, Czech Republic Duration: 2007 Jun 28 → 2007 Jun 29 |
Conference
Conference | Linguistic Annotation Workshop, LAW 2007 |
---|---|
Country/Territory | Czech Republic |
City | Prague |
Period | 07/6/28 → 07/6/29 |