TY - GEN
T1 - Statement map
T2 - 4th Workshop on Analytics for Noisy Unstructured Text Data, AND'10 Co-located with 19th International Conference on Information and Knowledge Management, CIKM'10
AU - Murakami, Koji
AU - Nichols, Eric
AU - Mizuno, Junta
AU - Watanabe, Yotaro
AU - Masuda, Shouko
AU - Goto, Hayato
AU - Ohki, Megumi
AU - Sao, Chitose
AU - Matsuyoshi, Suguru
AU - Inui, Kentaro
AU - Matsumoto, Yuji
PY - 2010
Y1 - 2010
N2 - On the Internet, users often encounter noise in the form of spelling errors or unknown words, however, dishonest, unreliable, or biased information also acts as noise that makes it difficult to find credible sources of information. As people come to rely on the Internet for more and more information, reducing this credibility noise grows ever more urgent. The Statement Map project's goal is to help Internet users evaluate the credibility of information sources by mining the Web for a variety of viewpoints on their topics of interest and presenting them to users together with supporting evidence in a way that makes it clear how they are related. In this paper, we show how a Statement Map system can be constructed by combining Information Retrieval (IR) and Natural Language Processing (NLP) technologies, focusing on the task of organizing statements retrieved from the Web by viewpoints. We frame this as a semantic relation classification task, and identify 4 semantic relations: [Agreement], [Conflict], [Confinement], and [Evidence]. The former two relations are identified by measuring semantic similarity through sentence alignment, while the latter two are identified through sentence-internal discourse processing. As a prelude to end-to-end user evaluation of Statement Map, we present a large-scale evaluation of semantic relation classification between user queries and Internet texts in Japanese and conduct detailed error analysis to identify the remaining areas of improvement.
AB - On the Internet, users often encounter noise in the form of spelling errors or unknown words, however, dishonest, unreliable, or biased information also acts as noise that makes it difficult to find credible sources of information. As people come to rely on the Internet for more and more information, reducing this credibility noise grows ever more urgent. The Statement Map project's goal is to help Internet users evaluate the credibility of information sources by mining the Web for a variety of viewpoints on their topics of interest and presenting them to users together with supporting evidence in a way that makes it clear how they are related. In this paper, we show how a Statement Map system can be constructed by combining Information Retrieval (IR) and Natural Language Processing (NLP) technologies, focusing on the task of organizing statements retrieved from the Web by viewpoints. We frame this as a semantic relation classification task, and identify 4 semantic relations: [Agreement], [Conflict], [Confinement], and [Evidence]. The former two relations are identified by measuring semantic similarity through sentence alignment, while the latter two are identified through sentence-internal discourse processing. As a prelude to end-to-end user evaluation of Statement Map, we present a large-scale evaluation of semantic relation classification between user queries and Internet texts in Japanese and conduct detailed error analysis to identify the remaining areas of improvement.
KW - Credibility analysis
KW - Discourse processing
KW - Opinion classification
KW - Semantic relation classification
KW - Statement map
KW - Structural alignment
UR - http://www.scopus.com/inward/record.url?scp=78651271667&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78651271667&partnerID=8YFLogxK
U2 - 10.1145/1871840.1871850
DO - 10.1145/1871840.1871850
M3 - Conference contribution
AN - SCOPUS:78651271667
SN - 9781450303767
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 59
EP - 66
BT - AND'10 - Proceedings of the 4th Workshop on Analytics for Noisy Unstructured Text Data, Co-located with 19th International Conference on Information and Knowledge Management, CIKM'10
Y2 - 26 October 2010 through 30 October 2010
ER -