Kernels for structured natural language data

Jun Suzuki, Yutaka Sasaki, Eisaku Maeda

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

This paper devises a novel kernel function for structured natural language data. In the field of Natural Language Processing, feature extraction consists of the following two steps: (1) syntactically and semantically analyzing raw data, i.e., character strings, then representing the results as discrete structures, such as parse trees and dependency graphs with part-of-speech tags; (2) creating (possibly high-dimensional) numerical feature vectors from the discrete structures. The new kernels, called Hierarchical Directed Acyclic Graph (HDAG) kernels, directly accept DAGs whose nodes can contain DAGs. HDAG data structures are needed to fully reflect the syntactic and semantic structures that natural language data inherently have. In this paper, we define the kernel function and show how it permits efficient calculation. Experiments demonstrate that the proposed kernels are superior to existing kernel functions, e.g., sequence kernels, tree kernels, and bag-of-words kernels.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 16 - Proceedings of the 2003 Conference, NIPS 2003
PublisherNeural information processing systems foundation
ISBN (Print)0262201526, 9780262201520
Publication statusPublished - 2004
Event17th Annual Conference on Neural Information Processing Systems, NIPS 2003 - Vancouver, BC, Canada
Duration: 2003 Dec 82003 Dec 13

Publication series

NameAdvances in Neural Information Processing Systems
ISSN (Print)1049-5258

Conference

Conference17th Annual Conference on Neural Information Processing Systems, NIPS 2003
Country/TerritoryCanada
CityVancouver, BC
Period03/12/803/12/13

Fingerprint

Dive into the research topics of 'Kernels for structured natural language data'. Together they form a unique fingerprint.

Cite this