TY - GEN
T1 - Sequence and tree kernels with statistical feature mining
AU - Suzuki, Jun
AU - Isozaki, Hideki
PY - 2005
Y1 - 2005
N2 - This paper proposes a new approach to feature selection based on a statistical feature mining technique for sequence and tree kernels. Since natural language data take discrete structures, convolution kernels, such as sequence and tree kernels, are advantageous for both the concept and accuracy of many natural language processing tasks. However, experiments have shown that the best results can only be achieved when limited small sub-structures are dealt with by these kernels. This paper discusses this issue of convolution kernels and then proposes a statistical feature selection that enable us to use larger sub-structures effectively. The proposed method, in order to execute efficiently, can be embedded into an original kernel calculation process by using sub-structure mining algorithms. Experiments on real NLP tasks confirm the problem in the conventional method and compare the performance of a conventional method to that of the proposed method.
AB - This paper proposes a new approach to feature selection based on a statistical feature mining technique for sequence and tree kernels. Since natural language data take discrete structures, convolution kernels, such as sequence and tree kernels, are advantageous for both the concept and accuracy of many natural language processing tasks. However, experiments have shown that the best results can only be achieved when limited small sub-structures are dealt with by these kernels. This paper discusses this issue of convolution kernels and then proposes a statistical feature selection that enable us to use larger sub-structures effectively. The proposed method, in order to execute efficiently, can be embedded into an original kernel calculation process by using sub-structure mining algorithms. Experiments on real NLP tasks confirm the problem in the conventional method and compare the performance of a conventional method to that of the proposed method.
UR - http://www.scopus.com/inward/record.url?scp=70049098154&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70049098154&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:70049098154
SN - 9780262232531
T3 - Advances in Neural Information Processing Systems
SP - 1321
EP - 1328
BT - Advances in Neural Information Processing Systems 18 - Proceedings of the 2005 Conference
T2 - 2005 Annual Conference on Neural Information Processing Systems, NIPS 2005
Y2 - 5 December 2005 through 8 December 2005
ER -