2003 ComparingConvolKernelsAndRecursiveNNs

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Convolution Kernel Function, Recursive Neural Network

Notes

Cited By

Quotes

Abstract

Convolution kernels and recursive neural networks (RNN) are both suitable approaches for supervised learning when the input portion of an instance is a discrete structure like a tree or a graph. We report about an empirical comparison between the two architectures in a large scale preference learning problem related to natural language processing, where instances are candidate incremental parse trees. We found that kernels never outperform RNNs, even when a limited number of examples is employed for learning. We argue that convolution kernels may lead to feature space representations that are too sparse and too general because not focused on the specific learning task. The adaptive encoding mechanism in RNNs in this case allows us to obtain better prediction accuracy at smaller computational cost.


References

  • R. Bod. What is the Minimal Set of Fragments that Achieves Maximal Parse Accuracy? In: Proceedings of ACL, 2001.
  • Michael Collins and N. Duffy. Convolution Kernels for Natural Language. In: Proceedings of Neural Information Processing Systems. NIPS 14, 2001.
  • Michael Collins and N. Duffy. New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron. In: Proceedings of ACL, 2002.
  • F. Costa, P. Frasconi, V.Lombardo, and G. Soda. Towards incremental parsing of natural language using recursive neural networks. Applied Intelligence, 19(1/2):9–25, 2003.
  • P. Frasconi, M. Gori, and A. Sperduti. A General Framework for Adaptative Processing of Data Structure. IEEE Transaction on Neural Networks, 9(5):768–786, 1998.
  • Yoav Freund and Robert E. Schapire. Large Margin Classification using the Perceptron Algorithm. Machine Learning, 37(3):277–296, 1999.
  • C. Goller and A. Kuechler. Learning Task-dependent Distributed Structure-representations by Back-propagation through Structure. In IEEE International Conference on Neural networks, pages 347–352, 1996.
  • D. Haussler. Convolution Kernels on Discrete Structures. Technical Report UCSC-CLR-99-10, University of California at Santa Cruz, 1999.
  • T. Jaakkola and D. Haussler. Exploiting Generative Models in Discriminative Classifiers. In: Proceedings of Neural Information Processing Systems. NIPS 10, 1998.
  • V. Lombardo, L. Lesmo, L. Ferraris, and C. Seidenari. Incremental Processing and Lexicalized Grammars. In: Proceedings of the XXI Annual Meeting of the Cognitive Science Society, 1998.
  • V. Lombardo and P. Sturt. Incrementality and Lexicalism: a Treebank Study. In S. Stevenson and P. Merlo, editors, Lexical Representations in Sentence Processing, Computational Psycholingusitics Series. John Benjamins, in press.
  • M. Marcus, B. Santorini, and M. Marcinkiewicz. Building a Large Annotated Corpus of English: the Penn Treebank. Computational Linguistics, 19:313–330, 1993.
  • P. Sturt, F. Costa, V. Lombardo, and P. Frasconi. Learning First-Pass Structural Attachment Preferences with Dynamic Grammars and Recursive Neural Networks. Cognition, 2003. In press.
  • Vladimir N. Vapnik. Statistical Learning Theory. John Wiley and Sons, New York, 1998.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2003 ComparingConvolKernelsAndRecursiveNNsMassimiliano Pontil
Sauro Menchetti
Fabrizio Costa
Paolo Frasconi
Comparing Convolution Kernels and Recursive Neural Networks for Learning Preferences on Structured Datahttp://www.dsi.unifi.it/~menchett/papers/annpr2003.pdf