2011 LearningRecurrentNeuralNetworks
- (Martens & Sutskever, 2011) ⇒ James Martens, and Ilya Sutskever. (2011). “Learning Recurrent Neural Networks with Hessian-Free Optimization.” In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011).
Subject Headings: Recurrent Neural Network; Hessian-Free Optimization, Sequence Modeling Task, Martens Hessian-Free Optimization , Neural Network Sequence Model, Generalized Gauss-Newton Matrix.
Notes
Cited By
- Google Scholar: 660 Citations.
Quotes
Abstract
In this work we resolve the long-outstanding problem of how to effectively train recurrent neural networks (RNNs) on complex and difficult sequence modeling problems which may contain long-term data dependencies. Utilizing recent advances in the Hessian-free optimization approach (Martens, 2010), together with a novel damping scheme, we successfully train RNNs on two sets of challenging problems. First, a collection of pathological synthetic datasets which are known to be impossible for standard optimization approaches (due to their extremely long-term dependencies), and second, on three natural and highly complex real-world sequence datasets where we find that our method significantly outperforms the previous state-of-the-art method for training neural sequence models: the Long Short-term Memory approach of Hochreiter and Schmidhuber (1997). Additionally, we offer a new interpretation of the generalized Gauss-Newton matrix of Schraudolph (2002) which is used within the HF approach of Martens.
References
BibTeX
@inproceedings{2011_LearningRecurrentNeuralNetworks, author = {James Martens and [[Ilya Sutskever]]}, editor = {Lise Getoor and Tobias Scheffer}, title = {Learning Recurrent Neural Networks with Hessian-Free Optimization}, booktitle = {Proceedings of the 28th International Conference on Machine Learning (ICML 2011)}, pages = {1033--1040}, publisher = {Omnipress}, year = {2011}, url = {https://icml.cc/2011/papers/532\_icmlpaper.pdf}, }
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2011 LearningRecurrentNeuralNetworks | Ilya Sutskever James Martens | Learning Recurrent Neural Networks with Hessian-Free Optimization | 2011 |