2011 LearningRecurrentNeuralNetworks
- (Martens & Sutskever, 2011) ⇒ James Martens, and Ilya Sutskever. (2011). “Learning Recurrent Neural Networks with Hessian-Free Optimization.” In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011).
Subject Headings: Recurrent Neural Network; Hessian-Free Optimization, Sequence Modeling Task, Martens Hessian-Free Optimization , Neural Network Sequence Model, Generalized Gauss-Newton Matrix.
Notes
Cited By
- Google Scholar: 660 Citations.
Quotes
Abstract
In this work we resolve the long-outstanding problem of how to effectively train recurrent neural networks (RNNs) on complex and difficult sequence modeling problems which may contain long-term data dependencies. Utilizing recent advances in the Hessian-free optimization approach (Martens, 2010), together with a novel damping scheme, we successfully train RNNs on two sets of challenging problems. First, a collection of pathological synthetic datasets which are known to be impossible for standard optimization approaches (due to their extremely long-term dependencies), and second, on three natural and highly complex real-world sequence datasets where we find that our method significantly outperforms the previous state-of-the-art method for training neural sequence models: the Long Short-term Memory approach of Hochreiter and Schmidhuber (1997). Additionally, we offer a new interpretation of the generalized Gauss-Newton matrix of Schraudolph (2002) which is used within the HF approach of Martens.
References
BibTeX
@inproceedings{2011_LearningRecurrentNeuralNetworks,
author = {James Martens and
[[Ilya Sutskever]]},
editor = {Lise Getoor and
Tobias Scheffer},
title = {Learning Recurrent Neural Networks with Hessian-Free Optimization},
booktitle = {Proceedings of the 28th International Conference on Machine Learning
(ICML 2011)},
pages = {1033--1040},
publisher = {Omnipress},
year = {2011},
url = {https://icml.cc/2011/papers/532\_icmlpaper.pdf},
}
| Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
|---|---|---|---|---|---|---|---|---|---|---|
| 2011 LearningRecurrentNeuralNetworks | Ilya Sutskever James Martens | Learning Recurrent Neural Networks with Hessian-Free Optimization | 2011 |