2011 GeneratingTextwithRecurrentNeur

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Recurrent Neural Network Language Model; Sequence Language Model, Character-level Language Model, Hessian-Free (HF) Optimizer, Hierarchical Nonparametric Sequence Model.

Notes

Cited By

Quotes

Abstract

Recurrent Neural Networks (RNNs) are very powerful sequence models that do not enjoy widespread use because it is extremely difficult to train them properly. Fortunately, recent advances in Hessian-free optimization have been able to overcome the difficulties associated with training RNNs, making it possible to apply them successfully to challenging sequence problems. In this paper we demonstrate the power of RNNs trained with the new Hessian-Free optimizer (HF) by applying them to character-level language modeling tasks. The standard RNN architecture, while effective, is not ideally suited for such tasks, so we introduce a new RNN variant that uses multiplicative (or "œgated") connections which allow the current input character to determine the transition matrix from one hidden state vector to the next. After training the multiplicative RNN with the HF optimizer for five days on 8 high-end Graphics Processing Units, we were able to surpass the performance of the best previous single method for character-level language modeling -“ a hierarchical nonparametric sequence model. To our knowledge this represents the largest recurrent neural network application to date.

References

BibTeX

@inproceedings{2011_GeneratingTextwithRecurrentNeur,
  author    = {Ilya Sutskever and
               James Martens and
               Geoffrey E. Hinton},
  editor    = {Lise Getoor and
               Tobias Scheffer},
  title     = {Generating Text with Recurrent Neural Networks},
  booktitle = {Proceedings of the 28th International Conference on Machine Learning
               (ICML 2011)},
  pages     = {1017--1024},
  publisher = {Omnipress},
  year      = {2011},
  url       = {https://icml.cc/2011/papers/524\_icmlpaper.pdf},
}


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2011 GeneratingTextwithRecurrentNeurGeoffrey E. Hinton
Ilya Sutskever
James Martens
Generating Text with Recurrent Neural Networks2011