LSTM-based Encoder-Decoder Network: Difference between revisions
Jump to navigation
Jump to search
(Created page with "An LSTM-based Encoder-Decoder Network is an RNN-based encoder-decoder model composed of LSTM models (an LSTM encoder and an LSTM decoder). * <B>Context:</B...") |
No edit summary |
||
Line 3: | Line 3: | ||
** It can be trained by a [[LSTM-based Encoder/Decoder RNN Training System]]. | ** It can be trained by a [[LSTM-based Encoder/Decoder RNN Training System]]. | ||
* <B>Example(s):</B> | * <B>Example(s):</B> | ||
** | ** an [[LSTM-based Encoder-Decoder Machine Translation Model]]. | ||
** | ** an [[LSTM-based Encoder-Decoder Text Error Correction Model]], such as an [[LSTM-based Encoder-Decoder WikiText Error Correction Model]]. | ||
** an [[LSTM+Attention-based Encoder-Decoder Model]]. | ** an [[LSTM+Attention-based Encoder-Decoder Model]]. | ||
** ... | ** ... |
Revision as of 04:52, 23 September 2020
An LSTM-based Encoder-Decoder Network is an RNN-based encoder-decoder model composed of LSTM models (an LSTM encoder and an LSTM decoder).
- Context:
- It can be trained by a LSTM-based Encoder/Decoder RNN Training System.
- Example(s):
- Counter-Example(s):
- See: Neural seq2seq, Bidirectional LSTM.
References
2018
- (Brownlee, 2018) ⇒ Jason Brownlee. (2018). “Encoder-Decoder Recurrent Neural Network Models for Neural Machine Translation." Blog Post
- QUOTE: After reading this post, you will know:
- The encoder-decoder recurrent neural network architecture is the core technology inside Google’s translate service.
- The so-called “Sutskever model” for direct end-to-end machine translation.
- The so-called “Cho model” that extends the architecture with GRU units and an attention mechanism.
- QUOTE: After reading this post, you will know:
2017
- (Robertson, 2017) ⇒ Sean Robertson. (2017). “Translation with a Sequence to Sequence Network and Attention.” In: TensorFlow Tutorials
- QUOTE: A basic sequence-to-sequence model, as introduced in Cho et al., 2014 , consists of two recurrent neural networks (RNNs): an encoder that processes the input and a decoder that generates the output. This basic architecture is depicted below.
Each box in the picture above represents a cell of the RNN, most commonly a GRU cell or an LSTM cell (see the RNN Tutorial for an explanation of those). Encoder and decoder can share weights or, as is more common, use a different set of parameters. Multi-layer cells have been successfully used in sequence-to-sequence models too, e.g. for translation Sutskever et al., 2014 .
- QUOTE: A basic sequence-to-sequence model, as introduced in Cho et al., 2014 , consists of two recurrent neural networks (RNNs): an encoder that processes the input and a decoder that generates the output. This basic architecture is depicted below.
2014a
- (Sutskever et al., 2014) ⇒ Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. (2014). “Sequence to Sequence Learning with Neural Networks.” In: Advances in Neural Information Processing Systems.