Long Short-Term Memory (LSTM) Unit

From GM-RKB
Jump to navigation Jump to search

A Long Short-Term Memory (LSTM) Unit is a recurrent neural network unit composed of an memory cell, an input gate, a forget gate, and an output gate.



References

2018a

  • (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/long_short-term_memory Retrieved:2018-3-27.
    • Long short-term memory (LSTM) units (or blocks) are a building unit for layers of a recurrent neural network (RNN). A RNN composed of LSTM units is often called an LSTM network. A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell is responsible for "remembering" values over arbitrary time intervals; hence the word "memory" in LSTM. Each of the three gates can be thought of as a "conventional" artificial neuron, as in a multi-layer (or feedforward) neural network: that is, they compute an activation (using an activation function) of a weighted sum. Intuitively, they can be thought as regulators of the flow of values that goes through the connections of the LSTM; hence the denotation "gate". There are connections between these gates and the cell.

      The expression long short-term refers to the fact that LSTM is a model for the short-term memory which can last for a long period of time. An LSTM is well-suited to classify, process and predict time series given time lags of unknown size and duration between important events. LSTMs were developed to deal with the exploding and vanishing gradient problem when training traditional RNNs. Relative insensitivity to gap length gives an advantage to LSTM over alternative RNNs, hidden Markov models and other sequence learning methods in numerous applications.

2018b

LSTM cell diagram

2018c


2017

2016

2015a

2015b

2014

Figure 1: Illustration of (a) LSTM and (b) gated recurrent units. (a) [math]\displaystyle{ i }[/math], [math]\displaystyle{ f }[/math] and [math]\displaystyle{ o }[/math] are the input, forget and output gates, respectively. [math]\displaystyle{ c }[/math] and [math]\displaystyle{ \tilde{c} }[/math] denote the memory cell and the new memory cell content. (b) [math]\displaystyle{ r }[/math] and [math]\displaystyle{ z }[/math] are the reset and update gates, and [math]\displaystyle{ h }[/math] and [math]\displaystyle{ \tilde{h} }[/math] are the activation and the candidate activation.

2013

2005

2002

Figure 1: LSTM memory block with one cell (rectangle). The so-called CEC maintains the cell state [math]\displaystyle{ s_c }[/math],which may be reset by the forget gate. Input and output gate control read and write access to the CEC; [math]\displaystyle{ g }[/math] squashes the cell input.

1997

Figure 1: Architecture of memory cell [math]\displaystyle{ c_j }[/math] (the box) and its gate units [math]\displaystyle{ in_j }[/math], [math]\displaystyle{ out_j }[/math]. The self-recurrent connection (with weight 1.0) indicates feedback with a delay of 1 time step. It builds the basis of the “constant error carrouselCEC. The gate units open and close access to CEC. See text and appendix A.1 for details.