Content-Based Attention Network

From GM-RKB
Jump to navigation Jump to search

A Content-Based Attention Network is a Artificial Neural Network that includes an attention mechanism and a context vector.



References

2020

2018

[math]\displaystyle{ c_{i}=\sum_{j=1}^{T} \alpha_{i, j} h_{j} }[/math] (6)

 :: The weight $\alpha_{i,j}$ of each $h_j$ is computed by

[math]\displaystyle{ \alpha_{i, j}=\exp \left(e_{i, j}\right) / \sum_{j=1}^{T} \exp \left(e_{i, j}\right) }[/math] (7)

 :: where

[math]\displaystyle{ e_{i, j}=\operatorname{Score}\left(s_{i-1}, h_{j}\right) }[/math] (8)

 :: Here the Score is an MLP network which measures how well the inputs around position $j$ and the output at position $i$ match. It is based on the LSTM hidden state $s_{i−1}$ and $h_j$ of the input sentence. Specifically, it can be further described by

[math]\displaystyle{ e_{i, j}=\operatorname{Score}\left(s_{i-1}, h_{j}\right) }[/math] (9)

 :: where $w$ and $b$ are vectors, and $W$ and $V$ are matrices.

[math]\displaystyle{ e_{i, j}=\mathbf{w}^{\top} \tanh \left(\mathbf{W} \mathbf{s}_{i-1}+\mathbf{V h}_{j}+\mathbf{b}\right) }[/math] (10)

2015