Neural Network with Self-Attention Mechanism

From GM-RKB
(Redirected from self-attention network)
Jump to navigation Jump to search

A Neural Network with Self-Attention Mechanism is a Neural Network with Attention Mechanism that includes a self-attention mechanism.



References

2019

2017

$\mathbf{a} = softmax\left(\mathbf{w_{s2}}tanh\left(W_{s1}H^T\right)\right) $

(5)
Here $W_{s1}$ is a weight matrix with a shape of $d_a$-by-$2u$. and $\mathbf{w_{s2}}$ is a vector of parameters with size $d_a$, where $d_a$ is a hyperparameter we can set arbitrarily. Since $H$ is sized $n$-by-$2u$, the annotation vector a will have a size $n$. the $softmax(\cdot)$ ensures all the computed weights sum up to 1. Then we sum up the LSTM hidden states $H$ according to the weight provided by a to get a vector representation $m$ of the input sentence.