Exploding Gradient Problem: Difference between revisions

Latest revision as of 01:50, 27 February 2024

An Exploding Gradient Problem is a Neural Network Training Algorithm problem that arises when using gradient descent and backpropagation.

Context:
- It can be solved by using a Gradient Clipping Algorithm and Weight Regularization Algorithm.
- …
Example(s):
- Figure 3 in Grosse (2017).
- …
Counter-Example(s):
- Vanishing Gradient Problem.
See: Recurrent Neural Network, Deep Neural Network Training Algorithm.

References

2021

(DeepAI, 2021) ⇒ https://deepai.org/machine-learning-glossary-and-terms/exploding-gradient-problem Retrieved:2021-01-23.
- QUOTE: In machine learning, the exploding gradient problem is an issue found in training artificial neural networks with gradient-based learning methods and backpropagation. An artificial neural network is a learning algorithm, also called neural network or neural net, that uses a network of functions to understand and translate data input into a specific output. This type of learning algorithm is designed to mimic the way neurons function in the human brain. Exploding gradients are a problem when large error gradients accumulate and result in very large updates to neural network model weights during training. Gradients are used during training to update the network weights, but when the typically this process works best when these updates are small and controlled. When the magnitudes of the gradients accumulate, an unstable network is likely to occur, which can cause poor predicition results or even a model that reports nothing useful what so ever. There are methods to fix exploding gradients, which include gradient clipping and weight regularization, among others.
  (...)
  
  Exploding gradients can cause problems in the training of artificial neural networks. When there are exploding gradients, an unstable network can result and the learning cannot be completed. The values of the weights can also become so large as to overflow and result in something called NaN values. NaN values, which stands for not a number, are values that represent an undefined or unrepresentable values. It is useful to know how to identify exploding gradients in order to correct the training.

2017

(Grosse, 2017) ⇒ Roger Grosse (2017). "Lecture 15: Exploding and Vanishing Gradients". In: CSC321 Course, University of Toronto.
- QUOTE: To make this story more concrete, consider the following RNN, which uses the tanh activation function:

Figure 3 shows the function computed at each time step, as well as the function computed by the network as a whole. From this figure, you can see which regions have exploding or vanishing gradients.

**Figure 3:** (left) The function computed by the RNN at each time step, (right) the function computed by the network.

@@ Line 17: / Line 17: @@
 === 2021 ===
 * (DeepAI, 2021) ⇒ https://deepai.org/machine-learning-glossary-and-terms/exploding-gradient-problem Retrieved:2021-01-23.
-** QUOTE: In [[machine learning]], the [[exploding gradient problem]] is an issue found in [[Neural Network Training System|training artificial neural network]]s with [[gradient-based learning method]]s and [[backpropagation]]. An [[artificial neural network]] is a [[learning algorithm]], also called [[ANN|neural network]] or [[neural net]], that uses a [[network]] of [[function]]s to understand and translate [[data input]] into a specific [[output]]. This type of [[learning algorithm]] is designed to mimic the way [[neuron]]s [[function]] in the [[human brain]]. [[Exploding gradient]]s are a [[problem]] when large [[error gradient]]s accumulate and result in very large [[update]]s to [[neural network model]] weights during [[NN Training System|training]]. [[Gradient]]s are used during [[NN Training System|training]] to [[update]] the [[network weight]]s, but when the typically this process works best when these [[update]]s are small and controlled. When the [[magnitude]]s of the [[gradient]]s accumulate, an [[unstable network]] is likely to occur, which can cause poor [[predicition]] results or even a [[model]] that reports nothing useful what so ever. There are methods to fix exploding [[gradient]]s, which include [[gradient clipping]] and [[weight regularization]], among others.         <P><div style="text-align:center">(...)</div>         <P>        [[Exploding gradient]]s can cause problems in the [[training of artificial neural network]]s. When there are [[exploding gradient]]s, an [[unstable network]] can result and the [[learning]] cannot be completed. The values of the [[NN Weight|weight]]s can also become so large as to overflow and result in something called [[NaN value]]s. [[NaN value]]s, which stands for not a number, are values that represent an undefined or unrepresentable values. It is useful to know how to identify [[exploding gradient]]s in order to correct the [[NN Training System|training]].
+** QUOTE: In [[machine learning]], the [[exploding gradient problem]] is an issue found in [[Neural Network Training System|training artificial neural network]]s with [[gradient-based learning method]]s and [[backpropagation]]. An [[artificial neural network]] is a [[learning algorithm]], also called [[ANN|neural network]] or [[neural net]], that uses a [[network]] of [[function]]s to understand and translate [[data input]] into a specific [[output]]. This type of [[learning algorithm]] is designed to mimic the way [[neuron]]s [[function]] in the [[human brain]]. [[Exploding gradient]]s are a [[problem]] when large [[error gradient]]s accumulate and result in very large [[update]]s to [[neural network model]] weights during [[NN Training System|training]]. [[Gradient]]s are used during [[NN Training System|training]] to [[update]] the [[network weight]]s, but when the typically this process works best when these [[update]]s are small and controlled. When the [[magnitude]]s of the [[gradient]]s accumulate, an [[unstable network]] is likely to occur, which can cause poor [[predicition]] results or even a [[model]] that reports nothing useful what so ever. There are methods to fix exploding [[gradient]]s, which include [[gradient clipping]] and [[weight regularization]], among others.         <P><div style="text-align:center">(...)</div>         <P>          [[Exploding gradient]]s can cause problems in the [[training of artificial neural network]]s. When there are [[exploding gradient]]s, an [[unstable network]] can result and the [[learning]] cannot be completed. The values of the [[NN Weight|weight]]s can also become so large as to overflow and result in something called [[NaN value]]s. [[NaN value]]s, which stands for not a number, are values that represent an undefined or unrepresentable values. It is useful to know how to identify [[exploding gradient]]s in order to correct the [[NN Training System|training]].
 === 2017 ===

Exploding Gradient Problem: Difference between revisions

Latest revision as of 01:50, 27 February 2024

References

2021

2017

Navigation menu

Search