Conjugate Gradient Optimization Algorithm: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
No edit summary
 
m (Text replacement - "]]↵*" to "]]. *")
 
(29 intermediate revisions by 2 users not shown)
Line 1: Line 1:
A [[Conjugate Gradient Optimization Algorithm]] is a [[batch function optimization algorithm]] that ...
A [[Conjugate Gradient Optimization Algorithm]] is a [[batch function optimization algorithm]] that ...
* <B>AKA:</B> [[CG]].
* <B>AKA:</B> [[CG]].
* <B>See:</B> [[L-BFGS Algorithm]], [[Nonlinear Conjugate Gradient Algorithm]], [[Biconjugate Gradient Algorithm]].
** …
* <B>Counter-Example(s):</B>
** [[L-BFGS Algorithm]].
** [[Generalized Iterative Scaling]].
* <B>See:</B> [[Nonlinear Conjugate Gradient Algorithm]], [[Biconjugate Gradient Algorithm]].


----
----
----
----
==References==
 
===2012===
== References ==
 
=== 2012 ===
* http://en.wikipedia.org/wiki/Conjugate_gradient_method
* http://en.wikipedia.org/wiki/Conjugate_gradient_method
** QUOTE: In [[mathematics]], the '''conjugate gradient method''' is an [[algorithm]] for the [[numerical solution]] of particular [[system of linear equations|systems of linear equations]], namely those whose matrix is [[symmetric matrix|symmetric]] and [[positive-definite matrix|positive-definite]]. The conjugate gradient method is an [[iterative method]], so it can be applied to [[sparse matrix|sparse]] systems that are too large to be handled by direct methods such as the [[Cholesky decomposition]]. Such systems often arise when numerically solving [[partial differential equation]]s. <P>   The conjugate gradient method can also be used to solve unconstrained [[Mathematical optimization|optimization]] problems such as [[energy minimization]].  It was developed by [[Magnus Hestenes]] and [[Eduard Stiefel]].<ref>{{cite web|last=Straeter|first=T. A.|title=On the Extension of the Davidon-Broyden Class of Rank One, Quasi-Newton Minimization Methods to an Infinite Dimensional Hilbert Space with Applications to Optimal Control Problems|url=http://hdl.handle.net/2060/19710026200|work=NASA Technical Reports Server|publisher=NASA|accessdate=10 October 2011}}</ref> <P> The [[biconjugate gradient method]] provides a generalization to non-symmetric matrices. Various [[nonlinear conjugate gradient method]]s seek minima of nonlinear equations.
** QUOTE: In [[mathematics]], the <B>conjugate gradient method</B> is an [[algorithm]] for the [[numerical solution]] of particular [[system of linear equations|systems of linear equations]], namely those whose matrix is [[symmetric matrix|symmetric]] and [[positive-definite matrix|positive-definite]]. The conjugate gradient method is an [[iterative method]], so it can be applied to [[sparse matrix|sparse]] systems that are too large to be handled by direct methods such as the [[Cholesky decomposition]]. Such systems often arise when numerically solving [[partial differential equation]]s.       <P>           The conjugate gradient method can also be used to solve unconstrained [[Mathematical optimization|optimization]] problems such as [[energy minimization]].  It was developed by [[Magnus Hestenes]] and [[Eduard Stiefel]].<ref>{{cite web|last=Straeter|first=T. A.|title=On the Extension of the Davidon-Broyden Class of Rank One, Quasi-Newton Minimization Methods to an Infinite Dimensional Hilbert Space with Applications to Optimal Control Problems|url=http://hdl.handle.net/2060/19710026200|work=NASA Technical Reports Server|publisher=NASA|accessdate=10 October 2011}}</ref>         <P>       The [[biconjugate gradient method]] provides a generalization to non-symmetric matrices. Various [[nonlinear conjugate gradient method]]s seek minima of nonlinear equations.
<references/>
<references/>
=== 2006 ===
* ([[2006_AcceleratedTrainingofConditiona|Vishwanathan et al., 2006]]) ⇒ [[S. V. N. Vishwanathan]], [[Nicol N. Schraudolph]], [[Mark W. Schmidt]], and [[Kevin P. Murphy]]. ([[2006]]). “[http://people.cs.ubc.ca/~murphyk/Papers/icml06_camera.pdf Accelerated Training of Conditional Random Fields with Stochastic Gradient Methods].” In: [[Proceedings of the 23rd International Conference on Machine learning]] ([[ICML-2006]]). [http://dx.doi.org/10.1145/1143844.1143966 doi:10.1145/1143844.1143966]
** QUOTE: … Current [[training methods for CRFs]] ([[In this paper]], “training” specifically means [[penalized maximum likelihood parameter estimation]]) include [[generalized iterative scaling (GIS)]], [[Conjugate Gradient Optimization Algorithm|conjugate gradient (CG)]], and [[limited-memory BFGS]]. These are all [[batch-only algorithm]]s that do not work well in an online setting, and require many [[Training Data Pass|pass]]es through the [[training data]] to [[converge]]. … [[Stochastic gradient method]]s, on the other hand, are [[online]] and scale [[sub-linear]]ly with the [[amount of training data]], making them very attractive for [[large data set]]s;
=== 1994 ===
* ([[1994_TrainingFeedforwardNetworkswith|Hagan & Menhaj, 1994]]) ⇒ [[Martin T. Hagan]], and [[Mohammad B. Menhaj]]. ([[1994]]). “[http://www.das.ufsc.br/~marcelo/pg-ic/Marquardt%20algorithm%20for%20MLP.pdf Training Feedforward Networks with the Marquardt Algorithm].” In: IEEE Transactions on Neural Networks Journal, 5(6). [http://dx.doi.org/10.1109/72.329697 doi:10.1109/72.329697]
** QUOTE: The [[Marquardt algorithm]] for [[nonlinear least square]]s is presented and is incorporated into the [[backpropagation algorithm]] for [[training feedforward neural networks]]. </s> [[The algorithm]] is tested on several [[function approximation problem]]s, and is compared with a [[Conjugate Gradient Optimization Algorithm|conjugate gradient algorithm]] and a [[variable learning rate algorithm]]. </s>


----
----
__NOTOC__
__NOTOC__
[[Category:Concept]]
[[Category:Concept]]

Latest revision as of 17:51, 4 October 2023

A Conjugate Gradient Optimization Algorithm is a batch function optimization algorithm that ...



References

2012

2006

1994