Coordinate Descent Algorithm

From GM-RKB
(Redirected from coordinate descent)
Jump to: navigation, search

A Coordinate Descent Algorithm is a mathematical optimization algorithm that is a non-derivative algorithm.



References

2018

  • (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/coordinate_descent Retrieved:2018-3-11.
    • Coordinate descent is an optimization algorithm that successively minimizes along coordinate directions to find the minimum of a function. At each iteration, the algorithm determines a coordinate or coordinate block via a coordinate selection rule, then exactly or inexactly minimizes over the corresponding coordinate hyperplane while fixing all other coordinates or coordinate blocks. A line search along the coordinate direction can be performed at the current iterate to determine the appropriate step size. Coordinate descent is applicable in both differentiable and derivative-free contexts.

2018b

  • (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Coordinate_descent#Description Retrieved:2018-3-11.
    • Coordinate descent is based on the idea that the minimization of a multivariable function [math] F(\mathbf{x}) [/math] can be achieved by minimizing it along one direction at a time, i.e., solving univariate (or at least much simpler) optimization problems in a loop.[1] In the simplest case of cyclic coordinate descent, one cyclically iterates through the directions, one at a time, minimizing the objective function with respect to each coordinate direction at a time. That is, starting with initial variable values : [math] \mathbf{x}^0 = (x^0_1, \ldots, x^0_n) [/math] ,

      round [math] k+1 [/math] defines [math] \mathbf{x}^{k+1} [/math] from [math] \mathbf{x}^k [/math] by iteratively solving the single variable optimization problems : [math] x^{k+1}_i = \underset{y\in\mathbb R}{\operatorname{arg\,min}}\; f(x^{k+1}_1, \dots, x^{k+1}_{i-1}, y, x^k_{i+1}, \dots, x^k_n) [/math] [2]

      for each variable [math] x_i [/math] of [math] \mathbf{x} [/math] , for [math] i [/math] from 1 to [math] n [/math] .

      Thus, one begins with an initial guess [math] \mathbf{x}^0 [/math] for a local minimum of [math] F [/math] , and gets a sequence [math] \mathbf{x}^0, \mathbf{x}^1, \mathbf{x}^2, \dots [/math] iteratively.

      By doing line search in each iteration, one automatically has : [math] F(\mathbf{x}^0)\ge F(\mathbf{x}^1)\ge F(\mathbf{x}^2)\ge \dots. [/math] It can be shown that this sequence has similar convergence properties as steepest descent. No improvement after one cycle of line search along coordinate directions implies a stationary point is reached.

      This process is illustrated below.

2017a


  1. Cite error: Invalid <ref> tag; no text was provided for refs named wright
  2. https://www.cs.cmu.edu/~ggordon/10725-F12/slides/25-coord-desc.pdf