Block-Coordinate Descent Algorithm

From GM-RKB
Jump to navigation Jump to search

A Block-Coordinate Descent Algorithm is a coordinate descent algorithm that ...



References

2017

  • http://www.math.ucla.edu/~wotaoyin/papers/bcu/
    • QUOTE: The block-coordinate update (BCU) method is a generalization to the following classic methods:
      • alternating minimization (of a function in the form of f(x_1,x_2)),
      • alternating projection (to find a point in the intersection of two convex sets mathcal{C}_1 and mathcal{C}_2 by alternatively projecting to mathcal{C}_1 and mathcal{C}_2),
      • (block) coordinate minimization (of a function in the form of f(x_1,x_2,ldots,x_s)),
      • (block) coordinate descent (of a function in the form of f(x_1,x_2,ldots,x_s)).
    • BCU solves the problem in the form of [math]\displaystyle{ mathop{mathrm{minimize}}_{mathbf{x}_1,ldots,mathbf{x}_s} F(mathbf{x}_1,ldots,mathbf{x}_s)~mbox{subject to}~(mathbf{x}_1,ldots,mathbf{x}_s)inmathcal{X} }[/math] by updating just one or a few blocks of variables at a time, rather than updating all the blocks together (the batch update). The order of update can be deterministic or stochastic. The deterministic orders can be eithr cyclic or greedy according to a certain rank.
    • The main advantage is that updating one or just a few blocks of variables are computationally much cheaper than the batch update. On the other hand, convergence requires more stringent conditions and typically takes more iterations.
    • The update applied to each block can be exact minimization over the block or take different forms of inexact updates such as
  one or a few gradient descent steps,
   one or a few projected gradient descent steps,
   one or a few (preconditioned) CG steps,
   prox-linear update,

There is a tradeoff between the per-update complexity and the progress of overall minimization.

2015

2001

  • (Tseng, 2001) ⇒ Paul Tseng. (2001). “Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization.” In: Journal of optimization theory and applications 109, no. 3