Block-Coordinate Descent Algorithm

A Block-Coordinate Descent Algorithm is a coordinate descent algorithm that ...

See: Parallelized Block Coordinate Descent Algorithm.

References

2017

http://www.math.ucla.edu/~wotaoyin/papers/bcu/
- QUOTE: The block-coordinate update (BCU) method is a generalization to the following classic methods:
  - alternating minimization (of a function in the form of f(x_1,x_2)),
  - alternating projection (to find a point in the intersection of two convex sets mathcal{C}_1 and mathcal{C}_2 by alternatively projecting to mathcal{C}_1 and mathcal{C}_2),
  - (block) coordinate minimization (of a function in the form of f(x_1,x_2,ldots,x_s)),
  - (block) coordinate descent (of a function in the form of f(x_1,x_2,ldots,x_s)).
- BCU solves the problem in the form of [math]\displaystyle{ mathop{mathrm{minimize}}_{mathbf{x}_1,ldots,mathbf{x}_s} F(mathbf{x}_1,ldots,mathbf{x}_s)~mbox{subject to}~(mathbf{x}_1,ldots,mathbf{x}_s)inmathcal{X} }[/math] by updating just one or a few blocks of variables at a time, rather than updating all the blocks together (the batch update). The order of update can be deterministic or stochastic. The deterministic orders can be eithr cyclic or greedy according to a certain rank.
- The main advantage is that updating one or just a few blocks of variables are computationally much cheaper than the batch update. On the other hand, convergence requires more stringent conditions and typically takes more iterations.
- The update applied to each block can be exact minimization over the block or take different forms of inexact updates such as

  one or a few gradient descent steps,
   one or a few projected gradient descent steps,
   one or a few (preconditioned) CG steps,
   prox-linear update,

There is a tradeoff between the per-update complexity and the progress of overall minimization.

2015

(Kim et al., 2015) ⇒ Hannah Kim, Jaegul Choo, Jingu Kim, Chandan K. Reddy, and Haesun Park. (2015). “Simultaneous Discovery of Common and Discriminative Topics via Joint Nonnegative Matrix Factorization.” In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2015). ISBN:978-1-4503-3664-2 doi:10.1145/2783258.2783338
- QUTOE: To address such needs, this paper presents a novel topic modeling method based on joint nonnegative matrix factorization, which simultaneously discovers common as well as discriminative topics given multiple document sets. Our approach is based on a block-coordinate descent framework and is capable of utilizing only the most representative, thus meaningful, keywords in each topic through a novel pseudo-deflation approach.

2001

(Tseng, 2001) ⇒ Paul Tseng. (2001). “Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization.” In: Journal of optimization theory and applications 109, no. 3

Block-Coordinate Descent Algorithm

References

2017

2015

2001

Navigation menu

Search