Proximal Gradient Methods

From Wikipedia, the free encyclopedia

Convex optimization is a sub field of optimization which can produce reliable solutions and can be solved exactly. Many signal processing problems can be formulated as convex optimization problems of form

{\begin{aligned}&\operatorname {minimize}_{{x\in {\mathbb  {R}}^{N}}}&&f_{1}(x)+f_{2}(x)+...+f_{{n-1}}(x)+f_{n}(x)\end{aligned}}

where f_{1},f_{2},...,f_{n} are convex functions defined from f:{\mathbb  {R}}^{N}\rightarrow {\mathbb  {R}} where some of the functions are non-differentiable, this rules out our conventional smooth optimization techniques like Steepest decent method, conjugate gradient method etc. There is a specific class of algorithms which can solve above optimization problem. These methods proceed by splitting, in that the functions f_{1},...,f_{n} are used individually so as to yield an easily implementable algorithm. They are called proximal because each non smooth function among f_{1},...,f_{n} is involved via its proximity operator. Iterative Shrinkage thresholding algorithm, projected Landweber, projected gradient, alternating projections, alternating-direction method of multipliers, alternating split Bregman are special instances of proximal algorithms. Details of proximal methods are discussed in Combettes et.al.[1] For the theory of proximal gradient methods from the perspective of and with applications to statistical learning theory, see proximal gradient methods for learning.

Notations and Terminology

Let {\mathbb  {R}}^{N}, the N-dimensional euclidean space, be the domain of the function f:{\mathbb  {R}}^{N}\rightarrow [-\infty ,+\infty ]. Suppose C is the non-empty convex subset of {\mathbb  {R}}^{N}. Then, the indicator function of C is defined as

i_{C}:x\mapsto \left\lbrace {\begin{aligned}&0&{\mbox{if }}x\in C\\&+\infty &{\mbox{if }}x\notin C\end{aligned}}\right.

p-norm is defined as ( \|.\|_{{p}} )

\|x\|_{p}=(|x_{1}|^{p}+|x_{2}|^{p}+\dots +|x_{N}|^{p})^{{{\frac  {1}{p}}}}

The distance from x\in {\mathbb  {R}}^{N} to C is defined as

D_{C}(x)=\min _{{y\in C}}\|x-y\|

If C is closed and convex, the projection of x\in {\mathbb  {R}}^{N} onto C is the unique point P_{C}x\in C such that D_{C}(x)=\|x-P_{C}x\|_{2}.

The subdifferential of f is given by

\partial f:{\mathbb  {R}}^{N}\rightarrow 2^{{{\mathbb  {R}}^{N}}}:x\mapsto \{u\in {\mathbb  {R}}^{N}|\forall y\in {\mathbb  {R}}^{N},(y-x)^{{\mathrm  {T}}}u+f(x)\leq f(y)).\}

Projection onto Convex Sets (POCS)

One of the widely used convex optimization algorithm is POCS (Projection Onto Convex Sets). This algorithm is employed to recover/synthesize a signal satisfying simultaneously several convex constraints. Let f_{i} be the indicator function of non-empty closed convex set C_{i} modeling a constraint. This reduces to convex feasibility problem, which require us to find a solution such that it lies in the intersection of all convex sets C_{i}. In POCS method each set C_{i} is incorporated by its projection operator P_{{C_{i}}}. So in each iteration x is updated as

x_{{k+1}}=P_{{C_{1}}}P_{{C_{2}}}...P_{{C_{n}}}x_{k}

However beyond such problems projection operators are not appropriate and more general operators are required to tackle them. Among the various generalizations of the notion of a convex projection operator that exist, proximity operators are best suited for other purposes.

Definition

Proximity operators of function f at x is defined as

For every x\in {\mathbb  {R}}^{N}, the minimization problem

{\begin{aligned}&{\text{minimize}}_{{y\in C}}&f(y)+{\frac  {1}{2}}\left\|x-y\right\|_{2}^{2}&\end{aligned}}

admits a unique solution which is denoted by \operatorname {prox}_{f}(x).

\operatorname {prox}_{f}(x):{\mathbb  {R}}^{N}\rightarrow {\mathbb  {R}}^{N}

The proximity operator of f is characterized by inclusion

{\begin{aligned}&p=\operatorname {prox}_{f}(x)\Leftrightarrow x-p\in \partial f(p)&(\forall (x,p)\in {\mathbb  {R}}^{N}\times {\mathbb  {R}}^{N})\end{aligned}}

If f is differentiable then above equation reduces to

{\begin{aligned}&p=\operatorname {prox}_{f}(x)\Leftrightarrow x-p\in \triangledown f(p)&(\forall (x,p)\in {\mathbb  {R}}^{N}\times {\mathbb  {R}}^{N})\end{aligned}}

Examples

Special instances of Proximal Gradient Methods are

See also

References

  • Rockafellar, R. T. (1970). Convex analysis. Princeton: Princeton University Press. 
  • Combettes, Patrick L.; Pesquet, Jean-Christophe (2011). Springer's Fixed-Point Algorithms for Inverse Problems in Science and Engineering 49. pp. 185–212. 

Notes

  1. Combettes, Patrick L.; Pesquet, Jean-Chritophe (2009). "Proximal Splitting Methods in Signal Processing". p. {11–23}. 
  2. "Beck, A; Teboulle, M (2009). "A fast iterative shrinkage-thresholding algorithm for linear inverse problems". SIAM J. Imaging Science 2. pp. 183–202. 

External links

This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.