Danskin's theorem

In convex analysis, Danskin's theorem is a theorem which provides information about the derivatives of a function of the form

f(x) = \max_{z \in Z} \phi(x,z).

The theorem has applications in optimization, where it sometimes is used to solve minimax problems. The original theorem by J. M. Danskin, given in his 1967, monograph "The Theory of Max-Min and its Applications to Weapons Allocation Problems," Springer, NY, provides a formula for the directional derivative of the maximum of a (not necessarily convex) directionally differentiable function. When adapted to the case of a convex function, this formula yields the following theorem given in somewhat more general form as Proposition A.22 in the 1971 Ph.D. Thesis by D. P. Bertsekas, "Control of Uncertain Systems with a Set-Membership Description of the Uncertainty". A proof of the following version can be found in the 1999 book "Nonlinear Programming" by Bertsekas (Section B.5).

Statement

The theorem applies to the following situation. Suppose \phi(x,z) is a continuous function of two arguments,

\phi: {\mathbb R}^n \times Z \rightarrow {\mathbb R}

where Z \subset {\mathbb R}^m is a compact set. Further assume that \phi(x,z) is convex in x for every z \in Z.

Under these conditions, Danskin's theorem provides conclusions regarding the differentiability of the function

f(x) = \max_{z \in Z} \phi(x,z).

To state these results, we define the set of maximizing points Z_0(x) as

Z_0(x) = \left\{ \overline{z} : \phi(x,\overline{z}) = \max_{z \in Z} \phi(x,z)\right\}.

Danskin's theorem then provides the following results.

Convexity
f(x) is convex.
Directional derivatives
The directional derivative of f(x) in the direction y, denoted D_y\ f(x), is given by
D_y f(x) = \max_{z \in Z_0(x)} \phi'(x,z;y),
where \phi'(x,z;y) is the directional derivative of the function \phi(\cdot,z) at x in the direction y.
Derivative
f(x) is differentiable at x if Z_0(x) consists of a single element \overline{z}. In this case, the derivative of f(x) (or the gradient of f(x) if x is a vector) is given by
\frac{\partial f}{\partial x} = \frac{\partial \phi(x,\overline{z})}{\partial x}.
Subdifferential
If \phi(x,z) is differentiable with respect to x for all z \in Z, and if \partial \phi/\partial x is continuous with respect to z for all x, then the subdifferential of f(x) is given by
\partial f(x) = \mathrm{conv} \left\{ \frac{\partial \phi(x,z)}{\partial x} : z \in Z_0(x) \right\}
where \mathrm{conv} indicates the convex hull operation.
Extension

The 1971 Ph.D. Thesis by Bertsekas [1] (Proposition A.22) proves a more general result, which does not require that \phi(\cdot,z) is differentiable. Instead it assumes that \phi(\cdot,z) is an extended real-valued closed proper convex function for each z in the compact set Z, that int(dom(f)), the interior of the effective domain of f, is nonempty, and that \phi is continuous on the set int(dom(f))\times Z. Then for all x in int(dom(f)), the subdifferential of f at x is given by

\partial f(x) = \mathrm{conv} \left\{ \partial \phi(x,z) : z \in Z_0(x) \right\}

where \partial \phi(x,z) is the subdifferential of \phi(\cdot,z) at x for any z in Z.

See also

References