Descent direction

In optimization, a descent direction is a vector \mathbf{p}\in\mathbb R^n that, in the sense below, moves us closer towards a local minimum \mathbf{x}^* of our objective function f:\mathbb R^n\to\mathbb R.

Suppose we are computing \mathbf{x}^* by an iterative method, such as line search. We define a descent direction \mathbf{p}_k\in\mathbb R^n at the kth iterate to be any \mathbf{p}_k such that \langle\mathbf{p}_k,\nabla f(\mathbf{x}_k)\rangle < 0, where  \langle , \rangle denotes the inner product. The motivation for such an approach is that small steps along \mathbf{p}_k guarantee that \displaystyle f is reduced, by Taylor's theorem.

Using this definition, the negative of a non-zero gradient is always a descent direction, as  \langle -\nabla f(\mathbf{x}_k), \nabla f(\mathbf{x}_k) \rangle = -\langle \nabla f(\mathbf{x}_k), \nabla f(\mathbf{x}_k) \rangle < 0 .

Numerous methods exist to compute descent directions, all with differing merits. For example, one could use gradient descent or the conjugate gradient method.

More generally, if P is a positive definite matrix, then d = -P \nabla f(x) is a descent direction [1] at x. This generality is used in preconditioned gradient descent methods.

  1. J. M. Ortega and W. C. Rheinbold (1970). Iterative Solution of Nonlinear Equations in Several Variables. p. 243. doi:10.1137/1.9780898719468.