Functional derivative

From Wikipedia, the free encyclopedia

In mathematics and theoretical physics, the functional derivative is a generalization of the directional derivative. The difference is that the latter differentiates in the direction of a vector, while the former differentiates in the direction of a function. Both of these can be viewed as extensions of the usual calculus derivative.

Two possible, restricted definitions suitable for certain computations are given here. There are more general definitions of functional derivatives.

For a functional F mapping (continuous/smooth/with certain boundary conditions/etc.) functions φ from a manifold M to R or C, the functional derivative of F, denoted $δ F / δφ(x)$ , is a distribution such that for all test functions f,

$\left\langle \delta F[\phi], f \right\rangle = \left.\frac{d}{d\epsilon}F[\phi+\epsilon f]\right|_{\epsilon=0}.$

Sometimes physicists write the definition in terms of a limit and the Dirac delta function, δ:

$\frac{\delta F[\phi(x)]}{\delta \phi(y)}=\lim_{\varepsilon\to 0}\frac{F[\phi(x)+\varepsilon\delta(x-y)]-F[\phi(x)]}{\varepsilon}.$

However, the right hand side is mathematically incorrect, since F is not defined for distributions.

1 Formal description
2 Relationship between the mathematical and physical definitions
3 Examples
4 References

[edit] Formal description

The definition of a functional derivative may be made much more mathematically precise and formal by defining the space of functions more carefully. For example, when the space of functions is a Banach space, the functional derivative becomes known as the Fréchet derivative, while one uses the Gâteaux derivative on more general locally convex spaces. Note that the well-known Hilbert space is a special case of a Banach space. The more formal treatment allows many theorems from ordinary calculus and analysis to be generalized to corresponding theorems in functional analysis, as well as numerous new theorems to be stated.

[edit] Relationship between the mathematical and physical definitions

The mathematicians' definition and the physicists' definition of the functional derivative mean slightly different things. The first describes how the entire functional, F, changes as a result of a small change in the function $φ(x)$ . The functional derivative is itself still a functional. However a physicist often wants to know how one quantity, say the density of charge at position 1, $n (y 1)$ , is affected by changing another quantity, say the value of the electric potential at position 2, $U (y 2)$ . (If there are lots of interacting charges in your system, changing the potential at position 2 moves those charges, which changes the potential and the density of charges at every other point in space, including position 1.) The density is a functional of the function $U (y)$ that describes the potential at each point in space. But when this functional is evaluated for a specific functional form of the potential, the density becomes a function, since it has a different value at each point in space.

In this case the physicist is interested in the “functional derivative” $δ n (y 1) / δ U (y 2)$ . To get this quantity take the mathematician's functional derivative of the functional (density) for the special case that the function (potential) has a particular variation that is only non-zero at position 2, namely a delta function: $f (y) = δ(y - y 2)$ . The resulting functional derivative is still a functional of the potential function. But when the actual potential function is substituted in for the argument of this functional, the result is the function, $δ n (y) / δ U (y 2)$ which describes the change in the charge density as a function of position. Finally this function can be evaluated at position 1 to give $δ n (y 1) / δ U (y 2)$ , the change in density at position 1 due to changing the potential only at position 2.

[edit] Examples

We give a formula to derive a common class of functionals that can be written as the integral of a function and its derivatives (a generalization of the Euler-Lagrange equation), and apply this formula to three examples taken from physics. Another example in physics is the derivation of the Lagrange equation of the second kind from the principle of least action in Lagrangian mechanics.

[edit] Formula for the integral of a function and its derivatives

Given a functional of the form

$F[\rho(\mathbf{r})] = \int f( \mathbf{r}, \rho(\mathbf{r}), \nabla\rho(\mathbf{r}) )\, d^3r,$

with $ρ$ vanishing at the boundaries of $\mathbf{r}$ , the functional derivative can be written

$\begin{matrix} \left\langle \delta F[\rho], \phi \right\rangle & = & \frac{d}{d\epsilon} \left. \int f( \mathbf{r}, \rho + \epsilon \phi, \nabla\rho+\epsilon\nabla\phi )\, d^3r \right|_{\epsilon=0} \\ & = & \int \left( \frac{\partial f}{\partial\rho} \phi + \frac{\partial f}{\partial\nabla\rho} \cdot \nabla\phi \right) d^3r \\ & = & \int \left[ \frac{\partial f}{\partial\rho} \phi + \nabla \cdot \left( \frac{\partial f}{\partial\nabla\rho} \phi \right) - \left( \nabla \cdot \frac{\partial f}{\partial\nabla\rho} \right) \phi \right] d^3r \\ & = & \int \left[ \frac{\partial f}{\partial\rho} \phi - \left( \nabla \cdot \frac{\partial f}{\partial\nabla\rho} \right) \phi \right] d^3r \\ & = & \left\langle \frac{\partial f}{\partial\rho} - \nabla \cdot \frac{\partial f}{\partial\nabla\rho}\,, \phi \right\rangle, \end{matrix}$

where, in the third line, $φ = 0$ is assumed at the integration boundaries. Thus,

$\delta F[\rho] = \frac{\partial f}{\partial\rho} - \nabla \cdot \frac{\partial f}{\partial\nabla\rho}$

or, writing the expression more explicitly,

$\frac{\delta F[\rho(\mathbf{r})]}{\delta\rho(\mathbf{r})} = \frac{\partial}{\partial\rho(\mathbf{r})}f(\mathbf{r}, \rho(\mathbf{r}), \nabla\rho(\mathbf{r})) - \nabla \cdot \frac{\partial}{\partial\nabla\rho(\mathbf{r})}f(\mathbf{r}, \rho(\mathbf{r}), \nabla\rho(\mathbf{r}))$

The above example is specific to the particular case that the functional depends on the function $\rho(\mathbf{r})$ and its gradient $\nabla\rho(\mathbf{r})$ only. In the more general case that the functional depends on higher order derivatives, i.e.

$F[\rho(\mathbf{r})] = \int f( \mathbf{r}, \rho(\mathbf{r}), \nabla\rho(\mathbf{r}), \nabla^2\rho(\mathbf{r}), \dots, \nabla^N\rho(\mathbf{r}))\, d^3r,$

where $\nabla^i$ is a vector whose $n i$ components $(\mathbf{r} \in \mathbb{R}^n)$ are all partial derivative operators of order $i$ , i.e. $\partial^i/(\partial r^{i_1}_1 \partial r^{i_2}_2 \dots \partial r^{i_n}_n)$ with $i_1+i_2+\dots+i_n = i$ , an analogous application of the definition yields

$\frac{\delta F[\rho]}{\delta \rho} = \frac{\partial f}{\partial\rho} - \nabla \cdot \frac{\partial f}{\partial(\nabla\rho)} + \nabla^2 \cdot \frac{\partial f}{\partial\left(\nabla^2\rho\right)} - \dots + (-1)^N \nabla^N \cdot \frac{\partial f}{\partial\left(\nabla^N\rho\right)} = \sum_{i=0}^N (-1)^{i}\nabla^i \cdot \frac{\partial f}{\partial\left(\nabla^i\rho\right)}.$

[edit] Thomas-Fermi kinetic energy functional

In 1927 Thomas and Fermi used a kinetic energy functional for a noninteracting uniform electron gas in a first attempt of density-functional theory of electronic structure:

$T_\mathrm{TF}[\rho] = C_\mathrm{F} \int \rho^{5/3}(\mathbf{r}) \, d^3r.$

$T TF [ρ]$ depends only on the charge density $ρ$ and does not depend on its gradient, Laplacian, or other higher-order derivatives. Therefore,

$\frac{\delta T_\mathrm{TF}[\rho]}{\delta \rho} = C_\mathrm{F} \frac{\partial \rho^{5/3}(\mathbf{r})}{\partial \rho} = \frac{5}{3} C_\mathrm{F} \rho^{2/3}(\mathbf{r}).$

[edit] Coulomb potential energy functional

For the classical part of the potential, Thomas and Fermi employed the Coulomb potential energy functional

$J[\rho] = \frac{1}{2}\int\int \frac{\rho(\mathbf{r}) \rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert}\, d^3r d^3r' = \int \left(\frac{1}{2}\int \frac{\rho(\mathbf{r}) \rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert} d^3r'\right) d^3r = \int j[\mathbf{r},\rho(\mathbf{r})]\, d^3r.$

Again, $J [ρ]$ depends only on the charge density $ρ$ and does not depend on its gradient, Laplacian, or other higher-order derivatives. Therefore,

$\frac{\delta J[\rho]}{\delta \rho} = \frac{\partial j}{\partial \rho} = \frac{1}{2}\int \frac{\partial}{\partial \rho}\frac{\rho(\mathbf{r}) \rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert}\, d^3r' = \int \frac{\rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert}\, d^3r'$

The second functional derivative of the Coulomb potential energy functional is

$\frac{\delta^2 J[\rho]}{\delta \rho^2} = \frac{\delta}{\delta \rho}\int \frac{\rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert}\, d^3r' = \frac{\partial}{\partial \rho} \frac{\rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert} = \frac{1}{\vert \mathbf{r}-\mathbf{r}' \vert}$

[edit] Weizsäcker kinetic energy functional

In 1935 Weizsäcker proposed a gradient correction to the Thomas-Fermi kinetic energy functional to make it suit better a molecular electron cloud:

$T_\mathrm{W}[\rho] = \frac{1}{8} \int \frac{\nabla\rho(\mathbf{r}) \cdot \nabla\rho(\mathbf{r})}{ \rho(\mathbf{r}) }\, d^3r = \frac{1}{8} \int \frac{(\nabla\rho(\mathbf{r}))^2}{\rho(\mathbf{r})}\, d^3r = \int t[\rho(\mathbf{r}),\nabla\rho(\mathbf{r})]\, d^3r.$

Now $T W [ρ]$ depends on the charge density $ρ$ and its gradient, therefore,

$\frac{\delta T[\rho]}{\delta \rho} = \frac{\partial t}{\partial \rho} - \nabla\cdot\frac{\partial t}{\partial (\nabla \rho)} = -\frac{1}{8} \frac{(\nabla\rho(\mathbf{r}))^2}{\rho(\mathbf{r})^2} - \nabla\cdot\left(\frac{1}{4} \frac{\nabla\rho(\mathbf{r})}{\rho(\mathbf{r})}\right) = \frac{1}{8} \frac{(\nabla\rho(\mathbf{r}))^2}{\rho^2(\mathbf{r})} - \frac{1}{4} \frac{\nabla^2\rho(\mathbf{r})}{\rho(\mathbf{r})}.$

[edit] Writing a function as a functional

Finally, note that any function can be written in terms of a functional. For example,

$\rho(\mathbf{r}) = \int \rho(\mathbf{r}') \delta(\mathbf{r}-\mathbf{r}')\, d^3r'.$

Therefore,

$\frac{\delta \rho(\mathbf{r})}{\delta\rho(\mathbf{r}')}=\frac{\delta \int \rho(\mathbf{r}') \delta(\mathbf{r}-\mathbf{r}')\, d^3r'}{\delta \rho(\mathbf{r}')} = \frac{\partial \left(\rho(\mathbf{r}') \delta(\mathbf{r}-\mathbf{r}')\right)}{\partial \rho} = \delta(\mathbf{r}-\mathbf{r}').$