Functional derivative

From Wikipedia, the free encyclopedia

In mathematics and theoretical physics, the functional derivative is a generalization of the directional derivative. The difference is that the latter differentiates in the direction of a vector, while the former differentiates in the direction of a function. Both of these can be viewed as extensions of the usual calculus derivative.

Two possible, restricted definitions suitable for certain computations are given here. There are more general definitions of functional derivatives.

Given a manifold M representing (continuous/smooth/with certain boundary conditions/etc.) functions φ and a functional F defined as

F\colon M \rightarrow \mathbb{R} \quad \mbox{or} \quad F\colon M \rightarrow \mathbb{C} ,

the functional derivative of F, denoted δF / δφ(x), is a distribution δF[φ] such that for all test functions f,

\left\langle \delta F[\phi], f \right\rangle = \left.\frac{d}{d\epsilon}F[\phi+\epsilon f]\right|_{\epsilon=0}.

Sometimes physicists write the definition in terms of a limit and the Dirac delta function, δ:

\frac{\delta F[\phi(x)]}{\delta \phi(y)}=\lim_{\varepsilon\to 0}\frac{F[\phi(x)+\varepsilon\delta(x-y)]-F[\phi(x)]}{\varepsilon}.

Contents

[edit] Formal description

The definition of a functional derivative may be made much more mathematically precise and formal by defining the space of functions more carefully. For example, when the space of functions is a Banach space, the functional derivative becomes known as the Fréchet derivative, while one uses the Gâteaux derivative on more general locally convex spaces. Note that the well-known Hilbert spaces are special cases of Banach spaces. The more formal treatment allows many theorems from ordinary calculus and analysis to be generalized to corresponding theorems in functional analysis, as well as numerous new theorems to be stated.

[edit] Relationship between the mathematical and physical definitions

The mathematicians' definition and the physicists' definition of the functional derivative differ only in the physical interpretation. Since the mathematical definition is based on a relationship that holds for all test functions f, it should also hold when f is chosen to be a specific function. The only handwaving difficulty is that specific function was chosen to be a delta function---which is not a valid test function.

   In the mathematical definition, the functional derivative describes how the entire functional, F[\varphi(x)], changes as a result of a small change in the function \varphi(x). Observe that the particular form of the change in \varphi(x) is not specified. The physics definition, by contrast, employs a particular form of the perturbation --- namely, the delta function --- and the 'meaning' is that we are varying \varphi(x) only about some neighborhood of y. Outside of this neighborhood, there is no variation in \varphi(x).

   Often, a physicist wants to know how one quantity, say the electric potential at position r1, is affected by changing another quantity, say the density of electric charge at position r2. The potential at a given position, is a functional of the density. That is, given a particular density function and a point in space, one can compute a number which represents the potential of that point in space due to the specified density function. Since we are interested in how this number varies across all points in space, we treat the potential as a function of r. To wit,

F[\rho(r')] := V(r) =  \frac{1}{4\pi\epsilon_0} \int \frac{\rho(r')}{|r-r'|} \mathrm{d}^3r'.

That is, for each r, the potential V(r) is a functional of ρ(r'). We can apply either definition---here we apply the math definition:


\begin{align}
\left\langle \delta F[\rho(r')], \varphi(r') \right\rangle
& {} = \frac{d}{d\varepsilon} \left. \frac{1}{4\pi\epsilon_0}  \int \frac{\rho(r') + \varepsilon \varphi(r')}{|r-r'|} \mathrm{d}^3r' \right|_{\varepsilon=0} \\
& {} = \frac{1}{4\pi\epsilon_0}  \int \frac{\varphi(r')}{|r-r'|} \mathrm{d}^3r' \\
& {} = \left\langle \frac{1}{4\pi\epsilon_0} \frac{1}{|r-r'|}, \varphi(r') \right\rangle.
\end{align}

So,


\frac{\delta V(r)}{\delta \rho(r')} = \frac{1}{4\pi\epsilon_0}\frac{1}{|r-r'|}.

Now, we can evaluate the functional derivative at r = r1 and r' = r2 to see how the potential at r1 is changed due to a small variation in the density at r2. In practice, the unevaluated form is probably more useful.

[edit] Examples

We give a formula to derive a common class of functionals that can be written as the integral of a function and its derivatives (a generalization of the Euler–Lagrange equation), and apply this formula to three examples taken from physics. Another example in physics is the derivation of the Lagrange equation of the second kind from the principle of least action in Lagrangian mechanics.

[edit] Formula for the integral of a function and its derivatives

Given a functional of the form

F[\rho(\mathbf{r})] = \int f( \mathbf{r}, \rho(\mathbf{r}), \nabla\rho(\mathbf{r}) )\, d^3r,

with ρ vanishing at the boundaries of \mathbf{r}, the functional derivative can be written


\begin{align}
\left\langle \delta F[\rho], \phi \right\rangle 
& {} = \frac{d}{d\varepsilon} \left. \int f( \mathbf{r}, \rho + \varepsilon \phi, \nabla\rho+\varepsilon\nabla\phi )\, d^3r \right|_{\varepsilon=0} \\
& {} = \int \left( \frac{\partial f}{\partial\rho} \phi + \frac{\partial f}{\partial\nabla\rho} \cdot \nabla\phi \right) d^3r \\
& {} = \int \left[ \frac{\partial f}{\partial\rho} \phi + \nabla \cdot \left( \frac{\partial f}{\partial\nabla\rho} \phi \right) - \left( \nabla \cdot \frac{\partial f}{\partial\nabla\rho} \right) \phi \right] d^3r \\
& {} = \int \left[ \frac{\partial f}{\partial\rho} \phi - \left( \nabla \cdot \frac{\partial f}{\partial\nabla\rho} \right) \phi \right] d^3r \\
& {} = \left\langle \frac{\partial f}{\partial\rho} - \nabla \cdot \frac{\partial f}{\partial\nabla\rho}\,, \phi \right\rangle,
\end{align}

where, in the third line, φ = 0 is assumed at the integration boundaries. Thus,


\delta F[\rho] = \frac{\partial f}{\partial\rho} - \nabla \cdot \frac{\partial f}{\partial\nabla\rho}

or, writing the expression more explicitly,


\frac{\delta F[\rho(\mathbf{r})]}{\delta\rho(\mathbf{r})} = \frac{\partial}{\partial\rho(\mathbf{r})}f(\mathbf{r}, \rho(\mathbf{r}), \nabla\rho(\mathbf{r})) - \nabla \cdot \frac{\partial}{\partial\nabla\rho(\mathbf{r})}f(\mathbf{r}, \rho(\mathbf{r}), \nabla\rho(\mathbf{r}))

The above example is specific to the particular case that the functional depends on the function \rho(\mathbf{r}) and its gradient \nabla\rho(\mathbf{r}) only. In the more general case that the functional depends on higher order derivatives, i.e.


F[\rho(\mathbf{r})] = \int f( \mathbf{r}, \rho(\mathbf{r}), \nabla\rho(\mathbf{r}), \nabla^2\rho(\mathbf{r}), \dots, \nabla^N\rho(\mathbf{r}))\, d^3r,

where \nabla^i is a vector whose ni components (\mathbf{r} \in \mathbb{R}^n) are all partial derivative operators of order i, i.e. \partial^i/(\partial r^{i_1}_1\, \partial r^{i_2}_2 \cdots \partial r^{i_n}_n) with i_1+i_2+\cdots+i_n = i, an analogous application of the definition yields


\begin{align}
\frac{\delta F[\rho]}{\delta \rho} = \frac{\partial f}{\partial\rho} - \nabla \cdot \frac{\partial f}{\partial(\nabla\rho)} + \nabla^2 \cdot \frac{\partial f}{\partial\left(\nabla^2\rho\right)} - \cdots \\
\cdots + (-1)^N \nabla^N \cdot \frac{\partial f}{\partial\left(\nabla^N\rho\right)} = \sum_{i=0}^N (-1)^{i}\nabla^i \cdot \frac{\partial f}{\partial\left(\nabla^i\rho\right)}.
\end{align}

[edit] Thomas-Fermi kinetic energy functional

In 1927 Thomas and Fermi used a kinetic energy functional for a noninteracting uniform electron gas in a first attempt of density-functional theory of electronic structure:

T_\mathrm{TF}[\rho] = C_\mathrm{F} \int \rho^{5/3}(\mathbf{r}) \, d^3r.

T_\mathrm{TF}[\varrho] depends only on the charge density \varrho(\mathbf{r}) and does not depend on its gradient, Laplacian, or other higher-order derivatives. Therefore,

\frac{\delta T_\mathrm{TF}[\rho]}{\delta \rho} = C_\mathrm{F} \frac{\partial \rho^{5/3}(\mathbf{r})}{\partial \rho} = \frac{5}{3} C_\mathrm{F} \rho^{2/3}(\mathbf{r}).

[edit] Coulomb potential energy functional

For the classical part of the potential, Thomas and Fermi employed the Coulomb potential energy functional

J[\rho] = \frac{1}{2}\int\int \frac{\rho(\mathbf{r}) \rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert}\, d^3r d^3r' = \int \left(\frac{1}{2}\int \frac{\rho(\mathbf{r}) \rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert} d^3r'\right) d^3r = \int j[\mathbf{r},\rho(\mathbf{r})]\, d^3r.

Again, J[ρ] depends only on the charge density ρ and does not depend on its gradient, Laplacian, or other higher-order derivatives. Therefore,

\frac{\delta J[\rho]}{\delta \rho} = \frac{\partial j}{\partial \rho} = \frac{1}{2}\int \frac{\partial}{\partial \rho}\frac{\rho(\mathbf{r}) \rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert}\, d^3r'  = \int \frac{\rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert}\, d^3r'

The second functional derivative of the Coulomb potential energy functional is

\frac{\delta^2 J[\rho]}{\delta \rho^2} = \frac{\delta}{\delta \rho}\int \frac{\rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert}\, d^3r' = \frac{\partial}{\partial \rho} \frac{\rho(\mathbf{r}')}{\vert \mathbf{r}-\mathbf{r}' \vert} = \frac{1}{\vert \mathbf{r}-\mathbf{r}' \vert}

[edit] Weizsäcker kinetic energy functional

In 1935 Weizsäcker proposed a gradient correction to the Thomas-Fermi kinetic energy functional to make it suit better a molecular electron cloud:

T_\mathrm{W}[\rho] = \frac{1}{8} \int \frac{\nabla\rho(\mathbf{r}) \cdot \nabla\rho(\mathbf{r})}{ \rho(\mathbf{r}) }\, d^3r = \frac{1}{8} \int \frac{(\nabla\rho(\mathbf{r}))^2}{\rho(\mathbf{r})}\, d^3r = \int t[\rho(\mathbf{r}),\nabla\rho(\mathbf{r})]\, d^3r.

Now TW[ρ] depends on the charge density ρ and its gradient, therefore,

\frac{\delta T[\rho]}{\delta \rho} = \frac{\partial t}{\partial \rho} - \nabla\cdot\frac{\partial t}{\partial (\nabla \rho)} = -\frac{1}{8} \frac{(\nabla\rho(\mathbf{r}))^2}{\rho(\mathbf{r})^2} - \nabla\cdot\left(\frac{1}{4} \frac{\nabla\rho(\mathbf{r})}{\rho(\mathbf{r})}\right) = \frac{1}{8} \frac{(\nabla\rho(\mathbf{r}))^2}{\rho^2(\mathbf{r})} - \frac{1}{4} \frac{\nabla^2\rho(\mathbf{r})}{\rho(\mathbf{r})}.

[edit] Writing a function as a functional

Finally, note that any function can be written in terms of a functional. For example,

\rho(\mathbf{r}) = \int \rho(\mathbf{r}') \delta(\mathbf{r}-\mathbf{r}')\, d^3r'.

This functional is a function of ρ only, and thus, is in the same form as the above examples. Therefore,

\frac{\delta \rho(\mathbf{r})}{\delta\rho(\mathbf{r}')}=\frac{\delta \int \rho(\mathbf{r}') \delta(\mathbf{r}-\mathbf{r}')\, d^3r'}{\delta \rho(\mathbf{r}')} = \frac{\partial \left(\rho(\mathbf{r}') \delta(\mathbf{r}-\mathbf{r}')\right)}{\partial \rho} = \delta(\mathbf{r}-\mathbf{r}').

[edit] Entropy

The entropy of a discrete random variable is a functional of the probability mass function.

H[p(x)] = − p(x)log2p(x)
x

Thus,


\begin{align}
\left\langle \delta H, \phi \right\rangle 
& {} = \sum_x \delta H \, \varphi(x) \\
& {} = \frac{d}{d\epsilon} \left. H[p(x) + \epsilon\phi(x)] \right|_{\epsilon=0}\\
& {} = -\frac{d}{d\varepsilon} \left. \sum_x [p(x) + \varepsilon\varphi(x)] \log_2 [p(x) + \varepsilon\varphi(x)] \right|_{\varepsilon=0} \\
& {} = \displaystyle -\sum_x [1+\log_2 p(x)]\varphi(x)\\
& {} = \left\langle -[1+\log_2 p(x)], \varphi \right\rangle.
\end{align}

Thus,


\frac{\delta H}{\delta p} = -[1+\log_2 p(x)].

[edit] References