First class constraint

Not to be confused with Primary constraint.

In a constrained Hamiltonian system, a dynamical quantity is called a first class constraint if its Poisson bracket with all the other constraints vanishes on the constraint surface (the surface implicitly defined by the simultaneous vanishing of all the constraints). A second class constraint is one that is not first class.

First and second class constraints were introduced by Dirac (1950, p.136, 1964, p.17) as a way of quantizing mechanical systems such as gauge theories where the symplectic form is degenerate.^[1] ^[2]

The terminology of first and second class constraints is confusingly similar to that of primary and secondary constraints. These divisions are independent: both first and second class constraints can be either primary or secondary, so this gives altogether four different classes of constraints.

Poisson brackets

In Hamiltonian mechanics, consider a symplectic manifold M with a smooth Hamiltonian over it (for field theories, M would be infinite-dimensional).

Suppose we have some constraints

f_i(x)=0,

for n smooth functions

\{ f_i \}_{i= 1}^n

These will only be defined chartwise in general. Suppose that everywhere on the constrained set, the n derivatives of the n functions are all linearly independent and also that the Poisson brackets

\{f_i,f_j\}

and

\{f_i,H\}

all vanish on the constrained subspace. This means we can write

\{f_i,f_j\}=\sum_k c_{ij}^k f_k

for some smooth functions

c_{ij}^k

(there is a theorem showing this) and

\{f_i,H\}=\sum_j v_i^j f_j

for some smooth functions

v_i^j

This can be done globally, using a partition of unity. Then, we say we have an irreducible first-class constraint (irreducible here is in a different sense from that used in representation theory).

Geometric theory

For a more elegant way, suppose given a vector bundle over M, with n-dimensional fiber V. Equip this vector bundle with a connection. Suppose too we have a smooth section f of this bundle.

Then the covariant derivative of f with respect to the connection is a smooth linear map Δf from the tangent bundle TM to V, which preserves the base point. Assume this linear map is right invertible (i.e. there exists a linear map g such that (Δf)g is the identity map) for all the fibers at the zeros of f. Then, according to the implicit function theorem, the subspace of zeros of f is a submanifold.

The ordinary Poisson bracket is only defined over $C^{\infty}(M)$ , the space of smooth functions over M. However, using the connection, we can extend it to the space of smooth sections of f if we work with the algebra bundle with the graded algebra of V-tensors as fibers. Assume also that under this Poisson bracket,

{ f, f } = 0

(note that it's not true that

{ g, g } = 0

in general for this "extended Poisson bracket" anymore) and

{ f, H } = 0

on the submanifold of zeros of f (If these brackets also happen to be zero everywhere, then we say the constraints close off shell). It turns out the right invertibility condition and the commutativity of flows conditions are independent of the choice of connection. So, we can drop the connection provided we are working solely with the restricted subspace.

Intuitive meaning

What does it all mean intuitively? It means the Hamiltonian and constraint flows all commute with each other on the constrained subspace; or alternatively, that if we start on a point on the constrained subspace, then the Hamiltonian and constraint flows all bring the point to another point on the constrained subspace.

Since we wish to restrict ourselves to the constrained subspace only, this suggests that the Hamiltonian, or any other physical observable, should only be defined on that subspace. Equivalently, we can look at the equivalence class of smooth functions over the symplectic manifold, which agree on the constrained subspace (the quotient algebra by the ideal generated by the f's, in other words).

The catch is, the Hamiltonian flows on the constrained subspace depend on the gradient of the Hamiltonian there, not its value. But there's an easy way out of this.

Look at the orbits of the constrained subspace under the action of the symplectic flows generated by the f's. This gives a local foliation of the subspace because it satisfies integrability conditions (Frobenius theorem). It turns out if we start with two different points on a same orbit on the constrained subspace and evolve both of them under two different Hamiltonians, respectively, which agree on the constrained subspace, then the time evolution of both points under their respective Hamiltonian flows will always lie in the same orbit at equal times. It also turns out if we have two smooth functions A₁ and B₁, which are constant over orbits at least on the constrained subspace (i.e. physical observables) (i.e. {A₁,f}={B₁,f}=0 over the constrained subspace)and another two A₂ and B₂, which are also constant over orbits such that A₁ and B₁ agrees with A₂ and B₂ respectively over the restrained subspace, then their Poisson brackets {A₁, B₁} and {A₂, B₂} are also constant over orbits and agree over the constrained subspace.

In general, one cannot rule out "ergodic" flows (which basically means that an orbit is dense in some open set), or "subergodic" flows (which an orbit dense in some submanifold of dimension greater than the orbit's dimension). We can't have self-intersecting orbits.

For most "practical" applications of first-class constraints, we do not see such complications: the quotient space of the restricted subspace by the f-flows (in other words, the orbit space) is well behaved enough to act as a differentiable manifold, which can be turned into a symplectic manifold by projecting the symplectic form of M onto it (this can be shown to be well defined). In light of the observation about physical observables mentioned earlier, we can work with this more "physical" smaller symplectic manifold, but with 2n fewer dimensions.

In general, the quotient space is a bit "nasty" to work with when doing concrete calculations (not to mention nonlocal when working with diffeomorphism constraints), so what is usually done instead is something similar. Note that the restricted submanifold is a bundle (but not a fiber bundle in general) over the quotient manifold. So, instead of working with the quotient manifold, we can work with a section of the bundle instead. This is called gauge fixing.

The major problem is this bundle might not have a global section in general. This is where the "problem" of global anomalies comes in, for example. See Gribov ambiguity. This is a flaw in quantizing gauge theories many physicists overlooked.

What have been described are irreducible first-class constraints. Another complication is that Δf might not be right invertible on subspaces of the restricted submanifold of codimension 1 or greater (which violates the stronger assumption stated earlier in this article). This happens, for example in the cotetrad formulation of general relativity, at the subspace of configurations where the cotetrad field and the connection form happen to be zero over some open subset of space. Here, the constraints are the diffeomorphism constraints.

One way to get around this is this: For reducible constraints, we relax the condition on the right invertibility of Δf into this one: Any smooth function that vanishes at the zeros of f is the fiberwise contraction of f with (a non-unique) smooth section of a $\bar{V}$ -vector bundle where $\bar{V}$ is the dual vector space to the constraint vector space V. This is called the regularity condition.

Constrained Hamiltonian dynamics from a Lagrangian gauge theory

First of all, we will assume the action is the integral of a local Lagrangian that only depends up to the first derivative of the fields. The analysis of more general cases, while possible is more complicated. When going over to the Hamiltonian formalism, we find there are constraints. Recall that in the action formalism, there are on shell and off shell configurations. The constraints that hold off shell are called primary constraints while those that only hold on shell are called secondary constraints.

Examples

Look at the dynamics of a single point particle of mass m with no internal degrees of freedom moving in a pseudo-Riemannian spacetime manifold S with metric g. Assume also that the parameter τ describing the trajectory of the particle is arbitrary (i.e. we insist upon reparametrization invariance). Then, its symplectic space is the cotangent bundle T*S with the canonical symplectic form ω. If we coordinatize T * S by its position x in the base manifold S and its position within the cotangent space p, then we have a constraint

f = m² −g(x)⁻¹(p,p) = 0.

The Hamiltonian H is, surprisingly enough, H = 0. In light of the observation that the Hamiltonian is only defined up to the equivalence class of smooth functions agreeing on the constrained subspace, we can use a new Hamiltonian H'=f instead. Then, we have the interesting case where the Hamiltonian is the same as a constraint! See Hamiltonian constraint for more details.

Consider now the case of a Yang–Mills theory for a real simple Lie algebra L (with a negative definite Killing form η) minimally coupled to a real scalar field σ, which transforms as an orthogonal representation ρ with the underlying vector space V under L in (d − 1) + 1 Minkowski spacetime. For l in L, we write

ρ(l)[σ]

l[σ]

for simplicity. Let A be the L-valued connection form of the theory. Note that the A here differs from the A used by physicists by a factor of i and "g". This agrees with the mathematician's convention. The action S is given by

S[\bold{A},\sigma]=\int d^dx \frac{1}{4g^2}\eta((\bold{g}^{-1}\otimes \bold{g}^{-1})(\bold{F},\bold{F}))+\frac{1}{2}\alpha(\bold{g}^{-1}(D\sigma,D\sigma))

where g is the Minkowski metric, F is the curvature form

d\bold{A}+\bold{A}\wedge\bold{A}

(no is or gs!) where the second term is a formal shorthand for pretending the Lie bracket is a commutator, D is the covariant derivative

Dσ = dσ − A[σ]

and α is the orthogonal form for ρ.

I hope I have all the signs and factors right. I can't guarantee it.

What is the Hamiltonian version of this model? Well, first, we have to split A noncovariantly into a time component φ and a spatial part $\vec{A}$ . Then, the resulting symplectic space has the conjugate variables σ, π_σ (taking values in the underlying vector space of $\bar{\rho}$ , the dual rep of ρ), $\vec{A}$ , $\vec{\pi}_A$ , φ and π_φ. for each spatial point, we have the constraints, π_φ=0 and the Gaussian constraint

\vec{D}\cdot\vec{\pi}_A-\rho'(\pi_\sigma,\sigma)=0

where since ρ is an intertwiner

\rho:L\otimes V\rightarrow V

ρ' is the dualized intertwiner

\rho':\bar{V}\otimes V\rightarrow L

(L is self-dual via η). The Hamiltonian,

H_f=\int d^{d-1}x \frac{1}{2}\alpha^{-1}(\pi_\sigma,\pi_\sigma)+\frac{1}{2}\alpha(\vec{D}\sigma\cdot\vec{D}\sigma)-\frac{g^2}{2}\eta(\vec{\pi}_A,\vec{\pi}_A)-\frac{1}{2g^2}\eta(\bold{B}\cdot \bold{B})-\eta(\pi_\phi,f)-<\pi_\sigma,\phi[\sigma]>-\eta(\phi,\vec{D}\cdot\vec{\pi}_A).

The last two terms are a linear combination of the Gaussian constraints and we have a whole family of (gauge equivalent)Hamiltonians parametrized by f. In fact, since the last three terms vanish for the constrained states, we can drop them.

Second class constraints

In a constrained Hamiltonian system, a dynamical quantity is second class if its Poisson bracket with at least one constraint is nonvanishing. A constraint that has a nonzero Poisson bracket with at least one other constraint, then, is a second class constraint.

See first class constraints or Dirac bracket for the preliminaries.

An example: a particle confined to a sphere

Before going on to the general theory, consider a specific example step by step to motivate the general analysis.

Start with the action describing a Newtonian particle of mass m constrained to a surface of radius R within a uniform gravitational field g. When one works in Lagrangian mechanics, there are several ways to implement a constraint: one can switch to generalized coordinates that manifestly solve the constraint, or one can use a Lagrange multiplier while retaining the redundant coordinates so constrained.

In this case, the particle is constrained to a sphere, therefore the natural solution would be to use angular coordinates to describe the position of the particle instead of Cartesian and solve (automatically eliminate) the constraint in that way (the first choice). For pedagogical reasons, instead, consider the problem in Cartesian coordinates, with a Lagrange multiplier term enforcing the constraint.

The action is given by

S=\int dt L=\int dt \left[\frac{m}{2}(\dot{x}^2+\dot{y}^2+\dot{z}^2)-mgz+\frac{\lambda}{2}(x^2+y^2+z^2-R^2)\right]

where the last term is the Lagrange multiplier term enforcing the constraint.

Of course, as indicated, we could have just used different coordinates and written it as $S=\int dt \left[\frac{mR^2}{2}(\dot{\theta}^2+\sin^2(\theta)\dot{\phi}^2)+mgR\cos(\theta)\right]$ instead, without extra constraints, but we look at the former coordinatization to illustrate constraints.

The conjugate momenta are given by

p_x=m\dot{x}

p_y=m\dot{y}

p_z=m\dot{z}

p_\lambda=0

Note that we can't determine $\dot{\lambda}$ from the momenta.

The Hamiltonian is given by

H=\vec{p}\cdot\dot{\vec{r}}+p_\lambda \dot{\lambda}-L=\frac{p^2}{2m}+p_\lambda \dot{\lambda}+mgz-\frac{\lambda}{2}(r^2-R^2)

We cannot eliminate $\dot{\lambda}$ at this stage yet. We are here treating $\dot{\lambda}$ as a shorthand for a function of the symplectic space which we have yet to determine and not an independent variable. For notational consistency, define $u_1=\dot{\lambda}$ from now on. The above Hamiltonian with the $p λ$ term is the "naive Hamiltonian". Note that since, on-shell, the constraint must be satisfied, one cannot distinguish, on-shell, between the naive Hamiltonian and the above Hamiltonian with the undetermined coefficient, $\dot{\lambda}=u_1$ .

We have the primary constraint

p λ =0

We require, on the grounds of consistency, that the Poisson bracket of all the constraints with the Hamiltonian vanish at the constrained subspace. In other words, the constraints must not evolve in time if they are going to be identically zero along the equations of motion.

From this consistency condition, we immediately get the secondary constraint

r 2 - R 2 =0

By the same reasoning, this constraint should be added into the Hamiltonian with an undetermined (not necessarily constant) coefficient $u$ ₂. At this point, the Hamiltonian is

H = \frac{p^2}{2m} + mgz - \frac{\lambda}{2}(r^2-R^2) + u_1 p_\lambda + u_2 (r^2-R^2)

And from the secondary constraint, we get the tertiary constraint, $\vec{p}\cdot\vec{r}=0$ , by demanding, for consistency, that $\{r^2-R^2,\, H\}_{PB} = 0$ on-shell. Again, one should add this constraint into the Hamiltonian, since on-shell no one can tell the difference. Therefore, so far, the Hamiltonian looks like

H = \frac{p^2}{2m} + mgz - \frac{\lambda}{2}(r^2-R^2) + u_1 p_\lambda + u_2 (r^2-R^2) + u_3 \vec{p}\cdot\vec{r},

where $u_1$ , $u_2$ , and $u_3$ are still completely undetermined. Note that frequently all constraints that are found from consistency conditions are referred to as "secondary constraints" and secondary, tertiary, quaternary, etc. constraints are not distinguished.

The tertiary constraint's consistency condition yields

\{\vec{p}\cdot\vec{r},\, H\}_{PB} = \frac{p^2}{m} - mgz+ \lambda r^2 -2 u_2 r^2 = 0.

This is not a quaternary constraint, but a condition which fixes one of the undetermined coefficients. In particular, it fixes

u_2 = \frac{\lambda}{2} + \frac{1}{r^2}\left(\frac{p^2}{2m}-\frac{1}{2}mgz \right).

Now that there are new terms in the Hamiltonian, one should go back and check the consistency conditions for the primary and secondary constraints. The secondary constraint's consistency condition gives

\frac{2}{m}\vec{r}\cdot\vec{p} + 2 u_3 r^2 = 0.

Again, this is not a new constraint; it only determines that

u_3 = -\frac{\vec{r}\cdot\vec{p}}{m r^2}.

At this point there are no more constraints or consistency conditions to check.

Putting it all together,

H=\left(2-\frac{R^2}{r^2}\right)\frac{p^2}{2m} + \frac{1}{2}\left(1+\frac{R^2}{r^2}\right)mgz - \frac{(\vec{r}\cdot\vec{p})^2}{mr^2} + u_1 p_\lambda

When finding the equations of motion, one should use the above Hamiltonian, and as long as one is careful to never use constraints before taking derivatives in the Poisson bracket then one gets the correct equations of motion. That is, the equations of motion are given by

\dot{\vec{r}} = \{\vec{r}, \, H\}_{PB}, \quad \dot{\vec{p}} = \{ \vec{p},\, H\}_{PB}, \quad \dot{\lambda} = \{ \lambda,\, H\}_{PB}, \quad \dot{p}_\lambda = \{ p_\lambda, H\}_{PB}.

Before analyzing the Hamiltonian, consider the three constraints:

\phi_1 = p_\lambda, \quad \phi_2 = r^2-R^2, \quad \phi_3 = \vec{p}\cdot\vec{r}.

Notice the nontrivial Poisson bracket structure of the constraints. In particular,

\{\phi_2, \phi_3\} = 2 r^2 \neq 0.

The above Poisson bracket does not just fail to vanish off-shell, which might be anticipated, but even on-shell it is nonzero. Therefore, $\phi_2$ and $\phi_3$ are second class constraints while $\phi_1$ is a first class constraint. Note that these constraints satisfy the regularity condition.

Here, we have a symplectic space where the Poisson bracket does not have "nice properties" on the constrained subspace. But Dirac noticed that we can turn the underlying differential manifold of the symplectic space into a Poisson manifold using a different bracket, called the Dirac bracket, such that the Dirac bracket of any (smooth) function with any of the second class constraints always vanishes and a couple of other nice properties.

If one wanted to canonically quantize this system, then, one needs to promote the canonical Dirac brackets^[3] not the canonical Poisson brackets to commutation relations.

Examination of the above Hamiltonian shows a number of interesting things happening. One thing to note is that on-shell when the constraints are satisfied the extended Hamiltonian is identical to the naive Hamiltonian, as required. Also, note that $\lambda$ dropped out of the extended Hamiltonian. Since $\phi_1$ is a first class primary constraint it should be interpreted as a generator of a gauge transformation. The gauge freedom is the freedom to choose $\lambda$ which has ceased to have any effect on the particle's dynamics. Therefore, that $\lambda$ dropped out of the Hamiltonian, that $u_1$ is undetermined, and that $\phi_1 = p_\lambda$ is first class, are all closely interrelated.

Note that it would be more natural not to start with a Lagrangian with a Lagrange multiplier, but instead take $r^2-R^2$ as a primary constraint and proceed through the formalism. The result would the elimination of the extraneous $λ$ dynamical quantity. Perhaps, the example is more edifying in its current form.