Covariance and contravariance

From Wikipedia, the free encyclopedia

This page does not deal with the statistical concept covariance of random variables, or with the computer science concepts of Covariance and contravariance (computer science).

In mathematics and theoretical physics, covariance and contravariance are concepts used in many areas, generalizing in a sense invariance, i.e., the property of being unchanged under some transformation. In mathematical terms, they occur in a foundational way in linear algebra and multilinear algebra, differential geometry and other branches of geometry, category theory and algebraic topology. In physics they are important to the treatment of vectors and other quantities, such as tensors, that have physical meaning but are not scalars. Both special relativity (Lorentz covariance) and general relativity (general covariance) use covariant basis vectors.

In very general terms, duality interchanges covariance and contravariance, which is why these concepts occur together. For purposes of practical computation using matrices, the transpose relates two aspects (for example two sets of simultaneous equations). The case of a square matrix for which the transpose is also the inverse matrix, that is, an orthogonal matrix, is one in which covariance and contravariance can typically be treated on the same footing. This is of basic importance in the practical application of tensors.

A major potential cause of confusion is that this duality of covariance/contravariance intervenes every time discussion of a vector or tensor quantity is represented by its components. This causes discussion in the mathematics and physics literature often apparently to be using opposite conventions. It is not the convention that differs, but whether an intrinsic or component-wise description is the primary way of thinking of quantities. As the names suggest, covariant quantities are thought of as moving or transforming forwards, while contravariant quantities transform backwards. This depends on whether one is using a fixed background—a fact that switches the point of view.

Contents

[edit] Informal usage

In common physics usage, the adjective covariant may sometimes be used informally as a synonym for invariant (or equivariant, in mathematicians' terms). For example, the Schrödinger equation does not keep its written form under the coordinate transformations of special relativity; thus one might say that the Schrödinger equation is not covariant. By contrast, the Klein-Gordon equation and the Dirac equation take the same form in any coordinate frame of special relativity: thus, one might say that these equations are covariant. More properly, one should really say that the Klein-Gordon and Dirac equations are invariant, and that the Schrödinger equation is not, but this is not the dominant usage. Note also that neither the Klein-Gordon nor the Dirac equations are invariant under the transformations of general relativity (nor are they in any sense covariant either), and thus proper use should indicate what the invariance is in respect to.

Similar informal usage is sometimes seen with respect to quantities like mass and time in general relativity: mass is technically a component of the four-momentum or the energy-momentum tensor, but one might occasionally see language referring to the covariant mass, meaning the length of the momentum four-vector.

[edit] Rules of Covariant and Contravariant Transformation

In tensor representation a vector \mathbf{A} can be expressed as the sum of the products of each of its components times the basis vector belonging to that component in two ways (repeated indices are assumed to sum according to the Einstein summation convention):

\mathbf{A}=a^i \mathbf{e}_i=a_i \mathbf{e}^i

where ai are called the contravariant components of \mathbf{A}, ai are called the covariant components of \mathbf{A}, \mathbf{e}_i are covariant basis vectors, and \mathbf{e}^i are contravariant basis vectors if and only if these transform from coordinates x'i to coordinates xi (where xi are differentiable functions of x'i, and vice versa) according to the rules:

a^i=a'^j {\partial x^i \over \partial x'^j },
a_i=a'_j {\partial x'^j \over \partial x^i },
\mathbf{e}^i=\mathbf{e'}^j {\partial x^i \over \partial x'^j },
\mathbf{e}_i=\mathbf{e'}_j {\partial x'^j \over \partial x^i },

where the primed components and basis vectors represent A in the coordinates x'i:

\mathbf{A}=a'^i \mathbf{e'}_i=a'_i \mathbf{e'}^i.

We could also compute the inverse relations:

a'^i=a^j {\partial x'^i \over \partial x^j },
a'_i=a_j {\partial x^j \over \partial x'^i },
\mathbf{e'}^i=\mathbf{e}^j {\partial x'^i \over \partial x^j },
\mathbf{e'}_i=\mathbf{e}_j {\partial x^j \over \partial x'^i },

which is only possible if the determinant of the matrices formed by the components of \partial x^i / \partial x'^j and \partial x'^j / \partial x^i are non-zero. The determinant of the matrix formed by \partial x^i / \partial x'^j is called the Jacobian J of the transformation, which must be non-zero to provide a complete set of transformation laws.

Note that the matrices formed by all of the above partial derivative transformations can be generated as the inverse, transpose, and transpose of the inverse of the matrix formed by the components of \partial x^i / \partial x'^j. The key property of the tensor representation is the preservation of invariance in the sense that vector components which transform in a covariant manner (or contravariant manner) are paired with basis vectors that transform in a contravariant manner (or covariant manner), and these operations are inverse to one another according to the transformation rules. Substituting the transformation rules for the definition of \mathbf{A} gives:

\mathbf{A}=a^i \mathbf{e}_i=a'_j {\partial x'^j \over \partial x^i }\mathbf{e'}^j {\partial x^i \over \partial x'^j }=a'_j\mathbf{e'}^j

where the partial derivative terms cancel one another since they must be inverse to one another. This illustrates what is meant by invariance. A similar relation holds for all vectors (or higher-order tensors), allowing them to be written in the manner described above. Using the transformation rules can also show that: \mathbf{e}^i\cdot\mathbf{e}_j =\delta^i_j, where \delta^i_j is 1 if i = j and 0 otherwise.

Note that in this kind of system the basis vectors are not generally of unit length, nor are covariant basis vectors necessarily parallel to their contravariant basis vectors (if the coordinates are non-orthogonal).

 Illustration of the contravariant and covariant representation of vectors in a 2D curvilinear, non-orthogonal grid

The above figure illustrates how the contravariant and covariant representations would be plotted in terms of components on a 2D curvilinear non-orthogonal grid for a generic vector\mathbf{A}. Note that the sum of either pair of vectors yields the same vector. Also note that the covariant basis vectors are parallel to their respective coordinate lines while the contravariant basis vectors are orthogonal to the directions of the other coordinate lines.

There are many other useful properties of the tensor representation. If we take the dot product of \mathbf{A}=a^i \mathbf{e}_i=a_k \mathbf{e}^k and \mathbf{e}_j then we obtain:

a_j=a^i g_{ij}=a^i(\mathbf{e}_i\cdot \mathbf{e}_j)

where gij is the covariant metric tensor. The dot product of \mathbf{A}=a^k \mathbf{e}_k=a_j \mathbf{e}^j and \mathbf{e}^i likewise gives:

a^i=a_j g^{ij}=a_j(\mathbf{e}^i\cdot \mathbf{e}^j)

where gij is the contravariant metric tensor. This gives two useful results: 1) the covariant (or contravariant) components of a vector can be recovered by taking the dot product of that vector and the covariant (or contravariant) basis vectors, and 2) the covariant and contravariant components are related by the metric tensor. We note in passing that the covariant and contravariant basis vectors are also related to one another by the metric tensor, and that the above relations require that gij and gij are inverse to one another.

We note that the tensor representation is not restricted to vectors, but can be used on higher-order tensors where each covariant or contravariant component transforms individually according to the rules described above. For example, we could transform a so-called mixed tensor of the form:

b^i_j = {b'}^k_l {\partial x^i \over \partial x'^k} {\partial x'^l \over \partial x^j}

by successively applying the transformation rules to each index according to whether it is covariant (lowered) or contravariant (raised).

[edit] Example: contravariant and covariant basis vectors in Euclidean R3

If e1, e2, e3 are contravariant basis vectors of R3 (not necessarily orthogonal nor of unit norm) then the covariant basis vectors of their reciprocal system are:

\mathbf{e}_1 = \frac{\mathbf{e}^2 \times \mathbf{e}^3}{\mathbf{e}^1 \cdot (\mathbf{e}^2 \times \mathbf{e}^3)} ; \qquad \mathbf{e}_2 = \frac{\mathbf{e}^3 \times \mathbf{e}^1}{\mathbf{e}^2 \cdot (\mathbf{e}^3 \times \mathbf{e}^1)}; \qquad \mathbf{e}_3 = \frac{\mathbf{e}^1 \times \mathbf{e}^2}{\mathbf{e}^3 \cdot (\mathbf{e}^1 \times \mathbf{e}^2)}.

Note that even if the ei and ei are not orthonormal, they are still by this definition mutually orthonormal:

\mathbf{e}^i \cdot \mathbf{e}_j = \delta^i_j.

Then the contravariant coordinates of any vector v can be obtained by the dot product of v with the contravariant basis vectors:

q^1 = \mathbf{v \cdot e^1}; \qquad q^2 = \mathbf{v \cdot e^2}; \qquad q^3 = \mathbf{v \cdot e^3}.

Likewise, the covariant components of v can be obtained from the dot product of v with covariant basis vectors, viz.

q_1 = \mathbf{v \cdot e_1}; \qquad q_2 = \mathbf{v \cdot e_2}; \qquad q_3 = \mathbf{v \cdot e_3}.

Then v can be expressed in two (reciprocal) ways, viz.

\mathbf{v} = q_i \mathbf{e}^i = q_1 \mathbf{e}^1 + q_2 \mathbf{e}^2 + q_3 \mathbf{e}^3

or

\mathbf{v} = q^i \mathbf{e}_i = q^1 \mathbf{e}_1 + q^2 \mathbf{e}_2 + q^3 \mathbf{e}_3.

The indices of covariant coordinates, vectors, and tensors are subscripts. If the contravariant basis vectors are orthonormal then they are equivalent to the covariant basis vectors, so there is no need to distinguish between the covariant and contravariant coordinates, and all indices are subscripts.

[edit] What 'contravariant' means

Contravariant is a mathematical term with a precise definition in tensor analysis. It specifies precisely the method (direction of projection) used to derive the components by projecting the magnitude of the tensor quantity onto the coordinate system being used as the basis of the tensor.

Another method is used to derive covariant tensor components. When performing tensor transformations it is critical that the method used to map to the coordinate systems in use be tracked so that operations may be applied correctly for accurate, meaningful results.

In two dimensions, for an oblique rectilinear coordinate system, contravariant coordinates of a directed line segment (in two dimensions this is termed a vector) can be established by placing the origin of the coordinate axis at the tail of the vector. Parallel lines are placed through the head of the vector. The intersection of the line parallel to the x1 axis with the x2 axis provides the x2 coordinate. Similarly, the intersection of the line parallel to the x2 axis with the x1 axis provides the x1 coordinate.

 contravariant coordinates

By definition, the oblique, rectilinear, contravariant coordinates of the point P above are summarized as: xi = (x1, x2)

Notice the superscript; this is a standard nomenclature convention for contravariant tensor components and should not be confused with the subscript, which is used to designate covariant tensor components.

Is there a fundamental difference in the way contravariant and covariant components can be used, or could one simply interchange them everywhere? The answer is that in curved spaces, or in curved coordinate systems in flat space (e.g. cylindrical coordinates in Euclidean space), the quantity dxi is a perfect differential that can be immediately integrated to yield xi, whilst the covariant components of the same differential, dxi are not in general perfect differentials; the integrated change depends on the path. In the example of cylindrical coordinates, the radial and z components are the same in covariant and contravariant form, but the covariant component of the differential of angle round the z axis is r2 and its integral depends on the path.

Using the definition above, the contravariant components of a position vector vi, where i = {1, 2}, can be defined as the differences between coordinates (or position vectors) of the head and tail, on the same coordinate axis. Stated in another way, the vector components are the projection onto an axis from the direction parallel to the other axis.

So, since we have placed our origin at the tail of the vector,

vi = ( (x1 − 0), (x2 − 0 ) )
vi = (x1, x2)

This result is generalized into n-dimensions. Contravariance is a fundamental concept or property within tensor theory and applies to tensors of all ranks over all manifolds. Since whether tensor components are contravariant or covariant, how they are mixed, and the order of operations all impact the results it is imperative to track for correct application of methods.

In more modern terms, the transformation properties of the covariant indices of a tensor are given by a pullback; by contrast, the transformation of the contravariant indices is given by a pushforward (differential).

[edit] Use in tensor analysis

In tensor analysis, a covariant vector varies more or less reciprocally to a corresponding contravariant vector. Expressions for lengths, areas and volumes of objects in the vector space can then be given in terms of tensors with covariant and contravariant indices. Under simple expansions and contractions of the coordinates, the reciprocity is exact; under affine transformations the components of a vector intermingle on going between covariant and contravariant expression.

On a manifold, a tensor field will typically have multiple indices, of two sorts. By a widely followed convention (including Wikipedia), covariant indices are written as lower indices, whereas contravariant indices are upper indices. When the manifold is equipped with a metric, covariant and contravariant indices become very closely related to one-another. Contravariant indices can be turned into covariant indices by contracting with the metric tensor. Contravariant indices can be gotten by contracting with the (matrix) inverse of the metric tensor. Note that in general, no such relation exists in spaces not endowed with a metric tensor. Furthermore, from a more abstract standpoint, a tensor is simply "there" and its components of either kind are only calculational artifacts whose values depend on the chosen coordinates.

The explanation in geometric terms is that a general tensor will have contravariant indices as well as covariant indices, because it has parts that live in the tangent bundle as well as the cotangent bundle.

A contravariant vector is one which transforms like \frac{dx^{\mu}}{d\tau}, where x^{\mu} \! are the coordinates of a particle at its proper time \tau \!. A covariant vector is one which transforms like \frac{\partial \phi}{\partial x^{\mu}}, where \phi \! is a scalar field.

[edit] Algebra and geometry

In category theory, there are covariant functors and contravariant functors. The dual space of a vector space is a standard example of a contravariant functor. Some constructions of multilinear algebra are of 'mixed' variance, which prevents them from being functors. The distinction between homology theory and cohomology theory in topology is that homology is a covariant functor, while cohomology is a contravariant functor (it was suggested in a book, Hilton & Wylie, that contrahomology was therefore a better term for cohomology, but this did not catch on). Homology theory is covariant because (as is very clear in singular homology) its basic construction is to take a topological space X and map things into it (in that case, simplices). For a continuous mapping from X to another space Y, simply map on by composing functions. Cohomology goes the 'other way'; this is adapted to studying mappings out of X, for example the sections of a vector bundle.

In geometry, the same map in/map out distinction is helpful in assessing the variance of constructions. A tangent vector to a smooth manifold M is, to begin with, a curve mapping smoothly into M and passing through a given point P. It is therefore covariant, with respect to smooth mappings of M. A contravariant vector, or 1-form, is in the same way constructed from a smooth mapping from M to the real line, near P. It is in the cotangent bundle, built up from the dual spaces of the tangent spaces. Its components with respect to a local basis of one-forms dxi will be covariant; but one-forms and differential forms in general are contravariant, in the sense that they pull back under smooth mappings. This is crucial to how they are applied; for example a differential form can be restricted to any submanifold, while this does not make the same sense for a field of tangent vectors.

Covariant and contravariant components transform in different ways under coordinate transformations. By considering a coordinate transformation on a manifold as a map from the manifold to itself, the transformation of covariant indices of a tensor are given by a pullback, and the transformation properties of the contravariant indices is given by a pushforward.

[edit] See also

[edit] External links

In other languages