Introduction to mathematics of general relativity

From Wikipedia, the free encyclopedia

An understanding of calculus and differential equations is necessary for the understanding of nonrelativistic physics. In order to understand special relativity one also needs an understanding of tensor calculus. To understand the general theory of relativity, one needs a basic introduction to the mathematics of curved spacetime that includes a treatment of curvilinear coordinates, nontensors, curved space, parallel transport, Christoffel symbols, geodesics, covariant differentiation, the curvature tensor, Bianchi identity, and the Ricci tensor. This article follows the basic treatment in the lecture series on the topic, intended for advanced undergraduates, given by Paul Dirac at Florida State University. (Dirac 1996)

All the mathematics discussed in this article were known before Einstein's general theory of relativity.

For an introduction based on the specific physical example of particles orbiting a large mass in circular orbits, see Newtonian motivations for general relativity for a nonrelativistic treatment and Theoretical motivation for general relativity for a fully relativistic treatment.

Contents

[edit] Mathematics of special relativity

Main article: Special relativity

[edit] Vectors

Main article: Four-vector

[edit] Interval between two points

Spacetime physics requires four coordinates for the description of a point in spacetime:

  ct=x^0 \quad x=x^1 \quad y=x^2 \quad z=x^3

where c is the speed of light and x, y, and z are spatial coordinates.

A point very close to our original point is

  x^{\mu}_{ } + dx^{\mu} \quad \mu \in \{ 0,1,2,3\}    .

The square of the distance, or interval, between the two points is

  ds^2 = -\left ( dx^{0}\right ) ^2 + \left ( dx^{1}\right ) ^2 + \left ( dx^{2}\right ) ^2+ \left ( dx^{3}\right ) ^2

and is invariant under coordinate transformations. Here we are using the Minkowski metric.

[edit] Coordinate transformations

[edit] Transformation of dx

If one defines a new coordinate system xμ' such that

  x^{\mu'} = x^{\mu'}\left ( x^{\mu} \right )

then

  dx^{\mu'} =   { \partial x^{\mu'} \over  \partial x^{\nu} }  dx^{\nu} \ \stackrel{\mathrm{def}}{=}\   {x^{\mu'}}_{,\nu} \; dx^{\nu}

where repeated indices are summed according to the Einstein summation convention.

The comma in the subscript of the last term indicates differentiation.

[edit] Transformation of a scalar

A scalar quantity transforms as

  {\partial f \left ( x^{\mu} \right ) \over \partial x^{\mu'} } =  {\partial   x^{\nu}  \over \partial x^{\mu'} } {\partial f \left ( x^{\mu} \right ) \over \partial x^{\nu} } =  {x^{\nu}}_{, \mu'} {\partial f \left ( x^{\mu} \right ) \over \partial x^{\nu} } .

[edit] Contravariant vectors

Quantities Aμ that transform in the same way as dxμ under a change of coordinates,

  A^{\mu'} = {x^{\mu'}}_{,\nu} \; A^{\nu}  ,

form a contravariant vector. The squared length of the vector is the invariant quantity

  (A,A) \ \stackrel{\mathrm{def}}{=}\  -\left ( A^{0} \right ) ^2 + \left ( A^{1} \right ) ^2 + \left ( A^{2} \right ) ^2+ \left ( A^{3}\right ) ^2    .

The term on the left is the notation for the inner product of A with itself.

[edit] Covariant vectors

The covariant vector is defined as

  A_0\ \stackrel{\mathrm{def}}{=}\  -A^0 \quad A_1\ \stackrel{\mathrm{def}}{=}\  A^1 \quad A_2\ \stackrel{\mathrm{def}}{=}\  A^2 \quad A_3\ \stackrel{\mathrm{def}}{=}\  A^3   .

It transforms as

  A_{\mu'} =   x^{\nu}_{, \mu'} A_{\nu} .

[edit] Inner product

The inner product of two vectors is written

  (A,B)_{ }^{ } = A_{\mu}B^{\mu} = B_{\mu}A^{\mu}  .

This quantity is also invariant under coordinate transformations.

[edit] Tensors

Main article: Tensor

[edit] Definition

A rank-2 contravariant tensor can be constructed from the outer product of vectors as

  T^{\mu \nu} \ \stackrel{\mathrm{def}}{=}\  A^{\mu}B^{\nu} + C^{\mu}D^{\nu} + E^{\mu}F^{\nu}  + \cdots .

[edit] Contravariant tensor

The components of a rank 2 contravariant tensor transform in the same way as the quantities    {A^{\mu}B^{\nu}}_{ }^{ }  ,

  T^{\mu' \nu'}={x^{\mu'}}_{,\mu} \; {x^{\nu'}}_{,\nu} \; T^{\mu \nu}

[edit] Covariant and mixed tensors

Higher rank tensors are constructed similarly as are covariant and mixed tensors. For a rank 2 covariant tensor, the transformation is

T_{\mu'\nu'}={x^{\mu}}_{, \mu'} \; {x^{\nu}}_{, \nu'} \; T_{\mu \nu}  .

[edit] Oblique axes

Main article: Metric tensor

[edit] The interval and the metric tensor

An oblique coordinate system is one in which the axis are not necessarily orthogonal to each other. For oblique axes, the interval is

   {ds^2}_{ }^{ } = g_{\mu \nu}dx^{\mu}dx^{\nu} = g_{\nu \mu}dx^{\mu}dx^{\nu}

where the coefficients gμν, called the metric tensor depend on the system of oblique axes.

[edit] Determinant of the metric tensor

The determinant of gμν is denoted g and is always negative for any real coordinate axes.

[edit] Inner product

The inner product of any two vectors

   {(A,B)}_{ }^{ } = g_{\mu \nu}A^{\mu}B^{\nu}

is invariant.

[edit] Relation between covariant and contravariant tensors

Covariant tensors can be converted to and from contravariant tensors by

   {A_{\mu}}_{ }^{ } = g_{\mu \nu}A^{\nu}

and

   {A^{\mu}}_{ }^{ } = g^{\mu \nu}A_{\nu}

where gμν is the cofactor of the corresponding gμν

and

    g_{\mu \nu}  g^{\nu \rho} =  {g_{\mu}}^{\rho} \ \stackrel{\mathrm{def}}{=}\  \begin{cases} 1, & \mbox{if }\mu=\rho \\ 0, & \mbox{if }\mu \ne \rho \end{cases} .

[edit] Nontensors

See also: Pseudotensor

A nontensor is a tensor-like quantity Nμ that behaves like a tensor in the raising and lowering of indices,

   {N_{\mu}}_{ }^{ } = g_{\mu \nu}N^{\nu}

and

   {N^{\mu}}_{ }^{ } = g^{\mu \nu}N_{\nu}    ,

but that does not transform like a tensor under a coordinate transformation.

[edit] Mathematics of general relativity

[edit] Curvilinear coordinates and curved spacetime

Curvilinear coordinates are coordinates in which the angles between axes can change from point to point. In other words, the metric tensor gμν in curvilinear coordinates is no longer a constant, but depends on the spacetime location of the metric tensor. It is therefore a field quantity.

Like the surface of a ball embedded in three-dimensional space, we can imagine four dimensional spacetime as embedded in a flat space of a higher dimension. The coordinates on the surface of the ball are curvilinear, while the coordinates in three dimensional space can be rectilinear. The coordinates of four dimensional curved spacetime are curvilinear, while the four space is embedded in a larger dimensional space of rectilinear coordinates.

[edit] Parallel transport

Main article: Parallel transport
Example: Parallel displacement along a circle embedded in two dimensions. The circle of radius r is embedded in a two dimensional space characterized by the coordinates z1 and z2. The circle itself is characterized by coordinates y1 and y2 in the two dimensional space. The circle itself is one dimensional and can be characterized by its arc length x. The coordinates y are related to the coordinate x through the relation y1 = rsin(x / r) and y2 = rcos(x / r). This gives      and   .   In this case the metric is a scalar and is given by  g = cos2(x / r) + sin2(x / r) = 1.  The interval is then  ds2 = gdx2 = dx2.  The interval is just equal to the arc length as expected.
Example: Parallel displacement along a circle embedded in two dimensions. The circle of radius r is embedded in a two dimensional space characterized by the coordinates z1 and z2. The circle itself is characterized by coordinates y1 and y2 in the two dimensional space. The circle itself is one dimensional and can be characterized by its arc length x. The coordinates y are related to the coordinate x through the relation y1 = rsin(x / r) and y2 = rcos(x / r). This gives  \partial y^1 / \partial x =  \cos( x / r) and  \partial y^2 / \partial x = - \sin( x / r) . In this case the metric is a scalar and is given by g = cos2(x / r) + sin2(x / r) = 1. The interval is then ds2 = gdx2 = dx2. The interval is just equal to the arc length as expected.

[edit] The interval in a high dimensional space

Imagine our four dimensional curved spacetime is embedded in a larger N dimensional flat space. Any true physical vector lies entirely in the curved physical space. In other words, the vector is tangent to the curved physical spacetime. It has no component normal to the four dimensional curved spacetime.

In the N dimensional flat space with coordinates   z^n (n=1,2,3,\cdots , N ) the interval between neighboring points is

   ds^2_{ } = \eta_{nm} dz^n dz^m

where ηnm is the metric for the flat space. We do not assume the coordinates are orthogonal, only rectilinear.

[edit] The relation between neighboring contravariant vectors: Christoffel symbols

Main article: Christoffel symbol

The difference in y\! for two neighboring points in the surface differing by d x^{\mu}\! is

d y^{n} = {y^n}_{,\mu} d x^{\mu}

where

{y^n}_{,\mu} = {\partial y^n(x) \over \partial x^{\mu} }.

The interval between two neighboring points in physical spacetime becomes

ds^2_{ } = \eta_{nm} dy^n dy^m = \eta_{nm} {y^n}_{,\mu} {y^m}_{,\nu} dx^{\mu} dx^{\nu} = g_{\mu \nu} d x^{\mu} d x^{\nu}

where

g_{\mu \nu} = \eta_{nm} {y^n}_{,\mu} {y^m}_{,\nu}.

A contravariant vector at a point x in physical spacetime is related to the same contravariant vector at the same point y(x) in N-dimensional space by the relation

A^{n} = {y^n}_{,\mu} A^{\mu}.

The vector lies in the surface of physical spacetime.

Now shift the vector A^{n}\! to the point y^n(x+dx)\! keeping it parallel to itself. In other words, we hold the components of the vector constant during the shift. The vector no longer lies in the surface because of curvature of the surface.

The shifted vector can be split into two parts, one tangent to the surface and one normal to surface, as

   A^{n} = A^{n}_{\mathrm{tan}} + A^{n}_{\mathrm{nor}}   .

Let K^{\mu} \! be the components of A^{n}_{\mathrm{tan}} in the x coordinate system. This transformation is given by:

   A^{n}_{\mathrm{tan}} =  K^{\mu} {y^n}_{,\mu} (x+dx)  .

The normal vector    A^{n}_{\mathrm{nor}}   is normal to every vector in the surface including the unit vectors that define the components of xμ. Therefore

   A^{n}_{\mathrm{nor}} \; \;  y_{n,\mu} (x+dx)  = 0.

This allows us to write

   A^{n}  \;  y_{n,\mu} (x+dx)  = K^{\nu} g_{\mu \nu}(x+dx)

or

 K_{\nu} -  A_{\nu} \ \stackrel{\mathrm{def}}{=}\  dA_{\nu} = A^{\mu}  \;  {y^n}_{,\mu} y_{n,\nu, \sigma}  dx^{\sigma} \ \stackrel{\mathrm{def}}{=}\  A^{\mu}  \;  \Gamma_{\mu \nu \sigma}  dx^{\sigma}

where

  \Gamma_{\mu \nu \sigma} \ \stackrel{\mathrm{def}}{=}\  {y^n}_{,\mu} y_{n,\nu, \sigma}

is a nontensor called the Christoffel symbol of the first kind. It can be shown to be related to the metric tensor through the relation

  \Gamma_{\mu \nu \sigma} = {1 \over 2} \left ( g_{\mu \nu , \sigma} + g_{\mu \sigma , \nu} - g_{\nu \sigma , \mu} \right ) .

Since the Christoffel symbol can be written entirely in terms of the metric in physical spacetime, all reference to the N-dimensional space has disappeared.

[edit] Christoffel symbol of the second kind

The Christoffel symbol of the second kind is defined as

 \Gamma^{\mu}_{ \nu \sigma}  \ \stackrel{\mathrm{def}}{=}\  g^{\mu \lambda} \Gamma_{\lambda \nu \sigma}  .

This operation is allowed for nontensors.

This allows us to write

  dA_{\nu} = A_{\mu} \Gamma^{\mu}_{ \nu \sigma}  dx^{\sigma}

and

  dA^{\nu} = -A^{\mu} \Gamma^{\nu}_{ \mu \sigma}  dx^{\sigma} .

The minus sign in the second expression can be seen from the invariance of an inner product of two vectors

  d\left ( A^{\nu} B_{\nu} \right ) = 0  .

[edit] The constancy of the length of the parallel displaced vector

From Dirac:

The constancy of the length of the vector follows from geometrical arguments. When we split up the vector into tangential and normal parts ... the normal part is infinitesimal and is orthogonal to the tangential part. It follows that, to the first order, the length of the whole vector equals that of its tangential part.

[edit] The covariant derivative

Main article: Covariant derivative

The partial derivative of a vector with respect to a spacetime coordinate is composed of two parts, the normal partial derivative minus the change in the vector due to parallel transport

 A_{\mu ; \nu} =  A_{\mu , \nu}  - A_{\alpha} \Gamma^{\alpha}_{ \mu \nu}.

It is relatively easy to prove that the metric tensor g_{ij}\, is covariantly constant, i.e. g_{ij;k}=0\,\ for any choice of i,j,k.

The covariant derivative of a product is

\left(A B\right)_{;\sigma}=\left(A_{;\sigma}\right )B + A\left(B_{;\sigma}\right)

that is, the covariant derivative satisfies the product rule (due to Gottfried Leibniz).

[edit] Geodesics

Main article: Geodesic

Suppose we have a point zμ that moves along a track in physical spacetime. Suppose the track is parameterized with the quantity τ. The "velocity" vector that points in the direction of motion in spacetime is

 u^{\mu} = { dz^{\mu} \over d\tau }.

The variation of the velocity upon parallel displacement along the track is then

{ d u^{\nu} \over d \tau} + \Gamma^{\nu}_{\mu \sigma} u^{\mu} u^{\sigma}.

If there are no "forces" acting on the point, then the velocity is unchanged along the track and we have

 { d u^{\nu} \over d \tau} + \Gamma^{\nu}_{\mu \sigma} u^{\mu} u^{\sigma} \quad = \quad { d^2 z^{\nu} \over d \tau^2} + \Gamma^{\nu}_{\mu \sigma} { d z^{\mu} \over d \tau}   { d z^{\sigma} \over d \tau} \quad = \quad 0,

which is called the geodesic equation.

[edit] Curvature tensor

[edit] Definition

The curvature K of a surface is simply the angle through which a vector is turned as we take it around an infinitesimal closed path. For a two dimensional Euclidean surface we have

 \delta \theta = \mbox{(Area enclosed)} \cdot K  .

For a triangle on a spherical surface the angle is the excess (over 180 degrees) of the sum of the angles of the triangle. For a spherical surface of radius r, the curvature is

  K = {1 \over r^2} .

The definition of curvature   {R^{\beta}}_{\nu \rho \sigma}  generalizes to

 \delta^2 A^{\mu} =  {R^{\beta}}_{\nu \rho \sigma} A^{\nu} \Delta ^{\rho \sigma}

where A^{\mu}\! is an arbitrary vector transported around a closed loop of area \Delta^{\rho \sigma}\! along the x^{\rho}\! and x^{\sigma}\! directions. Here,

\Delta^{\rho \sigma} \ \stackrel{\mathrm{def}}{=}\  dx^{\rho} dx^{\sigma}\!.

This expression can be reduced to the commutation relation

 A_{\nu ; \rho ; \sigma } - A_{\nu ; \sigma ; \rho }  \ \stackrel{\mathrm{def}}{=}\  A_{\beta} {R^{\beta}}_{\nu \rho \sigma}

where

  {R^{\beta}}_{\nu \rho \sigma} \ \stackrel{\mathrm{def}}{=}\  \Gamma^{\beta}_{\nu \sigma , \rho} -  \Gamma^{\beta}_{\nu \rho , \sigma} + \Gamma^{\alpha}_{\nu \sigma } \Gamma^{\beta}_{\alpha \rho} - \Gamma^{\alpha}_{\nu \rho } \Gamma^{\beta}_{\alpha \sigma}.

In flat spacetime, the derivatives commute and the curvature is zero.

[edit] Symmetries of the curvature tensor

The curvature tensor is antisymmetric in the last two indices

{R^{\beta}}_{\nu\rho\sigma}=-{R^{\beta}}_{\nu\sigma\rho}.

Also

{R^{\beta}}_{\nu \rho \sigma} + {R^{\beta}}_{\rho \sigma \nu} + {R^{\beta}}_{\sigma \nu \rho } = 0
 R_{\mu \nu \rho \sigma}^{ } = -R_{ \nu \mu \rho \sigma}

and

  R_{\mu \nu \rho \sigma}^{ } = R_{ \rho \sigma \mu \nu } = R_{ \sigma \rho \nu \mu  }  .

A consequence of the symmetries is that the curvature tensor has only 20 independent components.

[edit] Bianchi identity

The following differential relation, known as the Bianchi identity is true.

  {R^{\nu}}_{\mu \rho \sigma ; \tau} + {R^{\nu}}_{\mu  \sigma \tau ; \rho } + {R^{\nu  }}_{\mu  \tau \rho ;  \sigma } = 0

[edit] Ricci tensor and scalar curvature

The Ricci tensor is defined as the contraction

  R_{\nu \rho} \ \stackrel{\mathrm{def}}{=}\ {R^{\mu}}_{\nu\mu \rho} .

A second contraction yields the scalar curvature

 R \ \stackrel{\mathrm{def}}{=}\  g^{\nu \rho} R_{\nu \rho} = {R^{\nu}}_{\nu} .

It can be shown that consequence of the Bianchi identity is

  2{R^{\alpha}}_{\sigma ; \alpha} - R_{;\sigma} = 0  .

[edit] See also

[edit] References

  • P. A. M. Dirac (1996). General Theory of Relativity. Princeton University Press. ISBN 0-691-01146-X. 
  • Misner, Charles; Thorne, Kip S. & Wheeler, John Archibald (1973). Gravitation. San Francisco: W. H. Freeman. ISBN 0-7167-0344-0. 
  • Landau, L. D. and Lifshitz, E. M. (1975). Classical Theory of Fields (Fourth Revised English Edition). Oxford: Pergamon. ISBN 0-08-018176-7. 
  • R. P. Feynman, F. B. Moringo, and W. G. Wagner (1995). Feynman Lectures on Gravitation. Addison-Wesley. ISBN 0-201-62734-5. 
  • Einstein, A. (1961). Relativity: The Special and General Theory. New York: Crown. ISBN 0-517-02961-8.