Norm (mathematics)

From Wikipedia, the free encyclopedia

In linear algebra, functional analysis and related areas of mathematics, a norm is a function which assigns a positive length or size to all vectors in a vector space, other than the zero vector. A seminorm (or pseudonorm) on the other hand is allowed to assign zero length to some non-zero vectors.

A simple example is the 2-dimensional Euclidean space R2 equipped with the Euclidean norm. Elements in this vector space (e.g., (3, 7) ) are usually drawn as arrows in a 2-dimensional cartesian coordinate system starting at the origin (0, 0). The Euclidean norm assigns to each vector the length of its arrow.

A vector space with a norm is called a normed vector space. Similarly, a vector space with a seminorm is called a seminormed vector space.

Contents

[edit] Definition

Given a vector space V over a subfield F of the complex numbers such as the complex numbers themselves or the real or rational numbers, a seminorm on V is a function p:VR; xp(x) with the following properties:

For all a in F and all u and v in V,

  1. p(a v) = |a| p(v), (positive homogeneity or positive scalability)
  2. p(u + v) ≤ p(u) + p(v) (triangle inequality or subadditivity).

A simple consequence of these two axioms, positive homogeneity and the triangle inequality, is p(0) = 0 and thus

p(v) ≥ 0 (positivity).

A norm is a seminorm with the additional property

p(v) = 0 if and only if v is the zero vector (positive definiteness).

A norm is usually denoted ||v||, and sometimes |v|, instead of p(v).

Although every vector space is seminormed (e.g., with the trivial seminorm in the Examples section below), it may not be normed. Any vector space V with seminorm p(v) can be made into a normed space by forming the quotient space V/W where W is the subspace of V consisting of all vectors v in V with p(v) = 0. The induced norm on V/W is given by ||W+v|| = p(v) and is clearly well-defined.

A topological vector space is called normable (seminormable) if the topology of the space can be induced by a norm (seminorm).

[edit] Examples

  • All norms are seminorms.
  • The trivial seminorm, with p(x) = 0 for all x in V.
  • The absolute value is a norm on the real numbers.
  • Every linear form f on a vector space defines a seminorm by x→|f(x)|.

[edit] Euclidean norm

On Rn, the intuitive notion of length of the vector x = [x1, x2, ..., xn] is captured by the formula

\|\mathbf{x}\| := \sqrt{x_1^2 + \cdots + x_n^2}.

This gives the ordinary distance from the origin to the point x, a consequence of the Pythagorean theorem. The Euclidean norm is by far the most commonly used norm on Rn, but there are other norms on this vector space as will be shown below.

On Cn the most common norm is

\|\mathbf{z}\| := \sqrt{|z_1|^2 + \cdots + |z_n|^2}., equivalent with the Euclidean norm on R2n.

In each case we can also express the norm as the square root of the inner product of the vector and itself. The euclidean norm is also called the l 2, see Lp space.

The set of vectors whose Euclidean norm is a given constant forms the surface of a sphere.

[edit] Taxicab norm or Manhattan norm

Main article Taxicab geometry

\|x\|_1 := \sum_{i=1}^{n} |x_i|.

The name relates to the distance a taxi has to drive in a rectangular street grid to get from the origin to the point x.

The set of vectors whose 1-norm is a given constant forms the surface of a cross polytope.

[edit] p-norm

Let p≥1 be a real number.

\|x\|_p := \left( \sum_{i=1}^n |x_i|^p \right)^\frac{1}{p}

Note that for p = 1 we get the taxicab norm and for p = 2 we get the Euclidean norm. See also Lp space.

[edit] Infinity norm or maximum norm

Main article maximum norm

\|x\|_\infty := \max \left(|x_1|, \ldots ,|x_n| \right).

The set of vectors whose ∞-norm is a given constant forms the surface of a hypercube.

[edit] Zero norm

In the machine learning and optimization literature, one often finds reference to the zero norm. The zero norm of x is defined as \lim_{p\rightarrow 0} \|x\|_p^p, where \|x\|_p is the p-norm defined above. If we define 0^0 \ \stackrel{\mathrm{def}}{=}\  0 then we can write the zero norm as \sum_{i=1}^n x_i^0. It follows that the zero norm of x is simply the number of non-zero elements of x. Despite its name, the zero norm is not a true norm; in particular, it is not positive homogeneous. (In other examples in this article, a vector space over the real or complex numbers is assumed; if, on the other hand, a vector space over the field \mathbb{Z}_2 is used, then the zero norm is a norm.)

[edit] Other norms

Other norms on Rn can be constructed by combining the above; for example

\|x\| := 2|x_1| + \sqrt{3|x_2|^2 + \max(|x_3|,2|x_4|)^2}

is a norm on R4.

For any norm and any bijective linear transformation A we can define a new norm of x, equal to

\|Ax\|.

In 2D, with A a rotation by 45° and a suitable scaling, this changes the taxicab norm into the maximum norm. In 2D, each A applied to the taxicab norm, up to inversion and interchanging of axes, gives a different unit ball: a parallelogram of a particular shape, size and orientation. In 3D this is similar but different for the 1-norm (octahedrons) and the maximum norm (prisms with parallelogram base).

All the above formulas also yield norms on Cn without modification.

[edit] Infinite dimensional case

The generalization of the above norms to an infinite number of components leads to the Lp spaces, with norms

\|x\|_p = \left(\sum_{i\in\mathbb N}|x_i|^p\right)^{\frac1p} resp. \|f\|_{p,X} = \left(\int_X|f(x)|^p\,\mathrm dx\right)^{\frac1p}

(for complex-valued sequences x resp. functions f defined on X\subset\mathbb R), which can be further generalized (see Haar measure).

Any inner product induces in a natural way the norm \|x\| := \sqrt{\langle x,x\rangle}.

Other examples of infinite dimensional normed vector spaces can be found in the Banach space article.

[edit] Properties

Illustrations of unit circles in different norms.
Illustrations of unit circles in different norms.

The concept of unit circle (the set of all vectors of norm 1) is different in different norms: for the 1-norm the unit circle in R2 is a rhomboid, for the 2-norm (Euclidean norm) it is the well-known unit circle, while for the infinity norm it is a square. See the accompanying illustration.

In terms of the vector space, the seminorm defines a topology on the space, and this is a Hausdorff topology precisely when the seminorm can distinguish between distinct vectors, which is again equivalent to the seminorm being a norm.

Two norms ||•||α and ||•||β on a vector space V are called equivalent if there exist positive real numbers C and D such that

C\|x\|_\alpha\leq\|x\|_\beta\leq D\|x\|_\alpha

for all x in V. On a finite dimensional vector space all norms are equivalent. For instance, the l1, l2, and l_\infty norms are all equivalent on \mathbb{R}^n:

\|x\|_2\le\|x\|_1\le\sqrt{n}\|x\|_2
\|x\|_\infty\le\|x\|_2\le\sqrt{n}\|x\|_\infty
\|x\|_\infty\le\|x\|_1\le n\|x\|_\infty

Equivalent norms define the same notions of continuity and convergence and do not need to be distinguished for most purposes. To be more precise the uniform structure defined by equivalent norms on the vector space is uniformly isomorphic.

Every (semi)-norm is a sublinear function, which implies that every norm is a convex function. As a result, finding a global optimum of a norm-based objective function is often tractable.

Given a finite family of seminorms pi on a vector space the sum

p(x):=\sum_{i=0}^n p_i(x)

is again a seminorm.

For any norm p on a vector space V, we have that for all u and vV:

p(u ± v) ≥ | p(u) − p(v) |

For the lp norms, we have

|x^\top y|\le\| x\|_p\|y\|_q\qquad \frac{1}{p}+\frac{1}{q}=1[1]

A special case of the above property is the Cauchy-Schwarz inequality:

|x^\top y|\le\|x\|_2\|y\|_2[1]

[edit] Absolutely convex and absorbing sets

Seminorms are closely related to absolutely convex and absorbing sets. Let p be a seminorm on a vector space V, then for any scalar α the sets {x : p(x) < α} and {x : p(x) ≤ α} are absorbing and absolutely convex. In a normed vector space the set {x : p(x) ≤ 1} is called the closed unit ball.

Conversely to each absorbing and absolutely convex subset A of V corresponds a seminorm p called the gauge of A, defined as

p(x) := inf{α : α > 0, x ∈ α A}

with the property that

{x : p(x) < 1} ⊆ A ⊆ {x : p(x) ≤ 1}.

A locally convex topological vector space has a local basis consisting of absolutely convex and absorbing sets. A common method to construct such a basis is to use a familiy of seminorms. Typically this family is infinite, and there are enough seminorms to distinguish between elements of the vector space, creating a Hausdorff space.

[edit] References

  1. ^ a b Golub, Gene; Charles F. Van Loan (1996). Matrix Computations - Third Edition. Baltimore: The Johns Hopkins University Press, 53. ISBN 0-8018-5413-X. 

[edit] See also