Newton's identities
From Wikipedia, the free encyclopedia
In mathematics, Newton's identities relate two different ways of describing the roots of a polynomial. They were found by Isaac Newton in about 1666, apparently in ignorance of earlier work (1629) by Albert Girard. These useful identities have immediate applications in many areas of mathematics, including Galois theory, invariant theory, group theory, combinatorics, as well as further applications outside mathematics, including general relativity. They can be considered as applications of ideas in computational algebraic geometry, particularly Gröbner bases.
Contents |
[edit] Mathematical statement
Consider the polynomial
where the xα are the roots and the aj are the coefficients. Frequently, this polynomial is regarded as the characteristic polynomial of a linear operator or matrix; then the roots are called eigenvalues.
Define the power sums
If we regard the roots as eigenvalues of a matrix A, then these quantities are the traces of the powers of the matrix
- .
Then the original form of Newton's identities is the recurrence
where the pattern is obvious. For an elementary proof see the book by Tignol cited below.
From these formulae we can readily obtain more useful formulae, expressing the power sums in terms of the coefficients:
Unfortunately, these formulae have the disadvantage that the pattern is no longer obvious.
Finally, we can solve to obtain expressions giving the coefficients in terms of the power sums:
and so forth, where we can see a partial pattern (factorials in the denominator, first term and last term), but again the general pattern is probably not obvious. But if you know about the theory of finite groups, take a hard second look! (Spoiler in a subsequent section.)
Newton seems to have left it there, and hence missed some lovely discoveries.
[edit] Relation with invariant theory
Invariant theory is concerned with polynomial invariants of various algebraic or geometric objects in mathematics, including polynomial invariants of quadratic forms, and more generally, polynomial invariants of tensors. From the latter, we obtain connections with the representation theory of groups.
A fundamental topic in invariant theory concerns the symmetric polynomials, which arise when we express the coefficients of a polynomial in terms of its roots. That is, multiplying out
we have
and so forth. In particular, we can use such expressions to obtain the characteristic polynomial of a linear operator if we know its eigenvalues.
Now the point for the theory of invariants is that if we consider the to be polynomials in the roots, then for a given n they form a basis for the space of symmetric polynomial functions of the roots. That is, every polynomial function of the roots which is invariant under any permutation of the roots, such as exchanging x1,x2, is given by a specific linear combination of the αj. For this reason, are called the elementary symmetric polynomials.
The point is that the expressions above give a different basis for the space of symmetric polynomials, namely
and so forth, where τj is of course simply the sum of the j-th powers of the roots. The fact that we can obtain in this way two distinct ways of representing all symmetric functions of the roots of a polynomial, without knowing the roots themselves, is of fundamental importance for Galois theory.
We can obtain "finer" decompositions by writing general symmetric polynomials as sums of homogeneous polynomials; that is, a symmetric polynomial in which all the terms have the same degree. A convenient basis for the homogeneous symmetric polynomials is given by the Schur polynomials, which correspond to the partitions of an integer, which can be enumerated by Ferrers diagrams (this is the "unlabeled" combinatorial concept corresponding to Young diagrams). For example, corresponding to the partition 4+2+1 = 7, we have the Schur polynomial
which is a homogeneous symmetric polynomial of degree seven in three variables. Amazingly enough, the determinant in the denominator cancels out when everything is fully expanded:
There are three other partitions of seven into three parts, so the space of homogeneous polynomials of degree seven in three variables has dimension four, with each polynomial uniquely expressible as a linear combination of four Schur polynomials. Each of these Schur polynomials can expressed in turn as an algebraic combination of (a polynomial function of) the three elementary symmetric polynomials in three variables, .
[edit] Relation with symmetric groups
The alert reader with some knowledge of the theory of finite groups, particularly the Pólya enumeration formula, will have noticed two striking facts in the discussion above:
- the Schur polynomials are defined in terms of integer partitions (or Ferrers diagrams), which correspond to the conjugacy classes of the symmetric group,
- up to alternating signs, the expressions found above giving the coefficients in terms of the power sums are exactly the cycle index polynomials of the symmetric groups.
This is no coincidence. Joseph Lagrange (and, with a little prior explanation, Isaac Newton) would have immediately understood the statement of the following theorem, which was found by George Pólya in the 1930s: define the characteristic function
where the xα are symbols which we think of as formal "roots" of the characteristic function, which we think of as a generating function for the coefficients aj. Define also the alternating power sums
(The alternating signs arise from the fact that transpositions or two-cycles are odd permutations, three-cycles are even permutations, and so forth.) Then the characteristic function is given by
If you expand the right hand side as the first few terms of a Taylor series in the variable u (you need only write out the first few terms in the exponent), you will obtain the cycle index polynomials of the first few symmetric groups! And, up to alternating signs, these agree with the expressions we found earlier giving the coefficients of the characteristic polynomial of a linear operator in terms of the traces of the powers of the operator. Taking sufficiently many terms in the power series, one can efficiently obtain the cycle index of any symmetric group from Pólya's formula.
[edit] Relation with enumerative combinatorics
Enumerative combinatorics concerns counting things, usually things defined by some kind of recursive construction. Examples include:
- the number of ways of partitioning the natural number n as a sum of natural numbers,
- the number of n-node binary forests,
- the number of sparse digraphs, which are digraphs with n-nodes, having either one or two edges coming into each node.
In each of these examples, the answer to the counting problem is a particular function of the natural number n, taking values in the natural numbers. Almost invariably, the most elegant way of solving such problems is to use the technique of generating functions which was introduced by the indefatigable Leonhard Euler.
In recent years, with the rise of category theory, a highly abstract way of thinking about generating functions has been introduced by André Joyal, in which a generalization of Pólya's cycle index polynomials has been introduced. These Joyal index functions are defined in terms of structors (also known as combinatorial species). These are certain functors which elegantly and precisely express the combinatorial notion of a recursive construction. The Joyal index functions really are a natural generalization of the Pólya index function to an oligomorphic group, a type of tractable infinite permutation group. This in turn leads to a beautiful connection with first-order logic.
Often, connected with a counting problem like those already mentioned, we have a closely related problem, counting
- the number of n-node binary trees (connected forests),
- the number of connected sparse digraphs.
Also, we may want to solve more precise counting problems, such as counting
- the number of ways of partitioning the natural number n as a sum of k natural numbers,
- the number of n-node binary forests containing k trees.
It turns out that such problems are related to the original problems by formulas resembling the Pólya formula given in the previous section. Maximal information is obtained when we can find the Joyal index of the connected structor obtained from a more complicated structure, or when we can obtain an attribute index function enumerating in terms of some integer valued attribute.
[edit] Relation with computational algebraic geometry
Algebraic geometry is primarily concerned with the algebraic problem of finding the roots of systems of polynomials in many variables and the geometric interpretation of this problem in terms of algebraic varieties. A fundamental computational technique for algebraic geometry, which has far-reaching implications in many other fields of mathematics, including differential equations, is the notion of a Gröbner basis. For our purposes we don't need to know precisely what these are, we need only know that Buchberger's algorithm for obtaining a Gröbner basis generalizes two of the most fundamental algorithms in mathematics:
- Gauss's algorithm for obtaining the row reduction of a matrix,
- division algorithm for dividing a polynomial in one variable by another such polynomial.
The resulting Gröbner basis of an ideal in a ring of multivariable polynomials is analogous to a vector basis for a subspace of vector space, and is ideally suited for computations involving ideals in polynomial rings, which is the basis concept of algebraic geometry. Indeed, the famous Nullstellensatz of David Hilbert establishes a perfect correspondence between a fundamental algebraic concept, the ideals of a nice kind of ring, and an equally fundamental geometric concept, varieties in an affine space, or even in a projective space.
Gröbner bases are defined with respect to choice of a certain term order (which specifies "priority" among monomials in carrying out the generalized division algorithm), and one of the most useful choices for algebraic geometry is an elimination order. In particular, using an elimination order we can solve a system of equations such as the identities giving the power sums in terms of the coefficients, to obtain equations giving the coefficients in terms of the power sums. Needless to say, we obtain the expressions found above.
[edit] See also
- elementary symmetric polynomial
- Schur polynomial
- fluid solutions, an article giving an application of Newton's identities to computing the characteristic polynomial of the Einstein tensor in the case of a perfect fluid, and similar articles on other types of exact solutions in general relativity.
[edit] References
- Tignol, Jean-Pierre (2001). Galois's theory of algebraic equations. Singapore: World Scientific. ISBN 978-981-02-4541-2.
— A readable, historically oriented introduction to Galois theory.
- Cameron, Peter J. (1999). Permutation Groups. Cambridge: Cambridge University Press. ISBN 978-0-521-65378-7.
— A fascinating introduction to permutation groups, including the Pólya cycle index, oligomorphic permutation groups, and the connection with mathematical logic.
- Bergeron, F.; Labelle, G.; and Leroux, P. (1998). Combinatorial species and tree-like structures. Cambridge: Cambridge University Press. ISBN 978-0-521-57323-8.
— A fairly readable monograph, which may leave the incorrect impression that this seemingly abstract approach has no practical advantages over more conventional approaches to generating function techniques.
- Sturmfels, Bernd (1992). Algorithms in Invariant Theory. New York: Springer-Verlag. ISBN 978-0-387-82445-1.
— A delightful monograph explaining the application of Gröbner bases to the invariant theory of finite groups, including Schur polynomials and an approach to Galois theory using invariants.
- Cox, David; Little, John, and O'Shea, Donal (1992). Ideals, Varieties, and Algorithms. New York: Springer-Verlag. ISBN 978-0-387-97847-5.
— A beautiful introduction to algebraic geometry using Gröbner bases, including a chapter on invariants of finite groups.
- Tucker, Alan (1980). Applied Combinatorics, 5/e, New York: Wiley. ISBN 978-0-471-73507-6.
— One of the most elementary and readable textbooks discussing Pólya's enumeration formula and cycle index polynomials.