Proofs of Fermat's theorem on sums of two squares

From Wikipedia, the free encyclopedia

Fermat's theorem on sums of two squares states that an odd prime p can be expressed as p = x2 + y2 with x and y integers if and only if p \equiv 1 \pmod{4}. It was originally announced by Fermat in 1640, but he gave no proof.

The only if clause is trivial: the squares modulo 4 are 0 and 1, so x2 + y2 is congruent to 0, 1, or 2 modulo 4. Since p is assumed to be odd, this means that it must be congruent to 1 modulo 4.

Contents

[edit] Euler's proof by infinite descent

Euler succeeded in proving Fermat's theorem on sums of two squares in 1747, when he was forty years old. He communicated this in a letter to Goldbach dated 6 May 1747. The proof relies on infinite descent, and proceeds in five steps; the fifth step below is from another letter to Goldbach written in 1749, as the first letter was vague on that final step:

1. The product of two numbers, each of which is a sum of two squares, is itself a sum of two squares.

This is simply a restatement of the Brahmagupta-Fibonacci identity.

2. If a number which is a sum of two squares is divisible by a prime which is a sum of two squares, then the quotient is a sum of two squares.

Indeed, suppose for example that a2 + b2 is divisible by p2 + q2 and that this latter is a prime. Then p2 + q2 divides
(pbaq)(pb + aq) = p2b2a2q2 = p2(a2 + b2) − a2(p2 + q2).
Since p2 + q2 is a prime, it divides one of the two factors. Suppose that it divides pbaq. Since
(a^2+b^2)(p^2+q^2) = (ap+bq)^2 + (aq-bp)^2\,
(Brahmagupta-Fibonacci identity) it follows that p2 + q2 must divide (ap + bq)2. So the equation can be divided by the square of p2 + q2. Dividing the expression by (p2 + q2)2 yields:
\frac{a^2+b^2}{p^2+q^2} = \left(\frac{ap+bq}{p^2+q^2}\right)^2 + \left(\frac{aq-bp}{p^2+q^2}\right)^2
and thus expresses the quotient as a sum of two squares, as claimed.
If p2 + q2 divides pb + aq, a similar argument holds by using
(a^2+b^2)(q^2+p^2) = (aq+bp)^2 + (ap-bq)^2\,
(Brahmagupta-Fibonacci identity).

3. If a number which can be written as a sum of two squares is divisible by a number which is not a sum of two squares, then the quotient has a factor which is not a sum of two squares.

Suppose x divides a2 + b2 and that the quotient, factored into its prime factors is p_1p_2\cdots p_n. Then a^2+b^2 = x p_1p_2\cdots p_n. If all factors pi can be written as sums of two squares, then we can divide a2 + b2 successively by p1, p2, etc., and applying the previous step we deduce that each quotient is a sum of two squares. This until we get to x, concluding that x would have to be the sum of two squares. So, by contrapositive, if x is not the sum of two squares, then at least one of the primes pi is not the sum of two squares.

4. If a and b are relatively prime then every factor of a2 + b2 is a sum of two squares.

This is the step that uses infinite descent. Let x be a factor of a2 + b2. We can write
a = mx \pm c,\qquad b = nx \pm d
where c and d are at most half of x in absolute value. This gives:
a^2 + b^2 = m^2x^2\pm 2mxc + c^2 + n^2x^2 \pm 2nxd + d^2 = Ax + (c^2+d^2).
Therefore, c2 + d2must be divisible by x, say c2 + d2 = yx. If c and d are not relatively prime, then their gcd cannot divide x (it would then divide a and b which we assume are relatively prime). Thus the square of the gcd divides y (as it divides c2 + d2), giving us an expression of the form e2 + f2 = zx for relatively prime e and f, and with z no more than half of x, since
zx = e^2 + f^2 \leq c^2+d^2 \leq \left(\frac{x}{2}\right)^2 + \left(\frac{x}{2}\right)^2 = \frac{1}{2}x^2.
If c and d are relatively prime, then we can use them directly instead of switching to e and f.
If x is not the sum of two squares, then by the third step there must be a factor of z which is not the sum of two squares; call it w. This gives an infinite descent, going from x to a smaller number w, both not the sums of two squares but dividing a sum of two squares. Since an infinite descent is impossible, we conclude that x must be expressible as a sum of two squares, as claimed.

5. Every prime of the form 4n + 1 is a sum of two squares.

If p = 4n + 1, then by Fermat's Little Theorem each of the numbers 1, 2^{4n}, 3^{4n},\dots, (4n)^{4n} is congruent to one modulo p. The differences 2^{4n}-1, 3^{4n}-2^{4n},\dots,(4n)^{4n}-(4n-1)^{4n} are therefore all divisible by p. Each of these differences can be factored as
a^{4n}-b^{4n} = \left(a^{2n}+b^{2n}\right)\left(a^{2n}-b^{2n}\right).
Since p is prime, it must divide one of the two factors. If in any of the 4n − 1 cases it divides the first factor, then by the previous step we conclude that p is itself a sum of two squares (since a and b differ by 1, they are relatively prime). So it is enough to show that p cannot always divide the second factor. If it divides all 4n − 1 differences 2^{2n}-1, 3^{2n}-2^{2n},\dots,(4n)^{2n}-(4n-1)^{2n}, then it would divide all 4n − 2 differences of successive terms, all 4n − 3 differences of the differences, and so forth. Since the kth differences of the sequence 1^k, 2^k, 3^k,\dots are all equal to k!, the 2nth differences would all be constant and equal to (2n)!, which is certainly not divisible by p. Therefore, p cannot divide all the second factors which proves that p is indeed the sum of two squares.

[edit] Lagrange's proof through quadratic forms

Lagrange gave a proof in 1770 based on his general theory of integral quadratic forms. The following is a slight simplification of his argument, due to Gauss, which appears in article 182 of the Disquisitiones Arithmeticae.

A (binary) quadratic form will be taken to be an expression of the form ax2 + 2bxy + cy2 with a,b,c integers. A number n is said to be represented by the form if there exist integers x,y such that n = ax2 + 2bxy + cy2. Fermat's theorem on sums of two squares is then equivalent to the statement that a prime p is represented by the form x2 + y2 (i.e., a = c = 1, b = 0) exactly when p is congruent to 1 modulo 4.

The discriminant of the quadratic form is defined to be b2ac (this is the definition due to Gauss; Lagrange did not require the xy term to have even coefficient, and defined the discriminant as b2 − 4ac). The discriminant of x2 + y2 is then equal to − 1.

Two forms ax2 + 2bxy + cy2 and rx'2 + 2sx'y' + ty'2 are equivalent if and only if there exist substitutions with integer coefficients

x = αx' + βy'
y = γx' + δy'

with \alpha\delta - \beta\gamma = \pm 1 such that, when substituted into the first form, yield the second. Equivalent forms are readily seen to have the same discriminant. Moreover, it is clear that equivalent forms will represent exactly the same integers.

Lagrange proved that all forms of discriminant − 1 are equivalent. Thus, to prove Fermat's theorem it is enough to find any form of discriminant − 1 that represents p. To do this, it suffices to find an integer m such that p divides m2 + 1. For, finding such an integer, we can consider the form

px^2 + 2mxy + \left(\frac{m^2+1}{p}\right)y^2,

which has discriminant − 1 and represents p by setting x = 1 and y = 0.

Suppose then that p = 4n + 1. Again we invoke Fermat's Little Theorem: for any z relatively prime to p, we know that p divides zp − 1 − 1 = z4n − 1 = (z2n − 1)(z2n + 1). Moreover, by a theorem of Lagrange, the number of solutions modulo p to a congruence of degree q modulo p is at most q (this follows since the integers modulo p form a field, and a polynomial of degree q has at most q roots). So the congruence z^{2n} - 1 \equiv 0 \pmod{p} has at most 2n solutions among the numbers 1, 2, \dots, p-1=4n. Therefore, there exists some positive integer z strictly smaller than p (and so relatively prime to p) such that p does not divide z2n − 1. Since p divides z4n − 1 = (z2n − 1)(z2n + 1), p must divide z2n + 1. Setting m = zn completes the proof.

[edit] Dedekind's proofs with Gaussian integers

Dedekind gave at least two proofs of Fermat's theorem on sums of two squares, both using the arithmetical properties of the Gaussian integers. One appears in section 27 of his exposition of ideals published in 1877; the second appeared in Supplement XI to Dirichlet's Lectures on Number Theory, and was published in 1894.

1. First proof. If p is a positive odd rational prime, then we have i^p = (-1)^{\frac{p-1}{2}}i, where i is the imaginary square root of − 1. Consequently, if ω = x + iy is any Gaussian integer, then

\omega^p = (x+yi)^p \equiv x + (-1)^{\frac{p-1}{2}}yi \pmod{p}.

If p is congruent to 1 modulo 4, then each Gaussian integer ω satisfies the congruence

\omega^p \equiv \omega \pmod{p},

and therefore the ideal (p) is the product of two different prime ideals of degree 1 in Z[i]. Since Z[i] is a principal ideal domain, (in fact, a Euclidean domain), every ideal is principal, generated by an element of minimal norm. So if α is a generator of one of the ideal factors of (p), then we must have p = N(α) = N(a + bi) = a2 + b2 which gives Fermat's theorem.

2. Second proof. This proof builds on Lagrange's result that if p = 4n + 1, then there must be a number of the form m2 + 1 which is a multiple of p. Since p does not divide the Gaussian factors m + i nor mi (the quotients are clearly equal to \begin{matrix}\frac{m}{p}\pm \frac{i}{p}\end{matrix}, which are not Gaussian integers), despite dividing the product, it follows that p cannot be a prime in Z[i]. Since it is not a prime, we must have a factorization in the Gaussian integers of the form p = (x + yi)(xyi) for some integers x and y, which immediately yields that p = x2 + y2.

[edit] References

  • Richard Dedekind, The theory of algebraic integers.
  • Harold M. Edwards, Fermat's Last Theorem. A genetic introduction to algebraic number theory. Graduate Texts in Mathematics no. 50, Springer-Verlag, NY, 1977.
  • C. F. Gauss, Disquisitiones Arithmeticae (English Edition). transl. by Arthur A. Clarke. Springer-Verlag, 1986.
  • John Stillwell, Introduction to Theory of Algebraic Integers by Richard Dedekind. Cambridge Mathematical Library, Cambridge University Press, 1996.

[edit] External links