Fermat's factorization method
From Wikipedia, the free encyclopedia
Fermat's factorization method is a representation of an odd integer as the difference of two squares:
- N = a2 − b2.
That difference is algebraically factorable as (a + b)(a − b); if neither factor equals one, it is a proper factorization of N. Put another way, we are looking for a,b such that a2 ≡ b2 (mod N), called a congruence of squares.
Each odd number has such a representation. Indeed, if N = cd is a factorization of N, then
- N = [(c + d) / 2]2 − [(c − d) / 2]2.
Since N is odd, then c and d are also odd, so those halves are integers. (A multiple of four is also a difference of squares: let c and d be even.)
In its simplest form, Fermat's method might be even slower than trial division (worst case). Nonetheless, the combination of trial division and Fermat's is more effective than either.
Contents |
[edit] The basic method
One tries various values of a, hoping that a2 − N = b2 is a square.
- FermatFactor(N): // N should be odd
- A ← ceil(sqrt(N))
- Bsq ← A*A - N
- while Bsq isn't a square:
- A ← A + 1
- Bsq ← A*A - N // equivalently: Bsq ← Bsq + 2*A - 1
- endwhile
- return A - sqrt(Bsq) // or A + sqrt(Bsq)
For example, to factor N = 5959, one computes
A: | 78 | 79 | 80 |
Bsq: | 125 | 282 | 441 |
The third try produces a square. A = 80, B = 21, and the factors are A − B = 59, and A + B = 101.
Suppose N has more than one factorization. That procedure first finds the factorization with the least values of a and b. That is, a + b is the smallest factor ≥ the square-root of N. And so a − b = N / (a + b) is the largest factor ≤ root-N. If the procedure finds N = 1 * N, that shows that N is prime.
For N = cd, let c be the largest subroot factor. a = (c + d) / 2, so the number of steps is approximately .
If N is prime (so that c = 1), one needs O(N) steps! This is a bad way to prove primality. But if N has a factor close to its square-root, the method works quickly. More precisely, if c differs less than from the method requires only one step. Note, that this is independent of the size of N.
[edit] Fermat's and trial division
Let's factor the prime number N=2345678917, but also compute B and A-B throughout.
A: | 48433 | 48434 | 48435 | 48436 |
Bsq: | 76572 | 173439 | 270308 | 367179 |
B: | 276.7 | 416.5 | 519.9 | 605.9 |
A-B: | 48156.3 | 48017.5 | 47915.1 | 47830.1 |
In practice, one wouldn't bother with that last row, until B is an integer. But observe that if N had a subroot factor above A − B = 47830.1, Fermat's method would have found it already.
Trial division would normally try up to 48432; but after only four Fermat steps, we need only divide up to 47830, to find a factor or prove primality.
In this regard, Fermat's gives diminishing returns. One would probably stop long before this point:
A: | 60001 | 60002 |
Bsq: | 1254441084 | 1254561087 |
B: | 35418.1 | 35419.8 |
A-B: | 24582.9 | 24582.2 |
This all suggests a combined factoring method. Choose some bound c; use trial division to find factors below c, and Fermat's for factors above c. That is, do Fermat until a − b < c, or a > (c + N / c) / 2. The best choice of c depends on N, and on the computing environment.
It also depends on the algorithm. There are ways to speed-up the basic method.
[edit] Sieve improvement
One needn't compute all the square-roots of a2 − N, nor even examine all the values for a. Examine the tableau for N = 2345678917:
A: | 48433 | 48434 | 48435 | 48436 |
Bsq: | 76572 | 173439 | 270308 | 367179 |
B: | 276.7 | 416.5 | 519.9 | 605.9 |
One can quickly tell that none of these values of Bsq are squares. Squares end with 0, 1, 4, 5, 9, or 16 modulo 20. The values repeat with each increase of a by 10. For this example a2 − N produces 3, 4, 7, 8, 12, and 19 modulo 20 for these values. It is apparent that only the 4 from this list can be a square. Thus, a2 must be 1 mod 20, which means that a is 1 or 9 mod 10; it will produce a Bsq which ends in 4 mod 20, and if Bsq is a square, b will end in 2 or 8 mod 10.
This can be performed with any modulus. Using the same N = 2345678917,
modulo 16: | Squares are | 0, 1, 4, or 9 |
N mod 16 is | 5 | |
so a2 can only be | 9 | |
and a must be | 3 or 5 modulo 8 | |
modulo 9: | Squares are | 0, 1, 4, or 7 |
N mod 9 is | 7 | |
so a2 can only be | 7 | |
and a must be | 4 or 5 modulo 9 |
One generally chooses a power of a different prime for each modulus.
Given a sequence of a-values (start, end, and step) and a modulus, one can proceed thus:
- FermatSieve(N, Astart, Aend, Astep, Modulus)
- A ← Astart
- do Modulus times:
- Bsq ← A*A - N
- if Bsq is a square, modulo Modulus:
- FermatSieve(N, A, Aend, Astep * Modulus, NextModulus)
- endif
- A ← A + Astep
- enddo
But one stops the recursion, when few a-values remain; that is, when (Aend-Astart)/Astep is small. Also, because a's step-size is constant, one can compute successive Bsq's with additions.
[edit] Multiplier improvement
Fermat's method works best when there's a factor near the square-root of N. Perhaps one can arrange for that to happen.
If one knew the approximate ratio of two factors (d / c), then one could pick a rational number v / u near that value. Nuv = cv * du, and the factors are roughly equal: Fermat's, applied to Nuv, will find them quickly. Then gcd(N,cv) = c and gcd(N,du) = d. (Unless c divides u or d divides v.)
Generally, one doesn't know the ratio, but one can try various u / v values, and try to factor each resulting Nuv. R. Lehman devised a systematic way to do this, so that Fermat's plus trial-division can factor N in O(N1 / 3) time. See R. Lehman, "Factoring Large Integers", Mathematics of Computation, 28:637-646, 1974.
[edit] Other improvements
The fundamental ideas of Fermat's factorization method are the basis of the quadratic sieve and general number field sieve, the best-known algorithms for factoring "worst-case" large semiprimes. The primary improvement that quadratic sieve makes over Fermat's factorization method is that instead of simply finding a square in the sequence of a2−n, it finds a subset of elements of this sequence whose product is a square, and it does this in a highly efficient manner. The end result is the same: a difference of square mod n that, if nontrivial, can be used to factor n.
See also J. McKee, "Speeding Fermat's factoring method", Mathematics of Computation, 68:1729-1737, 1999.