Dixon's factorization method

From Wikipedia, the free encyclopedia

In number theory, Dixon's factorization method (also Dixon's algorithm) is a general-purpose integer factorization algorithm. The quadratic sieve is a modification of the basic idea used in Dixon's method.

[edit] Basic idea

Dixon's method is based on finding a congruence of squares. Fermat's factorization algorithm finds such a congruence by selecting random or pseudo-random x values and hoping one satisfies the congruence:

x^2\equiv y^2\quad(\hbox{mod }n),\qquad x\not\equiv\pm y.

where n is the integer to be factorized. In practice, selecting random x values will take an impractically long time to find a congruence of squares. Dixon's method is based on satisfying a much weaker condition many times, and the results of these values can be combined into a congruence of squares.

[edit] Method

Firstly, a set of primes less than some bound B is chosen. This set of primes is called the factor base. Then, using the polynomial

p(x) = x2(mod n)

many values of x are tested to see if p(x) factors completely over the factor base. If it does, the pair (x,p(x)) is stored. Such a pair is called a relation. Then, once the number of relations collected exceeds the size of the factor base, we can enter the next stage.

The p(x) values are factorized (this is easy since we are certain they factorize completely over the factor base) and the exponents of the prime factors are converted into an exponent vector mod 2. For example, if the factor base is {2, 3, 5, 7} and the p(x) value is 30870, we have:

30870=2^1\cdot3^2\cdot 5^1\cdot 7^3

This gives an exponent vector of:

\mathbf{v}_i=\begin{bmatrix}1 \\ 2 \\ 1 \\ 3\end{bmatrix}=\begin{bmatrix}1 \\ 0 \\ 1 \\ 1\end{bmatrix}\hbox{ mod 2}

If we can find some way to add these exponent vectors together (equivalent to multiplying the corresponding relations together) to produce the zero vector (mod 2), then we can get a congruence of squares. Thus we can put the exponent vectors together into a matrix, and formulate an equation:

c_1\mathbf{v}_1+c_2\mathbf{v}_2+\cdots+c_n\mathbf{v}_n=\mathbf{0}\hbox{ (mod 2)}

This can be converted into a matrix equation:

\begin{bmatrix} v_{11} & v_{12} & \cdots & v_{1n}\\ v_{21} & v_{22} & \cdots & v_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ v_{n1} & v_{n2} & \cdots & v_{nn} \end{bmatrix}\begin{bmatrix}c_1 \\ c_2 \\ \vdots \\ c_n\end{bmatrix}=\begin{bmatrix}0\\0\\ \vdots\\0\end{bmatrix}\hbox{ (mod 2)}

This matrix equation is then solved (using, for example, Gaussian elimination) to find the vector c. Then:

\prod_{k}x_k^2\equiv\prod_{k}p(x_k)\pmod{n}

where the products are taken over all k for which ck = 1. At least one of the ck must be one. Because of the way we have solved for c, the right-hand side of the above congruence is a square. We then have a congruence of squares.

[edit] Optimizations

The quadratic sieve is an optimization of Dixon's method. It solves a quadratic congruence to find suitable x values much faster than simply by random selection.

Other ways to optimize Dixon's method include using a better algorithm to solve the matrix equation. In practice, the Lanczos algorithm is often used. Also, the size of the factor base must be chosen carefully. If it is too small, it will be difficult to find numbers that factorize completely over it. If it is too large, more relations will have to be collected.

The optimal complexity of Dixon's method is O(exp(2·sqrt(2·sqrt(ln n ln ln n)))). [1]

In other languages