Floyd's cycle-finding algorithm

From Wikipedia, the free encyclopedia

Floyd's cycle-finding algorithm is an algorithm which can detect cycles in arbitrary sequences, whether in data structures or generated on the fly (notably including those in graphs and pseudo-random sequences) in O(1) space. The algorithm is named for Robert W. Floyd, who invented it in 1967^[1]. It is sometimes called the "tortoise and the hare"-algorithm.

It should not be confused with the Floyd-Warshall algorithm for shortest paths.

1 The algorithm
- 1.1 Visualizing the algorithm
2 Performance
3 Variants
4 References
5 External links

[edit] The algorithm

The following discussion is couched in terms of pseudo-random sequences in particular, of great importance in analyzing pseudo-random number generators and in applications to factoring algorithms such as Pollard's rho algorithm.

Let

$f\colon S\mapsto S$

be a pseudo-random function, with S a finite set of cardinality n. Define a sequence a_i as:

a i + 1 = f (a i)

Clearly such a sequence must cycle after at most n iterations of the pseudo-random function, because there are only n possible values for an element. The naïve way to find the length of the cycle is to store each element of the sequence and, at each iteration, look among all the stored elements for duplicates. This means that the storage requirement is O(μ + λ), where μ is the length of the cycle and λ is the length of the tail (i.e. the part of the sequence that does not cycle).

Note that if we find two elements of the sequence such that

a i = a j

then |i − j| must be a multiple of the cycle length, because of the definition of a cycling sequence:

a λ + m = a λ + m + k μ .

The difference of the two indices that hold equal elements is kμ, a multiple of the cycle length. Floyd's cycle-finding algorithm finds such an equality by running two instances of the sequence in parallel, one twice as "fast" as the other; i.e. one instance undergoes two iterations while the other undergoes one. Then, when

a m = a 2 m

then any divisor of 2m − m = m might be the length of the cycle. (The tail length, λ, at this point can be any value > m - μ and <= m.)

Once an equality is found, λ can be determined by single-stepping forward from $a m$ and from the beginning of the sequence in parallel, and checking for equality after each step, as in the first stage of the algorithm. Since $a m$ is m - λ steps past the beginning of the cycle, advancing forward λ steps from there leads to a point m steps past the beginning of the cycle, which, since m is a multiple of μ, is the beginning of the cycle. So after λ steps, both instances will arrive at the beginning of the cycle, and the number of steps required for that to happen is λ.

Once λ is known, μ can be found by repeating the first stage of the algorithm, starting at the beginning of the cycle. Since the new tail length is zero, the new m will be guaranteed equal to the cycle length rather than just a multiple of it.

[edit] Visualizing the algorithm

The best way to visualize this algorithm is to make a diagram of the sequence. It looks like the Greek letter ρ. The sequence starts at the bottom of the "tail", and moves upward and clockwise around the loop. Following the algorithm, the two instances of the sequence will meet at a₆ after six iterations. If the algorithm keeps running, the sequences will meet again, after six more iterations, at the same element. Since the cycle length is in fact six, the same result will keep occurring.

[edit] Performance

The first stage of the algorithm requires at least max( λ, μ ), and at most λ + μ comparisons to find an equality. Finding the beginning of the cycle requires another λ comparisons. Finding the exact cycle length can be done with μ additional comparisons, though it can be done with fewer by taking advantage of the fact that m must be a multiple of μ.

The algorithm uses O(1) storage.

[edit] Variants

Perhaps the most widely known variant of Floyd's cycle-finding algorithm is Pollard's rho algorithm, an integer factorization algorithm that uses pseudo-random sequences to factor integers. There is also an algorithm for calculating discrete logarithms based on Floyd's cycle-finding algorithm.