Collatz conjecture

From Wikipedia, the free encyclopedia

Directed graph showing the orbits of small numbers under the Collatz map. The Collatz conjecture is equivalent to the statement that all paths eventually lead to 1.
Directed graph showing the orbits of small numbers under the Collatz map. The Collatz conjecture is equivalent to the statement that all paths eventually lead to 1.

The Collatz conjecture is an unsolved conjecture in mathematics. It is named after Lothar Collatz, who first proposed it in 1937. The conjecture is also known as the 3n + 1 conjecture, as the Ulam conjecture (after Stanislaw Ulam), as the Syracuse problem, as the hailstone sequence or hailstone numbers, or as wondrous numbers per Gödel, Escher, Bach. It asks whether a certain kind of number sequence always ends in the same way, regardless of the starting number.

Paul Erdős said about the Collatz conjecture, "Mathematics is not yet ready for such problems." He offered $500 for its solution. (Lagarias 1985)

Contents

[edit] Statement of the problem

Consider the following operation on an arbitrary positive integer:

  • If the number is even, divide it by two.
  • If the number is odd, triple it and add one.

For example, if this operation is performed on 3, the result is 10; if it is performed on 28, the result is 14.

In modular arithmetic notation, define the function f as follows:

 f(n) = \begin{cases} n/2 &\mbox{if } n \equiv 0 \pmod{2}\\ 3n+1 & \mbox{if } n\equiv 1 \pmod{2}.\end{cases}
Numbers from 2 to 9999 and their corresponding total stopping time.
Numbers from 2 to 9999 and their corresponding total stopping time.

Now, form a sequence by performing this operation repeatedly, beginning with any positive integer, and taking the result at each step as the input at the next.

In notation:

 a_i = \begin{cases}n & \mbox{for } i = 0 \\ f(a_{i-1}) & \mbox{for } i > 0. \end{cases}

The Collatz conjecture is: This process will eventually reach the number 1, regardless of which positive integer is chosen initially.

That smallest i such that the above holds is called the total stopping time of n. The conjecture asserts that every n has a well-defined stopping time. If, for some n, such an i doesn't exist, we say that n has infinite total stopping time and the conjecture is false.

If the conjecture is false, it can only be because there is some starting number which gives rise to a sequence which does not contain 1. Such a sequence might enter a repeating cycle that excludes 1, or increase without bound. No such sequence has been found.

[edit] Examples

For instance, starting with n = 6, one gets the sequence 6, 3, 10, 5, 16, 8, 4, 2, 1.

Starting with n = 11, the sequence takes longer to reach 1: 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1.

If the starting value n = 27 is chosen, the sequence takes 111 steps, climbing above 9,000 before descending to 1.

{ 27, 82, 41, 124, 62, 31, 94, 47, 142, 71, 214, 107, 322, 161, 484, 242, 121, 364, 182, 91, 274, 137, 412, 206, 103, 310, 155, 466, 233, 700, 350, 175, 526, 263, 790, 395, 1186, 593, 1780, 890, 445, 1336, 668, 334, 167, 502, 251, 754, 377, 1132, 566, 283, 850, 425, 1276, 638, 319, 958, 479, 1438, 719, 2158, 1079, 3238, 1619, 4858, 2429, 7288, 3644, 1822, 911, 2734, 1367, 4102, 2051, 6154, 3077, 9232, 4616, 2308, 1154, 577, 1732, 866, 433, 1300, 650, 325, 976, 488, 244, 122, 61, 184, 92, 46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, 5, 16, 8, 4, 2, 1 }

[edit] Program to calculate Collatz sequences

A specific Collatz sequence can be easily computed, as is shown by this pseudocode example:

function collatz(n)
  while n > 1
    show n
    if n is odd
      set n to 3n + 1
    else
      set n to n / 2
  show n

This program halts when the sequence reaches 1, in order to avoid printing an endless cycle of 4, 2, 1. If the Collatz conjecture is true, the program will always halt no matter what positive starting integer is given to it. (See Halting problem for a discussion of the relationship between open-ended computer programs and unsolved mathematics problems.)

[edit] Supporting arguments

Although the conjecture has not been proven, most mathematicians who have looked into the problem believe intuitively that the conjecture is true. Here are two reasons for expecting this.

[edit] Experimental evidence

The conjecture has been checked by computer for all start values up to 10 × 258 ≈ 2.88×1018[1]. While impressive, such computer bounds are of very limited evidential value. More than one important conjecture has been found to have only exceptionally large-valued counterexamples (see for examples the Pólya conjecture, the Mertens conjecture and the Skewes' number).

[edit] Probabilistic evidence

If one considers only the odd numbers in the sequence generated by the Collatz process, then one can argue that on average (specifically, the geometric mean of the ratios) the next odd number should be about ¾ of the previous one [2], which suggests that they should decrease in the long run (although this is not evidence against cycles, only against divergence).

[edit] Other ways of approaching the problem

[edit] In reverse

There is another approach to prove the conjecture, which considers the bottom-up method of growing the so called Collatz graph. The Collatz graph is a graph defined by the inverse relation

 R(n) = \begin{cases} 2n & \mbox{if } n\equiv 0,1,2,3,5 \\ 2n, (n-1)/3 & \mbox{if } n\equiv 4 \end{cases} \pmod{6}.

So, instead of proving that all natural numbers eventually lead to 1, we can prove that 1 leads to all natural numbers. For any integer n, 3n + 1 ≡4 (mod 6) iff n ≡1 (mod 2) and thus n ≡1, 3 or 5 (mod 6). Also, the inverse relation forms a tree except for the 1-2-4 loop (the inverse of the 1-4-2 loop of the unaltered function f defined in the statement of the problem above). When the relation 3n + 1 of the function f(n) is replaced by the common substitute "shortcut" relation (3n + 1)/2 (see Optimizations below), the Collatz graph is defined by the inverse relation,

 R(n) = \begin{cases} 2n & \mbox{if } n\equiv 0,1 \\ 2n, (2n-1)/3 & \mbox{if } n\equiv 2 \end{cases} \pmod{3}.

This inverse relation forms a tree except for a 1-2 loop (the inverse of the 1-2 loop of the function f(n) revised as indicated above).

[edit] As rational numbers

The natural numbers can be converted to rational numbers in a certain way. To get the rational version, find the highest power of two less than or equal to the number, use it as the denominator, and subtract it from the original number for the numerator (527 → 15/512). To get the natural version, add the numerator and denominator (255/256 → 511).

The Collatz conjecture then says that the numerator will eventually equal zero. The Collatz function changes to:

 f(n, d) = \begin{cases} (3n + d + 1)/2d  & \mbox{if } 3n + d + 1 < 2d \\
(3n - d + 1)/4d & \mbox{if } 3n + d + 1 \ge 2d \end{cases} (n = numerator; d = denominator).

This works because 3x + 1 = 3(d + n) + 1 = (2d) + (3n + d + 1) = (4d) + (3n - d + 1). Reducing a rational before every operation is required to get x as an odd.

[edit] As an abstract machine ...

Repeated applications of the Collatz function can be represented as an abstract machine that handles strings of bits. The machine will perform the following two steps on any odd number until only one "1" remains:

  1. Add the original with a "1" appended to the end to the original (interpreting the string as a binary integer), i.e. 3n + 1 = (2n + 1) + n
  2. Remove all trailing "0"s.

[edit] ...which is equivalent to Base Two arithmetic

Another way to examine the 3n+1 conjecture is through the base two system. An example would go as follows:

Ex. We will use the number 7, so in base two it is written thus: 111

                                                    111
                                                   1111
                                                  10110 
                                                 10111
                                                 100010
                                                100011
                                                110100 
                                               11011
                                              101000
                                             1011
                                            10000

The method used to iterate any number in base two is write the initial number, then beneath it write the same number with an additional 1 on the right side, then add these two numbers. Any zeros that result on the right side can be crossed out and the process is repeated until the number iterates to 1.

Comparison of the Abstract Machine to equivalent Base 2 arithmetic

 #
 # Python
 #
 import re     # regular expressions
 import gmpy   # base 2 math library
 def abstract_machine(s):
   # define Truth Tables for the Full Adder
   sum_tt   = {'000':'0','001':'1','010':'1','011':'0','100':'1','101':'0','110':'0','111':'1'}
   carry_tt = {'000':'0','001':'0','010':'0','011':'1','100':'0','101':'1','110':'1','111':'1'}
   print s
   while s != '1':
     if s[-1]=='1':                                  # it's odd
       s  = '00' + s                                 # operands must be same length, so prepend with MS 0
       ss = '0' + s + '0'                            # shift left (append LS 0) and prepend (MS 0) to allow carry
       t  = "".join(reversed(s))                     # iterating is L->R, so temporarily reverse
       tt = "".join(reversed(ss))
       carry = '1'                                   # preset carry (the '1' of '3n+1')
       answer = ""                                   # initialize answer
       for i,j in enumerate(t):                      # walk through operands one char at a time
         the_input = carry + j + tt[i]               # assemble input from previous carry & two operands
         the_sum = sum_tt[the_input]                 # look up sum out in sum Truth Table
         carry   = carry_tt[the_input]               # look up carry out in carry Truth Table
         answer = answer + the_sum                   # append sum to answer (carry used on next iteration)
       final_answer = "".join(reversed(answer))      # un-reverse answer
       if final_answer[0]=='0':                      # if the last pad caharacter didn't receive a carry,
         final_answer = final_answer[1:]             # ...get rid of it
       print final_answer                            # show result before stripping LS 0's
     else:                                           # it's even
       final_answer = s
     m = re.search('(.*1)(0*$)',final_answer)        # remove all LS 0's in one fell swoop
     s = "".join(m.groups()[0])                      # reassemble answer to do next iteration
     print s
 def base_2(n):
   while n>1:
     f = gmpy.scan1(n,0)                             # find position of LS 1-bit
     if f>0:                                         # it's even
       print gmpy.digits(n,2)                        # print n in base 2
       n = n >> f                                    # remove all LS 0's in one fell swoop
     else:                                           # it's odd
       print gmpy.digits(n,2)                        # print n in base 2
       n = (n << 1) + n + 1                          # multiply by 3 and add 1
   print gmpy.digits(n,2)                            # print n in base 2
 # main
 print 'test of abstract machine:'
 print
 abstract_machine('111')
 print
 print
 print 'test of base 2:'
 print
 base_2(7)
 ##  test of abstract machine:
 ##
 ##  111
 ##  10110
 ##  1011
 ##  100010
 ##  10001
 ##  110100
 ##  1101
 ##  101000
 ##  101
 ##  10000
 ##  1
 ##
 ##
 ##  test of base 2:
 ##
 ##  111
 ##  10110
 ##  1011
 ##  100010
 ##  10001
 ##  110100
 ##  1101
 ##  101000
 ##  101
 ##  10000
 ##  1
 ##

[edit] As a parity sequence

For this section, consider the Collatz function in the slightly modified form

 f(n) = \begin{cases} n/2 &\mbox{if } n \equiv 0 \\ (3n +1)/2 & \mbox{if } n \equiv 1 \end{cases} \pmod{2}.

This can be done because when n is odd, 3n + 1 is always even.

If P(…) is the parity of a number, that is P(2n) = 0 and P(2n + 1) = 1, then we can define the Collatz parity sequence for a number n as pi = P(ai), where a0 = n, and ai+1 = f(ai).

Using this form for f(n), it can be shown that the parity sequences for two numbers m and n will agree in the first k terms if and only if m and n are equivalent modulo 2k. This implies that every number is uniquely identified by its parity sequence, and moreover that if there are multiple Collatz cycles, then their corresponding parity cycles must be different.

The proof is simple: it is easy to verify by hand that applying the f function k times to the number a 2k+b will give the result a 3c+d, where d is the result of applying the f function k times to b, and c is how many odd numbers were encountered during that sequence. So the parity of the first k numbers is determined purely by b, and the parity of the (k+1)th number will change if the least significant bit of a is changed.

The Collatz Conjecture can be rephrased as stating that the Collatz parity sequence for every number eventually enters the cycle 0 → 1 → 0.

[edit] As a tag system

For the Collatz function in the form

 f(n) = \begin{cases} n/2 &\mbox{if } n \equiv 0 \\ (3n +1)/2 & \mbox{if } n \equiv 1 \end{cases} \pmod{2}

the Collatz sequences are computed by the extremely simple 2-tag system whose production rules are

 \begin{cases} a \rarr bc \\ b \rarr a \\ c \rarr aaa \end{cases}

and in which a positive integer n is represented by a string of n a's, with iteration of the tag operation halting on any word of length less than 2. (Adapted from De Mol.)

The Collatz conjecture can be rephrased as stating that this tag system, with an arbitrary finite string of a's as the initial word, eventually halts. See the linked article for a worked example.

[edit] Extensions to larger domains

[edit] Iterating on all integers

For any integer n, rather than just positive integers, we map it to the integer f(n), where

f(n) = 3n + 1  if n is odd;
f(n) = n/2     if n is even.

Interestingly, there are in this case a total of 5 known cycles, which all integers seem to eventually fall into under iteration of f. These cycles are listed here, starting with the well-known cycle for positive n.

To save steps, we list only the odd numbers of each cycle (except for the trivial cycle {0}). Each odd number n, when f is applied repeatedly, will next reach an odd number at (3n+1) / (the largest power of 2 that divides 3n+1); each cycle is listed with its member of least absolute value first. We follow each cycle with the size of the full cycle (in parentheses): the number of members, odd or even, belonging to a cycle, counted without repetition.

a)    1  →  1   (size 3)

b)    0  →  0   (size 1)

c)    -1  →  -1  (size 2)

d)    -5  →  -7  →  -5   (size 5)

e)    -17  →  -25  →  -37  →  -55  →  -41  →  -61  →  -91  →  -17   (size 18)

We may define the Generalized Collatz Conjecture as the assertion that every integer, under iteration by f, eventually falls into one of these five cycles a), b), c), d), or e).

[edit] Iterating on rational numbers with odd denominators

The standard Collatz map can be extended to (positive or negative) rational numbers which have odd denominators when written in lowest terms. The number is taken to be odd or even according to whether its numerator is odd or even.

The parity sequences as defined above are no longer unique for fractions. However, it can be shown that any possible parity cycle is the parity sequence for exactly one fraction: if a cycle has length n and includes odd numbers exactly m times at indices k0, …, km-1, then the unique fraction which generates that parity cycle is

\frac{3^{m-1} 2^{k_0} + ... + 3^0 2^{k_{m-1}}}{2^n - 3^m}.

For example, the parity cycle (1 0 1 1 0 0 1) has length 7 and has 4 odd numbers at indices 0, 2, 3, and 6. The unique fraction which generates that parity cycle is

\frac{3^3 2^0 + 3^2 2^2 + 3^1 2^3 + 3^0 2^6}{2^7 - 3^4} = \frac{151}{47}.

The complete cycle being: 151/47 → 250/47 → 125/47 → 211/47 → 340/47 → 170/47 → 85/47 → 151/47

Although the cyclic permutations of the original parity sequence are unique fractions, the cycle is not unique, each permutation's fraction being the next number in the loop cycle:

(0 1 1 0 0 1 1) → \frac{3^3 2^1 + 3^2 2^2 + 3^1 2^5 + 3^0 2^6}{2^7 - 3^4} = \frac{250}{47}


(1 1 0 0 1 1 0) → \frac{3^3 2^0 + 3^2 2^1 + 3^1 2^4 + 3^0 2^5}{2^7 - 3^4} = \frac{125}{47}


(1 0 0 1 1 0 1) → \frac{3^3 2^0 + 3^2 2^3 + 3^1 2^4 + 3^0 2^6}{2^7 - 3^4} = \frac{211}{47}


(0 0 1 1 0 1 1) → \frac{3^3 2^2 + 3^2 2^3 + 3^1 2^5 + 3^0 2^6}{2^7 - 3^4} = \frac{340}{47}


(0 1 1 0 1 1 0) → \frac{3^3 2^1 + 3^2 2^2 + 3^1 2^4 + 3^0 2^5}{2^7 - 3^4} = \frac{170}{47}


(1 1 0 1 1 0 0) → \frac{3^3 2^0 + 3^2 2^1 + 3^1 2^3 + 3^0 2^4}{2^7 - 3^4} = \frac{85}{47}

Also, for uniqueness, the parity sequence should be "prime", i.e., not partitionable into identical sub-sequences. For example, parity sequence (1 1 0 0 1 1 0 0) can be partitioned into two identical sub-sequences (1 1 0 0)(1 1 0 0). Calculating the 8-element sequence fraction gives

(1 1 0 0 1 1 0 0) → \frac{3^3 2^0 + 3^2 2^1 + 3^1 2^4 + 3^0 2^5}{2^8 - 3^4} = \frac{125}{175}

But when reduced to lowest terms {5/7}, it is the same as that of the 4-element sub-sequence

(1 1 0 0) → \frac{3^1 2^0 + 3^0 2^1}{2^4 - 3^2} = \frac{5}{7}

And this is because the 8-element parity sequence actually represents two circuits of the loop cycle defined by the 4-element parity sequence.

In this context, the Collatz conjecture is equivalent to saying that (0 1) is the only cycle which is generated by positive whole numbers (i.e. 1 and 2).

[edit] Iterating on real or complex numbers

Cobweb plot of the orbit 10-5-8-4-2-1-2-1-2-1-etc. in the real extension of the Collatz map (optimized by replacing "3n + 1" with "(3n + 1)/2" )
Cobweb plot of the orbit 10-5-8-4-2-1-2-1-2-1-etc. in the real extension of the Collatz map (optimized by replacing "3n + 1" with "(3n + 1)/2" )

The Collatz map can be viewed as the restriction to the integers of the smooth real and complex map

f(z):=\frac 1 2 z \cos^2\left(\frac \pi 2 z\right)+(3z+1)\sin^2\left(\frac \pi 2 z\right),

which simplifies to \frac{1}{4}(2 + 7z - (2 + 5z)\cos(\pi z)).

If the standard Collatz map defined above is optimized by replacing "3n + 1" with "(3n + 1)/2" (see Optimizations below), it can be viewed as the restriction to the integers of the smooth real and complex map

f(z):=\frac 1 2 z \cos^2\left(\frac \pi 2 z\right)+\frac 1 2 (3z+1)\sin^2\left(\frac \pi 2 z\right),

which simplifies to \frac{1}{4}(1 + 4z - (1 + 2z)\cos(\pi z)).

Iterating the above optimized map in the complex plane produces the Collatz fractal.

Collatz map fractal in a neighbourhood of the real line
Collatz map fractal in a neighbourhood of the real line

[edit] Optimizations

The "parity" section above gives a way to speed up simulation of the sequence. To jump ahead k steps on each iteration (using

the f function from that section), break up the current number into two parts, b (the k least significant bits, interpreted as an

integer), and a (the rest of the bits as an integer). The result of jumping ahead k+c[b] steps can be found as:

f k+c[b](a 2k+b) = a 3c[b]+d[b]

The c and d arrays are precalculated for all possible k-bit numbers a, where d [a] is the result of applying the f function

k times to b, and c [a] is the number of odd numbers encountered on the way. For example, if k=5, you can jump ahead 5

steps on each iteration by separating out the 5 least significant bits of a number and using:

c [0...31] = {0,3,2,2,2,2,2,4,1,4,1,3,2,2,3,4,1,2,3,3,1,1,3,3,2,3,2,4,3,3,4,5}
d [0...31] = {0,2,1,1,2,2,2,20,1,26,1,10,4,4,13,40,2,5,17,17,2,2,20,20,8,22,8,71,26,26,80,242}

[edit] Syracuse function

If k is an odd integer, then 3k + 1 is even, so we can write 3k + 1 = 2ak′, with k' odd and a ≥ 1. We define a function f from the set I of odd integers into itself, called the Syracuse Function, by taking f (k) = k′ (sequence A075677 in OEIS).

Some properties of the Syracuse function are:

  • f (4k + 1) = f (k) for all k in I.
  • For all p ≥ 2 and h odd, f p - 1(2 p h - 1) = 2 3 p - 1h - 1 (see here for the notation).
  • For all odd h, f (2h - 1) ≤ (3h - 1)/2

The Syracuse Conjecture is that for all k in I, there exists an integer n ≥ 1 such that f n(k) = 1. Equivalently, let E be the set of odd integers k for which there exists an integer n ≥ 1 such that f n(k) = 1. The problem is to show that E = I. The following is the beginning of an attempt at a proof by induction:

1, 3, 5, 7, and 9 are known to exist in E. Let k be an odd integer greater than 9. Suppose that the odd numbers up to and including k - 2 are in E and let us try to prove that k is in E. As k is odd, k + 1 is even, so we can write k + 1 = 2ph for p ≥ 1, h odd, and k = 2ph-1. Now we have:

  • If p = 1, then k = 2h - 1. It is easy to check that f (k) < k , so f (k) ∈ E; hence kE.
  • If p ≥ 2 and h is a multiple of 3, we can write h = 3h′. Let k′ = 2p + 1h′ - 1; we have f (k′) = k , and as k′ < k , k′ is in E; therefore k = f (k′) ∈ E.
  • If p ≥ 2 and h is not a multiple of 3 but h ≡ (-1)p mod 4, we can still show that kE. (Cf.)

The problematic case is that where p ≥ 2 , h not multiple of 3 and h ≡ (-1)p+1 mod 4. Here, if we manage to show that for every odd integer k′, 1 ≤ k′ ≤ k-2 ; 3k′ ∈ E we are done. (Cf.).

[edit] See also

[edit] References and external links

76 (15), 2006, 1625-1630.