Cobham's thesis

From Wikipedia, the free encyclopedia

Cobham's thesis asserts that computational problems can be feasibly computed on some computational device only if they can be computed in polynomial time.^[1]

Formally, to say that a problem can be solved in polynomial time is to say that there exists an algorithm that, given an n-bit instance of the problem as input, can produce a solution in time O(n^c), where c is a constant that depends on the problem but not the particular instance of the problem.

Alan Cobham's 1965 paper entitled “The intrinsic computational difficulty of functions”^[2] is one of the earliest mentions of the concept of the complexity class P, consisting of problems decidable in polynomial time. Cobham theorized that this complexity class was a good way to describe the set of feasibly computable problems. Any problem that cannot be contained in P is not feasible, but if a real world problem can be solved by an algorithm existing in P, generally such an algorithm will eventually be discovered.

The class P is a useful object of study because it is not sensitive to the details of the model of computation: for example, a change from a single-tape Turing machine to a multi-tape machine can lead to a quadratic speedup, but any algorithm that runs in polynomial time under one model also does so on the other.

[edit] Reasons behind Cobham's Thesis

The thesis is widely considered to be a good rule of thumb for real-life problems. Typical input lengths that users and programmers are interested in are between 100 and 1,000,000, approximately. Consider an input length of n=100 and a polynomial algorithm whose running time is n². This is a typical running time for a polynomial algorithm. (See the "Objections" section for a discussion of atypical running times.) The number of steps that it will require, for n=100, is 100²=10000. A typical CPU will be able to do 10⁹ operations per second. So this algorithm will finish in (10000 ÷10⁹) = .00001 seconds. A running time of .00001 seconds is reasonable, and that's why this is called a practical algorithm. The same algorithm with an input length of 1,000,000 will take about 17 minutes, which is also a reasonable time.

Meanwhile, a exponential algorithms typically have a running time of 2ⁿ. The number of operations that it will require, for n=100, is 2¹⁰⁰. It will take (2¹⁰⁰ ÷ 10⁹) ≈ 1.3×10²¹ seconds, which is (1.3×10²¹ ÷ 31556926) ≈ 4.1×10¹³ years. Since it'll take 41 trillion years to complete for a typical input size, that's why this algorithm would be considered impractical.

[edit] Objections

There are many lines of objection to Cobham's thesis. The thesis essentially states that "P" means "easy, fast, and practical," while "not in P" means "hard, slow, and impractical." But this is not always true, for the following reasons:

It ignores constant factors. A algorithm that takes time 10¹⁰⁰n is in P (it is linear time), but is completely impractical. An algorithm that takes time 10^-10002ⁿ is not in P, but is practical for n up into the low thousands. However, this objection is rarely relevant for real-life problems. Throughout the history of computer science, many algorithms have been invented to solve real-life problems (so-called "natural problems"). Almost none of these algorithms have a running time with a large coefficient like 10¹⁰⁰ or a small coefficient like 10^-1000. In other words, a running time of 10¹⁰⁰n is not a "typical" running time. It is indeed possible to devise a problem whose best algorithm would be 10¹⁰⁰n, but the problem would be highly artificial. (For example: "Print out 10¹⁰⁰ copies of the input.") Algorithms for problems that occur in practice typically have coefficients between 1/100 and 100.
It ignores the size of the exponents and the bases. A problem with time n¹⁰⁰ is in P, yet impractical even for n = 2. A problem with time 1.004ⁿ is not in P, yet is practical for n up into the low thousands. But again, when one looks at the running times of all the useful polynomial algorithms that have been invented, almost all of them have a small exponent (usually less than 15). Therefore a running time of n¹⁰⁰ is not typical. Problems have been proven to exist in P that require arbitrarily large exponents (see time hierarchy theorem), but they fall into the realm of abstract theoretical mathematics, and don't relate to real-life problems. Similarly, when we look at the running times of exponential algorithms that solve useful problems, almost all of them have a base of 2 or more. An exponential running time with a base close to 1, like 1.004ⁿ, has never been encountered in a "natural problem".
It ignores the size of the input. If one has an exponential algorithm with time 2ⁿ, it is reasonable to use it for a very small input size, e.g., n<10. Meanwhile, a polynomial algorithm with time n² is impractical for an input size like n=10⁵⁰. But an input size of 10⁵⁰ is rarely encountered in a real-life situation. On the other hand, exponential algorithms are occasionally used with small inputs to solve real-life problems.

In the first three points above, of course, the terms "typical," "natural," "useful," and "real-life" are subjective. They have no formal definition in this context. Informally, an algorithm is "useful" if it could be used outside the realm of theoretical computer science, e.g., in engineering, and if the problem occurs in practice, not only in theory.

Other objections to the thesis are:

It only considers worst-case times. Consider a problem that can be solved in time n most of the time, but takes time 2ⁿ on very rare occasions. This problem has an average time that is polynomial, but a worst-case time that is exponential, so the problem would not be in P. But even though it would not be a polynomial algorithm, it would be very practical to use. The simplex algorithm is an example of practical algorithms that are exponential in the worst case.
It only considers deterministic solutions. Consider a problem that can be solved quickly if a tiny error probability is acceptable. For example, the algorithm always outputs an answer in polynomial time, and there's a 1 in 1 billion chance that the answer it gave is wrong. Meanwhile, solving the same problem exactly, with zero chance of error, would take much longer. The problem would not belong to P, even though in practice it can be solved quickly. This is a common approach to attack problems in NP not known to be in P (see RP, BPP). Even if P = BPP, as many researchers believe, it is often considerably easier to find probabilistic algorithms.

It ignores the speed of the hardware that the algorithms are run on. Currently, a typical real-life computer can perform somewhere between 10⁹ and 10¹⁵ operations per second. But if, at some point in the distant future, a CPU is invented that can perform 10^10¹⁰⁰ operations per second, then exponential-time algorithms with long inputs could be run on that CPU very fast. In fact, if computer memory sizes grow much more slowly than CPU speeds, a CPU might exist that runs at 10^10¹⁰⁰ Hz, and, at the same time, the largest memory module in the world might hold 10⁵⁰ bits. In this situation, exponential algorithms would run fast for even the largest possible inputs. Note that, given the current most widely-accepted theories of physics, a 10^10¹⁰⁰ Hz CPU is not possible.[1] However, also note that some physics theories that were once widely accepted were later shown to be wrong, and our current theories may be proven wrong as well. At the other extreme, if a CPU is built that takes 50 years to do one operation, then even a typical polynomial algorithm would be unacceptably slow on that CPU. Of course, polynomial algorithms are better for any processor speed, because they suffer less from a processor slowdown and benefit more from a speedup. But an extremely slow processor can still make a polynomial algorithm suffer greatly, and an extremely fast processor can make an exponential algorithm benefit greatly.

New models of computation such as quantum computers may be able to quickly solve some problems not known to be in P. The class P is defined in terms of classical computers (Turing machines), but BQP is the name of the class containing all problems that can be computed in polynomial time on a quantum computer. It is believed that there are problems that are feasible on a quantum computer but not feasible on a classical computer. For example, it is known that factorization can be done in polynomial time on a quantum computer, and it is believed (but not proven) that factorization cannot be done in polynomial time on a classical computer. If this is true, then there are feasible problems which lie outside of P, and that would refute Cobham's thesis. Note that no large quantum computer has actually been built yet, so the quantum algorithms are currently feasible in theory but not in practice. Also, factorization and problems that reduce to factorization are the only problems known to be in BQP but not known to be in P. Quantum algorithms have not to date solved any NP-hard problem in polynomial time. Therefore this particular objection to Cobham's thesis is currently only relevant to a small subset of the problems not known to be in P.

[edit] References

Steven Homer and Alan L. Selman (1992). "Complexity Theory", in Alan Kent and James G. Williams: Encyclopedia of Computer Science and Technology 26. CRC Press.
Alan Cobham (1965). "The intrinsic computational difficulty of functions", Proc. Logic, Methodology, and Philosophy of Science II. North Holland.
Lloyd, Seth. "Computational capacity of the universe". Accessed March 31, 2008.

Categories: Computational complexity theory

Cobham's thesis

From Wikipedia, the free encyclopedia

[edit] Reasons behind Cobham's Thesis

[edit] Objections

[edit] References

Views

Navigation

Interaction

Search