Talk:P = NP problem

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Computer science, which aims to create a comprehensive computer science reference for Wikipedia. Visit the project page for more information and to join in on related discussions.
Start rated as start-Class on the assessment scale
Top rated as top-importance on the assessment scale
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.
Mathematics rating: A-BB+ Class High Priority  Field: Discrete mathematics
One of the 500 most frequently viewed mathematics articles.

Contents

[edit] Solved by a Genetic Algorithm??

Just saw this blog: solved in April of 2007 by a http://www-illigal.ge.uiuc.edu/system/index.php?option=com_jd-wp&Itemid=78&p=815 From the Illinois Genetic Algorithms Laboratory


—The preceding unsigned comment was added by 72.71.211.76 (talk) 23:38, 30 April 2007 (UTC).

Look at the release date — It's an April Fools' joke you fool! :) -- intgr 23:42, 30 April 2007 (UTC)

[edit] Old discussion

I'm reluctant to contribute to the Wikipedia and hack someone else's text, but I'd like to point out that prime factorization and NP-completeness don't have anything to do with each other (or at least, nobody to my knowledge has shown that they do). The article gives the misleading impression that if someone were to prove P=NP, cryto-algorithms that rely on the hardness of factorizing prime numbers would be in jeapordy. It would be better, IMO, to stick to propositional SAT to illustrate the difference between P and NP. (as per the article on NP-completeness, which, it seems to me could well be subsumed in the discussion of the two complexity classes.

BTW, I rather disagree that a proof that P = NP would have no practical consequences, but that's another discussion.

- Andre Vellino (vellino@sympatico.ca)

The article mentions factorization only as a simple and hopefully intuitive example of a problem in NP; it also says explicitly that it is unknown whether the problem is in P. The concept of NP-completeness is not introduced until several paragraphs later. So the article does not suggest a connection between factorization and NP-completeness. AxelBoldt

At the risk of reviving old discussion, I'd like to set the record straight here: If someone proved P=NP then those crypto algorithms WOULD be in jeopardy. Factorization is in NP, even if it's not proven to be NP-Complete, if P=NP then everything in NP, NP-Complete or not, is poly-time solvable. A poly-time algorithm to factor would destroy the assumption that factoring is effectively a one-way problem that is made by public key cryptosystems. 74.12.143.44 (talk) 08:14, 26 November 2007 (UTC)


This question was asked on http://www.slashdot.org a little while ago, and it seems relevant here: What, if any, would be the effects of a proof that P=NP, or that P≠NP, or indeed that the question is undecidable?
- Stuart

The latter two wouldn't have any practical consequences, but it would be a big deal in the world of theoretical computer science (and the last also one in logic and mathematics). If somebody proves P=NP, then the practical consequences depend on the way they prove it. If they explicitly exhibit a polynomial algorithm for an NP complete problem, then we would know polynomial algorithms for all NP problems, and if the degrees of the involved polynomials are reasonable, that would be huge news in many fields, but is considered extremely unlikely. It's much more likely that either the polynomial has a ridiculously high degree, or that the proof wouldn't be constructive and just generally concludes P=NP without exhibiting any algorithm altogether. In those two cases, there wouldn't be any immediate practical consequences either. --AxelBoldt

I agree with your conclusion. However, it isn't really possible for there to be a nonconstructive proof of P=NP with no known algorithm. That's because the construction has already been done. I could easily give you an algorithm for, say, Travelling Salesman, whose correctness is obvious, but whose asymptotic running time is unknown. It's easy to prove that if P=NP, then my algorithm is polytime. If anyone ever publishes a "nonconstructive" proof of P=NP, the construction will already exist. This is all academic, of course. The huge multiplicative constant means none of this would have any practical value. --LC

That's very interesting. How does the construction work? --AxelBoldt

Assume the goal is a program that, given a TSP instance, returns the certificate (if "yes") or fails to halt (if "no"). Assume you already have a program to check certificates. Let X=1. Consider the list of all possible programs, sorted first by size, then lexicographically. Run the first X programs for X steps each, with the TSP instance as an input. If any of them halts and returns a certificate, check it. If it's right, halt and return it. If not, continue. If none of them succeed, then increment X, and repeat. Clearly, this algorithm is correct, and its theta running time will be the optimal running time, plus the time to check a certificate. The constant hidden by the theta is exponential in the length of the first program that solves the problem. Totally impractical. I don't remember offhand which books cover this, though I could look it up if needed. It's been a while since I've visited Wikipedia; it's nice to see that it's still going strong. --LC

The description is clear, I don't think we need a reference, but I think it's nice enough to put it in the main article. --AxelBoldt

I've added it, in a form that's hopefully accessible to a wider audience. --LC

Can you also concoct an algorithm which always gives the correct answer in polynomial time? --AxelBoldt

Yes, if you can tell me the running time of the first algorithm (or give a polynomial upper bound for it). Otherwise, I doubt it's possible now. It's too bad that we can construct this algorithm for NP-complete problems, but not for co-NP-complete problems. --LC

But then my original statement stands: if we find a non-constructive proof of P=NP, then we still wouldn't have a polynomial algorithm for NP problems, since a polynomial algorithm has to stop after p(n) steps. --AxelBoldt

I should have said "accept" rather than "solve". Sorry for the confusion. If we could construct a particular algorithm, and prove that it is a polytime algorithm accepting a particular language in NP-complete, then we would have proved P=NP. That would be a constructive proof of P=NP. If we could prove P=NP without knowing any polytime algorithm accepting an NP-complete language, then that would be a nonconstructive proof. The latter is impossible. As soon as someone proves P=NP, we'll immediately know a polytime algorithm that accepts an NP-complete language. It's the algorithm I gave. Note that by the standard definition, an algorithm is a polytime algorithm accepting a language if it returns "YES" in polytime on strings in the language, and never halts on strings outside the language. I'll change the page to say it "accepts the language SUBSET-SUM", rather than "solves the problem SUBSET-SUM". Actually, I think I'll add a second algorithm too. This algorithm (under stronger conditions) yields a stronger result. If there are any algorithms that can be proved to solve (not accept) an NP-complete problem, then this is one of them. I think that's an interesting result too, so I'll write it up later tonight or tomorrow. --LC


From the article:

The integer factorization problem is this: given two natural numbers x and y, with n digits each, decide whether x can be evenly divided by
an integer between 1 and y.

I think the previous example of simply deciding whether a given number is prime was simpler and more intuitive. The main point here is to get the point across that "solving is harder than checking". Was there a reason for changing it? (Also, not both x and y have to have n digits.) AxelBoldt


The problem of finding the best move in Chess or Go (on an n by n board) is EXPTIME-Complete

Is this right? Go has a strictly higher complexity class than Chess. Isn't Go PSPACE-complete and chess is EXPTIME-complete? - anonymous

Go with Ko (Japanese rules) is EXPTIME-complete. Go with other rules might also be EXPTIME-complete, but is only known to be PSPACE-hard. See this, which was also in the references at the bottom of the article. --LC 18:04 Sep 13, 2002 (UTC)

I've also removed this:

  • It ignores problems that can be well-approximated. There are some NP-complete problems where good approximations can be found quickly.

P and NP and NP-complete only contain decision problems, and you can't approximate a decision problem. Approximations are only possible for problems in classes such as NP-equivalent. --LC 18:11 Sep 13, 2002 (UTC)


What is non-P? This web site says non-P is different from NP. -Jeff

Non-P consists of all those decision problems that cannot be solved in polynomial time. That is indeed very different from NP. AxelBoldt 00:42 Jan 24, 2003 (UTC)


I'm a bit confused regarding NP-complete...is it purely a matter of probability? If P intersected with NPC yields the empty set (for certain) but it is uncertain whether P=NP, either NPC=empty set or P does not equal NP. Am i misunderstanding this?

I'm not sure whether you misunderstand or not. NPC is not the empty set (satisifiability and many other problems are known to be in NPC). So if the intersection of P and NPC is empty, problems in NPC are in NP but not in P. This would imply P!=NP. However, the great unsolved question is whether the intersection of P and NPC is empty or not. If you can prove one way or the other, you'd have solved a [Millennium Prize]] challenge and thus be up for a $1 million USD prize, in due course receive a Turing award and maybe even a Fields medal, get tenure at any university computer science department in the world, possibly have the National Security Agency rather eager to talk to you, and be batting off Hollywood starlets of the appropriate gender wherever you travelled. Well, maybe not the Hollywood starlet part. --Robert Merkel 04:42, 19 Feb 2005 (UTC)

I think the source of the confusion is the following pair of sentences: Theoretical computer scientists currently believe that the relationship among the classes P, NP, and NPC is as shown in the picture. The intersection of P and NPC equals the empty set.

The second sentence comes off as an assertion, when I think the intention was that most comp. scientists believe that. I suggest rewording, something like: Theoretical computer scientists currently believe that the relationship among the classes P, NP, and NPC is as shown in the picture, and that P and NPC are disjoint.

[edit] Quantum computers and NP-hard problems

This statement has recently been added regarding quantum computers: ...none of the problems they are known to be able to solve are NP-hard.

Is this true?

  • I thought factoring was NP-hard.
  • Shor's algorithm enables factoring in polynomial time on a quantum computer.
  • It's even been implemented on a system of (I think) 7 qubits, where it successfully factored the number 15.

So wouldn't that mean that quantum computers are known to be able to solve an NP-hard problem?

RSpeer 22:52, May 6, 2005 (UTC)

Factoring probably isn't NP hard. It's in NP intersect co-NP, which is strictly outside NP-hard provided that NP ≠ coNP, which most people believe is true. Deco 23:58, 6 May 2005 (UTC)

[edit] Small EXPTIME-complete accuracy fix

Before the article stated that EXPTIME-complete problems require exponential time. This is as incorrect as saying that NP-complete problems require nondeterminism to be solved in polynomial time; while this is believed to be true, it need not be (for example, it may be the case that all problems in EXPTIME can actually be solved in O(nlog n) time). The reason we know that EXPTIME-complete has no intersection with P is because P is not equal to EXP, which requires a proof by diagonalization and is not at all obvious. Deco 00:32, 8 Jun 2005 (UTC)

[edit] Page of incorrect proofs

I followed the link to the P-versus-NP page, and though its content is fascinating, I'm worried by the author's wording.

Deco is right that this link needs at least a warning that the proofs are incorrect, because the author never bothers to say it. In fact, he says things like "In February 2005, Raju Renjit Grover proved that P is not equal to NP", and on the same page says that other people proved P=NP. Since, last I checked, true is not false and I am not the Queen of England, this must be an unconventional use of the word "prove".

What was the author's motivation for writing like that? Is he being tongue-in-cheek? Is he trying to trip up readers who aren't observant enough to notice that the "proofs" are contradictory? Is he hoping that every reader comes to the conclusion on their own that all the proofs are bunk?

I know the page can't be entirely serious, and I recognize that reading those papers can be a great test of your own skill at finding holes in arguments. But it still alarms me that Wikipedia is linking to false information. A freshman in CS might click quickly through a few links from Wikipedia, not noticing the entirety of the content on that external page, and then go around saying "But I read on Wikipedia that P=NP". That reflects badly on Wikipedia. Or someone might use this link as Wikipedia-endorsed evidence that mathematical logic is flawed.

I don't think stating next to the link that the proofs are incorrect is enough. Is the author of the page contactable? What I'd like to see is a version of the page that keeps all the links to papers, but says more explicitly that the arguments are unlikely to be sound, and does not falsely state that they are proofs.

RSpeer 05:09, Jun 15, 2005 (UTC)

Honestly I found this a bit alarming too. Maybe he's describing them from the author's perspective, or maybe he is being a little playful. Simply creating another version of the page and stealing the list though would be rather dishonest. Maybe we could ask him to make another version of the page for the naive. If you think the link is contentious, though, maybe it's best to remove it. Deco 07:52, 15 Jun 2005 (UTC)
I followed up on this. The page's author told me in e-mail:
I try to formulate all my statements along the statements in the corresponding papers. Whenever I write something like "purportedly proves" or "claims to prove", the authors start pestering me and complaining by email that their proof is correct and not only purportedly correct etc.
He went on to say that it should be evident that they're incorrect, which it pretty much is. Deco 17:54, 30 July 2005 (UTC)

[edit] Oracle discussion

I added a brief discussion of the oracle result regarding the problem. I tried to keep it simple, but if anyone thinks it can be made easier to understand, I welcome your changes. Deco 07:52, 15 Jun 2005 (UTC)

[edit] All three classes are equal

The Caption for the pic at the top of the page says:"Diagram of complexity classes provided that P ≠ NP. If P = NP, then all three classes are equal." which is incorrect. If P=NP there will still be problems which are not NP-complete.--SurrealWarrior 02:19, 19 Jun 2005 (UTC)

No, that's not true. NP-completeness is defined with respect to polynomial-time many-one reductions. Every problem in P is trivially complete under this sort of reduction (simply solve the source problem and produce a yes/no instance of the target problem). Consequently, if P=NP, every problem in P=NP is NP-complete. Deco 08:26, 19 Jun 2005 (UTC)
This discussion is over a year inactive, but I want to point out that the caption IS wrong because a decision problem where the answer is always "no" or always "yes" cannot possibly be NP-hard, yet is obviously in P. Eric119 06:01, 23 August 2006 (UTC)
You're right. These two problems (and only these two) would fail to be NP-complete. So NP-complete is equal to P=NP minus two problems. I'll revise the caption. Deco 16:37, 23 August 2006 (UTC)
Is there a source for your conclusion, Eric and Deco? It doesn't sound right. Always accepting and always rejecting aren't in a special class that acts differently from all other problems. rspeer / ɹəədsɹ 23:25, 23 August 2006 (UTC)
Hmmm. At first I thought these two problems were special, since they don't have both yes and no instances for many-one reductions to target (so it would seem that they could only be reduced to themselves). But Papadimitriou asserts: "All languages in L are L-complete." (Computational Complexity, section 16.1, pg. 398). If this is true it seems like the analogous property must hold for P and polynomial-time reductions. These two problems are a special case in Rice's theorem - I'm really unsure about this. Deco 03:43, 24 August 2006 (UTC)
Did some more research. In chapter 14 Papadimitriou asserts "For all we know P could be equal to NP [...] Everything in NP would be both polynomial-time computable and NP-complete." Now that I think more about this, I think this has to do with vacuousness in the definition of a many-one reduction. If there is no yes/no instance to target, there is no need for the reduction to translate such instances. I will add back the statement and add an appropriate citation. Deco 03:48, 24 August 2006 (UTC)

Again, it is obviously impossible to reduce a non-empty language to the empty language. Proof: If a language L is non-empty there exists x in L, and in order for a function f to be a reduction from L to the empty language, we must have f(x) in the empty language, which by definition is impossible. QED. So Papadimitriou's statement is not quite true, as there exist non-empty languages in P. Eric119 05:17, 24 August 2006 (UTC)

Your math sounds right, and this document supports you explicitly. The only possible explanation I can find is either that Papadimitriou glossed over these special cases, or he's using a special definition of reduction/completeness (yet his original explanation of reductions is informal and does not reveal such a special case). If he really did miss these cases, then the diagram on page 330, showing P=NP=NP-complete as an explicit scenario, is an even more explicit error. Deco 08:46, 24 August 2006 (UTC)
I think it depends on the definition of reduction. Specifically, if you use Turing reduction, a non-empty language can be reduced to an empty one by not using the oracle. But it's a long time ago that I was taught theoretical CS. -- Jitse Niesen (talk) 09:44, 24 August 2006 (UTC)

I realize this is over a year old, but this whole discussion is plain WRONG. NP-Complete is defined as everything within a poly-time bound of the hardest problem in NP. If P=NP then everything in NP is solved in poly-time, which means everything is trivially in a poly-time bound of the hardest problem. The poly-time reduction to 'always return false' and 'always return true' is also pretty easy when the problem can be solved in poly-time.74.12.143.44 (talk) 08:10, 26 November 2007 (UTC)

[edit] About the algorithm that accepts the NP-complete language SUBSET-SUM.

In the section "Polynomial-time algorithms", it writes:

"No one knows whether polynomial-time algorithms exist for NP-complete languages. But if such algorithms do exist, we already know some of them! For example, the following algorithm correctly accepts an NP-complete language, but no one knows how long it takes in general. This is a polynomial-time algorithm if and only if P = NP."

Then it gives a algorithm that accepts the SUBSET-SUM and claims that the algorithm is a polynomial-time algorithm if and only if P=NP.

But I think this conclusion is not very clear since the number of the program P which produces the suitable subset of S may depend on the size of S.

Furthermore, if we change the IF statement in the algorithm with

          IF the program outputs a complete math proof
              AND each step of the proof is legal
              AND the conclusion is that S does (or does not) have a subset summing to 0
          THEN
              OUTPUT "YES" (or "NO" if that was proved) and HALT

there is another problem: the length of the proof and the number of the program which produces the proof may also depend on the size of S.


ok, I have understood. The algorithm is correct.

If there is a polynomial time program to solve the decision problme SUBSET-SUM, let P_0 be the number of such program. Note that P_0 is independent of the size of input for the SUBSET-SUM problem.

We can use the program numbered P_0 to construct a polynomial time program which produce a subset of S which adds up to 0. Let P_1 be the number of this new program. Note that P_1 is also independent of the size of input.

Assume the input S does has a subset which adds up to 0. Let N_1 be the number of steps need for program numbered P_1 to produce the correct subset of S. Since program numbered P_1 is a polynomial-time program, N_1 is a polynomial of the size of S. Therefore the algorithm will halt when N = max(N_1, P_1), P = P_1, and clearly it runs in polynomial time.

The second algorithm is similar.

---

For the above algorithm, we really do two things: 1) Search for P_0, and 2) Use P_0 on an input string. We say the overall algorithm runs in polynomial time with respect to the length of the input because task 1 is a fixed cost, and we assume P_0 is polynomial. That said, it is infeasible to search for P_0, so the algorithm is useless to us.

It is interesting to note that for short inputs, the algorithm will likely find a machine that just outputs the answer rather than solving the problem. But because such machines are themselves polynomial on the length of the solution, for large problems, we will eventually find the correct algorithm before we encounter any of these degenerate programs.

[edit] Better example for a problem in NP

I think the example definitely shouldn't be PRIMES. It is quite confusing there. How hard would it be refer to the SAT problem there? --Exa 00:05, 23 January 2006 (UTC)

Quite difficult. Really, just try to explain SAT in 50 words or less to someone who's never heard of it - perhaps never even heard of logical propositions. I like the composite problem because it's very simple, very obviously in NP, and familiar even to people with only a middle-school math education (although an NP-complete problem would be nicer). Deco 01:27, 23 January 2006 (UTC)
How about TSP or Hamiltonian cycle? The idea of finding the shortest circuit which travels through a bunch of towns before arriving back at the starting point shouldn't be too hard Keithdunwoody 03:27, 18 February 2006 (UTC)
subset sum problem. --Robert Merkel 05:02, 29 March 2006 (UTC)

[edit] Stupid Joke about P = NP

Commonly stated, this is P = NP, which is an assignment statement in most programming languages. The result of the condition is true if NP is true. So:

 if (P = NP)
   print TRUE
 else
   print FALSe

prints true if NP is true.

This passes for humour on your planet? --Surturz 04:08, 6 June 2007 (UTC)

[edit] "Polynomial-time algorithms"

The article describes an algorithm that takes an enumeration of all programs, then runs programs 1..N for N steps each, for N = 1 to infinity. Then it says:

This is a polynomial-time algorithm if and only if P = NP... If P = NP, then this is a polynomial-time algorithm accepting an NP-Complete language. "Accepting" means it gives "YES" answers in polynomial time, but is allowed to run forever when the answer is "NO".

Perhaps I am mistaken, but I think this is complete nonsense. The "algorithm" is not an algorithm at all, since it is not guaranteed to halt, and in fact won't halt if there is no solution for the problem it is trying to solve.

I suggest that the entire section be deleted, and unless there is a serious objection in a few days, that is what I am going to do. -- Dominus 20:22, 5 April 2006 (UTC)

An algorithm does not need to halt in this context. -- Jitse Niesen (talk) 04:49, 6 April 2006 (UTC)

[edit] trivia

on the margins of 'concrete mathematics' by knuth, there is a 'proof' that P=NP; it says simply N=1 => P=NP!  :)--Aryah 18:02, 16 April 2006 (UTC)

[edit] The second algorithm IS NOT CORRECT!!!!

         IF the program outputs a complete math proof
              AND each step of the proof is legal
              AND the conclusion is that S does (or does not) have a subset summing to 0
          THEN
              OUTPUT "YES" (or "NO" if that was proved) and HALT


There is no Turing Mashine, that can understand a "complete math proof".

Proof: If that there is the TM, we can solve Halting problem. We've got a TM, and we want to know if it stops. Let's search for complete math proofs and check them. Any TM either stops or not, so eventually we'll find a proof that TM stops or proof that TM doesn't stop. Hence we can solve the halting problem. But it is impossible.

I propose to delete this algorithm. —The preceding unsigned comment was added by 194.85.83.201 (talk • contribs) 17:07, 8 May 2006 (UTC)

Not only do Turing Machines that can check a "complete math proof" exist, such algorithms have even been implemented. One of the programs that check proofs is Coq. Your argument assumes that there either exists a proof showing that the TM halts, or there exists a proof showing that it does not halt. I think the conclusion of your argument is that there are Turing Machines for which such a proof does not exist. -- Jitse Niesen (talk) 01:15, 9 May 2006 (UTC)

"Every 'function which would naturally be regarded as computable' can be computed by a Turing machine."(Church–Turing thesis). So if some program can check proofs, then there is a TM can doing it. It's very difficult to understand / can exist TM, for which there is no such proofs, but if they do exist, why every word for every language has to have a proof, that it is not in the language(or proof, that it is in the language)? I mean why the second algoritm hopes that it will find a proof?

I've deleted the algorithm.194.85.82.254 05:04, 20 May 2006 (UTC)

Why? The Clay Math Institute Official Problem Description (pdf), listed in the External links section, says that "formal proofs can easily be recognized in polynomial time". So, I reinstated the algorithm.
If your reason for removal is supposed to be explained in the fragment "It's very difficult to understand … find a proof?" above, then could you please reformulate that? I have no idea what it means, and I couldn't even tell whether it was supporting your argument or not. By the way, it's useful if you could make an account or at least sign with the same name every time, since your IP address changes, which leaves me wondering whether you're the same person or not. -- Jitse Niesen (talk) 05:59, 20 May 2006 (UTC)

[edit] Another question about the second algorithm

The algorithm that decides subset-sum seems to rely on an unstated assumption: if P=NP, then there exists an algorithm that decides subset-sum and outputs a complete mathematical proof in polynomial time. This wasn't at all obvious to me, and it was fairly non-trivial to prove. Is it worth including a proof, or at least a pointer to one? -- Nate Biggs 21:07, 6 June 2006 (UTC)

Pointer to one would be good. I wouldn't put a proof here personally, but that's just because I don't really like that section. Deco 23:11, 6 June 2006 (UTC)
Any idea where one might find a pointer to one? Personally, my preference would be to delete the first algorithm entirely (who cares if there's a program that accepts subset-sum? That has nothing to do with P=NP) and include a short sketch of the proof that the second algorithm is correct. -- Nate Biggs 20:23, 8 June 2006 (UTC)

[edit] Article needs easily-understandable practical example

This article should have a more easily-understandable pratical example for the less tech/math-savvy ones, such as this one.

I can't write one myself, if that's what you're thinking. --Procrastinator 17:11, 31 July 2006 (UTC)


[edit] Maze theory

Isn't this the same as saying "given a randomly generated maze, is there a general algorithm for finding a route between entrance exit which does not involve taking every path" ?

Suppose I generate a random maze which has exactly one route between entrance and exit. If I then draw a line on it, you can easily verify if the line connects the entrance with the exit.

Now imagine I come up with an algorithm that finds that path without taking every possible route. So far so good. But what if I now move the exit so it lies on one of the unexplored routes. Now I must continue running the algorithm until the new route is found. At this point I again move the exit to another unexplored route. I continue doing this until all of the maze has been covered.

Thus there can be no general algorithm for finding the path between entrance and exit. QED.

Is this correct, or have I made a mistake somewhere ?

-- salsaman@xs4all.nl 29 Aug 2006.

What exactly are you trying to prove? It's not clear to me. The statement, "Thus there can be no general algorithm for finding the path between entrance and exit.", is false. You can always use breadth-first search to find the shortest path or discover that there is no path. Eric119 01:40, 30 August 2006 (UTC)
That's true. What I meant to say is "there can be no general algorithm that guarantees finding the single route solution without trying every route". But perhaps I am simply misunderstanding the problem. -- salsaman@xs4all.nl 1 Sep 2006.
There are several problems with this approach. One is that, because the maze itself is part of the input, even a search of the entire maze requires only linear time, which is polynomial (assuming a reasonable encoding). Another problem is that for any particular instance, the goal node is fixed; you have not established the existence of a single goal node that forces the search to visit every location. Yet another problem is that you have not established that this problem is NP-complete (and it's not) - there are many problems for which finding and verifying solutions can both be done in polynomial time, but none of them are known to be NP-complete. Deco 03:05, 2 September 2006 (UTC)

Thankyou. Your answers are helping me to understand the problem better. Reading it all through again it makes more sense. However there seems to be a contradiction on the page, it says:

'If P = NP, then this is a polynomial-time algorithm accepting an NP-Complete language. "Accepting" means it gives "YES" answers in polynomial time, but is allowed to run forever when the answer is "NO".'

I don't understand this. If there is an algorithm which finds a solution in polynomial-time, then surely it should just terminate after the maximum number of steps, and say that the answer is "NO". Why would it need to run forever if the answer is "NO" ? If it can't say "NO" in a finite time, then that means there could still be a solution left to find.

-- salsaman@xs4all.nl 02 Sep 2006.


More importantly, I guess what I meant by the maze analogy, and moving the exits is, what about Godel's theorem ? What I mean by this is, consider the subset-sum problem. I conjecture that for any proposed solving algorithm you give me, I can construct an input set which you can't solve in polynomial time. For example, suppose the algorithm starts with the lowest numbers first. I can simply construct a set with a solution 1,000,000 and -1,000,000, and fill all the rest with small numbers. If the algorithm starts with biggest numbers first, I can construct a set with 1 and -1 and fill the rest with larger numbers. If the algorithm starts by considering pairs, then triplets, I construct a set where all the numbers must be summed to reach 0.

All I need do is look at the algorithm, and figure out a way to "break" that algorithm, i.e. construct a set where the solution will be the last possible combination tested.

-- salsaman@xs4all.nl 03 Sep 2006.

In short, yes, it's easy to come up with counterexamples that indicate why a particular heuristic strategy is not polynomial-time in the general case. The difficulty is finding such counterexamples for all possible solutions, of which there are an infinite number, most of them very complex. Don't forget that the algorithm can alter its behavior depending on the specific input, which includes the location of the goal. I'm also unclear on the "allowed to run forever on NO" thing - complexity (as opposed to computability) generally only considers algorithms that always terminate. Deco 10:21, 3 September 2006 (UTC)
But isn't that precisely what Godel's theorem says: for any consistent formal theory (algorithm) that proves basic arithmetical truths (solves the subset-sum problem), it is possible to construct an arithmetical statement (input set) that is true (has a solution) but not provable in the theory (solvable in polynomial time). Secondly, wouldn't finding an "algorithm breaking" input, itself be an NP-complete problem ? Thus if P==NP, I could find such an input in polynomial time, and then feed it into the first algorithm ! Finally, regarding the quote, I cut and pasted it from the article page. It's also mentioned above, "polynomial-time algorithms". Look for yourself ! My point is, the program should always be able to halt ! It just needs to compute the maximum time it would take to find a solution - if it can find it in polynomial time, then this should be just a function of the input length. Then if no solution is found in this time, it outputs "NO" -- salsaman@xs4all.nl 03 Sep 2006.
Answering my own first question, I suppose that if Godel's theorem were true for NP-complete solvers, then I would just construct a third algorithm, which finds a "second algorithm breaker". Then that second algorithm breaker would be my chosen solution for the subset-sum problem. But this leads to a contradiction - which implies that either an NP-complete solver would not be a consistent formal theory, or that P!=NP. I don't know what to make of this. I am probably stating something obvious, but I've not seen things expressed in exactly this way elsewhere.-- salsaman@xs4all.nl 3 Sep 2006.
Answering my own question again. I suppose a set with an infinite number of elements could be solved in polynomial time, but that time could also be infinite. So this would be the application of Godel's theorem... And given such an infinite set, it could take an infinite amount of time to produce a "NO" answer, so that also addresses my other point. -- salsaman@xs4all.nl Sep 03 2006

[edit] P=NP?

I noticed THIS on arxiv.org: [1] and [2] 70.49.174.170 01:23, 5 September 2006 (UTC)

Well, all seem more complicate. I don't know if author of this papers is right or wrong. Have a look, people, if you feel competent.70.49.174.170 01:46, 5 September 2006 (UTC)
For the purposes of wikipedia it doesn't matter yet: it's dated sept 2 2006: any serious claim of such a proof would go through scrutiny from the scientific community and then either a flaw would be found or it would become gennerally accecpted. Unless if something gets into a peer reviewed scientific journal I don't think it would be worth mentioning. GromXXVII 02:04, 5 September 2006 (UTC)
There are lots of preprints floating around that claim to prove or disprove P=NP. Unless and until this one gets more scrutiny we'll have to reserve judgement. This one doesn't contain any signs of obvious crackpottery in the first two pages, which is an improvement on at most such preprints. --Robert Merkel 02:19, 5 September 2006 (UTC)
OK, the paper is mentioned here. Diaby has been working on this for a while, and it's been discussed extensively on the comp.theory newsgroup. They don't accept it, and he hasn't been able to interest any of the academic theory community in his result. --Robert Merkel 02:33, 5 September 2006 (UTC)
I have read this article and it seems odd for me. Author reduces TSP to LIP (Linear Integer Programming is known to be NP-hard, so TSP can be reduced to LIP - nothing new), then he construct non-integer version claiming that solution to this non-integer version allow to answer TSP problem. As we know non-integer linear programming is most often polynomially bounded (in terms of time used), but no one (before this paper) proved that integer programming may be reduced to non-integer version. Well... incredible ;-).
There are several methods for proving theorems – proof via symbol over-usage, proof by hands weaving, proof by authority (Simon says...), proof by discrete change of assumptions, proof by negation of assumptions, proof by assumption that thesis is true, proof by mumbling and so on :-). This proof is hard to understand because of symbol over-usage, but in my opinion possible to be found false.
Meanwhile - I have developed my own very formal proof that P < NP - it is being under review by Discrete Applied Mathematics and hopefully will be available next year :-). See My HomePage. --Hcsradek 9:55, 7 September 2006 (GMT+2)
You forgot one: proof by removing your article from a website :-) -- anonymous.
I have an excellent proof of P ≠ NP, but it is too large to fit in the margin of this talk page. Deco 10:50, 10 September 2006 (UTC)
Proof by removeing is cool :)... but it was not main idea - I have prepared short summary, and it is avaliable here. --Hcsradek 11:39, 18 September 2006 (GMT+2)
It took longer then I expected but now I can formally prove that article [3] is incorrect. Counter example for this model is avaliable in mine article here --Hcsradek 14:54, 20 October 2006 (GMT+2)

Returning to my "maze theory" above - somebody was asking for an example for non-mathematicians. Somebody suggested a breadth-first algorithm for finding the exit. A breadth-first search is an example of an NP-complete algorithm. My point, before I distracted myself (!), was this: Example for non-mathematicians: you are standing at the entrance to a maze. You know for a fact there is exactly one other exit from the maze. The challenge is this: can you walk to that other exit without trying each possible path through the maze, and without just stumbling on it ? Most people would say no, but can you prove it one way or the other ?

-- salsaman@xs4all.nl 10 Sep 2006.

Salsaman, we've already added an easy to understand example of an NP-complete problem at the top of the article: subset-sum. It's about as simple an example as you can get. Adding woofly analogies to ill-defined problems is not likely to improve comprehension any further. --Robert Merkel 02:25, 11 September 2006 (UTC)

[edit] History

Please could everyone look at the reverts in this article, I have been trying to revert some blatant opinion expressing but I'm having trouble with user:82.250.60.102, please check and make your opinion on the matter as I have little knowledge of the subject Ryanpostlethwaite 17:47, 23 November 2006 (UTC)

If you have little knowledge of the subject, please move on... 82.250.60.102

Please check over the history for a few edits ago. We have come to an agreement now and in my opinion his new propsal of the title (see edits) is no longer biased. But I am still erring on caution so please check and ammend appropriatly Ryanpostlethwaite 18:00, 23 November 2006 (UTC)

I'm sorry but this line IS an opinion which I've just removed as other people have different views on P=NP; 'However, holding such belief for such reasons is tantamount to believing that Higgs Boson does not exist' Ryanpostlethwaite 22:06, 23 November 2006 (UTC)

I'd agree with you there Ryan. I'm not an expert in the subject but 82.250.60.102's edits read like soapboxing to me. However, Wikipedia has a three revert rule, so if it's a big issue you need to look into Wikipedia:Resolving disputes. — jammycakes 22:23, 23 November 2006 (UTC)

If you are not an expert then do not even think about anything as being soapboxing or not being soapboxing. Ryan could not be aware of three revert rule, he just arrived in town. The only big issue here is people like Ryan or Jammycakes expressing their opinion through reverts, supposedly in order to prevent violation of non-opinion policy. Ryan is the violator. user:82.250.60.102

I don't think we will need to go through Wikipedia:Resolving disputes. I'm confident that anybody who knows a bit about P=NP will agree that "Some computer scientists may err in professing that PNP" is not a neutral formulation and that 82.250.60.102's edits should be reverted. -- Jitse Niesen (talk) 02:56, 24 November 2006 (UTC)

This is your opinion about which you are confident. By the way, I know quite a bit about P=NP. 82.250.60.102

I think now that this is highly out of order, you have 3 neutral people saying that the items you added are biassed yet you still persist to revert the edits back to yours. By all means lets discuss this on the talk page BUT User:82.250.60.102 needs to reflect on what everyone else is saying Ryanpostlethwaite 11:16, 24 November 2006 (UTC)

Neutrality is by no means a way of deciding what is the right thing for this Wikipedian entry. If you were talking about knowledgeable or competent people, then I guess one could listen to your "all means". 82.248.114.187

(copied from User talk:82.250.60.102, addressed to him/her): You have given no valid reason at all for why you want to change the paragraph in the article. You said in this edit summary that the original paragraph was expressing an opinion. However, the original paragraph said "Most computer scientists believe that PNP", which you want to replace with "Some computer scientists may err in professing that PNP". Surely, your version is more opinionated in that it implies that in fact P=NP. You also said in this edit summary "please discuss in the talk page". Well, it is rather hard to discuss it without knowing why you want to change it. By all means, tell your arguments and I'm interested to discuss it. However, editing the article without giving any arguments when requested will not do. -- Jitse Niesen (talk) 11:29, 24 November 2006 (UTC)

The original paragraph did say that "Most computer scientists believe that PNP". This is as much an opinion as saying "Some computer scientists may err in professing that PNP". What source do you have to backup this majority statement? How do you know "most" computer scientists believe that? Would any of us see this as a "belief"? ... Well, that's maybe the only point where this statement comes close to truth... P!=NP became a belief for some colleagues. You ask for "arguments". Did anyone bring up evidence that "most computer scientists" think so? If you knew just a bit about the subject, you would think twice before deleting the analogy with the existence of Higgs boson. 82.248.114.187

The only reasonable way by which you could change your paragraph (if you still believe that Wikipedia should defend this "original" opinion) would be to say: "Computer scientists of the Bill Gasarch type believe that PNP". He (and SIGACT) is your only source. Wake up. 82.248.114.187

The point that I am trying to make is that although you may think it is absurbed that some people believe that PNP, there are some people who think that this is correct. The analogy with the existence of Higgs boson suggests that the people that believe PNP are stupid and completely incorrect, this is an opionion that you are expressing as I am sure the people that believe PNP would have a far different opinion. The paragraph in question is showing why people believe that PNP and it is not trying to show that this is correct. Therefore the changes that you made are unhelpful and need reverting to keep the articles neutrality Ryanpostlethwaite 21:41, 25 November 2006 (UTC)

May I make one suggestion, that we change the title of the paragraph from 'most computer scientists believe that PNP' to 'some computer scientists believe that PNP'. That may even improve the overall neutrality of the paragraph, but all other edits on this paragrph were opinionated and therefore should not be kept Ryanpostlethwaite 21:59, 25 November 2006 (UTC)
Earlier in the article, it says (I long ago added this):
In a poll of 100 researchers, 61 believed the answer is no, 9 believed the answer is yes, 22 were unsure, and 8 believed the question may be independent of the currently accepted axioms, and so impossible to prove or disprove.
So, uh, yes, I interpret that as "most computer scientists believe P ≠ NP", or more accurately, most believe there is strong evidence that P ≠ NP. In any case I think I provided ample counterpoints in that section already in the quotations, and the claim that Higgs Boson is even related to the pure mathematical result of whether P ≠ NP seems silly to me and I wouldn't believe it without a verifiable source. Even then, the wording "some may err" is strongly implying that those computer scientists believing P ≠ NP are wrong. Please leave it like it is and if you can find a source for this Higgs Boson stuff, talk about it somewhere else in the article. Deco 07:05, 26 November 2006 (UTC)

As already said... This poll just reflects what Bill Gasarch made it to reflect. Ryan's change for the title seems sound. As for relating Higgs Boson to this pure computer science result, Deco is clearly unaware of what's silly and what's not. "Some may err" is as strong as saying in the title "Why computer scientists think..." (which implies an absolute opinion of all computer scientists). I will propose a new version beyond Deco's dictatorship and my own stubbornness. 82.253.212.117

You haven't offered any support for this claim that Gasarch in any way forced the result of the poll. Really I'm surprised that as many as 9 out of 100 would think that P=NP, which I think is pretty obviously extremely unlikely. I strikes me as quite plausible that most computer scientists think P!=NP, the much more intuitively reasonable outcome, especially given the coherent picture of the polynomial-time hierarchy that has emerged.
However if you have some countervailing evidence to Gasarch's poll, then please present it. Otherwise I personally will oppose any change in the direction you suggest (including Ryan's proposed change, because we have a prima facie case that "most" computer scientists in fact believe P!=NP, and I see no reason to water it down). --Trovatore

Go for "many", though the many may be wrong (but that's obviously not the problem of Wikipedia). 82.253.212.117

I think now that we have finally gained consesus! If all are agreed we will leave the heading 'most computer scientists believe that P≠NP' and the subsequent paragraph intact and revert (as already done so) all changes made to it. Many thanks for everyones contributions Ryanpostlethwaite 23:37, 26 November 2006 (UTC)

[edit] Questions

Hi all, not an expert but find the subject very interesting. The debate over this question has a lot of interesting elements at its core. Basically, I think this is one of the most philosophically unique debates in science. First, its resolution has important consequences for several fields. I also think the nature of the dispute is interesting, in that (at least, as the dispute is represented in this article,) the people who say P!=NP are making an empirical argument based on current failure despite lots of effort, while the proponents of P=NP are holding out for a theoretical proof. Thanks for all of your energetic and thoughtful development of the article.

Due to the nature of the dispute, I find myself jumping a bit into the future, and I have a couple of questions for people who really think about this stuff. Any discussion would be great, and if you could post your reply on my talk page so that I would see it I would be very appreciative. Again, these are just some questions, I profess very little previous exposure to the subject.

Questions: Given the differing nature that seems to be emerging among the two "camps" on this question, and given the way that the question is formulated, how would a theoretical proof that P<NP even look or work? It seems that the "proof" that P<NP would simply be a sufficiently long list of examples of things like subset-sum, at least if current approaches to the question continue. I know this question seems silly, in that we don't know what the proof will look like until we find it, but I'm asking more generally how a question like this could even be resolved. I don't understand fully the interactions between the question and challenging examples, such as subset-sum.

IF P=NP, does that mean necessarily that every question like subset-sum must have an expedient answer and we just haven't found it yet? Therefore, does it mean that in EVERY persisting example where seemingly P<NP, all parties have overlooked a stroke-of-brilliance way to answer the example question that is as quick as it is to verify it? This seems exceedingly unlikely, doesn't it? Another way to ask this question is to say: does P=NP have to hold for everything to be considered true? Thus, if P=NP, does that mean that it is IMPOSSIBLE to dream up a question which is long to answer but quick to verify?

It seems to me that the philosophy of P=NP, the underlying claim, is that there is a fundamental identity of both answering a question and verifying the answer, that they must by definition have such a close brotherhood that, when the issue is fully explored and properly understood, their computational requirements converge. Something like subset-sum is currently easier to verify than to prove. Let's say, for sake of argument, that we figured out THE fastest way to verify a subset-sum answer, and it's x amount of time. Does P=NP, if true, imply that we have also found the amount of time that the perfect method would require to answer a subset-sum question, even if we have no idea what that method is?

Thanks for your help! Gcolive 20:58, 13 January 2007 (UTC)

Great questions, Gcolive. I'll try to directly answer some of your specific points. This doesn't have much to do with improving the article, but what the hell.
"the people who say P!=NP are making an empirical argument based on current failure despite lots of effort [...] IF P=NP, does that mean necessarily that every question like subset-sum must have an expedient answer and we just haven't found it yet? Therefore, does it mean that in EVERY persisting example where seemingly P<NP, all parties have overlooked a stroke-of-brilliance way to answer the example question that is as quick as it is to verify it? This seems exceedingly unlikely, doesn't it?"
You have described quite well the reason why many people believe P ≠ NP. On the other hand, some of the deepest results in complexity theory were the proofs of NL=coNL (Immerman-Szelepcsényi Theorem) and L=SL. At one time a logspace hierarchy analogous to the polynomial hierarchy was defined, but turned out to be trivial. And it wasn't that people hadn't tried to find efficient algorithms for NL-complete problems, although they certainly haven't received the heaps of attention NP-complete problems get.
how would a theoretical proof that P<NP even look or work?
That's just it - we have no idea. We know that P ≠ EXP by the Time hierarchy theorem, a basic diagonalization argument, but oracle results show that this won't work for P ≠ NP. Likewise, we know that AC0NC, because it does not contain the parity function, which was probably the most important separation result of recent times. But the theory of natural proofs tells us that similar ideas won't prove P ≠ NP. In short, pretty much everything that ever worked, won't work - we need something new.
Let's say, for sake of argument, that we figured out THE fastest way to verify a subset-sum answer, and it's x amount of time. Does P=NP, if true, imply that we have also found the amount of time that the perfect method would require to answer a subset-sum question, even if we have no idea what that method is?
Sort of. It tells us that solving it is at worst polynomially slower than verifying it. For example, verifying it might require O(n2) time asymptotically, whereas solving it requires O(n5) time. A great example of this is primality certificates, which are short proofs of a number's primality that (today) can be verified more quickly than their primality can be established, even though this problem is in P (ref AKS primality test). Nevertheless, yes, P=NP would prove (perhaps nonconstructively) the existence of polynomial time solutions to any problem for which we have a polytime verifier.
Thus, if P=NP, does that mean that it is IMPOSSIBLE to dream up a question which is long to answer but quick to verify?
Roughly speaking, yes. It can still be polynomially slower to solve than to verify, and it may still be the case that a separation exists at higher levels of complexity, somewhere above NEXPTIME.
the proponents of P=NP are holding out for a theoretical proof
I'm not sure I'd go quite that far - there aren't many "proponents" of P=NP at all, and most people who believe that seem to believe it in analogy to other results like the ones I mentioned above. The people who "aren't sure" though are waiting for a proof and more importantly for the tools that would be needed to construct such a proof.
Hope this helps. Deco 08:23, 14 January 2007 (UTC)

[edit] Confusion

Hello all, I'm trying to understand the "Polynomial-time algorithms" section of the article. I'm not convinced that if P=NP, that in fact an algorithm for an NP problem must be polytime. I was under the impression that a proof that P=NP would imply that there existed an equivalent polytime algorithm, not that a particular algorithm must be polytime. Can someone explain this to me? Arkleseizure 21:48, 27 January 2007 (UTC)

Your question is a bit vague. Which part of the argument do you not follow? Do you not believe the given algorithm is correct, or do you not believe it runs in polynomial time (given P=NP)? --Mellum 22:02, 27 January 2007 (UTC)
I think Arkleseizure is making the point that the article claims that "if P=NP then any algorithm for an NP problem runs in polynomial time", which is not true. However, I don't see where the article makes this claim. -- Jitse Niesen (talk) 07:55, 28 January 2007 (UTC)

[edit] Good job

Hello, I just read the entire article and I want to say good job to the people who wrote it. This is a very complex subject but you people have found a way to explain it in a simple, clear, easy to understand way. Good Job. 70.49.201.164 03:12, 4 February 2007 (UTC)

[edit] Formal Definition

Why is (Note that the abbreviation NP stands for "Non-deterministic Polynomial" and not for "Non-Polynomial".) at the end of the Formal Definition section? It should be readily available in the introduction to this entry.


[edit] Independent of axioms

The article vividly explains the evidence, theoretical arguments, potential consequences, and history surrounding the statements P≠NP and P=NP. But Gasarch's poll found that some researches believe that the statements are independent of the accepted axioms. I feel that this article should give that viewpoint equal exposition as it gives the other two. Otherwise, terrific article. Thanks! --131.193.178.113 20:53, 13 February 2007 (UTC)

A good point. We really ought to discuss this viewpoint in more detail, but frankly it involves a deeper background in logic, and I don't think I could explain it myself. I might be able to find some references or hopefully some contributors who know better than I. Deco 21:42, 13 February 2007 (UTC)

Independent of which accepted set of axioms? Peano arithmetic? ZFC? Something in between, or something stronger? It makes a difference -- "independent" is not a truth value, contrary to some people's vague notion of the idea. Those who responded in that way to the poll may have had viewpoints sufficiently disparate among themselves that no single coherent exposition will really be possible, and in any case we don't have a reference for just what it was that those who responded thus were thinking. (Or I assume we have no such reference -- if anyone does have it, of course please share.) --Trovatore 22:07, 13 February 2007 (UTC)

Usually ZFC. We have a reference for some of them at least, that are quoted in the poll paper. Deco 02:56, 14 February 2007 (UTC)
Hm, I took a look at the poll paper, and the author guesses they were talking about ZFC, but we have nothing from the respondents. Except that three other respondents said it was not independent of PRA! So I don't think we have much to go on here. Independence from ZFC, as opposed to other stronger or weaker theories, is nothing special, and no particular reason to single it out. --Trovatore 05:15, 14 February 2007 (UTC)

[edit] New discussion

All,

Ok, I've asked nicely for my edits here and in Wikipedia:Love, Wikipedia:Life to be left in-situ, but there seems to be a persistent desire to remove them. Since the addition logically invalidates the rest of the article here and articles elsewhere I guess the revertees are protecting sunk investment. Ho Hum.

I will not put the additions back on Wikipedia again.

Your choice.

-matt dspart 16h10, 26 February 2007 (GMT).

So what exactly do love & life have to do with P and NP? Why did you vandalize this article because of reverts on those articles? Why are you complaining about those reverts here instead of on those 2 talk pages? It's pretty clear your "additions" to those 2 articles are nonsense. Blokhead 22:36, 26 February 2007 (UTC)

Blokhead (apt, perhaps? Seriously, are you cube-shaped? Have N of M components in structured groups? Have an ultimate "Queen" for your drones? Need Data? I might have some...),

Re. Love, etc. Triangulated place-holder in, well lets call it a tuple-space for now.

Re. "nonsense": E=mc^2, and you *fully* understand that, right? Understand it a little? Heard about it? Got a t-shirt with it on with an old guy sticking his tongue out, maybe.

So, "pretty clear" then, eh? I think not. However, instead of dismissing it out of hand, I'm will to listen to arguments which show any fault with the logic here, or that this is incorrect.

To put my money where my mouth is (and, yes there is some), I bet you *or anyone, anywhere* publicly, every single penny that I have - or a choccy bar, Kip Thorn - that P=NP . In *one* line. For it is Extraordinary Proof/HowdyDoody Time, LOL.

Also, there's a much better solution to that Fermat's Last (4 lines), but I can't be arsed.

Think consequences. Think Hiroshima.

Anyhow, better things to do than talk to Blokhead[s]. Mornington Crescent to you, again, BTW.

-matt dspart 23h00, 26 February 2007 (GMT).


I'm going to agree with dspart, but only because I'm drunk. --131.193.179.146 04:10, 1 March 2007 (UTC)

[edit] Are there really hard problems?

I published a proof that there is no proof that any decidable decision function in not in O(n). It can even be extended to O(1).

Any comments? Uri Even-Chen 19:57, 21 April 2007 (UTC)

I haven't looked at this, but it clearly is possible to show a decision problem cannot be solved in O(1) (e.g. does a string of bits contain a 1).
Also note that the best execution time depends on the model of computation. A Turing machine doesn't support random access so many things require worse time complexity on a Turing machine. Eric119 20:09, 21 April 2007 (UTC)
Okay, I looked at this, and it is very confusing. You seem to be using the same letters to denote different things. The error (so far as I can tell), is exemplified in the statement, "Calculating each bit has been proved to be a hard problem." The problem as a whole has a time complexity, but when you specify a specific input the answer for that specific input can always be done immediately, just by returning whatever the answer is. This says nothing about the increase in time required as the problem size increases.
The fact that a subset m can be easily solved does not imply anything about the difficulty of the larger set. Eric119 20:30, 21 April 2007 (UTC)

I will try to express this in simple words:

1. every specific question with yes or no answer (1 or 0) can be computed in O(1) (there is no input, the input is included in the question itself). The question must be defined in a deterministic way.

2. for every n, in order to prove that the set of possible inputs of n bits (2^n possible inputs) cannot be calculated in less than c*n steps (for any constant) by any algorithm, at least one specific example of such an input needs to exist.

3. we come up with a sequence of inputs of size n (from one to infinity) for which it is known that it cannot be calculated in less than c*n steps (for any constant) by any algorithm.

4. for every n, we create a new algorithm that will check the input, if it is equal to this specific hard input then we can display the known answer, otherwise we can use another algorithm - whether efficient or not.

5. the result is a contradiction.

Which means, a deterministic proof doesn't exist.

But, a statistical or probabilistic approach might show that the chances to find such an algorithm who computes the function in O(n) are very low. It doesn't mean such an algorithm doesn't exist - maybe it does. We can't prove it doesn't exist, the only thing we can say is that we don't know it.

It's easier to understand when the size of input is specific. For example, if the size of the input is up to 50,000,000 bits - we need to prove that for every algorithm of size up to 50,000,000 bits, calculating the function in O(50,000,000) steps is not feasible. Can we check 2^50,000,000 algorithms? Probably not. But there are problems for which we can estimate that the chances to find such an algorithm are low.

An example of a specific problem is factoring numbers. If we receive a number of n bits, and we need to calculate its prime factors which can be represented together in m bits, then each bit of m is a decision problem, and the total calculation will not take more than O(m*n). If each bit is calculated in parallel, or if there is even a more efficient algorithm, O(n) can probably reached too. This can be easily verified because if we already know the prime factors for any specific input, calculating the function is very easy. It is a hard problem since a solution is not known to us, but we cannot prove that such a solution doesn't exist.

Uri Even-Chen 22:24, 25 April 2007 (UTC)

Wikipedia talk pages are for discussing the article, not for discussing or promoting your research (and blogs are generally not considered reliable sources, so that shouldn't be cited in the article); this behavior, and especially your mass posting of this is canvassing and spamming. I have removed the link from your comment. -- intgr 21:00, 21 April 2007 (UTC)
The short refutation to your "proof" is Time hierarchy. The long one is that your proof goes wrong in step 2. If a language is not decidable in O(n) time, then for all O(n)-time algorithms A, for infinitely many input lengths n, there is an n-bit input (possibly different for each algorithm) on which A gives the wrong answer. Of course it's ridiculous to think that there is one input that every algorithm gets wrong. All you showed was that the latter was impossible. Hopefully you will stop persisting along these lines. Blokhead 23:39, 25 April 2007 (UTC)

[edit] Simplification of introductory paragraph

I have taken a lot of the technical details out of the introductory paragraph to make it more accessible to people who aren't theoretical computer scientists. All the intro needs to do is introduce the flavour; the technical details are presented quite adequately in the body of the article. --Robert Merkel 12:52, 5 May 2007 (UTC)


[edit] Suggested addition to "Is P really practical?"

I would like to propose adding another reason to the list of reasons in this section of the article. I would like to add something like this:

  • It ignores the speed of the hardware that the algorithms are run on. Currently, a typical real-life CPU (or a collection of real-life CPUs working together) can perform somewhere between 109 and 1015 operations per second. So it's true that a polynomial-time algorithm runs fast and an exponential-time algorithm runs slowly on currently existing hardware. But if, at some point in the distant future, a CPU is invented that can perform 1010100 operations per second, then exponential-time algorithms with long inputs could be run on that CPU very fast. For example, an algorithm with a running time of 2n, with an input of length n=10000, would finish in less than a second. Note that, given the current most widely-accepted theories of physics, such a CPU is not possible; see: .[4]. However, also note that some theories that were once widely accepted were later shown to be wrong, and our current theories may be proven wrong as well. At the other extreme, if a CPU is built that takes one day (24 hours) to do one operation, then even a polynomial-time algorithm requiring 1 million steps would be unacceptably slow on that CPU.

Would it be OK to add that? Is it too long? I'm not sure if I should have a citation in there. I don't have a source for this; it just seems to be somewhat self-evident to me. Also, I'm not an expert in complexity theory or physics, so perhaps someone who knows more can edit what I wrote to make it more accurate. Navigatr85 23:44, 3 July 2007 (UTC)

This is sort of like the first, second, and last points already in this section (i.e. practicality is actually a function of how fast you can execute the number of steps involved in solving the range of inputs you're actually interested in). Nothing in this section is referenced. Rather than add more unreferenced material, I'd suggest finding some references about the topic of practicality and rewriting this section to summarize what they say. Oh, and BTW, I don't think surmising a CPU that can perform 1010100 operations per second is reasonable in a section on practicality. -- Rick Block (talk) 17:54, 4 July 2007 (UTC)
I would say this actually covers a completely different topic than the first, second, and last points in this section. The first and second points talk about the running time of the algorithm in terms of the number of steps that the algorithm requires; they don't talk about how fast each of those steps will be executed. I think you're referring to the fact that using a new CPU and running an existing algorithm on the new CPU provides a speedup by a constant factor. This is true. But that's different than inventing a new algorithm that is faster, by a constant factor, than the old algorithm. For example, the first point is suggesting that we compare the following two things (or something along the same lines):
  1. an algorithm in which the number of steps is 101000n, run on a particular CPU
  2. another algorithm in which the number of steps is n, run on that same CPU.
But I'm suggesting that we compare something like the following two things:
  1. an algorithm where the number of steps is 101000n, run on a CPU with a clock speed of 1 Hz
  2. the same algorithm, where the number of steps is 101000n, run on a CPU with a clock speed of 101000 Hz.
Meanwhile, the last point discusses new models of computing, not equivalent to a Turing machine. But I'm actually talking about something equivalent to a Turing machine that just runs very, very fast.
I agree that the ideal thing to do would be to find published books that dicuss practicality and summarize parts of them. Regarding "surmising," I thought that this section already does some surmising. For example, the first point suggests an algorithm that takes 101000n steps, but I don't think any algorithm has been discovered that solves a useful real-life problem and takes 101000n steps. Similarly, I don't think a useful algorithm has been found that takes O(n1000) steps, or O((1+10-10000)n) steps, or O(2n/1000) steps. (Please correct me if I'm wrong. Of course, I admit that the terms "useful" and "real-life" are subjective.) So the first and second points seem to be asking questions like "What if a useful O(n1000) algorithm were found?", and then answering those questions. Similarly, I'm asking, "What if a 1010100 Hz CPU were built?", and I'm attempting to answer that. Yes, the section is about practicality, but more specifically, it's about exceptions to the rules of thumb that we usually use to determine practicality.
Navigatr85 01:12, 6 July 2007 (UTC)
It's a valid point but much too long. I would prefer to say simply that "Advances in CPU technology may make exponential-time algorithms practical for practical ranges of problem sizes." Also, if you ask me, the whole discussion in that section is a lot more relevant to asymptotic complexity in general and ought to be moved out of this article. Dcoetzee 01:18, 6 July 2007 (UTC)

This point argues against itself: any improvement in processor speeds gives more benefit to polynomial time algorithms than it does to exponential time algorithms. If you have a computer that goes 10^1000 times as fast, then you can solve an instance of a O(n^2) problem that's 10^500 times bigger, but an instance of a O(10^n) problem that's only 1000 times bigger. Similarly for reductions in speed: polynomial algorithms suffer less than exponential ones.

In any case, the point is already made in the article; there's no point in saying in ten different ways that complexity classes are all about asymptotic behaviour but in practice we have only finite resources. Gdr 15:14, 6 July 2007 (UTC)

That's a good point, Gdr.....But it depends on how fast memory sizes grow in comparison to CPU speeds. If you build a CPU that runs at 1010100 Hz, but if at the same time, the largest hard drive in the world can only store 1050 bits, then both exponential algorithms and polynomial algorithms would be fast for the largest possible input size. (I'm assuming fast hard drive read/write speeds.) I don't think the point is already made. This section doesn't seem to contain anything yet about exactly how large our real-life finite resources are. --Navigatr85 17:04, 10 August 2007 (UTC)

Since no one has raised any further objections, I'm going to add this to the article. However, I have a feeling that people WILL raise further objections immediately after I add this to the article. :) Feel free to post your further thoughts; I think it's a good discussion. --Navigatr85 19:20, 20 November 2007 (UTC) —Preceding unsigned comment added by Navigatr85 (talkcontribs)

The objections I already stated were not addressed. I've replaced the point with the abbreviated wording that I suggested. I really think this whole section ought to be somewhere else, but that can be addressed another time. Also, avoid citing talk pages (or any page on Wikipedia) - it's not considered a reliable source. Dcoetzee 20:20, 20 November 2007 (UTC)
I apologize, Dcoetzee. I thought I had addressed your first objection. You said my original proposal was much too long, and I thought you were just saying that it should be about the same length as the first point and the second point in the list. The version of my paragraph that I added to the article wasn't the same as my original proposal. It did have about the same number of words as the first and second points. Are you saying that your version is superior to my version, in and of itself? Or are you just saying that, in general, a shorter version is better? I can understand that, because, as you said, this discussion is somewhat off-topic from the P=NP problem, so I guess you might be saying that you just want to have less off-topic text in this article. I saw that CRGreatHouse added a link to "Cobham's thesis." What do you think about moving this whole section out of the P=NP article and into the Cobham's thesis article? And I think that if it gets moved, then, after we move it, it should have a discussion of hardware, including some examples with numbers. I think this is the reason for Nicolaennio's confusion in his posts on this talk page. See my response to him below. Also, I wasn't citing the talk page. I was just trying to put a link similar to the links that already exist in wikipedia that say things like "where?--see talk page". For example, see the first paragraph of this article: Tying (commerce). Sorry for the confusion. --Navigatr85 20:56, 2 December 2007 (UTC)

[edit] Suggested move

Considering that we already have articles on P and NP, this article appears at first glance to be redundant. Only after reading it is it clear that this article discusses the P = NP problem. I'd suggest we rename it to suggest this; maybe P=NP problem or something like that. What do you think? Dcoetzee 22:24, 24 July 2007 (UTC)

Its title at http://www.claymath.org/millennium/ is "P vs NP Problem" - if we're renaming (and I'm not at this point advocating a rename), using this name (or, following Wikipedia capitalization guidelines, "P vs NP problem") seems like a reasonable idea. -- Rick Block (talk) 01:10, 25 July 2007 (UTC)
I disagree with "vs" - it's not a contest! I do agree with an article move though, I also found it confusing coming here specifically to look up the equality problem and the title of this doesn't suggest it very well. Remy B 08:01, 25 July 2007 (UTC)
I think here "vs" is intended in the sense of "as opposed to", as in, is there a difference? Ideally I'd prefer "P≟NP problem", where "≟" is the equals with a question-mark over it, but that might not have wide browser support. Although it looks odd, I'm currently leaning towards "P=NP problem", as that immediately evokes the topic in question. Dcoetzee 00:17, 26 July 2007 (UTC)
I dont really think P≟NP problem would help people trying to search for the article, considering the character isnt on a standard keyboard. P=NP problem looks good to me. Remy B 08:06, 26 July 2007 (UTC)

[edit] Another suggested move

Writing P=NP with no spaces around "=" is generally regarded as incorrect according to Wikipedia conventions; it should by P = NP instead. I'd move the article to P = NP problem (now a redirect page) instantly if fixing all the links were not going to take a while. Michael Hardy 03:31, 15 September 2007 (UTC)

Good point. I can handle automatically fixing redirects, provided everyone agrees this is a good idea. Dcoetzee 07:52, 17 September 2007 (UTC)

OK, I've moved it and fixed the resulting double redirects. The next problem is fixing all the links. Michael Hardy 16:17, 21 September 2007 (UTC)

[edit] Delete "Is P really practical?"

What's all this stuff about P is impratical, it is better a low exponential, we have few resources and so on? I think that when you speak about P is impratical you make an hypothesys on the length of input, but this hypothesys is not justified!!!!! Why thinking that input is short? What if our x^(10^100) takes as input whole atoms in universe? Is it worse than e^(-10000*x)? I do not think so, not only, but all this stuff denotes an ignorance about the meaning of big O notation. If I say that e^(-10000*x) is not in O(x^(10^100)) I mean there is a constant after that ≈e^(-10000*x) is greater than x^(10^100), so on inputs that bigger we still prefer exponential? So not only this section is WRONG, but it also creates CONFUSION. I agree with Dcoetzee.

--Nicolaennio 12:31, 21 September 2007 (UTC)

I don't think you do agree with me - I think it's out of place in this article since it doesn't involve in particular the P=NP problem but rather the characterization of P as a practical class and more generally, whether the definitions of complexity classes capture the separation between tractable and intractable problems. I do believe the section is accurate and raises (albeit verbosely) some important points. Dcoetzee 23:17, 21 September 2007 (UTC)
Thank you for your prompt response. I was wrong, I do not agree with you.As said before I do not think it should be placed here, but I also do not think it is accurate: It lacks a definition of practicity, 'It ignores constant factors' has not been justified, 'It ignores the size of the exponents' does not have a meaning without the definition of practicity, 'It only considers worst-case times' can be a good point, but it is a common point for all complesity classes (for NP too), 'It only considers deterministic solutions' in general problems do have a deterministic solution, a subset of them consider also approximation, but we want to be general. 'quantum computers' should not be placed here. My suggestion is that if there is not a widely accepted and hardware-free definition of practicity, this section can only create confusion and misunderstanding of the problem --Nicolaennio 09:16, 25 September 2007 (UTC)
A also agree with Dcoetzee. It is important to keep these things stated somewhere, but probably not in this article. P as a synonym for "practical" is a great oversimplification, but it's also fairly standard in complexity theory. It's worth pointing out (1) why it's a useful and meaningful simplification for complexity theorists, and (2) where exactly it simplifies things and doesn't reflect our intuition about "practical". Anyway, should this section be moved to P (complexity)? Blokhead 14:53, 25 September 2007 (UTC)
If a definition of practicity is given somewhere (Blokhead, you said is is fairly standard, can you give me a link), it would be a good idea to include it in P. Unfortunately it would go for every other class, infact also L can be very impractical, and also EXPTIME can be very practical. It seems that the suggested notion of practicity in the article is very tranversal, infact is something like 'choose a reasonable amount of time, the best processor physics allow us to build (not quantum), expect a certain length of input i_length' and now define PRACTICAL the set of complexity classes for which it exists a Turing machine which ends, in the worst case, in the reasonable amount of time. I think that PRACTICAL intersects, but not contains, every complexity class. --Nicolaennio 08:03, 26 September 2007 (UTC)
That is the entire problem. There is no definition of what is practical. It is not a complexity class. It is an imprecise and intuitive notion, which is why no complexity class completely characterizes it perfectly. Practicality depends on hardware, model of computation, input sizes. Like you say, there are (intuitively) practical and impractical problems in every complexity class, which is one of the main points of this section I think. But now I am confused -- are you still arguing that this section is confusing and misleading? It sounds like you are mostly agreeing with it. Still, it is not worth giving up talking about the practicality or problems. That is what ultimately drives complexity theory. It's still a useful and meaningful first-order approximation/simplification to think "P = practical". For a reference, the Arora-Barak textbook has a section (1.4) discussing this topic, and motivates the utility of P while addressing common criticisms. Blokhead 14:08, 26 September 2007 (UTC)

Another point should worth mentioning, by considering practicity as defined in my previous answer, P=NP problem can be seen as a practicity statement, that is, is it possible that every problem (also difficult) turn out to be practical? However in the explaination of the voice it seems that EVEN IF we demonstrate P = NP we did not gain anything because P can be also a 'bad' class (As you can see it does not intersect so much with PRACTICAL). I think this is misleading in the comprehension of the statement, and possibly does not allow the reader to make an own idea on its possible answer --Nicolaennio 08:03, 26 September 2007 (UTC)

I think P=NP is more of a philosophical statement. If P=NP then there is some amazingly clever way to avoid exhaustive search of witnesses on all problems. This clever way may not yet be practical, but it would be significantly faster -- it would only have time to check a tiny fraction of possible witnesses, which would be quite revolutionary from a philosophical point of view. So I wouldn't agree that P=NP would not gain anything. Anyway, I'm a cryptographer, so I probably lose my job if P=NP ;) Blokhead 14:08, 26 September 2007 (UTC)

After all I think that to understand complexity classes it should be important to state, somewhere, that we do not consider hardware stuff, because with enough hardware we can break down any constant or lower exponent, and that any statement is input-lenght - free, that is we prefer a polynomial w.r.t. an exponential because after a certain length it is best. I do not think P=NP is the right place because it is an advanced statement and this is discussion about practicity is at an entry level of complexity theory, so it can only be deceptive for the reader to guess if P is 'good'. So I think that there should be a brief explanation of tape compression and linear speed up theorem in computational complexity theory that motivates O notation (actual motivation is incomplete) and effectively a discussion in P about the fact Dcoetzee and Blokhead say, that is it is usually considered a 'good' class not because it contains fast algorithms, but because it is better than other ones (P is strictly contained in EXP). --Nicolaennio 09:37, 4 October 2007 (UTC)

I think a discussion of the relationship between P and the (fuzzy and subjective) set of practical problems belongs somewhere - either in the article on P itself, or some higher-level article on complexity theory. It's misleading to simply say "problems in P are considered to be practical" - and I don't buy the argument either that it's "better than other classes" - for example, the simplex algorithm for linear programming is of exponential time complexity, but frequently used because of its good practical performance on typical instances, despite the fact that polynomial-time algorithms exist. Dcoetzee 22:47, 4 October 2007 (UTC)

Thanks for producing such a great page. As a complete novice in this area I found the level fairly accessible. I would like some discussion may be of concrete examples of common mistakes. Here is an idea you could maybe shoot down:

I am also interested in the concept of information loss on either side of equations xy = z If given z, xy is much harder to compute than the other way around one expects. The equation is not 'information symetrical', therefore one has to undertake a finding process to uncover that extra information, each side cannot be as easy as each other to compute. Let us say it is almost polynomial then go to wxy = z an increase in inbalance and computional time difference and so forth, eventually we must reach p does not equal NP??? 81.158.153.174 13:26, 20 October 2007 (UTC)

First, factoring actually becomes easier if there are more prime factors. Second, the concept of "uncovering information" probably cannot be formalized in any meaningful way to deal with NP, since NP is about decision problems, where the output is always just a single bit. --Mellum 14:45, 20 October 2007 (UTC)

Sorry about that. I am constantly amazed by my ability to be stupid whilst feeling insightful. I really meant to say if we add more primes of an incrementally growing size to one side of the equation it looks as though we can create a growing complexity on one side in comparison to the other so at some stage we must reach P does not equal NP. As you rightly point out there is a problem with formalizing this difficulty which crashes the proof unless it could be successfully argued that a number does not reveal its factors without testing and that testing is inherently slow as it can only eliminate fractions of the total possible number of divisors. please contact me at piersnewberry at hotmail.com if you want to discuss this further. I eventually followed the QEDen link at the bottom of the page which provide further help for people at my level. Thanks again. —Preceding unsigned comment added by 81.158.153.174 (talk) 19:02, 20 October 2007 (UTC)


Nicolaennio, I think the reason you're confused is because there's no example with actual numbers. Consider this: typical input lengths that users are interested in are between 100 and 100,000, approximately. Let's pick an input length of 100, i.e., n=100. Now, let's say we have a polynomial algorithm whose running time is n2. This is a typical polynomial algorithm. Also, let's say we have an exponential algorithm whose running time is 2n. This is a typical exponential algorithm. The number of steps that the first algorithm will require, for n=100, is 1002=10000. The number of steps that the second algorithm will require is 2100. Now, a typical CPU will be able to do 109 operations per second. To get the number of seconds that the algorithm will take, we take the number of steps and divide it by the number of steps per second. So the first algorithm will finish in (10000 ÷ 109) = .00001 seconds. A running time of .00001 seconds is reasonable, and that's why we call that a practical algorithm. The second algorithm will take (2100 ÷ 109) ≈ 1.3×1021 seconds. Now we can divide by the number of seconds in a year. The second algorithm will take about (1.3×1021 ÷ 31556926) ≈ 4.1×1013 years. Since it'll take 41 trillion years to complete, that's why we call the second algorithm impractical. --Navigatr85 21:20, 2 December 2007 (UTC)

What would you prefer, a polynomial algorithm with running time n1024, or an exponential one with time 2n/16? Which is more "practical"?  --Lambiam 22:57, 2 December 2007 (UTC)
I take as reference an input size that tends to infinity. By no reason this way of thinking is practical. However, to answer to Lambiam, it depends on the size of n.--Nicolaennio (talk) 12:35, 9 January 2008 (UTC)
My question was directed towards Navigatr85, who appeared to be arguing that picking n=100 settles the issue in favour of polynomial algorithms.  --Lambiam 23:07, 9 January 2008 (UTC)
Hi Lambiam. You're right, it's true that an n1024 algorithm is worse than a 2n/16 algorithm. But an n1024 algorithm is not a "typical" polynomial algorithm. When I say "typical", what I mean is this: Throughout the history of computer science, there have been many situations in which a new algorithm was devised to solve a real-life problem. In almost all of these situations, if the new algorithm was polynomial, it had a small exponent. For example, Insertion sort is one algorithm that solves the problem of sorting, and it runs in time O(n2). Similarly, we can look at all the problems that have real-life applications, but are only solved by an exponential algorithm as their best known algorithm. When we look at these, none of them have an expression like "n/16" or "n/1000" in their exponent. For example, the best known algorithm for the Subset sum problem runs in time O(2n/2). (It might not be clear that subset-sum has real-life applications, but it's related to other problems that are useful in real life.)
It is indeed possible to devise a problem whose best algorithm would be O(n1024). (It's possible for any polynomial with any exponent.) The time hierarchy theorem shows that. However, those problems fall into the realm of abstract, theoretical mathematics, and they aren't useful for solving any real-life problems. I'm pretty sure no one has ever found a useful real-life problem whose best known algorithm is O(n1024). The same is true for O(2n/1000) algorithms. In my discussion here, remember that the terms "useful" and "real-life" are subjective, and not well-defined. Also, remember that all this polynomial and exponential stuff is just rules of thumb, not strict rules. Incidentally, if someone devised an algorithm that solved the subset-sum problem, and ran in O(2n/1000), then many industrial engineers would be very happy, because it would be practical to use that algorithm for input values up to n=9000 or so. On the other hand, most theoretical computer scientists wouldn't be very happy upon hearing about that same algorithm, because it wouldn't solve the P=NP problem. :) --Navigatr85 01:43, 17 January 2008 (UTC)

[edit] Formal Definition ?

The formal definitions in this Article are not formal at all. Take a look at the Official Problem Description of the Clay Mathematics Institute to see how P and NP are defined. ( The Clay Mathematics Institute Official Problem Description (pdf))

--CBKAtTopsails 19:06, 24 October 2007 (UTC)

I fail to see the problem with the formal definitions. Can you be more specific? --Mellum 21:48, 24 October 2007 (UTC)


First, I believe that any formal definition should be short and brief and should not be as wordy as the one given in this Article. But the major problem really is a lot of things used in this Definition remain undefined and we cannot build a formal definition based on somethings which are not formally defined.

To name just a few:

1. The word algorithm is mentioned several times in the Definition but there is no formal definition for algorithm. At one point, it says "an algorithm A(w,C) which takes two arguments". At another point, it says "A produces a YES/NO answer". Now, you may argue that we can look at algorithm as a Turing machine in this situation. However, if you go back to the basic definition of a Turing machine, you will recall that both inputs and outputs to a Turing machine are strings. Therefore, you will need to have some explanation as to how this two arguments can be converted into strings that can then be input to the Turing machine. You also need to explain how the output string from this Turing machine can then be converted to a YES/NO answer.

2. The Definition contains the statement "w is a YES instance of the decision problem". This statement needs to be formally defined in terms of languages.

3. Finally, there is an important property about the Certificate that is missing from this Definition. That is, the size of the Certificate is polynomially bounded by the size of w.

--CBKAtTopsails 16:54, 25 October 2007 (UTC)

So please be bold and fix it.  --Lambiam 08:46, 26 October 2007 (UTC)

Note: "There exists" must be changed back to "There exist" because 2 things must exist before conditions (i) and (ii). That is, the binary relation R and the positive integer k.

--CBKAtTopsails 15:46, 9 November 2007 (UTC)

Do we really need formal definitions of P, NP, and completeness? Isn't that what P (complexity), NP (complexity) and Karp reduction are for? I would prefer this article to contain just a brief intuitive description of P and NP (yes, even one that glosses over the finer points), then discussion on why the P vs NP problem is important, why it is hard, possible approaches, and consequences. In fact, I don't think there should be anything here as formal as the stuff you have recently been adding. I don't think we need to go all the way down to first principles for each article, or teach complexity theory from the bottom up each time we want to talk about it. It's wikipedia, so let's only summarize the background info here and just re-use good definitions from elsewhere. If someone is interested in the real definitions, there could be a link to the more formal pages mentioned above. Blokhead (talk) 05:21, 20 November 2007 (UTC)
Yeah, the point of this article was always just to give an overview of the problem, and leave the formal definitions of the classes involved to their own articles. A brief summary with appropriate links might not hurt, but nothing too verbose. Dcoetzee 08:09, 20 November 2007 (UTC)

[edit] Some Additional Comments

Although someone pointed out that the total number of divisions to go through to check if a given x is composite is [x1/2]-1 instead of x-2 where [x1/2] denotes the greatest integer less that or equal to x1/2, in the theory of computational complexity, two methods are considered the same if they both run in exponential time. Even if you can cut down the number of divisions to x1/2, the time is still exponential.

To see why, let's assume that x contains k digits.

We can then write

x=ak-110k-1+ak-210k-2+......a110+a0 ( For example, 323=(3)(102)+(2)(10)+3 )

Therefore, x1/2={ak-110k-1+ak-210k-2+......a110+a0}1/2 which clearly is exponential in k, the size of the input.

Although each division is polynomial-time, the total number of divisions is exponential. Therefore, the total time is exponential.

The question of P vs NP basically asks: "Is there a way to reduce the computation time from exponential to polynomial for all the NP problems?". Just reducing it from one exponential form to another is not good enough.

--CBKAtTopsails 19:45, 9 November 2007 (UTC)

[edit] sic

One of the quotes has a "[sic]" the way it's written:
"The resolution of Fermat's Last Theorem also shows that very simply [sic] questions may be settled only by very deep theories."

I don't know the quote at all, but it could have been:
"The resolution of Fermat's Last Theorem also shows that very simply—questions may be settled only by very deep theories."

Depending on if the "that" refers to the "space of algorithms" or "questions." 71.177.240.147 18:39, 10 November 2007 (UTC)

In either case there's an error - the [sic] discourages people from spuriously correcting the accurate quote. Dcoetzee 23:14, 12 November 2007 (UTC)

[edit] Some more comments on formal definition

First of everything, I beleive that formal definitions are needed because people want to use them to establish theorems using rigorous mathematical arguments. Formal definitions therefore must be precise and cover all applicable situations.

Since this article is rated to be within the scope of WikiProject Mathematics and since you cannot talk about mathematics without talking about formal definitions, I see no reason why formal definitions have to be left out from this article.

I have checked out the articles on P (complexity), NP (complexity) and Karp reduction. Here are what I found:

(i) The articles on P (complexity) and Karp reduction do not themselves contain formal definitions for P and NP-complete. you have to navigate through many other articles spending a lot of time to sort out what information is needed and what information is not needed. People who do not want to spend a whole week or even a whole month to understand all these other articles in order to get to their original very simple objective (i.e. to find out how P and NP-complete are defined) would simply give up.

(ii) The article on NP (complexity) has a different way of defining NP. This different definition, although mathematically equivalent, originated from the concept of nondeterministic machines and is no longer the popular definition. Most importantly, it is not in tune with the concepts and and ideas talked about in the article of "P = NP Problem". For example, the concepts of checking relations, verifiers and certificates.

I believe that there are many people out there who became aware of the "P vs NP Problem" because of the $million prize offered by the Clay Mathematics Institute and I believe that what Clay Mathematics Institute is looking for is a settlement of this problem based on some kind of rigorous mathematical argument.

In summary, including a formal definition in this article will serve two important objetcives:

(a) Provide the convenience of one-stop shopping without having to navigate to many other articles.

(b) Provide additional flavors in the article for different people who come to search Wikipedia on the subject of "P vs NP" with different perspectives.

--CBKAtTopsails (talk) 16:47, 21 November 2007 (UTC)

Responding: NP (complexity) describes both definitions and summarizes a proof of their equivalence. We use the verification definition here because it's more intuitive and approachable without background about nondeterministic machines. It is true that these articles don't currently contain a detailed formal definition, based on the formal sets in the definition of the Turing machine - they rely on background about other concepts in language such as "P is the complexity class containing decision problems which can be solved by a deterministic Turing machine using a polynomial amount of computation time". Although it would be somewhat pedantic, I think there's reasonable justification for adding more detailed formal definitions to the articles on specific complexity classes - at best, I think this article would benefit primarily from a brief comparison of the formal definitions of P and NP, with reference to their articles for details. Dcoetzee 23:02, 21 November 2007 (UTC)

So, are you suggesting all formal definitions in this article should be moved to the articles of P (complexity), NP (complexity) and Karp reduction? If so, why was there a section called formal definition from the beginning? Why didn't people see it a problem until someone puts up a real formal definition in it? Again, I want to stress the point that formal definitions have to be very rigorous and there is no such a thing as less detailed or more detailed formal definition. Section 2.1 of the article NP (complexity) talks about verified-based definition using the subset-sum problem as an example. This section is far from being called formal definition because formal definition cannot be built based on one specific example.

--CBKAtTopsails (talk) 00:19, 25 November 2007 (UTC)

I'm not suggesting that - just suggesting that formal definitions could be useful in the articles on specific classes, and once they're added it'd be redundant to provide complete formal definitions of every class here; it'd suffice to describe relevant differences. I recognise that the "formal definition" in this section wasn't formal, it was just more detailed. And no, I'm certainly not referring to section 2.1 of NP (complexity) - the definition is in the introduction: it's sufficient to say that it's the set of languages recognizable in polynomial time by a nondeterministic Turing machine, or equivalently the set of languages verifiable by a deterministic Turing machine in polynomial time, given a valid proof string. This is not a detailed formal definition, but it is rigorous and correct. Also keep in mind that formal definitions of related classes are very similar (e.g. NP and NEXP) and I would prefer to centralize this type of information in articles like NTIME. Dcoetzee 14:06, 25 November 2007 (UTC)

The NTIME article is strictly concerned with nondeterministic TM. The so-called verified-based definition (i.e. the one included in this article) is defined in terms of deterministic TM. Therefore, I don't think it should belong to that article. The definition that you refer to in the "introduction" of the article NP (complexity) namely, \mbox{NP} = \bigcup_{k\in\mathbb{N}} \mbox{NTIME}(n^k), as I pointed out earlier, is not a popular definition. In fact, in many text books (including Sipser's), it is used as a theorem rather than a definition. Also, many theorems in time-complexity theory were proved using the verified-based definition. I don't know what objective you have in mind for this article. I do feel that there is one important subject that has not been discussed adequately in this article. That is, what kind of approaches people might take to tackle the P vs NP problem. Don't you think you need to lay down the fundamental definitions of P and NP (preferably the verified-based one) to get into the subject?

--CBKAtTopsails (talk) 05:38, 26 November 2007 (UTC)

The articles should discuss both definitions, and their equivalence. Many sources use one as the definition and prove the other - that's always the way with equivalent definitions. I'm not saying we shouldn't define them, just that it's redundant to completely formally define the classes both here and in the more detailed articles on the classes themselves. It's important to avoid duplicating content. This is an appropriate place for comparing/contrasting formal definitions of P and NP. Dcoetzee 23:11, 27 November 2007 (UTC)
I just took a look at the additions to the article and I'm okay with them, at least after deleting the fluffy "intuitive explanation" paragraph that didn't add anything and contained lots of errors. The section was a good idea. :-) Dcoetzee 23:55, 27 November 2007 (UTC)

[edit] About the practicality of the polynomial time definition

There have been some talks about the practicality of the polynomial time definition on this page. The section "Is P really practical?" in this article states the following:


"All of the above discussion has assumed that P means "easy" and "not in P" means "hard". While this is a common and reasonably accurate assumption in complexity theory, it is not always true in practice, for several reasons:

  • It ignores constant factors. A problem that takes time 10100n is in P (it is linear time), but is completely impractical. A problem that takes time (1+10-100)n is not in P (it is exponential time), but for n a googol takes time essentially e = 2.718… and so is very practical for all feasible values of n.
  • It ignores the size of the exponents. A problem with time n100 is in P, yet impractical. Problems have been proven to exist in P that require arbitrarily large exponents (see time hierarchy theorem). A problem with time 2n/100 is not in P, yet is practical for n up into the low thousands.".


According to the opinion of at least one expert in this field, all the known polynomial-time algorithms for solving any practical problems in the real world run in time O(n5) or less with reasonable constants. The following is a direct quote from Section 1.5.1 of Chapter 1 in the the textbook, "Computational Complexity: A Modern Approach" written by Sanjeev Arora and Boaz Barak of Princeton University (See http://www.cs.princeton.edu/theory/complexity/modelchap.pdf)

"1.5.1 On the philosophical importance of P

The class P is felt to capture the notion of decision problems with “feasible” decision procedures. Of course, one may argue whether DTIME(n100) really represents “feasible” computation in the real world. However, in practice, whenever we show that a problem is in P, we usually find an n3 or n5 time algorithm (with reasonable constants), and not an n100 algorithm. (It has also happened a few times that the first polynomial-time algorithm for a problem had high complexity, say n20, but soon somebody simplified it to say an n5 algorithm.)"


In all fairness, I think the burden of proof lies on those who made the criticism to come up with a specific example to support their argument. In other words, you need to show that there is a practical problem which requires an n100 time algorithm to solve it and otherwise cannot be solved in any other more efficient way.

If you don't have a specific example, all you talk about is still speculation. It is not about practicality.

--CBKAtTopsails 20:37, 4 December 2007 (UTC)

There are many fixed-parameter tractable algorithms and approximation algorithms which can have arbitrarily large polynomial exponent (based on the parameter values) and which are potentially useful in practice. Dcoetzee 00:07, 8 December 2007 (UTC)

Can you provide more details on this? Keep in mind that even if you can find some other problems that can be solved easily but are not in P, you have not proved any thing wrong about P. The reason is that the intention of P is to say that everthing in P is easy-to-solve but it never guarantees that every easy-to-solve problem in the world must be in P.

If you can find any fixed-parameter tractable algorithms and approximation algorthms which can be used in practice, it's nice but it doesn't prove anything wrong about P.

The fact of the matter is there is no reason why we cannot have more than one class of problems which we can consider easy.

To sum up, you must find a problem that is in P but not easy to solve in order to ridicule P.

--CBKAtTopsails (talk) 04:53, 8 December 2007 (UTC)

See AKS primality test for an n6+ε primality testing algorithm, the best yet. --Vaughan Pratt (talk) 05:00, 13 March 2008 (UTC)
Just as a note, "fixed-parameter tractable" does not mean "polynomial-time for any fixed parameter value k"; it also demands that the exponent is a constant independent of k. --Mellum (talk) 11:40, 8 December 2007 (UTC)
See for example Parameterized Intractability of Distinguishing Substring Selection, which describes a problem that has a PTAS but has no EPTAS, under reasonable assumptions; that is, if you wish to solve it within a small ratio, the exponent of the resulting algorithm will necessarily be large. Dcoetzee 06:28, 12 December 2007 (UTC)

[edit] It is "the" verifier rather than "a" verrifier

"a Turing machine" and "a verifier" must be changed back to "the Turing machine" and "the verifier". The difference is the following:

When you use "a", you are not sure about the existence of such a Turing machine. Condition (ii) actually guarantees such existence.

--CBKAtTopsails (talk) 20:48, 7 December 2007 (UTC)

(1) Why is this an error, or even a logical error? If condition (ii) guarantees the existence, it is superfluous to also indicate the necessary existence grammatically, since it is already implied logically. (2) The definite article "the" implies, in normal English, not only the existence of a Turing machine with these properties, but also that there there is one Turing machine having these properties. This is false; there are many, some of which will not run in polynomial time even if another one does. Thus, the use here of the definite article "the" actually makes the sentence erroneous, unlike the article "a".
Compare for example the first sentence of our article Divisor:
In mathematics, a divisor of an integer n, also called a factor of n, is an integer which evenly divides n without leaving a remainder.
We are sure about the existence of such a divisor; in particular, the number 1 is a divisor of every integer. So is the above a logical error that should be changed? We would then get this:
In mathematics, the divisor of an integer n, also called the factor of n, is the integer which evenly divides n without leaving a remainder.
This is most certainly not an improvement. --Lambiam 22:33, 9 December 2007 (UTC)


I think your argument is a little out of context here.

We need to get everything back on the table in order to straighten this mess.

First, you are given a binary relation R from which you construct LR. This LR is not an arbitrary language but one that specifically ties to certain given information.

Quote: "(1) Why is this an error, or even a logical error? If condition (ii) guarantees the existence, it is superfluous to also indicate the necessary existence grammatically, since it is already implied logically. (2) The definite article "the" implies, in normal English, not only the existence of a Turing machine with these properties, but also that there there is one Turing machine having these properties. This is false; there are many, some of which will not run in polynomial time even if another one does. Thus, the use here of the definite article "the" actually makes the sentence erroneous, unlike the article "a".".

Response: The definition of decidability guarantees the existence of at least one Turing machine that decides LR. Yes, sometimes, you can have more than one TM to decide LR for any given LR but sometimes you don't for another specific given LR. Since formal definitions have to deal with all situations, we cannot talk about those TM's that sometimes exist and sometimes don't. "The" Turing machine refers to the only one that is guaranteed to exist in all situations.

Quote: "In mathematics, a divisor of an integer n, also called a factor of n, is an integer which evenly divides n without leaving a remainder.

We are sure about the existence of such a divisor; in particular, the number 1 is a divisor of every integer. So is the above a logical error that should be changed? We would then get this:
In mathematics, the divisor of an integer n, also called the factor of n, is the integer which evenly divides n without leaving a remainder.
This is most certainly not an improvement."

Response: This example simply is not a good analogy to our issue. The integer n is too arbitrary and too general unlike LR which is specifically tied to certain given information.

How does it sound to you if I say, "In mathematics, the divisor, 1 of the integer 1, also called the factor of 1, is the integer which evenly divides 1 without leaving a remainder."?

And, How does it sound to you if I say, "In mathematics, a divisor, 1 of an integer 1, also called a factor of 1, is an integer which evenly divides 1 without leaving a remainder."?

--CBKAtTopsails (talk) 19:23, 11 December 2007 (UTC)

I am sorry, but I cannot follow your argument. Given a Turing machine for deciding language LR that runs in time O(T(n)), I can construct a different Turing machine deciding the same language LR that runs in time O(T(n)2) – which preserves polynomial-time-hood – or even a TM that takes time O(2T(n)) – no longer polynomial. So if there is one verifier, there are in fact always several (even infinitely many) verifiers. Which of these is then "the" verifier?
As to the definition of divisor with "1" substituted for "n", to call 1 "the factor of 1" is definitely weird, even though it is the only factor (in positive integers). On the other hand, stating that 1 is "a factor of 1" is just fine. So I think this example actually works against your position.  --Lambiam 23:31, 11 December 2007 (UTC)
Generally, the word "the" is thought to imply uniqueness as well as existence, which certainly does not hold in this case; nor does "a" imply non-existence (although you can make vacuously true statements such as "a composite prime is equal to 4"). In any case I favour the original wording. Dcoetzee 06:04, 12 December 2007 (UTC)
Which of the various versions is "the original wording"?  --Lambiam 10:50, 12 December 2007 (UTC)


Quote: ":I am sorry, but I cannot follow your argument. Given a Turing machine for deciding language LR that runs in time O(T(n)), I can construct a different Turing machine deciding the same language LR that runs in time O(T(n)2) – which preserves polynomial-time-hood – or even a TM that takes time O(2T(n)) – no longer polynomial. So if there is one verifier, there are in fact always several (even infinitely many) verifiers. Which of these is then "the" verifier?"

Response: In this case, you would have to take the most efficient one and throw the others into the garbage can. Theoretically, if you give me a Turing machine, M, which is a decider, I can always construct another Turing machine which moves its head back and forth 100 million times before it takes whatever steps M takes to get to the same destination and therefore this Turing machine I construct is still a decider although it takes 100 million more steps to do what M can do. The point is : Does this machine make any sense to you or does it serve any practical purpose in terms of developing an academic theory? Every academic theory is developed with an objective in mind and computational complexity theory is no exception. As I pointed out before, which may sound a little weird to you, the language R = \{(x,y)\in N\times N: 1<y< \sqrt x\; ; \;y\; \text{divides}\; x\} is the considered the same as the language R' = \{(x,y)\in N\times N: 1<y< x\; ; \;y\; \text{divides}\; x\} in the theory of computational complexity.


Quote: "As to the definition of divisor with "1" substituted for "n", to call 1 "the factor of 1" is definitely weird, even though it is the only factor (in positive integers). On the other hand, stating that 1 is "a factor of 1" is just fine. So I think this example actually works against your position."

Response: I don't see it that way. The reason I raised this example is to show that your example is out of context with the issue in which LR is something specific whereas n is very general. If you replace n with something more specific (which doesn't have to be '1'. You can make it something else), "the" sounds much better than "a".

--CBKAtTopsails (talk) 15:39, 12 December 2007 (UTC)


Quote: "Generally, the word "the" is thought to imply uniqueness as well as existence, which certainly does not hold in this case; nor does "a" imply non-existence (although you can make vacuously true statements such as "a composite prime is equal to 4"). In any case I favour the original wording."

Response: "The" integers which are divisible by 2 are also called "the" even integers. "The" doesn't have to mean uniqueness. It means something specific. I never said "a" implies "non-existence". I said "not sure about existence".

--CBKAtTopsails (talk) 15:52, 12 December 2007 (UTC)

Your definition in the article does not mention anywhere that you "have to take the most efficient one and throw the others into the garbage can". Should the reader read this between the lines then? And why is the implicit assumption warranted that the various Turing machines deciding the problem can be ordered according to efficiency? Perhaps they are incomparable.
In giving a precise and formal definition, you cannot discount counterexample to a proposed definition by statements like "does it serve any practical purpose"? The counterexample serves the very practical purpose of exhibiting that the definition in its current form is broken and must be replaced by something else.
I contest your claim that the languages R and R' are generally considered the same; as decision problems they are in the same complexity class, at least in the sense that each is polynomially reducible to the other, which makes them equivalent for the classes that are usually considered of interest, including P and NP. If some people truly consider them the same rather than in some sense equivalent, I find that sloppy rather than weird. In any case, the article is not written only for people who are aware of and familiar with some abuse of language traditional in some field, but also for people who haven't even heard of computational complexity theory before.
How can you say "LR is something specific whereas n is very general"? Both are not fixed and can be instantiated in many ways, and I really don't see why one should be considered more specific than the other.
There are various uses of the word "the" in English, but if you say "If Y has property P(X,Y), then Y is called the frobnitz of X", this implies that given X, the proposition P(X,Y) has at most one solution in Y. If there may be several solutions in Y but you mean the least one, you have to include that in the definition; you must not assume the reader will understand that as being implied. You must then also be able to define what the ordering is.
Lack of uniqueness makes a referent not definite. There is only one set that is equal to {n ∈ Z | even(n)}, which is why we can call this "the even integers".  --Lambiam 18:02, 12 December 2007 (UTC)


Quote: "Your definition in the article does not mention anywhere that you "have to take the most efficient one and throw the others into the garbage can". Should the reader read this between the lines then?"

Response: First of all, this is not my definition. I am just reflecting what is being used in this field. Make no mistakes about this. I have never said things are now perfect in computational complexity theory. Computational complexity theory and even the entire theoretical computer science is a relatively new theory. Despite of all the significant discoveries in this field, some day, some one will have to write up expositions to unify the definitions and simpify or even correct eroneous arguments in proving theorems. This situation is not unique for computational complexity theory. It has happened before in the other branches of mathematics. I personally have had a lot of frustration experience in reading texts in theoretical computer science because I believe that most authors do not have the attitute to treat the subject using a very rigorous mathematical approach. Sometimes, you can find inconsistencies from definition to definition. Sometimes, you can find ambiguous definitions and sometimes, you can even find incomplete or false arguments in proving theorems. Nevertheless, let's get back to the subject. I think there can be a number of ways to fix a problem. In this case, I think it's ridiculous to use "a" to replace "the" simply for the purpose of including those useless TM's you constructed. One approach is not to use the concept of Verifier at all. This Verifier thing is just a matter of name calling and it is not needed to define NP. See for example how Stephen Cook defines NP in the Official Problem Description of P vs NP (page 2). He completely avoids the names of Verifier and Certificate by simply saying LR\inP. Another approach (if you believe that the concept of Verifier is still worth keeping because you can use it for some other purpose) is to define Verifier as the most efficient TM among the class of TM's which decide the same language. This would make good common sense to me because in reality, given a set of algorithms that all can solve the same natural problem, why is it wrong to use just the best one and ignore the remaining. Why would you want to wast time to construct a (T(M))2-time TM while you have a T(M)-time TM that can solve your problem with the same amount of resource given?


Quote: "And why is the implicit assumption warranted that the various Turing machines deciding the problem can be ordered according to efficiency? Perhaps they are incomparable."

Response: Your last statement does hit a valid point. We cannot rule out the possibility that there might be TM's which are not comparable just as in reality, there is the possibility of having two algorithms which cannot be ranked in terms of efficiency. My point is that if you do have a set of TM's (such as the one you constructed) in which one is clearly better than the others, then you should go by the best one in the set. In the event that there exist more than one TM's and they are not comparable, both of them can be called Verifier. However, as I pointed out before, the definition of decidabilty guarantees the existence of one TM in all situations. This the one that we should refer to.


Quote: "In giving a precise and formal definition, you cannot discount counterexample to a proposed definition by statements like "does it serve any practical purpose"? The counterexample serves the very practical purpose of exhibiting that the definition in its current form is broken and must be replaced by something else."

Response: The statement "does it serve any practical purpose?" is not put in there to discount the counterexample. It is put there to point out that the artificial TM's you constructed are totally counter-productive. In fact, I did have an answer to your counter-example. That is, you go by the most efficient one and ignore the others. I just have to ask you this question again, Why would any one want to wast time to construct a (T(M))2-time TM while there is clearly a T(M)-time TM that can solve the same problem with the same amount of resource given?


Quote: "I contest your claim that the languages R and R' are generally considered the same; as decision problems they are in the same complexity class, at least in the sense that each is polynomially reducible to the other, which makes them equivalent for the classes that are usually considered of interest, including P and NP. If some people truly consider them the same rather than in some sense equivalent, I find that sloppy rather than weird. In any case, the article is not written only for people who are aware of and familiar with some abuse of language traditional in some field, but also for people who haven't even heard of computational complexity theory before."

Response: This is not an abuse of language. It is a fact. Two languages are considered the same in terms of solving computational complexity problems if they belong to the same complexity class. Based on the way you make your argument on this issue, I think any reasonable person who understands Computational Complexity Theory has to come to the conclusion that either you don't fully understand the theory (including its objective) or you understand it but are twisting the facts to make an argument just for the purpose of making an argument. If you think you can put up a technical article of P vs NP on Wikipedia (with a B+ mathematical rating) without confining it to the context of Computational Complexity Theory, but only go by what you think or what other people (not familiar with Computational Complexity Theory) would think, I rest my case.


Quote: "Lack of uniqueness makes a referent not definite. There is only one set that is equal to {n ∈ Z | even(n)}, which is why we can call this "the even integers"."

Response: This is strictly a double-standard talk. In your previous example, you want to treat every divisor of n as different divisor and you therefore use "a". Here, you want to group all the integers which are divisible by 2 into one set and treat it as a single entity.


Quote: "How can you say "LR is something specific whereas n is very general"? Both are not fixed and can be instantiated in many ways, and I really don't see why one should be considered more specific than the other.

Response: I don't know how many times I have to go through this. But I'll do it one more time. LR is specific because it is uniquely tied to the binary relation R which I have no control of. In fact, if the existing R changes in another situation to R', I would have no choice but to come up with LR'. R is the first thing that comes into the picture and is not restricted by any other condition, therefore, I can use "a". LR comes out of R and therefore I use "the". In your example, n is the very first thing that comes into the picture. It is totally arbitrary just like R. Therefore, you can use "a". At any subsequent point, if you talk about something that comes out of n, then you can use "the".

--CBKAtTopsails (talk) 02:28, 13 December 2007 (UTC)

Listen. You do not own this article. I have phrased reasonable objections against how the definition is presented. To counter these you give exceedingly long replies that, in my opinion, use unreasonable arguments. Whether a definition is good or not does not depend on whether someone applying the definition to a situation wants to waste time or not. (Why would anyone want to waste time applying a proposed definition of prime number to a counter-example number that has been constructed to be composite? Using this insight, we can really simplify definitions!) The n in the definition of divisor is also bound to a number we may have no control over: someone may decide to apply the definition to the number 123. We get then: a divisor of the integer 123 is an integer that evenly divides 123 without leaving a remainder.
If the lack of rigour of some authors has bothered you, accept my sympathies, but attempting to be rigorous is not a good excuse for presenting incorrect and incomprehensible definitions.  --Lambiam 07:22, 13 December 2007 (UTC)
I've read this over and I believe CBKAtTopsails's argument has no merit, as well. Changing "a" to "the" here makes the description less clear, and it has no basis in the way articles are used in English. So I've reverted that change again. rspeer / ɹəədsɹ 16:43, 13 December 2007 (UTC)


Quote: "Listen. You do not own this article."

Response: "Who owns this article?'


Quote: "I've read this over and I believe CBKAtTopsails's argument has no merit, as well. Changing "a" to "the" here makes the description less clear, and it has no basis in the way articles are used in English. So I've reverted that change again."

Response: Can you introduce yourself and explain what authority you have over this Wikipedia article? Why do you use "a" Verifier but still keep "the" Certificate? Do you understand these two things are tied to each other?

--CBKAtTopsails (talk) 17:33, 13 December 2007 (UTC)

Once something specific has been introduced and posited to exist, it is afterwards correct to refer to it using "the", which makes it clear that it is referring to the specific aforementioned subject rather than any of them. However, when positing it in the first place, you always say "there is" or "exists a" rather than "the". This is just typical usage in mathematical language. Dcoetzee 20:49, 13 December 2007 (UTC)


Quote: "Once something specific has been introduced and posited to exist, it is afterwards correct to refer to it using "the", which makes it clear that it is referring to the specific aforementioned subject rather than any of them. However, when positing it in the first place, you always say "there is" or "exists a" rather than "the". This is just typical usage in mathematical language."

Response: You are making my point. "The" Turing machine in the last sentence refers to the polynomial time Turing machine mentioned in condition (ii). "A" Turing machine would be appropriate only if it is made a stand-alone (or separate) definition.

--CBKAtTopsails (talk) 21:43, 13 December 2007 (UTC)


Quote: "I've read this over and I believe CBKAtTopsails's argument has no merit, as well. Changing "a" to "the" here makes the description less clear, and it has no basis in the way articles are used in English. So I've reverted that change again. "

Response: The owner of this statement has not demonstrated he has any specific authority over this article. Due to a malicious change that has been made to the article, I have no choice but to reverse it.

--CBKAtTopsails (talk) 05:48, 14 December 2007 (UTC)

There is something you don't understand. You are not the owner of this passage. Please take some time to read Ownership of articles, an official Wikipedia policy. Editors do not have to demonstrate any "specific authority" over the articles they edit. By the same token we can ask: who do you think you are? However, we won't. But know that it does not behoove you to label another editor's motivated change as "malicious". Please also take some time to read Assume good faith, an important guideline for editors. Finally, study the following policy diligently: Consensus. Wikipedia works by building consensus. You should not revert against consensus. Clearly, you are in the minority with your predilection for "the": Dcoetzee, rspeer, and the undersigned, all have argued against its use here.  --Lambiam 14:28, 14 December 2007 (UTC)


I have read the articles on Assume good faith and Consensus. Here's what happened.

rspeer suddenly jumped in acting like a big boss to shut down a debate and then moved on to make the change without offering his analysis of the issue. Since he does not have any specific authority over this article, his view does not have to be final and there is no reason why some one else cannot reverse his change. This kind of attitute cannot be considered as good faith.

I posted a polite question to him and waited for a reasonable amount of time for his response. This is good faith. He ignored the question and acted like his view should be final. This again is not a good faith.

With this bad faith of rspeer, I had no choice but came to the conclusion that his change is malicious.

As to the issue of Consensus, I quote this from the article:

"So in summary, wikipedia decision making is not based on formal vote counting ("Wikipedia is not a majoritarian democracy"). This means that polling alone is not considered a means of decision-making, and it is certainly not a binding vote, and you do not need to abide by polls per se. Polling is generally discouraged, except in specialized processes such as AFD."

Until you show some good faith to resolve this issue, I am going to discontinue this dialogue.

However, I will continue to make any edit so long as it is necessary to defend the truth.

--CBKAtTopsails (talk) 16:41, 14 December 2007 (UTC)

"Defend the truth"? Did you really just say that? Are we still arguing about one silly little word in a mathematical definition? Going back to the original entry in this section:
When you use "a", you are not sure about the existence of such a Turing machine.
This is completely contrary to mathematical convention. If you hear "every complete graph has a Hamiltonian cycle", you interpret that as a tentative statement? that the writer is not sure if there really is a Hamiltonian cycle? If so, I'm afraid you would be alone in that interpretation.
Your changes do the definitions imply many confusing things: First, that there is a unique verifier for each language (or that one verifier is somehow the "chosen one"). Furthermore, since the verifier is unique (or there is one chosen one), there is a unique (or chosen) encoding for certificates. This is very misleading. So which encoding is the "best" one for SAT, one which lists the truth assignments to variables in order x_1, x_2, ... or one which lists them in order x_n, x_{n-1}, ... ? The corresponding verifiers will have the same running time but induce different witness relations L_R. So if you think the definition only makes sense with "the", then which of these verifiers is "the" verifier for SAT?
Also, since you seem to really enjoy appealing to "technical" issues, I should mention that when dealing with Turing machines, there is no unique "most efficient verifier", thanks to the Linear speedup theorem. So please don't tell us that the definition for NP involves only considering "the most efficient verifier".
In summary, I completely oppose your proposed changes. So far, several people have weighed in here, and you seem to be the only person supporting your proposed changes. The consensus is clearly against it. Blokhead (talk) 21:27, 14 December 2007 (UTC)
After a review, I'm starting to feel more ambivalent. Taking a look at the sentence in context:
(ii) The language LR=\{ x\# y:(x,y)\in R\} over \Sigma\cup\{\#\} is decidable by a polynomial-time Turing machine.
The/A Turing machine that decides LR is called the/a verifier for L, and y is called the certificate of membership of x in L.
There are two interpretations here that are both valid:
  1. "A Turing machine that decides LR is called a verifier for L". This is a freestanding and valid definition.
  2. "The Turing machine that decides LR is called the verifier for L". Here "The Turing machine" and "the verifier" refers to the aforementioned polynomial-time Turing machine in (ii). The language here mirrors usage like this: "Suppose we have a graph. Suppose the graph is complete." This is consistent with the second part of the sentence, which also uses "the" to refer to variables defined above.
In short, I think either way is just fine, although they are interpreted differently; it's a matter of style. The only problem here is the User:CBKAtTopsails's inaccurate argument and edit warring, and I hope we don't have to fight anymore about this. Dcoetzee 22:41, 14 December 2007 (UTC)
I suppose I can see the case for saying "The Turing machine" in the above -- it could refer to the previously established TM for deciding L_R. I would concede that as a stylistic choice (whether the TM in the 2nd line refers to the one mentioned in the first line, or if it's a separate thought). But there are many verifiers for a language (even fixing a witness relation L_R) and potentially many certificates for each element of L (even fixing a verifier). Defining something as "the foo of x" implies that there is a unique foo given x. So "the verifier of L" and "the certificate of x in L" are still misleading. So given that these should use "a" instead of "the", I think it reads much better when it's "A TM" instead of "The TM", so that the second line starts a new thought (i.e., any TM that decides L_R is a verifier). Blokhead (talk) 00:48, 15 December 2007 (UTC)
Okay, would CBKAtTopsail be happy with a compromise in which it is "The Turing machine" but "a verifier"? Dcoetzee 01:12, 15 December 2007 (UTC)
That is truly half-baked. Compare the following:
A composite number is a positive integer that has a positive divisor other than 1 or itself. The divisor of a composite number n is called a factor of n.
 --Lambiam 02:27, 15 December 2007 (UTC)

Quote "This is completely contrary to mathematical convention. If you hear "every complete graph has a Hamiltonian cycle", you interpret that as a tentative statement? that the writer is not sure if there really is a Hamiltonian cycle? If so, I'm afraid you would be alone in that interpretation."

Response "Your example is taken out of context. In the original discussion, what was being argued about was the language LR. You should know that a language is not a complete graph and it is not always decidable. When it is not decidable, the Turing machine does not exist. Condition (ii) guarantees the existence of such Turing machine, therefore, we should talk about the one that is guaranteed to exist.

Quote: "Your changes do the definitions imply many confusing things: First, that there is a unique verifier for each language (or that one verifier is somehow the "chosen one"). "

Response: This is your implication and it is wrong. Go back to read again what I said. I argued against "the" meaning uniqueness

Quote: "Furthermore, since the verifier is unique (or there is one chosen one), there is a unique (or chosen) encoding for certificates. This is very misleading."

Response: This is an erroneous statement. For any given string x, a certificate of x (which may not exist) is just another string y such that (x, y)\in R where R is a given checking relation. It does not depend on any verifier.

Quote: "The corresponding verifiers will have the same running time but induce different witness relations L_R."

Response: This is again another erroneous statement. LR is constructed from the given checking relation R. It is not induced by any verifier.

Quote: "Also, since you seem to really enjoy appealing to "technical" issues, I should mention that when dealing with Turing machines, there is no unique "most efficient verifier", thanks to the Linear speedup theorem. So please don't tell us that the definition for NP involves only considering "the most efficient verifier"."

Response: This is a distortion of what I said. Some one created an artificial set of Turing machines which were clearly inferior to the one he was given and asked me which one I thought should be the verifier. I said he should go by the one that is the best in that set. The definition does not tell you to look for the most efficient verifier.

--CBKAtTopsails (talk) 17:00, 15 December 2007 (UTC)

"The" can mean uniqueness or it can mean referring to a previously established object. "The turing machine" in context of your original definition is fine. I can accept that it might refer to the previously established TM. You have not addressed the fact that defining a term as "the foo of x" implies that there is a unique foo. Do you not agree?
As to the other stuff you quote, you must have misunderstood my example. The relation R is not unique given an NP language. Different Rs induce different witness sets (for each x, y | R(x,y) = 1) and thus different verifiers for these sets. I gave an example of two different relations R for SAT, in which the certificates (satisfying assignments) were respectively encoded in opposite order. You insist on defining something as "the verifier" for the language, implying that there is a unique or somehow distinguished verifier. Blokhead (talk) 23:41, 15 December 2007 (UTC)


Quote: "As to the other stuff you quote, you must have misunderstood my example. The relation R is not unique given an NP language. Different Rs induce different witness sets (for each x, y | R(x,y) = 1) and thus different verifiers for these sets. I gave an example of two different relations R for SAT, in which the certificates (satisfying assignments) were respectively encoded in opposite order. You insist on defining something as "the verifier" for the language, implying that there is a unique or somehow distinguished verifier."

Response: I understand your example but they are erroneous in the context of what we are debating. Let me explain why.

First, let me first give a simple illustrative example to lay down a platform for resolving this issue.

In pure mathematics, a fundamental technique used for proving the universal truth of a statement concerning a set of given objects is first to make a general representation of these objects. For example, if I want to prove that a certain statement is true for all even integers, what I would do as the first step is to let x represent an even interger. Once I have done this, I can treat x as just one even integer without having to worry about there are in fact an infinite number of even integers in the world. Then I go through a series of logical arguments to come to the conclusion that the statement is true for x (That is, T(x) is true). Now, if in every step of my arguemnts, I did not make any reference to any particular even number such as 2, 4, 2002, etc... and if the only information I used is the property that x is an even integer, then at the end of my argument, I can safely conclude that the statement is also true for all integers because everything I've said about x is also true for any even integer. Since I am fixing my sight on one interger x, with the exception of the first introductory statement in which I used "a", everywhere in my proof process, I can say "the" integer x or "the" "anything" which came out of x.

Now, let's get back to our NP definition. Here's what you have to go through in your mental process:

Step 1. You are given a language L.

Step 2. You look for a binary relation. Let's say you found one and we call it R. You might have found more than one but it doesn't matter because R is a general representation of all such relations.

Step 3. You construct LR. Again, you use a general representation of the language that came out of R. You can use "the" because you fix your sight at one such R and hence one such LR.

Step 4. Here's a little more tricky. You have to find a TM that decides LR. Condition (ii) guarantees that that LR is decidable which means that you can find at least one such TM's in all situations. The second might exist but it is not guaranteed. Since you have to maintain your generality, you can only focus on the only one that is guaranteed to exist in all situations. "The" TM that you found is unique in that sense.

Step 5. You call this TM that you found "the" verifier.

Step 6. You have a formal definition for NP.

Throughout the entire process, you fix your sight on one L, one R, one LR, one TM but you did it without loss of generality. Therefore, you can safely conclude that the definition you establish can be used on all situations.

--CBKAtTopsails (talk) 18:57, 16 December 2007 (UTC)

With this kind of argument the following is correct English:
A positive integer n is composite if there exists a positive integer d, other than 1 or n itself, which evenly divides n without leaving a remainder. The number d is called the divisor of n.
Here is the argument. Step 1. You are given a number n. Step 2. You look for a decomposition into two or more factors greater than 1. Let's say you found one and we call this multiset of factors F. Step 3. You take the smallest factor of F. Step 4. You take a deep breath. Step 5. You call this factor that you found "the" divisor. Step 6. You have a formal definition for composite.
But, curiously enough, it is nevertheless not good English in this situation. It would imply that 2 is the unique divisor of 6. You can counter this with the argument why should one want to pick a larger factor like 3 when 2 will do to establish that 6 is composite. Well, why not.  --Lambiam 21:58, 16 December 2007 (UTC)

Quote: "It would imply that 2 is the unique divisor of 6."

Response: Are you making a joke here?

I got a better one for you.

There is an argument that given a TM that runs in time T(n), one can always contruct another TM that runs in time (T(n))2 and even one in time 2T(n). On the basis of this theory, one can contruct the following algorithms and argue that they be treated legitimate and not be thrown into the garbage can:

Algorithm A

Step 1. You are given a number n.

Step 2. You look for a decomposition into two or more factors greater than 1. Let's say you found one and we call this multiset of factors F.

Step 3. You take the smallest factor of F.

Step 4. For n = 1 to 20

Step 4b. You take a deep breath.

Step 4c. Next n

Step 5. You call this factor that you found "the" divisor.

Step 6. You have a formal definition for composite.


Algorithm B

Step 1. You are given a number n.

Step 2. You look for a decomposition into two or more factors greater than 1. Let's say you found one and we call this multiset of factors F.

Step 3. You take the smallest factor of F.

Step 4. For n = 1 to 220

Step 4b. You take a deep breath.

Step 4c. Next n

Step 5. You call this factor that you found "the" divisor.

Step 6. You have a formal definition for composite.


Of course, a counter argument to this is after taking 220 deep breaths, it is highly questionable whether one can still move on to step 5.

--CBKAtTopsails (talk) 00:22, 18 December 2007 (UTC)

[edit] Suggestions for alternatives to the definition

In my opinion, the whole definition should be rewritten. It looks messy and is hard to understand. The concept of a verifier is logically independent of any notion of complexity. Shouldn't we first define the concept of verifier, and then next define NP-ness as the existence of a fast verifier? Sketch:

  1. We have a language (decision problem) L, and want to define what it means that some TM is a verifier for L.
  2. For that, just one ingredient is needed:
    • A language LR, consisting of pairs x#y, with the following property:
      (x ∈ L) ⇔ (∃ y : x#y ∈ LR)
  3. A TM is then called a verifier for L if there exists an LR as above, and TM decides that language. Given an x ∈ L, a y such that x#y ∈ LR is called a certificate.
  4. For NP-ness we need a second ingredient:
    • A function c such that
      (x ∈ L) ⇒ (x#c(x) ∈ LR).
  5. L is then in NP if there is a verifier for it, based on some LR as before but now with a function c as above, such that if we only look at the computations on inputs of the form x#c(x), the time is polynomially bounded in the size of x.

I wonder, why is this not actually dealt with in the article NP (complexity)? That seems a better spot for a precise definition; then we can refer to the definition there and give a perhaps less precise but more intuitive description here.  --Lambiam 02:27, 15 December 2007 (UTC)


This sketch looks pretty messy to me so far and it has taken 18 lines already. Why don't you write it up formally to see how it looks and let's have a civilized debate on this. I think you also need a formal definition for verifier before you get into the definition of NP according to your sketch (although I think it is unnecessary).

--CBKAtTopsails (talk) 04:33, 15 December 2007 (UTC)

That sketch seems fine except where on earth does this c(x) function come from? Again, the impression is that there is a unique certificate for each x in L (that would be UP) or that one particular certificate is special. Furthermore, one could easily read that and assume that c(x) is computable in polynomial time, which could not be true for languages in NP \ P. There would have to be a big disclaimer about that. Since we're suggesting things, how about:

A language L is in NP if it is decided by a non-determinstic Turing machine that runs in polynomial time.
Equivalently, a language L is in NP if it can be expressed in the following form: L = \{ x | (\exists y, |y| < p(|x|))\ R(x,y) = 1\}, where p is a polynomial and R is a binary relation that is decidable in polynomial time. A polynomial-time machine that decides R is called a verifier for L (note that there may be many very different verifiers, corresponding to different relations R). Given x \in L, a value y such that R(x,y)=1 is called a certificate for x (or a witness for x).

Plus a more English-y interpretation of what this actually means. Perhaps:

Essentially, if x \in L, then x has a short (at most polynomially longer than x) "proof of membership" in L, which can be verified in polynomial time. These "proofs" are sound in the sense that when  x \not\in L, there is no proof of membership that could convince a verifier. Note that the definition does not say anything about the complexity of finding a valid proof given x, only of verifying a given proof.

In my opinion, this is as far as this article needs to go in terms of formality. Right now the article re-defines polynomial time, running time of a TM, language of a TM. Going all the way down to these first principles is really unnecessary for this article. The goal should be to let the reader understand the ideas, not drown them in notation. If they care enough to see the dirty details, we should lead them to more suitable complexity theory articles. Blokhead (talk) 15:16, 15 December 2007 (UTC)

To answer the question where this c(x) function is coming from: from the same neighbourhood in Platonic heaven as the language LR. A language L is in NP iff there exist a language LR and a function c such that ... . A malicious person might try to demonstrate that the verifier is slow by choosing hard-to-verify (or, equally disastrous, hard-to-fail) certificates from among possibly infinitely many (think of proofs for theorems). But as long as an angel can pick easily verifiable certificates (which is embodied in function c), all is fine. Instead of using a function c, you could define Tc(x) as infy∈C(x) T(x#y), where T(x#y) is the time taken by the verifier on input x#y, and C(x) is the set of certificates for x if it has any, and otherwise the unconstrained universe of words. Thay does not look simpler to me.  --Lambiam 20:05, 15 December 2007 (UTC)

Quote: "Equivalently, a language L is in NP if it can be expressed in the following form: L = \{ x | (\exists y, |y| < p(|x|))\ R(x,y) = 1\}, where p is a polynomial and R is a binary relation that is decidable in polynomial time. A polynomial-time machine that decides R is called a verifier for L (note that there may be many very different verifiers, corresponding to different relations R). Given x \in L, a value y such that R(x,y)=1 is called a certificate for x (or a witness for x)."

Response: I have some serious problems about this proposal. First, the statement R(x, y) = 1 is erroneous. A relation is a set of ordered pairs and therefore cannot be equal to 1. Second, a relation is not a language and therefore cannot be decided by a Turing machine. Third, there is nothing said about where R is coming from.

In summary, there is not enough rigor and details in this proposal to be called formal definition. It might be OK if it is called something else in the article. You cannot assume that there is no one who would come to search Wikipedia to look for some real math stuff on this subject. Putting out a sloppy formal definition is just irresponsible.

--CBKAtTopsails (talk) 17:31, 15 December 2007 (UTC)

Someone looking for "real math stuff" can go to a more specific article, like NP (complexity). If you think my use of relations is sloppy, replace it in the definition with M(x,y)=1 for a poly-time machine M. Does that make anything better?
I still strongly believe that there is no need to waste space in this article re-defining tedious things like:
  • running time of a TM
  • language accepted by a TM
  • encoding pairs/tuples of strings into strings
Might as well redefine a TM from first principles. As to "nothing said about where R is coming from", I don't understand this. A language is in NP if there exists such a suitable R, that's it. Where it "comes from" is irrelevant. Blokhead (talk) 23:56, 15 December 2007 (UTC)


Quote: "As to "nothing said about where R is coming from", I don't understand this. A language is in NP if there exists such a suitable R, that's it. Where it "comes from" is irrelevant."

Response: This is exactly the point and is a very important point in mathematics. In theorem proving in mathematics, one has to distinguish between "for all" and "for some" all the time. In your original proposal, you did not make this point clear enough. Keep in mind that R depends on L. Given another L', there might exist another R' which is different.

Quote: "Someone looking for "real math stuff" can go to a more specific article, like NP (complexity)."

Response: If you want to suggest to move the entire section on formal definition to another place, I don't have a problem with that but I suggest you write up your idea in formal language and let's have a discussion on it.

--CBKAtTopsails (talk) 16:29, 16 December 2007 (UTC)

In standard mathematical discourse you can say something like:
An integer n is said to be composite if it can be written in the form n = pq, where p and q are integers such that |p| > 1 and |q| > 1.
Every working mathematician will understand that this means: "if there exist integers p and q such that n = pq, where |p| > 1 and |q| > 1". You don't have to say anything about where p and q are coming from, and there is no risk of confusion with a universally quantified statement such as "n = pq for all integers p and q such that |p| > 1 and |q| > 1". For examples of definitions in this style, see Compact operator and Newman-Shanks-Williams prime.  --Lambiam 21:30, 16 December 2007 (UTC)

You are contradicting yourself. In your first statement, you have already said a lot about p and q. First, you said, "n = pq". Then, you said, "|p| > 1 and |q| > 1". These things basically tell people where they are coming from. So, please don't tell me that you don't have to say any thing about where p and q are coming from. In the original proposal of Blokhead, where did he say anything at all to tell R comes out of L.

--CBKAtTopsails (talk) 23:44, 17 December 2007 (UTC)


On the "the" vs. "a" thing, it seems to be a draw. How about saving two characters and going for "a"? --Vaughan Pratt (talk) 04:15, 13 March 2008 (UTC)

Only slightly more seriously, I would vote in favor of "a" independently of the length as it reads more naturally to me. As my credentials I cite (a) my willingness to expose my real name to ridicule in the unfortunate event that I'm wrong at the top of my voice, (b) my 1969 master's thesis was in computational linguistics, translating English into logic, so I know at least a little about English grammar, and (c) AFAIK my paper showing that the primes are in NP was the first publication to use the word "certificate" in connection with NP, so I have a certain vested interest in seeing to it that the language surrounding "certificate" reads naturally in English. Is CBKAtTopsails a native English speaker, and if so did he come top of his English class in high school? If not I suggest he listen to how others think the sentence sounds each way. --Vaughan Pratt (talk) 04:33, 13 March 2008 (UTC)

[edit] Alternate Preview for the Article

The relationship between the complexity classes P and NP is an unsolved question in theoretical computer science. It is considered to be the most important problem in the field - the Clay Mathematics Institute has offered a $1 million US prize for the first correct proof.

To understand the complexity and the importance of the problem, consider the problem of breaking a four digit number lock. Common sense dictates that without knowing the key, one must try at least 5,000 numbers before one can succeed in opening the lock, with probability at least 0.5. In fact, it can be rigorously proven that if the lock's secret key is picked at random, then one must try 5,000 combinations to succeed with probability at least 0.5. In general, to open an n-digit lock one needs to try exponential in n number of combinations to open it with probability 0.5. Although, in this case the complexity of the problem of unlocking is clear and provable, consider the case where the lock is transparent, and one can see all the inner workings of the lock. Now, all bets are off!

There are thousands of problems in Computer Science where the complexity of solving the problem is not that clear cut. In fact, they all seem to exhibit this exponential search quality. On the other hand, no one has ever managed to prove, for any of these problems, that it must require an exponential search. There is however, one quality common to all these problems, which is different from the breaking of the opaque number lock problem. There is a description of an "easy" method or program available, which can verify if a given potential solution solves the problem. By "easy" we mean the program runs in polynomial time. Such problems are said to be in NP. Note that in the opaque number lock problem, one does not have access to a program or method which verifies if a given key works -- we only have access to the lock as a black box.

Thus, the quintessential question of Computer Science is whether all problems in NP can be solved in polynomial time, or not. In short, NP ?= P.

As a more concrete example, consider the Hamiltonian path problem. Given a graph with n vertices, and m edges, the Hamiltonian path problem is to determine if there is a path in the graph which visits each vertex exactly once. The problem is in NP, because one can give an easy method to check if a given potential path is indeed Hamiltonian -- just check that it goes to each node exactly once. As for the complexity of the problem, intuitively it seems that one must try all paths (an exponential number) in the graph to determine if there is a Hamiltonian path. However, there is no proof known of such a claim. On the other hand, one might use the fact that this problem is in NP, and hence has additional properties, namely that there is a description of a polynomial time verification algorithm available, to solve it in polynomial time. There is no such method known either.

One major advance that has been made is that many of these problems have been shown complete for the complexity class NP. In ther words, if any of these complete problems can be shown to be in P, then all of NP is in P, and similarly if one can show any complete problem in NP to be not in P, then none of the complete problems can be in P. —Preceding unsigned comment added by Ustadny (talkcontribs) 17:44, 13 December 2007 (UTC)

I sort of like the lock analogy, and if presented more concisely it might help reinforce the concepts (although I would refrain from referring to randomness here). It's an interesting and relevant point that if the problem is treated as a "black box" that there is a (trivial) lower bound. I believe a "real" example is also important however, and subset-sum is approachable for anyone with a basic math background. I'm not fond of the Hamiltonian path problem as an introduction, because many people lack a graph theory background. Dcoetzee 20:46, 13 December 2007 (UTC)
It is much easier to consider the worst case for the lock problem: Given any algorithm that is guaranteed to eventually open each lock, there is a lock for which the algorithm will go through all combinations before the lock opens. What I don't like, though, is that the example crucially depends on incomplete information, while for our decision problems everything is out in the open.
The subset-sum problem can be presented in a (for non-mathematicians) more attractive and intuitive way by asking whether a collection of items (say diamonds) having known positive integral values can be evenly split into two collections, each having exactly the same total value. For example, if the items' values are [30, 36, 38, 40, 61, 64, 65] for a total value of 334, this can be split into [30, 36, 40, 61] and [38, 64, 65], each adding up to 167. I think that already for this simple problem it will not be immediately obvious how to make an even split, without trial and error.
It may actually help the reader if we give several examples of problems known to be in NP but not known to be in P, and I'd support having a separate Examples section with two or three such examples. While the example of the Hamiltonian path problem is more complicated than subset-sum, it can be illustrated with an image of a "road map" with cities to be visited each once, making it easier to understand. As the Euclidean Hamiltonian path problem is also known to be NP-complete (Christos H. Papadimitriou. "The Euclidean travelling salesman problem is NP-complete". TCS 4:237–244, 1977), we can use a planar graph without introducing a gross distortion.  --Lambiam 22:58, 13 December 2007 (UTC)


I agree with the idea of giving more examples but disagree with the presumption that every one who comes to search Wiki does not have a math ground to understand or not interested in looking at the subject from a math perspective. Does Wiki have any statistics to support this notion?

--CBKAtTopsails (talk) 18:29, 15 December 2007 (UTC)

I notice that the section "Is P really practical" contains some of the real important points. Unfortunately, those points are written rather tangentially, when they are of extreme significance. For example, if it is proven that NP = RP, the P=NP debate is over... no one is going to harp over the issue that it really hasn't proven P=NP. Similarly, proving P/=NP is not enough, and one must prove NP /= AvgP, for the result to be of any significance...this clearly brings in randomness in choosing the instances. Of course, I have cryptography in mind, which is the real motive for proving P /=NP.

So, what I am suggesting is that in this extremely important page "P vs. NP", there be less formalism, and more stress on the issues. The formal points should be relegated to the NP page itself.

So here is what I consider is important (in decreasing order of importance): (i) The notion of difficulty or ease of solving computer science problems...some idea of polynomial time vs super-polynomial or exponential time (ii) Are there examples in Computer Science where this can be proven (for both ease and difficulty) (iii) Pointing out that there are gazillion problems in Computer Science where it is not proven either way..so the field is in a limbo...a bad state of affairs since we do not understand the fundamentally important problem... it has consequences for Robotics, vision, for how our own brain works, many Wall Street optimization problems etc etc. A subsection on some examples.(iv) Do these problems share something in common? YES! (v) Non-determinism (vi) Completeness of many problems (vii) More detail on the notion of "difficultly" or complexity of problems ...(a) probabilistic polynomial time (b) Average Polynomial time. (viii) Recent Advances .. viz. PCP. i.e. hardness of approximating (ix) Incompleteness of certain proof techniques... relativizing proof techniques are not going to work for proving either way... rules out most common easy proof techniques which are commonly employed for proving such results. For proving NP /= P even more techqniues, like Natural Proofs (cite Razborov-Rudich), are not going to work. Ustad NY (talk) 15:32, 16 December 2007 (UTC)

While all this may be true, we are writing an encyclopedia article here, not an article in a science magazine. It is not our task to point out such things as that "we" do not understand the fundamentally important problem. What we should do is report on the positions taken by others. If you have a reliable and academically reputable source venturing such opinions, we may consider adding them in a form like "Karp himself, however, took the position that work on optimizing compilers was of much greater practical importance than attempting to solve the outstanding problems in computational complexity theory (R. M. Karp, How fast is, like, really fast, man? Turing Award lecture, 1985)." Otherwise we venture into the twilight zone of unpublished analysis or synthesis, from which few have returned. To the extent that we can give an encyclopedic review of the field like that, a better place may be in Computational complexity theory.  --Lambiam 22:35, 16 December 2007 (UTC)

Hey I solved this problem and P=NP. There are too many patterns and underlying relationships that it could take exponential time (looking at every possible combination). Owell maybe one day this post here on wikipedia will be history :) I will explain it in detail once I get it notarized etc. —Preceding unsigned comment added by 74.33.52.72 (talk) 21:50, 18 December 2007 (UTC)

[edit] Request for comments: "the" verifier or "a" verifier?

{{RFCsci | section=Request for comments: "the" verifier or "a" verifier? !! reason=Should the definition of NP-complete use "the" verifier or "a" verifier? !! time=08:38, 22 December 2007 (UTC)}}

The discussion on the following burning question has stalled: should the verifier-based definition of NP-complete in the section Formal definitions for P and NP use the definite article (the verifier) or the indefinite article (a verifier)? For that discussion, see the section #It is "the" verifier rather than "a" verrifier on this talk page.

Please only respond if you understand the notion of non-deterministic polynomial time.


Use the definite article "the", as in "is called the verifier":


Use the indefinite article "a", as in "is called a verifier":

  • Yes, use "a" here.  --Lambiam 08:38, 22 December 2007 (UTC)
  • Yes. rspeer / ɹəədsɹ 09:59, 22 December 2007 (UTC)
  • The indefinite article should be used. Given one Turing machine that decides membership in a language in polynomial time, one can construct a distinct Turing machine that decides the same language in polynomial time (for example, by replacing the halting state with a "do nothing for one step, then halt" machine). The use of the definite article here would suggest, incorrectly, that "the verifier" must be unique. The existence of an efficient verifier is what matters. Michael Slone (talk) 10:14, 22 December 2007 (UTC)
  • As Michael Slone points out, there is no uniqueness here; the article should directly point this out, which should help resolve any potential confusion. — Carl (CBM · talk) 14:14, 22 December 2007 (UTC)
  • Use "a". That is the standard and the one that makes sense. The definition of NP is always phrased to say that there be at least one TM satisfying the conditions, and it is not at all unusual that two drastically different algorithms of the same complexity demonstrate a problem is in NP. It's hard to believe it got this far. I commend Lambiam for his patience. --C S (talk) 14:21, 22 December 2007 (UTC)
    • I concur completely with CS's assessment. The indefinite "a" is clearly more appropriate. --Cheeser1 (talk) 06:34, 25 December 2007 (UTC)
  • I agree with all the foregoing; if the language needs more precision, it should say, not Turing Machine, but Turing Machine program. Septentrionalis PMAnderson 06:43, 25 December 2007 (UTC)

Comment/further discussion:

  • As explained in my comment above, I think both wordings are valid, although they are interpreted differently. One is a free-standing valid statement in its own right ("a"), while the other refers to a previously-defined object ("the"). As such I have no stake in the outcome of this discussion. Dcoetzee 07:11, 25 December 2007 (UTC)
  • The definition looks fine as it is (as of 03:32, 30 December 2007 (UTC)); except that "y is called the certificate of membership of x in L" should also use the indefinite article, since there may be multiple certificates for a given verifier and a given x. Also, the current definition does not clearly indicate what it has to do with verifiers; this should be fixed by putting the definition of verifier first. Ben Standeven (talk)
    Fixed.  --Lambiam 18:19, 30 December 2007 (UTC)
  • I agree: "a verifier". The problem is in NP if any verifier V is polynomial-time. We can always find an alternative, less efficient verifier (even if it's "do V then delay for en seconds") but it doesn't stop the problem being in NP. Certes (talk) 00:02, 13 January 2008 (UTC)

[edit] DUH

There's a section that refers to a trivial problem DUH but does not provide any description of it. Can someone look into it? Thanks. --Uchchwhash (talk) 14:23, 14 February 2008 (UTC)

It is actually described: given a description of a Turing machine M guaranteed to halt in polynomial time, does there exists a polynomial-size input that M will accept? However, I find this very unclear and I don't see the relevance of mentioning this contrived problem; a reference to the list of NP-complete problems, most of which are much more natural, would seem to serve the purpose better of conveying the fact of the existence of NP-complete problems.  --Lambiam 04:57, 15 February 2008 (UTC)
The importance of this problem from a pedogogical perspective is that it's straightforward to prove that the problem is in NP and NP-hard. I revised to briefen this paragraph, and remove the name DUH, which I think is too informal. Dcoetzee 08:02, 15 February 2008 (UTC)

[edit] Moving "Is P really practical"

I had started a discussion above, titled "Suggested addition to "Is P really practical"". I'd like to continue the discussion that discussion here, because it seems like people pay more attention to sections at the bottom of the talk page than sections in the middle of it. First, let me repost my last post from above:

I apologize, Dcoetzee. I thought I had addressed your first objection. You said my original proposal was much too long, and I thought you were just saying that it should be about the same length as the first point and the second point in the list. The version of my paragraph that I added to the article wasn't the same as my original proposal. It did have about the same number of words as the first and second points. Are you saying that your version is superior to my version, in and of itself? Or are you just saying that, in general, a shorter version is better? I can understand that, because, as you said, this discussion is somewhat off-topic from the P=NP problem, so I guess you might be saying that you just want to have less off-topic text in this article. I saw that CRGreatHouse added a link to "Cobham's thesis." What do you think about moving this whole section out of the P=NP article and into the Cobham's thesis article? And I think that if it gets moved, then, after we move it, it should have a discussion of hardware, including some examples with numbers. I think this is the reason for Nicolaennio's confusion in his posts on this talk page. See my response to him below. Also, I wasn't citing the talk page. I was just trying to put a link similar to the links that already exist in wikipedia that say things like "where?--see talk page". For example, see the first paragraph of this article: Tying (commerce). Sorry for the confusion. --Navigatr85 20:56, 2 December 2007 (UTC)

So, Dcoetzee and everyone, please let me know what you think about moving that section out of this article and into the "Cobham's Thesis" article. Navigatr85 23:01, 6 March 2008 (UTC)

I think moving this out is a great idea and Cobham's thesis is a great place for it to go. It can still be briefly summarized here and linked using {{main}}, which I think is a good idea. I have a lot more to write about Cobham's thesis too that would go quite astray of this topic. :-) Regarding your original contributions, I don't honestly remember them, and I will evaluate them when you add them in that article. Dcoetzee 19:53, 10 March 2008 (UTC)

[edit] Wording in "Polynomial-time algorithms" section

On January 12, Brianjd changed the wording in the second half of this section from

"It is not known whether an algorithm exists that can provably do this in polynomial time."

to

"it is unknown whether an algorithm exists that can do this in polynomial time."

Removing the word 'provably' makes the sentence more readable, but I'm worried it might also make it incorrect. Can someone prove that the algorithm works with the current wording, or was this just an inadvertent mistake? Nate Biggs (talk) 15:00, 10 March 2008 (UTC)

It should not affect correctness. If it had said "No known algorithm can do this in polynomial time", that would be subtlely incorrect, because there are known algorithms that are polynomial-time provided that P=NP, and we don't know whether or not this is the case yet (in this sense, to say "No known algorithm can do this in polynomial time" is equivalent to saying "P ≠ NP"). The current wording is okay though, because it just expresses that we don't have a proof one way or the other about the existence of such an algorithm. Dcoetzee 19:59, 10 March 2008 (UTC)
Sorry, I think I should have spelled out my issue more clearly in the original comment. If an algorithm X *provably* solves subset-sum in polytime, then there is a polytime algorithm Y that outputs:
- A proof that X solves subset-sum
- A proof that X returns whatever answer it returns when run on input S
Overall, this produces a complete proof that S contains (or does not contain) a subset that sums to 0, as required by the given algorithm. If you omit the word 'provably', this falls apart, since Y can't start with a proof that X solves subset-sum. So is there some other argument why this algorithm is correct that relies only on the fact that X solves subset-sum, not that it *provably* does so? Nate Biggs (talk) 22:20, 10 March 2008 (UTC)
No, it's clear to me that the word "provably" in the original text referred to "do this" - in other words, has someone demonstrated an algorithm, accompanied by a proof that the algorithm is correct and runs in polynomial time? The answer to that is no, and the word "provably" doesn't have anything to do with emitting a proof string or certificate for verification. Dcoetzee 00:04, 11 March 2008 (UTC)
Sorry, I think I'm still not being clear. The question is not whether the sentence "it is not known whether an algorithm exists that can do this in polynomial time" is correct -- clearly it is. The question is whether the entire paragraph is correct. In other words, the article currently asserts that if an algorithm exists that solves subset-sum in polynomial time, then the specific algorithm it supplies also solves subset-sum in polynomial time. Is that true? I can't prove it to myself. I'm only able to prove a slightly weaker result: if an algorithm exists that *provably* solves subset-sum in polynomial time, then the supplied algorithm also does. Nate Biggs (talk) 00:24, 11 March 2008 (UTC)
The paragraph is correct, perhaps more easily seen for the subset sum example. You run all programs and see if any outputs a witness for the instance (finding a witness can be done in poly time if and only if the decision problem is poly time). We know how to verify subset-sum witbesses. If some machine outputs subset-sum witnesses in poly time, then clearly so will the one on the page. Of course, we don't know how big these polynomial are, so we can't a priori code the algorithm to "time out" after so long and return NO. There is hope though, since if P=NP, then co-subset-sum is in NP and we can also look for witnesses for this language. However, we don't know what such a verifier (for co-subset-sum) would look like (proving one correct would imply NP=coNP). So we can kinda cheat and ask for a logical proof instead.. For example, the "proof" could contain a verification algorithm + proof that it is a correct verifier that runs in poly time + a witness. Cheating? Kinda, but this is all very theoretical anyway.
I'm not sure that this is an interesting/noteable section for the article. It is overly theoretical, has no practical use, is confusing to everyone the first time they see it, and in my experience has no interesting theoretical use apart from being an annoying problem assigned in algorithms classes across the world just to frustrate the students... ;) Blokhead (talk) 12:38, 11 March 2008 (UTC)
Thanks for the explanation, Blokhead. There's one part of your argument that still doesn't quite make sense to me. You say that "co-subset-sum is in NP" implies "there exists an algorithm that can *provably* verify co-subset-sum in polytime." It's not immediately clear to me why this is true. Nate Biggs (talk) 13:39, 11 March 2008 (UTC)

This seems to have fallen on the floor, and there's still a gap in Blokhead's argument above. Can anybody give a complete proof that the algorithm is correct? If not, I'll add the word 'provably' back in the next couple days. Nate Biggs (talk) 01:15, 14 March 2008 (UTC)

This thing of asking for a proof of correctness along with the witness may be kinda cheating (I haven't thought it through 100%), although it seems unlikely that there would be co-subset-sum NP verifoers, but none would admit a correctness proof. Another alternative would be for all k, let A_k be the algorithm that runs the algorithm in the section (the one that may run forever), and responds NO if it has run for more than k+n^k steps. Now, if P=NP, then some A_k is a poly-time algorithm for subset-sum. This is essentially what the section claims now, but stated more explicitly in terms of the relationship between the "always give an answer in poly time" definition vs. the "say yes in poly time or run forever". This statement is not as strong -- it says that if P=NP, then one of these algorithms is a strict P algorithm for subset-sum. Blokhead (talk) 12:24, 14 March 2008 (UTC)
Adding "provably" in front of "do" does not alter the meaning of the sentence from which it was removed. However, it does affect the following sentence, in particular the meaning of "such". Making this more explicit, contrast:
(1) It is unknown whether there is an A for which PT(A) holds. If there is an A for which PT(A) holds, then PT(B).
(2) It is unknown whether there is an A for which PT(A) holds. If there is an A for which PT(A) provably holds, then PT(B).
In this case we do not know if (1) is correct, while (2) is provably correct. The "action at a distance" of the word "provably" in the first sentence ("It is unknown ...") is perhaps a bit too subtle, and it may be better to also make this more explicit in the article, for example thus:
As of 2008, it is unknown whether an algorithm exists that can do this in polynomial time. But if such algorithms do exist that are also provably correct, then we already know some of them; ...
There is still another subtlety. What axiomatizable logic will the proof checker be based on? ZFC? There may be provable statements that are independent of ZFC, and PT(A) might be one of those. Note that (by Gödel) PT(B) is not provably correct in the same logic used for the proof checker in B, but independent of it. To make the statement fully defensible, replace "provably correct" by:
provably correct in some sound logic L
and replace the if clause in the algorithm by:
IF the program outputs a complete proof in logic L
 --Lambiam 07:44, 15 March 2008 (UTC)

[edit] Should this be Cited?

Is this known to be trivial or should it be cited in this Article?

(* complete dictionary of words, one per row, number of rows is (alpha^word), using words of length "word" and an alphabet of "alpha" number of characters *)

alpha=4;word=3;dict=.;

dict=Partition[Flatten[Tuples[Reverse[IdentityMatrix[alpha]],word]],(alpha*word)]

PseudoInverse[dict]==((Transpose[dict])*((alpha)^-(word -1)))-((word - 1)/(alpha^word)/word)

True

There is nothing special for alpha=4;word=3 - The method works for all cases where word < alpha, but that is a more verbose loop, and the Mathematica PseudoInverse function takes too long to calculate for values of alpha and word that are large, whereas the transpose side of the equation is fast.

100TWdoug (talk) 15:11, 27 March 2008 (UTC)

It can only be cited if you have a reference for this, published by a reliable source. However, it is not clear what this has to do with the problem whether P = NP.  --Lambiam 19:43, 27 March 2008 (UTC)

[edit] Moved "Is P really practical"

As discussed above, I moved most of the information out of this section and into the Cobham's thesis article. This was done because Dcoetzee and I agreed that the discussion was not very relevant to the P=NP problem. However, I ended up deleting some of the discussion of quantum computers, not moving it, because it was irrelevant to Cobham's thesis. Maybe that information could be added back into the P=NP article. I would consider adding something like this:

  • There is a misconception that quantum computers could help with the P=NP problem. However, quantum algorithms have not to date solved any NP-hard problem in polynomial time. Also, the definition of P and NP are in terms of classical computing models like Turing machines. Therefore, even if a quantum computer algorithm were discovered to efficiently solve an NP-hard problem, we would only have a way of physically solving difficult problems quickly, not a proof that the mathematical classes P and NP are equal.

But then, If we added that, I think we might need a citation for the first sentence of that paragraph. On the other hand, maybe we don't need any discussion of quantum computers in this article. What does everyone think? Navigatr85 04:16, 1 April 2008 (UTC)

I believe it is relevent as motivation for why the question is important. At least a summary should go here, so I did this. LouScheffer (talk) 04:54, 1 April 2008 (UTC)

[edit] The truth of a statement.

One of the sections says:

The problem of deciding the truth of a statement in Presburger arithmetic requires even more time. ... [deciding] the truth of Presburger statements has a runtime of at least 2^{2^{cn}} ... the problem ... [needs] more than exponential run time. Even more difficult are the undecidable problems, such as the halting problem.

That bound is interesting, but it should definitely be noted in the article that the halting problem is not more difficult than the problem of deciding the truth of a statement. That problem is also undecidable. Unfortunately I'm new to all this so I don't want to make the change myself for fear of messing it up worse! Can someone take care of it? .froth. (talk) 02:39, 6 June 2008 (UTC)

Deciding the truth of a statement in arithmetic, where you can talk about addition and multiplication, is in fact strictly harder than the halting problem. But in Presburger arithmetic, you're not allowed to talk about multiplication. That restriction turns the problem into a decidable one. --Trovatore (talk) 02:43, 6 June 2008 (UTC)
Is this an answer to Froth? His suggestion is to mention Tarski's theorem, which, I agree, offers a more striking contrast to the result of Fischer and Rabin than Turing's result. However, I think the whole paragraph does not belong here; the content could be moved to Complexity class, which could bear being expanded a bit.  --Lambiam 16:55, 6 June 2008 (UTC)
It's of tangential interest, and I think the current brief summary is appropriate, because it emphasizes that the NP-complete problems are not the hardest problems to solve, and that P=NP would not imply an efficient solution to everything. This article is targeted toward people with little or no complexity background, so this can't be assumed to be known.
Froth's point was that "the halting problem is not more difficult than deciding the truth of a statement" - but as Lambiam noted, he missed "in Presburger arithmetic," which makes it decidable. As for replacing the halting problem with Tarski's theorem, I think the halting problem is, if not the most fundamental undecidable problem, at least the most widely-known, and so a brief summary such as this does better in citing it than Tarski. Dcoetzee 17:12, 6 June 2008 (UTC)

So it's possible I misunderstood Froth's comment -- I thought he was claiming an actual error in the current text, and I was explaining why it wasn't an error. After reading Lambiam's comment I'm not sure that's what Froth meant. As to mentioning the decision problem for arithmetical truth here, I'm not sure that makes too much sense -- the halting problem is not just the most widely known undecidable problem, but it's also sort of an almost "minimal" one (not literally; there are nonzero unsolvability degrees below 0', but they're complicated to describe). Arithmetical truth is much more undecidable than the halting problem. --Trovatore (talk) 20:26, 6 June 2008 (UTC)