Proof that Boolean satisfiability problem is NP-complete
From Wikipedia, the free encyclopedia
The boolean satisfiability problem, commonly abbreviated as SAT, is the question whether a given boolean expression φ, consisting of n variables v1, ..., vn with parentheses and operators between them, is satisfiable, i.e. whether there exists an assignment of truth values to the variables that makes the boolean expression true.
This decision problem is NP-complete.
Boolean satisfiability is the first problem to be shown to be NP-complete, so its proof does not rely on the NP-completeness of other problems. Many other problems are shown to be NP-complete by reducing boolean satisfiability to them.
[edit] Proof
First of all, we need to show that boolean satisfiability is in NP. This is rather easy: Given a boolean expression and a particular assignment of truth values (the "certificate"), we can check in polynomial time whether this assignment of truth values does actually make the boolean expression evaluate to true.
To prove that it is NP-complete, we need to show that every other decision problem in NP can be reduced to it.
A decision problem is basically the same thing as a language L (a set of strings). An algorithm for the decision problem would be an algorithm that determines whether a given string x is in that set L, and accepts or rejects x accordingly. Saying that the decision problem is in NP means that a non-deterministic Turing machine (with alphabet Σ, a blank symbol blank ∈ Σ, a set of states K, a designated start state s ∈ K, and a designated accepting state acc ∈ K) can perform this in polynomial time, say nk, where n is the length of the input x and k is a constant. Now, if we can construct an algorithm that converts an x to a boolean expression that is satisfiable if and only if this non-deterministic Turing machine accepts x, then boolean satisfiability is NP-complete.
To construct our boolean expression, we will need a whole load of boolean variables:
-
- Si,q specifies that at the ith step, the machine is in state q.
-
- Ti,j,σ specifies that at the ith step, the symbol in the jth position on the tape is σ.
-
- Hi,j specifies that at the ith step, the head is in position j.
The total number of these variables is | K | nk + | Σ | n2k + n2k.
Next, we will construct a number of boolean expressions. All of them must be true, so at the end we will stick them all together with "and"s in between. First, there are some basic truths about Turing machines that we have to encode:
-
- meaning: The head is never in two positions at the same time.
-
- meaning: The machine is never in two states at once.
-
- meaning: Each tape cell contains only one symbol.
Second, the basic operation of a Turing machine needs to be encoded:
-
- meaning: At the first step of the computation, the machine is in state s (the start state), and the head is pointing at the first symbol on the tape.
-
- meaning: At the first step, the tape contains the input x on the left, and blank symbols everywhere on the right.
-
- meaning: If at any stage during the computation, the head is pointing at the jth symbol on the tape, then all other symbols on the tape will still be the same in the next step. (In other words: Only the symbol under the head may change.)
Now we encode the actual Turing machine program itself. We think of the Turing machine program as a function δ that takes a pair (q, σ) that specifies the current state and symbol, and returns a set of triples (q', σ', d) which specifies the new state, new symbol, and the direction to move to, where d is either -1, 0, or +1. Notice that it returns a set of these triples, not just one such triple; this is where the non-determinism comes in. We express these semantics as a boolean expression:
Lastly, and this is what gives the whole thing the final kick, we want at least one of those computations we have encoded to reach an accepting state:
Now, finally, we glue all of these expressions together with "and"s, and we get a gigantic boolean expression that is satisfiable if and only if one of the computations reaches an accepting state.
This shows that if we could determine the satisfiability of a boolean expression in polynomial time, then we can solve any other NP problem in polynomial time, and so boolean satisfiability is NP-complete.