Independent set problem

From Wikipedia, the free encyclopedia

In mathematics, the independent set problem (IS) is a well-known problem in graph theory and combinatorics. The independent set problem is known to be NP-complete.

Contents

[edit] Description

Given a graph G, an independent set is a subset of its vertices that are pairwise not adjacent. In other words, the subgraph induced by these vertices has no edges, only isolated vertices. Then, the independent set problem asks if, given a graph G and an integer k, does G have an independent set of size at least k?

The corresponding optimization problem is the maximum independent set problem, which attempts to find the largest independent set in a graph. Given a solution to the decision problem, binary search can be used to solve this problem with O(log |V|) invocations of that solution. This problem is known to have no constant-factor approximation algorithm if P≠NP.

[edit] Algorithms

The simplest brute force algorithm for independent set simply examines every vertex subset of size at least k and checks whether it is an independent set. This is polynomial time if k equals the number of vertices, or if is a constant less than this, but not if it's, say, half the number of vertices.

A much easier problem to solve is that of finding a maximal independent set, which is an independent set not contained in any other independent set. We begin with a single vertex. We find a vertex not adjacent to it and add it, then find a vertex adjacent to neither of these and add it, and so on until we can find no more vertices to add. At this time the set is maximal independent.


[edit] Proof of NP-completeness

It's easy to see that the problem is in NP, since if we have a subset of vertices, we can check to make sure there are no edges between any two of them in polynomial time. To show the problem is NP-hard, we will use a reduction from another NP-complete problem.

Assume we already know Cook's result that the boolean satisfiability problem is NP-complete. One can efficiently reduce any boolean formula to conjunctive normal form (CNF). In conjunctive normal form:

  • The formula is a conjunction (and) of clauses.
  • Each clause is a disjunction (or) of literals.
  • Each literal is either a variable or its negation.

For example, the following formula is in CNF form, where ~ denotes negation:

(x1 or ~x2 or ~x3) and (x1 or x2 or x4)

Such a formula is satisfiable if we can assign true/false values to each variable in such a way that at least one literal in every clause is true. For example, any assignment with x2 false and x4 true satisfies the above formula. The problem of determining whether a formula in CNF form is satisfiable is also NP-complete and is called CNF-SAT.

The graph constructed by this reduction for the example formula above
Enlarge
The graph constructed by this reduction for the example formula above

Now, we describe a polynomial-time many-one reduction from CNF-SAT to the independent set problem. First, create a vertex for every literal in the formula; include duplicate vertices for those occurring more than once. Put an edge between:

  1. Any two literals which are negations of one another.
  2. Any two literals which are in the same clause.

For example, in our example above, x2 would be adjacent to ~x2, the first x1 would be adjacent to ~x2, and the second x1 would be adjacent to x4.

We argue now that this graph has an independent set of size at least k, where k is the number of clauses, if and only if the original formula was satisfiable.

Suppose we have an assignment satisfying the original formula. Then we can choose one literal from each clause which is made true by this assignment. This set is independent, because it only includes one literal from each clause (no edges of type 2), and because no assignment makes both a literal and its negation true (no edges of type 1).

On the other hand, suppose we have an independent set of size k or greater. It can't contain any two literals in the same clause, since these are pairwise adjacent. But then, since there are at least k vertices and k clauses, we must have at least one in each clause (in fact exactly one). It also can't contain both a literal and its negation, because there are edges between these. That means it's easy to choose an assignment that makes these k literals all true, and this assignment will satisfy the original formula.

What makes this reduction to independent set so simple is the capacity of the edges in the graph to express constraints, such as the necessity of never choosing both a literal and its negation. The graph coloring problem also benefits from this useful property.

[edit] References

  • Richard Karp. Reducibility Among Combinatorial Problems. Proceedings of a Symposium on the Complexity of Computer Computations. 1972.
  • Michael R. Garey and David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman. ISBN 0-7167-1045-5. A1.2: GT20, pg.194.
In other languages