Tarjan's strongly connected components algorithm

From Wikipedia, the free encyclopedia

Tarjan's Algorithm (named for its discoverer, Robert Tarjan) is a graph theory algorithm for finding the strongly connected components of a graph.

1 Idea
2 The Root Property
3 The Algorithm in pseudocode
4 Remarks
5 Literature
6 Links

[edit] Idea

The basic idea of the algorithm is this: a depth-first search begins from a start node. The strongly connected components form the subtrees of the search tree, the roots of which are the roots of the strongly connected components.

The nodes are placed on a stack in the order in which they are visited. When the search returns from a subtree, the nodes are taken from the stack and it is determined whether each node is the root of a strongly connected component. If a node is the root of a strongly connected component, then it and all of the nodes taken off before it form that strongly connected component.

[edit] The Root Property

It is obviously crucial for the algorithm that when returning from a subtree it is determined for each node whether that node is in the root of a strongly connected component. For that purpose, each node is given a depth search index v.dfs, which numbers the nodes consecutively in the order in which they are "discovered" by the depth first search. In addition, a value v.lowlink is assigned, such that v.lowlink := min {v'.dfs: v' is reachable from v}. Therefore: v is the root of a strongly connected component if and only if v.lowlink = v.dfs. v.lowlink can be computed during the depth first search in such a way that the value is known at the time of the query.

[edit] The Algorithm in pseudocode

Input: Graph G = (V, E), Start node v0

max_dfs := 0  // Counter for dfs
U := V        // Collection of unvisited nodes
S := {}       // An initially empty stack
tarjan(v0)    // Call the function with the start node

procedure tarjan(v)
v.dfs := max_dfs;          // Set the depth index
v.lowlink := max_dfs;      // v.lowlink <= v.dfs
max_dfs := max_dfs + 1;    // Increment the counter
S.push(v);                 // Place v on the stack
U := U \ {v};              // Separate v from U
forall (v, v') in E do     // Consider the neighboring nodes
  if (v' in U)
    tarjan(v');            // recursive call
    v.lowlink := min(v.lowlink, v'.lowlink);
  // Ask whether v' is on the stack 
  // by a clever linear time method
  // (for example, setting a flag on the node when it is pushed or popped) 
  elseif (v' in S)
    v.lowlink := min(v.lowlink, v'.dfs);
  end if
end for
if (v.lowlink = v.dfs)     // the root of a strongly connected component
  print "SZK:";
  repeat
    v' := S.pop;
    print v';
  until (v' = v);
end if

[edit] Remarks

Complexity: The tarjan procedure is called once for each node; the forall statement considers each edge at most twice. The algorithm's running time is therefore linear in the number of edges in G.
Naturally, the algorithm can only find those strongly connected components that are reachable from the start node. Otherwise, the algorithm must be started several times from different start nodes.