Longest increasing subsequence problem

From Wikipedia, the free encyclopedia

The longest increasing subsequence problem is to find the longest increasing subsequence of a given sequence. Note that subsequence we are searching for is not necessarily contiguous. Searching for the longest contiguous increasing subsequence would be a simple matter of scanning the whole sequence to find the points where the values decrease, and taking the longest gap between two such points.

The longest increasing subsequence problem has long been known to be solvable in time O(n log n) (Schensted 1961).

Contents

[edit] Relations to other algorithmic problems

The longest increasing subsequence problem is closely related to the longest common subsequence problem, which has a quadratic time dynamic programming solution: the longest increasing subsequence of a sequence S is the longest common subsequence of S and T, where T is the result of sorting S. However, for the special case in which the input is a permutation of the integers 1, 2, ..., n, this approach can be made much more efficient, leading to time bounds of the form O(n log log n) (Hunt and Szymanski 1977).

The longest increasing subsequence problem can also be related to finding the longest path in a directed acyclic graph derived from the input sequence: we construct a vertex per sequence value and connect values x and y by an edge when x is both smaller than y and occurs earlier than it in the sequence. Longest paths in DAGs can be found in time linear in the size of the DAG, but in this case the size can be quadratic in the sequence length, so this algorithm is not asymptotically faster than the dynamic programming one.

[edit] Efficient algorithms

We describe an efficient algorithm for the problem that needs no data structures, only arrays and binary searching. The algorithm processes the sequence elements in order, maintaining the longest increasing subsequence found so far. Denote the sequence values as X[0], X[1], etc. Then, after processing X[i], the algorithm will have stored values in two arrays, described below:

M[j] — stores the position k of the smallest value X[k] such that ki and there is an increasing subsequence of length j ending at X[k]
P[k] — stores the position of the predecessor of X[k] in the longest increasing subsequence ending at X[k]

In addition the algorithm stores a variable L representing the length of the longest increasing subsequence found so far.

Note that, at any point in the algorithm, the sequence

X[M[1]], X[M[2]], ..., X[M[L]]

is nondecreasing. For, if there is an increasing subsequence of length i ending at X[M[i]], then there is also a subsequence of length i-1 ending at a smaller value: namely the one ending at P[M[i]]. Thus, we may do binary searches in this sequence in logarithmic time.

The algorithm, then, proceeds as follows.

   L = 0
   for i = 0, 1, ... n:
       binary search for the largest j ≤ L such that X[M[j]] < X[i] (or set j = 0 if no such value exists)
       P[i] = M[j]
       if j == L or X[i] < X[M[j+1]]:
           M[j+1] = i
           L = max(L, j+1)

The result of this is the length of the longest sequence in L. The actual longest sequence can be found by backtracking through the P array:

..., X[P[P[M[L]]]], X[P[M[L]]], X[M[L]]

[edit] See also

[edit] References

  • Hunt, J.; Szymanski, T. (1977). "A fast algorithm for computing longest common subsequences". Communications of the ACM: 350–353.
  • Schensted, C. (1961). "Longest increasing and decreasing subsequences". Canadian Journal of Mathematics 13: 179–191.

[edit] External links