Self-balancing binary search tree

From Wikipedia, the free encyclopedia

In computer science, a self-balancing binary search tree or height-balanced binary search tree is a binary search tree that attempts to keep its height, or the number of levels of nodes beneath the root, as small as possible at all times, automatically. It is one of the most efficient ways of implementing associative arrays, sets, and other data structures.

[edit] Overview

Most operations on a binary search tree take time directly proportional to the tree's height, so it is desirable to keep the height small. Ordinary binary search trees have the primary disadvantage that they can attain very large heights in rather ordinary situations, such as when the keys are inserted in order. The result is a data structure similar to a linked list, making all operations on the tree expensive. If we know all the data ahead of time, we can keep the height small on average by adding values in a random order, but we don't always have this luxury, particularly in online algorithms.

Self-balancing binary trees solve this problem by performing transformations on the tree (such as tree rotations) at key times, in order to reduce the height. Although a certain overhead is involved, it is justified in the long run by drastically decreasing the time of later operations.

The height must always be at least the ceiling of log n, since there are at most 2k nodes on the kth level; a complete or full binary tree has exactly this many levels. Balanced BSTs are not always so precisely balanced, since it can be expensive to keep a tree at minimum height at all times; instead, they keep the height within a constant factor of this lower bound.

Times for various operations in terms of number of nodes in the tree n:

Operation Big-O time
Lookup O(log n)
Insertion O(log n)
Removal O(log n)
In-order iteration over all elements O(n)

For some implementations these times are worst-case, while for others they are amortized.


[edit] Implementations

Popular data structures implementing this type of tree include:

[edit] Applications

Self-balancing binary search trees can be used in a natural way to construct associative arrays; key-value pairs are simply inserted with an ordering based on the key alone. In this capacity, self-balancing BSTs have a number of advantages and disadvantages over their main competitor, hash tables. Lookup is somewhat complicated in the case where the same key can be used multiple times.

Many algorithms can exploit self-balancing BSTs to achieve good worst-case bounds with very little effort. For example, if binary tree sort is done with a BST, we have a very simple-to-describe yet asymptotically optimal O(n log n) sorting algorithm (although such an algorithm has practical disadvantages due to bad cache behavior). Similarly, many algorithms in computational geometry exploit variations on self-balancing BSTs to solve problems such as the line segment intersection problem and the point location problem efficiently.

Self-balancing BSTs are a flexible data structure, in that it's easy to extend them to efficiently record additional information or perform new operations. For example, one can record the number of nodes in each subtree having a certain property, allowing one to count the number of nodes in a certain key range with that property in O(log n) time. These extensions can be used, for example, to optimize database queries or other list-processing algorithms.

[edit] See also

[edit] External links

[edit] References

  • Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0. Section 6.2.3: Balanced Trees, pp.458–481.