R* tree

From Wikipedia, the free encyclopedia

R* tree is a variant of R tree for indexing spatial information. R* tree supports point and spatial data at the same time with a slightly higher cost than other R-trees. It was proposed by Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger in 1990.

1 Difference between R* trees and R trees
2 Performance
3 Algorithm
4 References
5 External links

[edit] Difference between R* trees and R trees

Minimization of both coverage and overlap is crucial to the performance of R-trees. The R*-tree attempts to reduce both, using a combination of a revised node split algorithm and the concept of forced reinsertion at node overflow. This is based on the observation that R-tree structures are highly susceptible to the order in which their entries are inserted, so an insertion-built (rather than bulk-loaded) structure is likely to be sub-optimal. Deletion and reinsertion of entries allows them to "find" a place in the tree that may be more appropriate than their original location.

When a node overflows, a portion of its entries are removed from the node and reinserted into the tree. (In order to avoid an indefinite cascade of reinsertions caused by subsequent node overflow, the reinsertion routine may be called only once in each level of the tree when inserting any one new entry.) This has the effect of producing more well-clustered groups of entries in nodes, reducing node coverage. Furthermore, actual node splits are often postponed, causing average node occupancy to rise.

[edit] Performance

Likely significant improvement over other R tree variants, but there is overhead due to the reinsertion method.
Efficiently supports point and spatial data at the same time