Linear hashing

Linear hashing is a dynamic hash table algorithm invented by Witold Litwin (1980),[1] and later popularized by Paul Larson. Linear hashing allows for the expansion of the hash table one slot at a time. The frequent single slot expansion can very effectively control the length of the collision chain. The cost of hash table expansion is spread out across each hash table insertion operation, as opposed to being incurred all at once.[2] Linear hashing is therefore well suited for interactive applications.

Algorithm Details


First the initial hash table is set up with some arbitrary initial number of buckets. The following values need to be kept track of:

Bucket collisions can be handled in a variety of ways but it is typical to have space for two items in each bucket and to add more buckets whenever a bucket overflows. More than two items can be used once the implementation is debugged. Addresses are calculated in the following way:

To add a bucket:

The effect of all of this is that the table is split into three sections; the section before S, the section from S to N \times 2^L, and the section after N \times 2^L. The first and last sections are stored using H \bmod N \times 2^{L+1} and the middle section is stored using H \bmod N \times 2^L. Each time S reaches N \times 2^L the table has doubled in size.

Points to ponder over

Algorithm for inserting ‘k’ and checking overflow condition

Searching in the hash table for ‘k’

Adoption in language systems

Griswold and Townsend [3] discussed the adoption of linear hashing in the Icon language. They discussed the implementation alternatives of dynamic array algorithm used in linear hashing, and presented performance comparisons using a list of Icon benchmark applications.

Adoption in database systems

Linear hashing is used in the BDB Berkeley database system, which in turn is used by many software systems such as OpenLDAP, using a C implementation derived from the CACM article and first published on the Usenet in 1988 by Esmond Pitt.

References

  1. Litwin, Witold (1980), "Linear hashing: A new tool for file and table addressing" (PDF), Proc. 6th Conference on Very Large Databases: 212–223
  2. Larson, Per-Åke (April 1988), "Dynamic Hash Tables", Communications of the ACM 31 (4): 446–457, doi:10.1145/42404.42410
  3. Griswold, William G.; Townsend, Gregg M. (April 1993), "The Design and Implementation of Dynamic Hashing for Sets and Tables in Icon", Software - Practice and Experience 23 (4): 351–367

External links

See also