Double hashing

From Wikipedia, the free encyclopedia

Double hashing is a computer programming technique used in hashing to resolve hash collisions, cases when two different values to be searched for produce the same hash key. It is a popular collision-resolution technique in open-addressed hash tables. Like linear probing, it uses one hash value as a starting point and then repeatedly steps forward an interval to another address until the desired value is located, an empty location is reached, or the entire table has been searched. Double hashing avoids the problem of clustering, an unfortunately high probability of repeated collisions, by using varied step sizes.

This comes at the price of not being able to delete entries properly after a collision occurred for a certain slot. Instead, a delete marker known as a tombstone will be put in its place, filling up a slot and slightly degrading search performance. While tombstones can be reused on a later insertion, it may be prudent to do a little garbage collection on tables which have many tombstones especially if the table is being used to perform frequent lookup. Like all other forms of open addressing, double hashing becomes linear as the hash table approaches maximum capacity. The only solution to this is to rehash to a larger size.

Double hashing can also be employed in Bloom filters to compute the requisite hash values associated with each element, but enhanced double hashing is superior in terms of accuracy (false positive probability).

Example :

Given h₁(k)and h₂(k) are auxiliary hash functions,

         h(k,i) = (h₁(k) + i h₂(k)) mod m

[edit] See also

This computer science-related article is a stub. You can help Wikipedia by expanding it.

Retrieved from "http://en.wikipedia.org../../../d/o/u/Double_hashing.html"

Categories: Computer science stubs | Data structures | Search algorithms

Double hashing

From Wikipedia, the free encyclopedia

[edit] See also

Views

Navigation

Search

In other languages