Rendezvous hashing
Rendezvous or Highest Random Weight (HRW) hashing[1][2] is an algorithm that allows clients to achieve distributed agreement on which site (or proxy) a given object is to be placed in. It accomplishes goals similar to consistent hashing, using an entirely different method.
History
Rendezvous or Highest Random Weight (HRW) hashing[1][2] was invented in 1996 by David Thaler and Chinya Ravishankar at the University of Michigan. Consistent hashing appears to have been devised independently and simultaneously at MIT. One of the first applications of rendezvous hashing was to enable multicast clients on the Internet (in contexts such as the MBONE) identify multicast rendezvous points in a distributed fashion.[3][4] It was used in 1998 by Microsoft's Cache Array Routing Protocol (CARP) for distributed cache coordination and routing.[5][6] It has also been used in applications such as mobile caching,[7] router design,[8] and secure key establishment.[9]
The HRW Algorithm For Rendezvous Hashing
Rendezvous hashing solves the following problem: How can a set of clients, given an object O, agree on where in a set of sites (servers, say) to place O? Each client is to select a site independently, but all clients must end up picking the same site.
Let the sites be represented as {S1, S2, ..., Sn } . HRW solves this problem by choosing a hash function h(), and having each client compute the n hash weights w1 = h(S1, O), w2 = h(S2, O), ..., wn = h(Sn, O). Each client then places O in the site SO that yields the highest weight wO = max {w1, w2, ..., wn}. Since all clients perform identical hash value computations, they will all independently pick the same site SO.
Properties
It might first appear sufficient to treat the n sites as buckets in a hash table and hash the object name O into this table. However, if any of the sites fails or is unreachable, the hash table size changes, requiring all objects to be remapped. This massive disruption makes such direct hashing unworkable. Under rendezvous hashing, however, clients handle site failures by picking the site that yields the next largest weight. Remapping is required only for objects currently mapped to the failed site, and as proved in,[1][2] disruption is minimal. Rendezvous hashing has the following properties.
- Low overhead: The hash function used is efficient, so overhead at the clients is very low.
- Load balancing: Since the hash function is randomizing, each of the n sites is equally likely to receive the object O. Loads are uniform across the sites.
- High hit rate: Since all clients agree on placing an object O into the same site SO , each fetch or placement of O into SO yields the maximum utility in terms of hit rate. The object O will always be found unless it is evicted by some replacement algorithm at SO .
- Minimal disruption: When a site fails, only the objects mapped to that site need to be remapped. Disruption is at the minimal possible level, as proved in.[1][2]
- Distributed k-agreement: Clients can reach distributed agreement on k sites simply by selecting the top k sites in the ordering, as in.[9]
Comparison With Consistent Hashing
Consistent hashing operates by mapping sites uniformly and randomly to points on a unit circle. Next, the object O is also mapped to a point pO on the unit circle, and SO is selected as the site closest to pO on the circle, proceeding in the clockwise direction. If a site Sk fails, all objects mapping to it are remapped to Sk+1 , the site next to Sk on the circle, going clockwise. Consequently, Sk+1 must now handle Sk's workload in addition to its own. Its performance is likely to be significantly worse than that of other sites. The variance in workload has also increased, and worsens as more sites fail.
In practice, this problem is handled by randomly distributing a large number (100-200, say) of virtual replicas of the sites on the circle. Each virtual replica of a site becomes clockwise adjacent to different sites at different parts of the circle. When a site fails, its objects are remapped to different servers at different parts of the circle, reducing the congestion alluded to above, but at the cost of managing these virtual replicas.
Rendezvous hashing does not have this problem. Since HRW hashing incorporates both the site name and object name, the weight-based rank ordering of sites differs for each object. Objects mapping to the failed site Sk are now remapped uniformly across all the remaining sites, resulting in balanced loads.
Rendezvous hashing also appears to be more flexible and powerful, and can provide simple solutions to problems such as k-agreement. The technique has also been used with weights assigned to sites.
Implementation
Implementation is straightforward once a hash function h() is chosen (the original work on the HRW method[1][2] makes a hash function recommendation). Each client only needs to compute a hash value for each of the n sites, and then pick the largest. While it might first appear that the HRW algorithm runs in O(n) time, this is not the case. A virtual hierarchical structure (called a skeleton) can be created, and HRW applied at each level as one descends the hierarchy, leading to O(log n) running time, as in.[10][11] To see this, choose some constant m and organize the n sites into c = ceiling(n/m) clusters C1, = {S1, S2, ... ,Sm}, C2 = {Sm+1, Sm+2, ... ,S2m}, ... Now build a virtual hierarchy as follows. Choose a constant d and imagine these c clusters placed at the leaves of a tree T of virtual nodes, each with fanout d. Clearly, T has height h = O (log c) = O (log n), since m and d are both constants. To process an object request O, descend T from the root, using Rendezvous Hashing to select one of the d branches at each level. It is not necessary to construct T. It suffices to assign the names 1, 2,...,d to the virtual nodes at each level, and just repeat Rendezvous Hashing h = O (log n) times to simulate descending the tree. The work done at each level is O (1), since d is a constant.
At this point, we will have effectively arrived at a cluster. We now select a site within this cluster by applying Rendezvous Hashing to its m sites. If we chose virtual nodes v1, v2, ..., vh as we descended T, the index of this leaf-level cluster will have the radix-d representation v1 v2...vh. For any given object O, it is clear that the method chooses each of the clusters, and hence each of the n sites, with equal probability. If the site finally selected is unavailable, Rendezvous Hashing leads us to a different site within its cluster. Thus, the load corresponding to any inactive site is shared equally among the active nodes within its cluster. If all nodes in the cluster are unavailable, we go up one level in the virtual hierarchy and select a sibling cluster, in the usual manner.
The value of m can be chosen based upon such factors as the anticipated failure rate and the degree of load balancing desired. A higher value of m leads to less load skew in the event of failure, at the cost of somewhat higher search overhead. The choice m = n is equivalent to non-hierarchical Rendezvous Hashing. In practice, the hash h() is very cheap, so m = n can work quite well unless n is very high.
References
- ↑ 1.0 1.1 1.2 1.3 1.4 Thaler, David; Chinya Ravishankar. "A Name-Based Mapping Scheme for Rendezvous". University of Michigan Technical Report CSE-TR-316-96. Retrieved 2013-09-15.
- ↑ 2.0 2.1 2.2 2.3 2.4 Thaler, David; Chinya Ravishankar (February 1998). "Using Name-Based Mapping Schemes to Increase Hit Rates". IEEE/ACM Transactions on Networking 6 (1): 1–14. doi:10.1109/90.663936.
- ↑ Blazevic, Ljubica. "Distributed Core Multicast (DCM): a routing protocol for many small groups with application to mobile IP telephony". IETF Draft. IETF. Retrieved September 17, 2013.
- ↑ Fenner, B. "Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised)". IETF RFC. IETF. Retrieved September 17, 2013.
- ↑ Valloppillil, Vinod; Kenneth Ross. "Cache Array Routing Protocol v1.0". Internet Draft. Retrieved September 15, 2013.
- ↑ "Cache Array Routing Protocol and Microsoft Proxy Server 2.0". Microsoft. Retrieved September 15, 2013.
- ↑ Mayank, Anup; Ravishankar, Chinya (2006). "Supporting mobile device communications in the presence of broadcast servers". International Journal of Sensor Networks 2 (1/2): 9–16.
- ↑ Guo, Danhua; Bhuyan, Laxmi; Liu, Bin (October 2012). "An efficient parallelized L7-filter design for multicore servers". IEEE/ACM Transactions on Networking (TON) 20 (5): 1426–1439.
- ↑ 9.0 9.1 Wang, Peng; Ravishankar, Chinya (2015). "Key Foisting and Key Stealing Attacks in Sensor Networks'". International Journal of Sensor Networks.
- ↑ Wang, Wei; Chinya Ravishankar (January 2009). "Hash-Based Virtual Hierarchies for Scalable Location Service in Mobile Ad-hoc Networks". Mobile Networks and Applications 14 (5): 625–637. doi:10.1007/s11036-008-0144-3.
- ↑ Mayank, Anup; Phatak, Trivikram; Ravishankar, Chinya (2006), Decentralized Hash-Based Coordination of Distributed Multimedia Caches, Proc. 5th IEEE International Conference on Networking (ICN'06), Mauritius: IEEE