Judy array

In computer science and software engineering, a Judy array is a data structure that has high performance, low memory usage and implements an associative array. Unlike normal arrays, Judy arrays may be sparse, that is, they may have large ranges of unassigned indices. They can be used for storing and looking up values using integer or string keys. The key benefits of using a Judy array is its scalability, high performance, memory efficiency and ease of use.[1]

Judy arrays are both speed- and memory-efficient, and therefore they can sometimes replace common in-memory dictionary implementations (like red-black trees or hash tables).

Roughly speaking, Judy arrays are highly optimized 256-ary radix trees.[2] Judy arrays use over 20 different compression techniques on trie nodes to reduce memory usage.

The Judy array was invented by Douglas Baskins and named after his sister.[3]

Benefits

Memory allocation

Judy arrays are dynamic and can grow or shrink as elements are added to, or removed from, the array. The memory used by Judy arrays is nearly proportional to the number of elements in the Judy array.

Speed

Judy arrays are designed to keep the number of processor cache-line fills as low as possible, and the algorithm is internally complex in an attempt to satisfy this goal as often as possible. Due to these cache optimizations, Judy arrays are fast, especially for very large datasets. On data sets that are sequential or nearly sequential, Judy arrays can even outperform hash tables.[4] Lastly, because a Judy array is a tree, it is possible to do an ordered sequential traversal of keys, which is not possible in hash tables.

Drawbacks

Judy arrays are extremely complicated. The smallest implementations are thousands of lines of code.[3] In addition, Judy arrays are optimized for machines with 64 byte cache lines, making them essentially unportable without a significant rewrite.[4]

References

External links

This article is issued from Wikipedia - version of the Friday, January 22, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.