Binary heap
From Wikipedia, the free encyclopedia
Binary heaps are a particularly simple kind of heap data structure created using a binary tree. It can be seen as a binary tree with two additional constraints:
- The shape property: the tree is either a perfectly balanced binary tree (all leaves are at the same level), or, if the last level of the tree is not complete, the nodes are filled from left to right.
- The heap property: each node is greater than or equal to each of its children according to some comparison predicate which is fixed for the entire data structure.
"Greater than" means according to whatever comparison function is chosen to sort the heap, not necessarily "greater than" in the mathematical sense (since the quantities are not even always numerical). Heaps where the comparison function is mathematical "greater than" are called max-heaps; those where the comparison function is mathematical "less than" are called "min-heaps". Conventionally, min-heaps are used, since they are readily applicable for use in priority queues.
Note that the ordering of siblings in a heap is not specified by the heap property, so the two children of a parent can be freely interchanged (as long as this does not violate the shape property, compare with treap).
Contents |
[edit] Heap operations
[edit] Adding to the heap
If we have a heap, and we add an element, we can perform an operation known as up-heap, bubble-up, percolate-up, or sift-up in order to restore the heap property. We can do this in O(log n) time, using a binary heap, by following this algorithm:
- Add the element on the bottom level of the heap.
- Compare the added element with its parent; if they are in the correct order, stop.
- If not, swap the element with its parent and return to the previous step.
We do this at maximum for each level in the tree — the height of the tree, which is O(log n). However, since approximately 50% of the elements are leaves and 75% are in the bottom two levels, it is likely that the new element to be inserted will only move a few levels upwards to maintain the heap. Thus, binary heaps support insertion in average constant time, O(1).
Say we have a max-heap
and we want to add the number 15 to the heap. We first place the 15 in the position marked by the X. However the heap property is violated since 15 is greater than 8, so we need to swap the 15 and the 8. So, we have the heap looking as follows after the first swap:
However the heap property is still violated since 15 is greater than 11, so we need to swap again:
which is a valid max-heap.
[edit] Deleting the root from the heap
The procedure for deleting the root from the heap — effectively extracting the maximum element in a max-heap or the minimum element in a min-heap — is similar to up-heap. To do this we remove the root then replace it with the last element on the last level. So, if we have the same max-heap as before, we remove the 11 and replace it with the 4.
Now the heap property is violated since 8 is greater than 4. If we swap these two elements, we have restored the heap property (Note: 4 is swapped with its larger child, if this were a min-heap it would have been swapped with its smaller child) and we need not swap elements further:
[edit] Heap implementation
It is perfectly possible to use a traditional binary tree data structure to implement a binary heap. There is an issue with finding the adjacent element on the last level on the binary heap when adding an element which can be resolved algorithmically or by adding extra data to the nodes, called "threading" the tree — that is, instead of merely storing references to the children, we store the inorder successor of the node as well.
However, a more common approach, and an approach aligned with the theory behind heaps, is to store the heap in an array. Any binary tree can be stored in an array, but because a heap is always an almost complete binary tree, it can be stored compactly. No space is required for pointers; instead, for each index i, element a[i] is the parent of two children a[2i+1] and a[2i+2], as shown in the figure. (Note that with an implementation starting at 0 for the root, the parent of a[i] is a[floor((i−1)/2)]. ) This is a simple example of an implicit data structure.
This approach is particularly useful in the heapsort algorithm, where it allows the space in the input array to be reused to store the heap.
The upheap/downheap operations can be stated then in terms of an array as follows: suppose that the heap property holds for the indices b, b+1, ..., e. The sift-down function extends the heap property to b-1, b, b+1, ..., e. Only index i = b-1 can violate the heap property. Let j be the index of the largest child of a[i] (for a max-heap, or the smallest child for a min-heap) within the range b, ..., e. (If no such index exists because 2i > e then the heap property holds for the newly extended range and nothing needs to be done.) By swapping the values a[i] and a[j] the heap property for position i is established. At this point, the only problem is that the heap property might not hold for index j. The sift-down function is applied tail-recursively to index j until the heap property is established for all elements.
The sift-down function is fast. In each step it only needs two comparisons and one swap. The index value where it is working doubles in each iteration, so that at most log2 e steps are required.
The merge operation in a binary heap takes Θ(n) for equal-sized heaps. When merging is a common task, a different heap implementation is recommended, such as a binomial heap.