One of the most neglected data structures in the CS repertoire is

the heap. Unfortunately, the jargon is overloaded, so “heap” can mean

two different things – but there is a connection, which I’ll explain

in a little while.

In many programming languages, when you can dynamically create

objects in memory, the space from which you allocate the memory to

create them is called the heap. That is *not* what I’m talking about.

What I am talking about is an extremely simple but powerful structure

that is used for maintaining a collection of values from which you want

to be able to quickly remove the largest object quickly. This may sound

trivial, but this turns out to be an incredibly useful structure. You can use this basic structure to implement a very fast sorting algorithm; to maintain a prioritized set of values/tasks; to manage chunks of memory; etc. (The

prioritized set is the most common case in my experience, but that’s probably because I’ve spent so much of my career working on distributed systems.)

When you’re talking about data structures, one of the important things

you need to do in order to really understand how to choose the best one is to understand exactly what properties a data structure provides: what operations need to be fast; what operations you need to have, but don’t care about the speed; how fast things need to be; how much memory you need to create

one, and so on.

In a heap, what we want is to be able to add an element

to the heap quickly, and to be able to remove the largest element quickly. We’re *not* going to remove arbitrary elements of the heap: we only remove the largest. We can’t query the heap – that is, we can’t ask “Is X in the heap?” – all we can do is remove the largest value. And while providing these properties, we want to use as little memory as possible.

Now, I haven’t yet defined what I mean by “quickly” in the above definition. There’s a good reason for that: there are a set of tradeoffs

that produce different solutions. For now, we’ll settle for one

version of “quickly”: O(lg n) – that is, in the worst case, the amount of time it takes to either remove the largest value, or to add a new value, is

proportional to the logarithm of the number of values in the heap. We can

provide that kind of performance with a kind of heap called a *binary heap*.

A binary heap is basically a binary tree with three properties, called

the *heap properties* of the tree:

- For any node in the tree, the node is larger than

its children. - Every level of the tree except for the leaves is full.
- The leaf level of the tree is full from the leftmost leaf

to the last leaf. (In other words, the last level is filled

left to right.)

So, for example, this figure is a binary heap. It’s got three levels. The first two levels are full; the third level is full on its left side. The next

insert point would be the next leftmost slot, the left child of 6.

We can write a heap in a textual form using parens: we write each node

as a triple: (x, y, z), where x is the value of the parent, and y and z are the parenthesized heaps that are the children of x. So the same heap that I showed above can be written in text form as:

(11,(10,(9,(8,,),(2,,)),(5,(3,,),(1,,))),(7,(6,,),(4,,))).

To make it easier to read, we can stretch it across a couple of lines:

(11, (10, (9, (8,,), (2,,)), (5, (3,,), (1,,))), (7, (6,,), (4,,)))

For the moment, we’ll ignore just how this tree is represented. We’ll get

to that later.

It should be obvious how to find the largest element – it’s the root of

the binary heap. But once we remove it, then we don’t have a heap: there’s a hole at the top – so part of the operation of removing the maximum element

has to be re-validating the heap properties.

The way that we do this is called *bubble-down* What we do is take the last element in the heap – the rightmost leaf – and insert it as the root. Now there’s no hole, but the heap isn’t valid. So we look at the two children of the root, and take the *larger* one, and swap it with the root. Now the root is larger that both of its children, but the child-heap that now has a new root may be invalid. So we check it, and if it’s not valid, we repeat

the bubble-down operation on that child-heap.

Let’s look at that with an example. Take the heap from the

figure above. If we remove the largest element (11), and then

replace it with the last element of the heap, we get the following

invalid heap:

(1, (10, (9, (8,,), (2,,)), (5, (3,,),)) (7, (6,,), (4,,)))

Now, we look at the two children, 10 and 7. 10 is larger, so we swap

the 1 and the 10, which gives us:

(10, (1, (9, (8,,), (2,,)), (5, (3,,),)) (7, (6,,), (4,,)))

Now, we look at the modified subheap – the left child heap. Its two

children are 9 and 5, both larger that 1. 9 is larger, so we swap:

(10, (9, (1, (8,,), (2,,)), (5, (3,,),)) (7, (6,,), (4,,)))

Now we’ve modified the left-left child subheap, and it’s invalid, so we

again swap for the larger child:

(10, (9, (8, (1,,), (2,,)), (5, (3,,),)) (7, (6,,), (4,,)))

Now our heap is valid again.

So how long did that take? We removed the root, and replaced it with

the last child. That’s a couple of steps – not the dominant cost of the operation. Then we did the bubbledown – that’s the main cost. So how many

steps are there in a bubbledown? In the worst case – like the example case above – you’ll need to do a number of swaps equal to the depth of the tree. The depth of the tree is lg(n), where n is the number of values in the tree. So we do at most lg(n) swaps – so the cost of removing the largest element

is O(lg n).

What about adding an element? It’s basically the inverse of removing the root; it’s a *bubble-up*. We insert the value into the position

*after* the last leaf. Then we compare the leaf to its parent. If it’s

larger than its parent, we swap the node with its parent. And we repeat going

up the tree.

Let’s look at a quick example of insert. Say we wanted to insert 12

to our heap. We’d start by adding 12 as a leaf:

(10, (9, (8, (1,,), (2,,)), (5, (3,,), (12,,)) (7, (6,,), (4,,)))

It’s larger than its parent, 5, so we swap:

(10, (9, (8, (1,,), (2,,)), (12, (3,,), (5,,)) (7, (6,,), (4,,)))

And so on – we’ll swap two more times, giving us a new heap:

(12, (10, (8, (1,,), (2,,)), (9, (3,,), (5,,)) (7, (6,,), (4,,)))

So again, we’re looking at a maximum of lg(n) swaps to

do an insert. So we’re O(lg n) for insert.

I’m running out of writing time, so I’ll have to leave a

description of how to represent a heap, and an example

implementation of a heap for this evening. But I’ll leave you with one

quick example of how a heap can be used. Suppose I want to sort

a list of integers. We know that the best worst-case time for

sorting is O(n lg n). We can implement a very simple sort by inserting all

of the values into a heap, and then removing them in order. That’s n inserts, each with maximum cost lg(n); and n removes, maximum cost lg(n). So it’s

O(n lg n) to sort the list – and no O(n^{2}) case like quicksort!