purists.org

Institute of Applied Iconoclasm
Last updated July 14, 2001

 

Binary Heap Priority Queue

Introduction and ANSI C Reference Implementation

 

 

What's This Page About?

This page features an introduction to the priority queue data structure and a Public Domain ANSI C reference implementation thereof. The implementation internally uses a flat array representation of a binary heap for storing the portions of data enqueued. The implementation is original in so far as it, despite using a plain array, it does not impose any intrinsic limit regarding the number of items the queue can accommodate; if you fill up the array, the insertion function will automatically allocate a bigger array for you.

  

The Data Structure Explained

What Is a Priority Queue?

A basic queue is a one-dimensional last in-last out container structure. The term one-dimensional alludes to the fact that queues basically look like lists or an arrays of portions of data and that each queue therefore has two ends. Items are added to one end of the queue and are removed from its other end; this means that when you take an item from the queue you always get the "oldest" item it contains -- the item that has, of all items currently enqueued, been in the queue for the longest time. A priority queue is different from a regular queue in that the items it contains are not arranged "chronologically" (in the order of their respective times of enqueuement) but by priority. When you remove an item from a priority queue you get the item that has, of all items, the highest priority.

Of course you could implement a priority queue as a simple array or linked list, in such way that

With this obvious approach, item insertion would take but constant time; however, item removal would take linear time. There are several implementation strategies that yield a far better overall performance, one of them being the binary heap priority queue.

 

What Is a Binary Heap Priority Queue?

This definition requires us to take three steps; namely, binary tree, binary heap, and binary heap priority queue. First, we need to define the term binary tree. A binary tree is a set of interlinked records (nodes, vertices) such that

The nodes to which links from a are leading are called the subnodes, children, or child nodes of a. Accordingly, a is called the parent node of its children. It is easily seen that any nonempty tree must have exactly one node to which no links are leading and from which all other nodes can be reached by travelling along one or more links. This distinguished node is referred to as the root node. (You may want to check out the binary search tree page for a different approach to binary trees.)

A binary heap basically is a binary tree for which the following two statements hold:

  1. Each node is equipped with a scalar key value.
  2. No node in the three has children whose key is higher than its own key. In other words, the keys of the children of any given node a must be less than or equal to the key of a itself.

Binary heaps have two interesting properties. Firstly, the record bearing the highest key value is always the root record. Secondly, insertion or removal of records (in such way that the binary heap still is a binary heap afterwards) takes O(log n) time, where n is the number of items in the heap.) Outline of proof:

Finally, a binary heap priority queue simply is a priority queue internally using a binary heap to store its items.

 

Array Implementation of Binary Heaps

One does not need pointers and dynamic memory allocation to implement a binary heap. A plain flat one-dimensional array will do fine. Assume that you are dealing with a binary heap holding n elements. The elements may be stored in an array with n slots in which

Clearly, the root node sits in slot 1, its children sit in slots 2 and 3, respectively, their children in turn are occupying slot 4, 5, 6, and 7, respectively, and so on. There is a straightforward one-to-one correspondence between binary heaps and flattened-out array representations of binary heaps. It is important to note that, since the link relationship (or lack thereof) between any two nodes is obvious directly from their respective slot indices, we do not need to explicitly store any links, thus saving substantial amounts of time and space.

 

How Do I Enqueue an Item?

Let A be the priority queue's underlying item array, and let n be the number of items this array currently contains. At any time, the queue's n items will be in A[1], A[2], ..., A[n]. Thus, the index of the first available slot will be n+1, and there will not be an available slot between any two items. Now, let a be a portion of data to be enqueued. To insert a, proceed as follows:

  1. Let m = n+1.
  2. Put a into A[m]; that is, put a into the first available slot.
  3. Check whether m=1. If yes, you are done; the algorithm terminates.
  4. Compare the key of A[m] to that of A[m/2].

 

How Do I Remove an Item?

Let A be the prioriy queue's underlying item array, and let n be the number of the portions of data the priority queue currently contains. To remove an item from the queue, proceed as outlined below:

  1. Remove the item in slot A[1]. Let n = n-1 to reflect the fact that the number of items in the queue has decreased by one.
  2. Check whether n=0. If yes, the queue is now empty, and the algorithm terminates.
  3. Move the item in A[n+1] to slot A[1]; i. e., let the last item in the queue take the removed item's place. From now on, we shall refer to this item as item a.
  4. Let u = 1 and v = 1.
  5. Check whether 2u <= n, that is, whether the queue currently contains at least 2u items.
  6. Check whether 2u+1 <= n; that is, whether the queue contains at least 2u+1 elements.
  7. Unless u = v, swap A[u] and A[v].
  8. Let u = v to adjust u to the new slot index of item a. Then go back to step (5).

 

Properties

Inserting and removing an element both takes O(log n) time, where n is the number of portions of data in the queue. Just accessing the top-priority item without removing it takes constant time, naturally. There isn't too much else to say about the binary heap priority queue data structure per se, as there does exist a wide variety of implementation approaches, each behaving differently. (You may want to check out the links section.)

  

The Sample Source Code Explained

How to Use It

To obtain and run the reference implementation, proceed as follows:

Sample macro definitions:

    struct somestruct { int prio; /* additional variables go here */ };
    #define PQDATUM struct somestruct *
    #define PQPRIO(p) (p->prio) 

The queue's item order is defined like this:

 

Properties of this Implementation

The implementation uses an array rather than a linked list for storing the queue items. However, as mentioned above, the array can grow dynamically as needed; the number of items you can store in the queue therefore depends only on the amount of memory available. The number of different priority values available is not an issue, as the implementation can handle situations in which two or more portions of data share the same priority.

 

The Priority Queue Data Structure

The queue structure contains a pointer to the array in which the queue's items are being stored. Since the array's size is subject to change at run time, additional variables are needed to keep track of

The source code:

    /*
     *  Priority queue structure
     */
    struct pqueue 
    {
            int size, avail, step;
            PQDATUM *d;
    }; 

 

Initializing the Queue

Prior to use, the queue structure needs to be initialized. The item array must be allocated and its companion variables must be set up to reflect the array's size and state. The function takes a pointer to a queue structure as its first formal parameter; if you hand it a NULL pointer, the function will attempt to allocate the queue for you, returning NULL if it fails to do so due to insufficient memory. The second parameter specifies how many items the array should be able to hold initially. At the same time, it specifies by how many items the array should be grown in case it gets filled up later on.

The source code:

    /*
     *  pqinit: initialize the queue.
     *
     *  Parameters:
     *
     *    q           Pointer to a priority queue, or NULL if the user
     *                wishes to leave it to pqinit to allocate the queue.
     *
     *    n           Number of queue items for which memory should be
     *                preallocated, that is, the initial size of the
     *                item array the queue uses. If you insert more than
     *                n items to the queue, another n items will
     *                be allocated automatically.
     *
     *  Return values:
     *
     *   non-NULL     Priority queue has been initialized.
     *
     *   NULL         Insufficient memory.
     */
    struct pqueue *
    pqinit(struct pqueue *q, int n)
    {
            struct pqueue *tmp = q;
    
            if (!q && !(q = malloc(sizeof(struct pqueue)))) {
                    return NULL;
            }
            if (!(q->d = malloc(sizeof(PQDATUM) * n))) {
                    if (!tmp) free(q);
                    return NULL;
            }
            q->avail = q->step = n;
            q->size = 1;
            return q;
    } 

 

Enqueueing Items

The pqinsert function inserts an item into the queue. If the queue already contains as many items as its array can hold, the array is grown first. The actual enqueueing operation is done according to the customary heap insertion algorithm described above.

The source code:

    /*                  
     *  pqinsert: insert an item into the queue.
     *
     *  Parameters:
     *
     *    q           Pointer to a priority queue.
     *
     *    d           Datum to be inserted.
     *
     *  Return values:
     *
     *    1           The item has been inserted.
     *
     *    0           The item could not be appended. Either the queue 
     *                pointer provided was NULL, or the function was unable 
     *                to allocate the amount of memory needed for 
     *                the new item.
     */
    int
    pqinsert(struct pqueue *q, PQDATUM d)
    {
            PQDATUM *tmp;
            int i, newsize;
    
            if (!q) return 0;
            
            /* allocate more memory if necessary */
            if (q->size >= q->avail) {
                    newsize = q->size + q->step;
                    if (!(tmp = realloc(q->d, sizeof(PQDATUM) * newsize))) {
                            return 0;
                    };
                    q->d = tmp;
                    q->avail = newsize;                
            }

            /* insert item */
            i = q->size++;
            while (i > 1 && PQPRIO(q->d[i / 2]) < PQPRIO(d)) {
                    q->d[i] = q->d[i / 2];
                    i /= 2;
            }
            q->d[i] = d;
            return 1;        
    } 

 

Removing Items

The pqremove function returns the datum attached to the queue's top-priority item and removes the item from the queue. Just like pqinsert, this function uses a standard, no-special-effects heap-related algorithm.

The source code:

    /*
     *  pqremove: remove the highest-ranking item from the queue.
     *
     *  Parameters:
     *
     *    p           Pointer to a priority queue.
     *
     *    d           Pointer to the PQDATUM variable that will hold the 
     *                datum corresponding to the queue item removed.                
     *
     *  Return values:
     *
     *    non-NULL    An item has been removed. The variable that d points
     *                to now contains the datum associated with the item
     *                in question.
     *
     *    NULL        No item could be removed. Either the queue pointer
     *                provided was NULL, or the queue was empty. The chunk
     *                of memory that d points to has not been modified.
     */
    PQDATUM *
    pqremove(struct pqueue *q, PQDATUM *d)
    {        
            PQDATUM tmp;
            int i = 1, j;
    
            if (!q || q->size == 1) return NULL;
            *d = q->d[1];
            tmp = q->d[--q->size];
            while (i <= q->size / 2) {
                    j = 2 * i;
                    if (j < q->size &&
                            PQPRIO(q->d[j]) < PQPRIO(q->d[j + 1])) {
                            j++;
                    }
                    if (PQPRIO(q->d[j]) <= PQPRIO(tmp)) {
                            break;
                    }
                    q->d[i] = q->d[j];
                    i = j;
            }
            q->d[i] = tmp;
            return d;        
    } 

 

Accessing Items without Removing Them

The pqpeek function returns the datum attached to the queue's top-priority item without removing the item from the queue. This operation is fairly trivial; since heaps keep their highest-ranking element in their top slot at all times, an elementary, constant-time array lookup is all this takes.

The source code:

    /*
     *  pqpeek: access highest-ranking item without removing it.
     *
     *  Parameters:
     *
     *    q           Pointer to a priority queue.
     *
     *    d           Pointer to the PQDATUM variable that will hold the
     *                datum corresponding to the highest-ranking item.
     *                
     *  Return values:
     *
     *    non-NULL   Success. The variable that d points to now contains
     *               the datum associated with the highest-ranking item.
     *
     *    NULL       Failure. Either the queue pointer provided was NULL,
     *               or the queue was empty. The chunk of memory that d
     *               points to has not been modified.
     */
    PQDATUM *
    pqpeek(struct pqueue *q, PQDATUM *d)
    {
            if (!q || q->size == 1) return NULL;
            *d = q->d[1];
            return d;
    } 

 

Link to Related Sites

Other Web sites dealing with heaps and priority queues: