| Institute of Applied Iconoclasm
|
This page features an introduction to the priority queue data structure and a Public Domain ANSI C reference implementation thereof. The implementation internally uses a flat array representation of a binary heap for storing the portions of data enqueued. The implementation is original in so far as it, despite using a plain array, it does not impose any intrinsic limit regarding the number of items the queue can accommodate; if you fill up the array, the insertion function will automatically allocate a bigger array for you.
A basic queue is a one-dimensional last in-last out container structure. The term one-dimensional alludes to the fact that queues basically look like lists or an arrays of portions of data and that each queue therefore has two ends. Items are added to one end of the queue and are removed from its other end; this means that when you take an item from the queue you always get the "oldest" item it contains -- the item that has, of all items currently enqueued, been in the queue for the longest time. A priority queue is different from a regular queue in that the items it contains are not arranged "chronologically" (in the order of their respective times of enqueuement) but by priority. When you remove an item from a priority queue you get the item that has, of all items, the highest priority.
Of course you could implement a priority queue as a simple array
or linked list, in such way that
With this obvious approach, item insertion would take but constant time; however, item removal would take linear time. There are several implementation strategies that yield a far better overall performance, one of them being the binary heap priority queue.
This definition requires us to take three steps; namely, binary tree,
binary heap, and binary heap priority queue.
First, we need to define the term binary tree. A
binary tree is a set of interlinked records (nodes, vertices) such
that
The nodes to which links from a are leading are called the subnodes, children, or child nodes of a. Accordingly, a is called the parent node of its children. It is easily seen that any nonempty tree must have exactly one node to which no links are leading and from which all other nodes can be reached by travelling along one or more links. This distinguished node is referred to as the root node. (You may want to check out the binary search tree page for a different approach to binary trees.)
A binary heap basically is a binary tree for which the
following two statements hold:
Binary heaps have two interesting properties. Firstly,
the record bearing the highest key value is always the root record.
Secondly, insertion or removal of records (in such way that the binary heap
still is a binary heap afterwards) takes O(log n) time,
where n is the number of items in the heap.) Outline
of proof:
Finally, a binary heap priority queue simply is a priority queue internally using a binary heap to store its items.
One does not need pointers and dynamic memory allocation to implement
a binary heap. A plain flat one-dimensional array will do fine. Assume
that you are dealing with a binary heap holding n elements.
The elements may be stored in an array with n slots in which
Clearly, the root node sits in slot 1, its children sit in slots 2 and 3, respectively, their children in turn are occupying slot 4, 5, 6, and 7, respectively, and so on. There is a straightforward one-to-one correspondence between binary heaps and flattened-out array representations of binary heaps. It is important to note that, since the link relationship (or lack thereof) between any two nodes is obvious directly from their respective slot indices, we do not need to explicitly store any links, thus saving substantial amounts of time and space.
Let A be the priority queue's underlying item array, and
let n be the number of items this array currently contains.
At any time, the queue's n items will be in
A[1], A[2], ..., A[n]. Thus, the index of the first available slot
will be n+1, and there will not be an available slot between
any two items.
Now, let a be a portion of data to be enqueued. To insert a,
proceed as follows:
Let A be the prioriy queue's underlying item array, and let
n be the number of the portions of data the priority queue
currently contains. To remove an item from the queue,
proceed as outlined below:
Inserting and removing an element both takes O(log n) time, where n is the number of portions of data in the queue. Just accessing the top-priority item without removing it takes constant time, naturally. There isn't too much else to say about the binary heap priority queue data structure per se, as there does exist a wide variety of implementation approaches, each behaving differently. (You may want to check out the links section.)
To obtain and run the reference implementation, proceed as follows:
PQDATUM and PQPRIO macros:
#define PQDATUM directive specifies what kind
of data you plan to store in the queue. You may use either
primitive data types or pointers to structs. In the
latter case, you must allocate and deallocate the actual
structures yourself; the queue code will handle
the pointers as if the were, for example, ints.
#define PQPRIO directive defines a macro that
calculates a PQDATUM's priority, i. e., a scalar value
used to determine the order in which the items currently enqueued are
going to be retrieved. It is assumed that the PQPRIO macro
returns an int. If you are using the queue to store
strings or other atomic data types, your PQPRIO macro
may consist of a call to some kind of
hash function.
If you are using the queue to store structs, the
struct will probably
contain some kind of priority variable, and the PQPRIO
macro will simply dereference the struct pointer and
return the priority it finds.
Sample macro definitions:
struct somestruct { int prio; /* additional variables go here */ };
#define PQDATUM struct somestruct *
#define PQPRIO(p) (p->prio)
The queue's item order is defined like this:
PQPRIO(r) > PQPRIO(s), then r
will be retrieved before s, no matter
which of these two items has been inserted first.
PQPRIO(r) = PQPRIO(s), then the item inserted
first will be retrieved first. In other words, if all items queued
have the same priority attached to them, the priority queue will act
as a regular queue.
The implementation uses an array rather than a linked list for storing the queue items. However, as mentioned above, the array can grow dynamically as needed; the number of items you can store in the queue therefore depends only on the amount of memory available. The number of different priority values available is not an issue, as the implementation can handle situations in which two or more portions of data share the same priority.
The queue structure contains a pointer to the array in which the
queue's items are being stored. Since the array's size is subject to
change at run time, additional variables are needed to keep track of
avail),
size), and
step).
The source code:
/*
* Priority queue structure
*/
struct pqueue
{
int size, avail, step;
PQDATUM *d;
};
Prior to use, the queue structure needs to be initialized. The item array must be allocated and its companion variables must be set up to reflect the array's size and state. The function takes a pointer to a queue structure as its first formal parameter; if you hand it a NULL pointer, the function will attempt to allocate the queue for you, returning NULL if it fails to do so due to insufficient memory. The second parameter specifies how many items the array should be able to hold initially. At the same time, it specifies by how many items the array should be grown in case it gets filled up later on.
The source code:
/*
* pqinit: initialize the queue.
*
* Parameters:
*
* q Pointer to a priority queue, or NULL if the user
* wishes to leave it to pqinit to allocate the queue.
*
* n Number of queue items for which memory should be
* preallocated, that is, the initial size of the
* item array the queue uses. If you insert more than
* n items to the queue, another n items will
* be allocated automatically.
*
* Return values:
*
* non-NULL Priority queue has been initialized.
*
* NULL Insufficient memory.
*/
struct pqueue *
pqinit(struct pqueue *q, int n)
{
struct pqueue *tmp = q;
if (!q && !(q = malloc(sizeof(struct pqueue)))) {
return NULL;
}
if (!(q->d = malloc(sizeof(PQDATUM) * n))) {
if (!tmp) free(q);
return NULL;
}
q->avail = q->step = n;
q->size = 1;
return q;
}
The pqinsert function inserts an item into the queue.
If the queue already contains as many items as its array can hold, the
array is grown first. The actual enqueueing operation is done according to the
customary heap insertion algorithm described
above.
The source code:
/*
* pqinsert: insert an item into the queue.
*
* Parameters:
*
* q Pointer to a priority queue.
*
* d Datum to be inserted.
*
* Return values:
*
* 1 The item has been inserted.
*
* 0 The item could not be appended. Either the queue
* pointer provided was NULL, or the function was unable
* to allocate the amount of memory needed for
* the new item.
*/
int
pqinsert(struct pqueue *q, PQDATUM d)
{
PQDATUM *tmp;
int i, newsize;
if (!q) return 0;
/* allocate more memory if necessary */
if (q->size >= q->avail) {
newsize = q->size + q->step;
if (!(tmp = realloc(q->d, sizeof(PQDATUM) * newsize))) {
return 0;
};
q->d = tmp;
q->avail = newsize;
}
/* insert item */
i = q->size++;
while (i > 1 && PQPRIO(q->d[i / 2]) < PQPRIO(d)) {
q->d[i] = q->d[i / 2];
i /= 2;
}
q->d[i] = d;
return 1;
}
The pqremove function returns the datum attached to the
queue's top-priority item and removes the item from the queue. Just like
pqinsert, this function uses a standard,
no-special-effects heap-related algorithm.
The source code:
/*
* pqremove: remove the highest-ranking item from the queue.
*
* Parameters:
*
* p Pointer to a priority queue.
*
* d Pointer to the PQDATUM variable that will hold the
* datum corresponding to the queue item removed.
*
* Return values:
*
* non-NULL An item has been removed. The variable that d points
* to now contains the datum associated with the item
* in question.
*
* NULL No item could be removed. Either the queue pointer
* provided was NULL, or the queue was empty. The chunk
* of memory that d points to has not been modified.
*/
PQDATUM *
pqremove(struct pqueue *q, PQDATUM *d)
{
PQDATUM tmp;
int i = 1, j;
if (!q || q->size == 1) return NULL;
*d = q->d[1];
tmp = q->d[--q->size];
while (i <= q->size / 2) {
j = 2 * i;
if (j < q->size &&
PQPRIO(q->d[j]) < PQPRIO(q->d[j + 1])) {
j++;
}
if (PQPRIO(q->d[j]) <= PQPRIO(tmp)) {
break;
}
q->d[i] = q->d[j];
i = j;
}
q->d[i] = tmp;
return d;
}
The pqpeek function returns the datum attached to the
queue's top-priority item without removing the item from the queue.
This operation is fairly trivial;
since heaps keep their highest-ranking element in their top slot at all times,
an elementary, constant-time array lookup is all this takes.
The source code:
/*
* pqpeek: access highest-ranking item without removing it.
*
* Parameters:
*
* q Pointer to a priority queue.
*
* d Pointer to the PQDATUM variable that will hold the
* datum corresponding to the highest-ranking item.
*
* Return values:
*
* non-NULL Success. The variable that d points to now contains
* the datum associated with the highest-ranking item.
*
* NULL Failure. Either the queue pointer provided was NULL,
* or the queue was empty. The chunk of memory that d
* points to has not been modified.
*/
PQDATUM *
pqpeek(struct pqueue *q, PQDATUM *d)
{
if (!q || q->size == 1) return NULL;
*d = q->d[1];
return d;
}
Other Web sites dealing with heaps and priority queues:
Document URL: http://purists.org/pqueue/