CSC 378: Data Structures and Algorithm Analysis

Optimal Lists

Optimal Lists

For non-uniform query distributions, how can we organize a linked list for minimum expected access time? We'll assume that the item sought is always present.

Problem: What ordering of items minimizes index1.gif ?

Trying examples is a good problem solving technique:

Two-item example

Consider two items, a and b, for which P(a) = 0.7 and P(b) = 0.3. There are two ways to order these items in the list:

List ab : index2.gif .

List ba : index3.gif .

Three-item example

Now consider three items, a, b, and c, for which P(a) = 0.3, P(b) = 0.5, and P(c) = 0.2. There are six ways to order these items:

List
Expected Search Time
abc
acb
bac
bca
cab
cba
(.3 * 1) + (.5 * 2) + (.2 * 3) = 1.9
(.3 * 1) + (.2 * 2) + (.5 * 3) = 2.2
(.5 * 1) + (.3 * 2) + (.2 * 3) = 1.7
(.5 * 1) + (.2 * 2) + (.3 * 3) = 1.8
(.2 * 1) + (.3 * 2) + (.5 * 3) = 2.3
(.2 * 1) + (.5 * 2) + (.3 * 3) = 2.1

Look at the pattern:

Idea: Order the list by decreasing probability of access!

Optimal List Theorem

Theorem

A list of elements ordered by decreasing probability of access has minimum expected access time.

Proof Sketch

In list index4.gif , let element index5.gif be in the index6.gif position in the list:

index7.gif

Let's try a proof by contradiction: Assume that index8.gif has minimum index9.gif and that there is some pair of adjacent elements that are not ordered by decreasing probability of access:

index10.gif

If we can show that index11.gif does not have minimum cost, then the assumption is contradicted and

index12.gif

in the minimum-cost list.

The expected access time in index13.gif is

index14.gif

Create another list, index15.gif , in which elements index16.gif and index17.gif are swapped. Let index18.gif be the access time in this list.

You might guess that this list has lower expected access time, since index19.gif , which has higher probability of access, is now closer to the front of the list. Let's see:

index20.gif

If the new list has lower expected access time, then index21.gif . In other words, index22.gif .

index23.gif

But our assumption was that index24.gif . This means that

index25.gif

and the new list does have lower expected access time!

Thus, the assumption is contradicted and we must conclude that, if index26.gif has minimum expected access time,

index27.gif




Back to the lecture index

Copyright 1998 James Stewart