11.1 Direct-address tables

11.1-1

Suppose that a dynamic set $S$ is represented by a direct-address table $T$ of length $m$. Describe a procedure that finds the maximum element of $S$. What is the worst-case performance of your procedure?

As the dynamic set $S$ is represented by the direct-address table $T$, for each key $k$ in $S$, there is a slot $k$ in $T$ points to it. If no element with key $k$ in $S$, then $T[k] = \text{NIL}$. Using this property, we can find the maximum element of $S$ by traversing down from the highest slot to seek the first non-$\text{NIL}$ one.

1
2
MAXIMUM(S)
    return TABLE-MAXIMUM(T, m - 1)
1
2
3
4
5
6
TABLE-MAXIMUM(T, l)
    if l < 0
        return NIL
    else if DIRECT-ADDRESS-SEARCH(T, l) != NIL
        return l
    else return TABLE-MAXIMUM(T, l - 1)

The $\text{TABLE-MAXIMUM}$ procedure gest down and checks $1$ sloc at a time, linearly approaches the solution. In the worst case where $S$ is empty, $\text{TABLE-MAXIMUM}$ examines $m$ slots. Therefore, the worst-case performance of $\text{MAXIMUM}$ is $O(m)$, where $m$ is the length of the direct-address table $T$.

11.1-2

A bit vector is simply an array of bits ($0$s and $1$s). A bit vector of length $m$ takes much less space than an array of $m$ pointers. Describe how to use a bit vector to represent a dynamic set of distinct elements with no satellite data. Dictionary operations should run in $O(1)$ time.

Using the bit vector data structure, we can represent keys less than $m$ by a string of $m$ bits, denoted by $V[0..m - 1]$, in which each position that occupied by the bit $1$, corresponds to a key in the set $S$. If the set contains no element with key $k$, then $V[k] = 0$. For instance, we can store the set $\{2, 4, 6, 10, 16\}$ in a bit vector of length $20$:

$$001010100010000010000$$

1
2
3
4
BITMAP-SEARCH(V, k)
    if V[k] != 0
        return k
    else return NIL
1
2
BITMAP-INSERT(V, x)
    V[x] = 1
1
2
BITMAP-DELETE(V, x)
    V[x] = 0

Each of these operations takes only $O(1)$ time.

11.1-3

Suggest how to implement a direct-address table in which the keys of stored elements do not need to be distinct and the elements can have satellite data. All three dictionary operations ($\text{INSERT}$, $\text{DELETE}$, and $\text{SEARCH}$) should run in $O(1)$ time. (Don't forget that $\text{DELETE}$ takes as an argument a pointer to an object to be deleted, not a key.)

Assuming that fetching an element should return the satellite data of all the stored elements, we can have each key map to a doubly linked list.

  • $\text{INSERT}$: appends the element to the list in constant time
  • $\text{DELETE}$: removes the element from the linked list in constant time (the element contains pointers to the previous and next element)
  • $\text{SEARCH}$: returns the first element, which is a node in a linked list, in constant time

11.1-4 $\star$

We wish to implement a dictionary by using direct addressing on a huge array. At the start, the array entries may contain garbage, and initializing the entire array is impractical because of its size. Describe a scheme for implementing a direct-address dictionary on a huge array. Each stored object should use $O(1)$ space; the operations $\text{SEARCH}$, $\text{INSERT}$, and $\text{DELETE}$ should take $O(1)$ time each; and initializing the data structure should take $O(1)$ time. ($\textit{Hint:}$ Use an additional array, treated somewhat like a stack whose size is the number of keys actually stored in the dictionary, to help determine whether a given entry in the huge array is valid or not.)

The additional data structure will be a stack $S$.

Initially, set $S$ to be empty, and do nothing to initialize the huge array. Each object stored in the huge array will have two parts: the key value, and a pointer to an element of $S$, which contains a pointer back to the object in the huge array.

  • To insert $x$, push an element $y$ to the stack which contains a pointer to position $x$ in the huge array. Update position $A[x]$ in the huge array $A$ to contain a pointer to $y$ in $S$.

  • To search for $x$, go to position $x$ of $A$ and go to the location stored there. If that location is an element of $S$ which contains a pointer to $A[x]$, then we know $x$ is in $A$. Otherwise, $x \notin A$.

  • To delete $x$, invalidate the element of $S$ which is pointed to by $A[x]$. Because there may be "holes" in $S$ now, we need to pop an item from $S$, move it to the position of the "hole", and update the pointer in $A$ accordingly. Each of these takes $O(1)$ time and there are at most as many elements in $S$ as there are valid elements in $A$.