# 21.2 Linked-list representation of disjoint sets

## 21.2-1

Write pseudocode for $\text{MAKE-SET}$, $\text{FIND-SET}$, and $\text{UNION}$ using the linked-list representation and the weighted-union heuristic. Make sure to specify the attributes that you assume for set objects and list objects.

The three algorithms follow the english description and are provided here. There are alternate versions using the weighted union heuristic, suffixed with $\text{WU}$.

 1 2 3 4 5 6 7 MAKE-SET(x) let o be an object with three fields, next, value, and set let L be a linked list object with head = tail = o o.next = NIL o.set = L o.value = x return L 
 1 2 FIND-SET(x) return o.set.head.value 
 1 2 3 4 5 6 7 8 9 UNION(x, y) L1 = x.set L2 = y.set L1.tail.next = L2.head z = L2.head while z.next != NIL z.set = L1 L1.tail = L2.tail return L1 

## 21.2-2

Show the data structure that results and the answers returned by the $\text{FIND-SET}$ operations in the following program. Use the linked-list representation with the weighted-union heuristic.

  1 2 3 4 5 6 7 8 9 10 11  for i = 1 to 16 MAKE-SET(x[i]) for i = 1 to 15 by 2 UNION(x[i], x[i + 1]) for i = 1 to 13 by 4 UNION(x[i], x[i + 2]) UNION(x[1], x[5]) UNION(x[11], x[13]) UNION(x[1], x[10]) FIND-SET(x[2]) FIND-SET(x[9]) 

Assume that if the sets containing $x_i$ and $x_j$ have the same size, then the operation $\text{UNION}(x_i, x_j)$ appends $x_j$'s list onto $x_i$'s list.

Originally we have $16$ sets, each containing $x_i$. In the following, we'll replace $x_i$ by $i$. After the for loop in line 3 we have:

$$\{1,2\}, \{3, 4\}, \{5, 6\}, \{7, 8\}, \{9, 10\}, \{11, 12\}, \{13, 14\}, \{15, 16\}.$$

After the for loop on line 5 we have

$$\{1, 2, 3, 4\}, \{5, 6, 7, 8\}, \{9, 10, 11, 12\}, \{13, 14, 15, 16\}.$$

Line 7 results in:

$$\{1, 2, 3, 4, 5, 6, 7, 8\}, \{9, 10, 11, 12\}, \{13, 14, 15, 16\}.$$

Line 8 results in:

$$\{1, 2, 3, 4, 5, 6, 7, 8\}, \{9, 10, 11, 12, 13, 14, 15, 16\}.$$

Line 9 results in:

$$\{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16\}.$$

$\text{FIND-SET}(x_2)$ and $\text{FIND-SET}(x_9)$ each return pointers to $x_1$.

 1 2 3 4 MAKE-SET-WU(x) L = MAKE-SET(x) L.size = 1 return L 
 1 2 3 4 5 6 7 8 UNION-WU(x, y) L1 = x.set L2 = y.set if L1.size ≥ L2.size L = UNION(x, y) else L = UNION(y, x) L.size = L1.size + L2.size return L 

## 21.2-3

Adapt the aggregate proof of Theorem 21.1 to obtain amortized time bounds of $O(1)$ for $\text{MAKE-SET}$ and $\text{FIND-SET}$ and $O(\lg n)$ for $\text{UNION}$ using the linked-list representation and the weighted-union heuristic.

We want to show that we can assign $O(1)$ charges to $\text{MAKE-SET}$ and $\text{FIND-SET}$ and an $O(\lg n)$ charge to $\text{UNION}$ such that the charges for a sequence of these operations are enough to cover the cost of the sequence—$O(m + n\lg n)$, according to the theorem. When talking about the charge for each kind of operation, it is helpful to also be able to talk about the number of each kind of operation.

Consider the usual sequence of $m$ $\text{MAKE-SET}$, $\text{UNION}$, and $\text{FIND-SET}$ operations, $n$ of which are $\text{MAKE-SET}$ operations, and let $l < n$ be the number of $\text{UNION}$ operations. (Recall the discussion in Section 21.1 about there being at most $n - 1$ $\text{UNION}$ operations.) Then there are $n$ $\text{MAKE-SET}$ operations, $l$ $\text{UNION}$ operations, and $m - n - l$ $\text{FIND-SET}$ operations.

The theorem didn't separately name the number $l$ of $\text{UNION}$s; rather, it bounded the number by $n$. If you go through the proof of the theorem with $l$ $\text{UNION}$s, you get the time bound $O(m - l + l\lg l) = O(m + l\lg l)$ for the sequence of operations. That is, the actual time taken by the sequence of operations is at most $c(m + l\lg l)$, for some constant $c$.

Thus, we want to assign operation charges such that

$$\begin{array}{ll} \text{(MAKE-SET charge)} & \cdot \quad n \\ + \text{(FIND-SET charge)} & \cdot \quad (m - n - l) \\ + \text{(UNION charge)} & \cdot \quad l \\ \hline \ge c(m + l\lg l), \end{array}$$

so that the amortized costs give an upper bound on the actual costs.

The following assignments work, where $c'$ is some constant $\ge c$:

• $\text{MAKE-SET}$: $c'$
• $\text{FIND-SET}$: $c'$
• $\text{UNION}$: $c'(\lg n + 1)$

Substituting into the above sum, we get

\begin{aligned} c'n + c'(m - n - l) + c'(\lg n + 1)l & = c'm + c'l\lg n \\ & = c'(m + l\lg n) \\ & > c(m + l\lg l). \end{aligned}

## 21.2-4

Give a tight asymptotic bound on the running time of the sequence of operations in Figure 21.3 assuming the linked-list representation and the weighted-union heuristic.

We call $\text{MAKE-SET}$ $n$ times, which contributes $\Theta(n)$. In each union, the smaller set is of size $1$, so each of these takes $\Theta(1)$ time. Since we union $n - 1$ times, the runtime is $\Theta(n)$.

## 21.2-5

Professor Gompers suspects that it might be possible to keep just one pointer in each set object, rather than two ($head$ and $tail$), while keeping the number of pointers in each list element at two. Show that the professor's suspicion is well founded by describing how to represent each set by a linked list such that each operation has the same running time as the operations described in this section. Describe also how the operations work. Your scheme should allow for the weighted-union heuristic, with the same effect as described in this section. ($\textit{Hint:}$ Use the tail of a linked list as its set's representative.)

As the hint suggests, make the representative of each set be the tail of its linked list. Except for the tail element, each element's representative pointer points to the tail. The tail's representative pointer points to the head. An element is the tail if its next pointer is $\text{NIL}$. Now we can get to the tail in $O(1)$ time: if $x.next == \text{NIL}$, then $tail = x$, else $tail = x.rep$. We can get to the head in $O(1)$ time as well: if $x.next == \text{NIL}$, then $head = x.rep$, else $head = x.rep.rep$. The set object needs only to store a pointer to the tail, though a pointer to any list element would suffice.

## 21.2-6

Suggest a simple change to the $\text{UNION}$ procedure for the linked-list representation that removes the need to keep the $tail$ pointer to the last object in each list. Whether or not the weighted-union heuristic is used, your change should not change the asymptotic running time of the $\text{UNION}$ procedure. ($\textit{Hint:}$ Rather than appending one list to another, splice them together.)

Let's call the two lists $A$ and $B$, and suppose that the representative of the new list will be the representative of $A$. Rather than appending $B$ to the end of $A$, instead splice $B$ into $A$ right after the first element of $A$. We have to traverse $B$ to update pointers to the set object anyway, so we can just make the last element of $B$ point to the second element of $A$.