# 21.3 Disjoint-set forests

## 21.3-1

Redo Exercise 21.2-2 using a disjoint-set forest with union by rank and path compression.

## 21.3-2

Write a nonrecursive version of $\text{FIND-SET}$ with path compression.

To implement $\text{FIND-SET}$ nonrecursively, let $x$ be the element we call the function on. Create a linked list $A$ which contains a pointer to $x$. Each time we most one element up the tree, insert a pointer to that element into $A$. Once the root $r$ has been found, use the linked list to find each node on the path from the root to $x$ and update its parent to $r$.

## 21.3-3

Give a sequence of $m$ $\text{MAKE-SET}$, $\text{UNION}$, and $\text{FIND-SET}$ operations, $n$ of which are $\text{MAKE-SET}$ operations, that takes $\Omega(m\lg n)$ time when we use union by rank only.

You need to find a sequence of $m$ operations on $n$ elements that takes $\Omega(m\lg n)$ time. Start with $n$ $\text{MAKE-SET}$s to create singleton sets $\{x_1\}, \{x_2\}, \ldots, \{x_n\}$. Next perform the $n - 1$ $\text{UNION}$ operations shown below to create a single set whose tree has depth $\lg n$.

$$\begin{array}{ll} \hline \text{UNION(x_1, x_2)} & \text{n / 2 of these} \\ \text{UNION(x_3, x_4)} & \\ \text{UNION(x_5, x_6)} & \\ \vdots \\ \text{UNION(x_{n - 1}, x_n)} & \\ \hline \text{UNION(x_2, x_4)} & \text{n / 4 of these} \\ \text{UNION(x_6, x_8)} & \\ \text{UNION(x_{10}, x_{12})} & \\ \vdots \\ \text{UNION(x_{n - 2}, x_n)} & \\ \hline \text{UNION(x_4, x_8)} & \text{n / 8 of these} \\ \text{UNION(x_{12}, x_{16})} & \\ \text{UNION(x_{20}, x_{24})} & \\ \vdots \\ \text{UNION(x_{n - 4}, x_n)} & \\ \hline \vdots \\ \hline \text{UNION(x_{n / 2}, x_n)} & \text{1 of these} \\ \hline \end{array}$$

Finally, perform $m - 2n + 1$ $\text{FIND-SET}$ operations on the deepest element in the tree. Each of these $\text{FIND-SET}$ operations takes $\Omega(\lg n)$ time. Letting $m \ge 3n$, we have more than $m / 3$ $\text{FIND-SET}$ operations, so that the total cost is $\Omega(m\lg n)$.

## 21.3-4

Suppose that we wish to add the operation $\text{PRINT-SET}(x)$, which is given a node $x$ and prints all the members of $x$'s set, in any order. Show how we can add just a single attribute to each node in a disjoint-set forest so that $\text{PRINT-SET}(x)$ takes time linear in the number of members of $x$'s set and the asymptotic running times of the other operations are unchanged. Assume that we can print each member of the set in $O(1)$ time.

Maintain a circular, singly linked list of the nodes of each set. To print, just follow the list until you get back to node $x$, printing each member of the list. The only other operations that change are $\text{FIND-SET}$, which sets $x.next = x$, and $\text{LINK}$, which exchanges the pointers $x.next$ and $y.next$.

## 21.3-5 $\star$

Show that any sequence of $m$ $\text{MAKE-SET}$, $\text{FIND-SET}$, and $\text{LINK}$ operations, where all the $\text{LINK}$ operations appear before any of the $\text{FIND-SET}$ operations, takes only $O(m)$ time if we use both path compression and union by rank. What happens in the same situation if we use only the path-compression heuristic?

With the path-compression heuristic, the sequence of $m$ $\text{MAKE-SET}$, $\text{FIND-SET}$, and $\text{LINK}$ operations, where all the $\text{LINK}$ operations take place before any of the $\text{FIND-SET}$ operations, runs in $O(m)$ time. The key observation is that once a node $x$ appears on a find path, $x$ will be either a root or a child of a root at all times thereafter.

We use the accounting method to obtain the $O(m)$ time bound. We charge a $\text{MAKE-SET}$ operation two dollars. One dollar pays for the $\text{MAKE-SET}$, and one dollar remains on the node $x$ that is created. The latter pays for the first time that $x$ appears on a find path and is turned into a child of a root.

We charge one dollar for a $\text{LINK}$ operation. This dollar pays for the actual linking of one node to another.

We charge one dollar for a $\text{FIND-SET}$. This dollar pays for visiting the root and its child, and for the path compression of these two nodes, during the $\text{FIND-SET}$. All other nodes on the find path use their stored dollar to pay for their visitation and path compression. As mentioned, after the $\text{FIND-SET}$, all nodes on the find path become children of a root (except for the root itself), and so whenever they are visited during a subsequent $\text{FIND-SET}$, the $\text{FIND-SET}$ operation itself will pay for them.

Since we charge each operation either one or two dollars, a sequence of $m$ operations is charged at most $2m$ dollars, and so the total time is $O(m)$.

Observe that nothing in the above argument requires union by rank. Therefore, we get an $O(m)$ time bound regardless of whether we use union by rank.