Skip to content

21.4 Analysis of union by rank with path compression


Prove Lemma 21.4.

The lemma states:

For all nodes $x$, we have $x.rank \le x.p.rank$, with strict inequality if $x \ne x.p$. The value of $x.rank$ is initially $0$ and increases through time until $x \ne x.p$; from then on, $x.rank$ does not change. The value of $x.p.rank$ monotonically increases over time.

The initial value of $x.rank$ is $0$, as it is initialized in line 2 of the $\text{MAKE-SET}(x)$ procedure. When we run $\text{LINK}(x, y)$, whichever one has the larger rank is placed as the parent of the other, and if there is a tie, the parent's rank is incremented. This means that after any $\text{LINK}(y, x)$, the two nodes being linked satisfy this strict inequality of ranks.

Also, if we have that $x \ne x.p$, then, we have that $x$ is not its own set representative, so, any linking together of sets that would occur would not involve $x$, but that's the only way for ranks to increase, so, we have that $x.rank$ must remain constant after that point.


Prove that every node has rank at most $\lfloor \lg n \rfloor$.

We'll prove the claim by strong induction on the number of nodes. If $n = 1$, then that node has rank equal to $0 = \lfloor \lg 1 \rfloor$. Now suppose that the claim holds for $1, 2, \ldots, n$ nodes.

Given $n + 1$ nodes, suppose we perform a $\text{UNION}$ operation on two disjoint sets with $a$ and $b$ nodes respectively, where $a, b \le n$. Then the root of the first set has rank at most $\lfloor \lg a \rfloor$ and the root of the second set has rank at most $\lfloor \lg b\rfloor$.

If the ranks are unequal, then the $\text{UNION}$ operation preserves rank and we are done, so suppose the ranks are equal. Then the rank of the union increases by $1$, and the resulting set has rank $\lfloor\lg a\rfloor + 1 \le\lfloor\lg(n + 1) / 2\rfloor + 1 = \lfloor\lg(n + 1)\rfloor$.


In light of Exercise 21.4-2, how many bits are necessary to store $x.rank$ for each node $x$?

Since their value is at most $\lfloor \lg n \rfloor$, we can represent them using $\Theta(\lg(\lg(n)))$ bits, and may need to use that many bits to represent a number that can take that many values.


Using Exercise 21.4-2, give a simple proof that operations on a disjoint-set forest with union by rank but without path compression run in $O(m\lg n)$ time.



Professor Dante reasons that because node ranks increase strictly along a simple path to the root, node levels must monotonically increase along the path. In other words, if $x.rank > 0$ and $x.p$ is not a root, then $\text{level}(x) \le \text{level}(x.p)$. Is the professor correct?


21.4-6 $\star$

Consider the function $\alpha'(n) = \min \{k: A_k(1) \ge \lg(n + 1)\}$. Show that $\alpha'(n) \le 3$ for all practical values of $n$ and, using Exercise 21.4-2, show how to modify the potential-function argument to prove that we can perform a sequence of $m$ $\text{MAKE-SET}$, $\text{UNION}$, and $\text{FIND-SET}$ operations, $n$ of which are $\text{MAKE-SET}$ operations, on a disjoint-set forest with union by rank and path compression in worst-case time $O(m \alpha'(n))$.