21-2 Depth determination

In the depth-determination problem, we maintain a forest $\mathcal F = \{T_i\}$ of rooted trees under three operations:

$\text{MAKE-TREE}(v)$ creates a tree whose only node is $v$.

$\text{FIND-DEPTH}(v)$ returns the depth of node $v$ within its tree.

$\text{GRAFT}(r, v)$ makes node $r$, which is assumed to be the root of a tree, become the child of node $v$, which is assumed to be in a different tree than $r$ but may or may not itself be a root.

a. Suppose that we use a tree representation similar to a disjoint-set forest: $v.p$ is the parent of node $v$, except that $v.p = v$ if $v$ is a root. Suppose further that we implement $\text{GRAFT}(r, v)$ by setting $r.p = v$ and $\text{FIND-DEPTH}(v)$ by following the find path up to the root, returning a count of all nodes other than $v$ encountered. Show that the worst-case running time of a sequence of $m$ $\text{MAKE-TREE}$, $\text{FIND-DEPTH}$, and $\text{GRAFT}$ operations is $\Theta(m^2)$.

By using the union-by-rank and path-compression heuristics, we can reduce the worst-case running time. We use the disjoint-set forest $\mathcal S = \{S_i\}$, where each set $S_i$ (which is itself a tree) corresponds to a tree $T_i$ in the forest $\mathcal F$. The tree structure within a set $S_i$, however, does not necessarily correspond to that of $T_i$. In fact, the implementation of $S_i$ does not record the exact parent-child relationships but nevertheless allows us to determine any node's depth in $T_i$.

The key idea is to maintain in each node $v$ a "pseudodistance" $v.d$, which is defined so that the sum of the pseudodistances along the simple path from $v$ to the root of its set $S_i$ equals the depth of $v$ in $T_i$. That is, if the simple path from $v$ to its root in $S_i$ is $v_0, v_1, \ldots, v_k$, where $v_0 = v$ and $v_k$ is $S_i$'s root, then the depth of $v$ in $T_i$ is $\sum_{j = 0}^k v_j.d$.

b. Give an implementation of $\text{MAKE-TREE}$.

c. Show how to modify $\text{FIND-SET}$ to implement $\text{FIND-DEPTH}$. Your implementation should perform path compression, and its running time should be linear in the length of the find path. Make sure that your implementation updates pseudodistances correctly.

d. Show how to implement $\text{GRAFT}(r, v)$, which combines the sets containing $r$ and $v$, by modifying the $\text{UNION}$ and $\text{LINK}$ procedures. Make sure that your implementation updates pseudodistances correctly. Note that the root of a set $S_i$ is not necessarily the root of the corresponding tree $T_i$.

e. Give a tight bound on the worst-case running time of a sequence of $m$ $\text{MAKE-TREE}$, $\text{FIND-DEPTH}$, and $\text{GRAFT}$ operations, $n$ of which are $\text{MAKE-TREE}$ operations.

a. $\text{MAKE-TREE}$ and $\text{GRAFT}$ are both constant time operations. $\text{FINDDEPTH}$ is linear in the depth of the node. In a sequence of $m$ operations the maximal depth which can be achieved is $m/2$, so $\text{FIND-DEPTH}$ takes at most $O(m)$. Thus, $m$ operations take at most $O(m^2)$. This is achieved as follows: Create $m / 3$ new trees. Graft them together into a chain using $m / 3$ calls to $\text{GRAFT}$. Now call $\text{FIND-DEPTH}$ on the deepest node $m / 3$ times. Each call takes time at least $m / 3$, so the total runtime is $\Omega((m / 3)^2) = \Omega(m^2)$. Thus the worst-case runtime of the $m$ operations is $\Theta(m^2)$.

b. Since the new set will contain only a single node, its depth must be zero and its parent is itself. In this case, the set and its corresponding tree are indistinguishable.

MAKE-TREE(v)
    v = ALLOCATE-NODE()
    v.d = 0
    v.p = v
    return v

c. In addition to returning the set object, modify $\text{FIND-SET}$ to also return the depth of the parent node. Update the pseudodistance of the current node $v$ to be $v.d$ plus the returned pseudodistance. Since this is done recursively, the running time is unchanged. It is still linear in the length of the find path. To implement $\text{FIND-DEPTH}$, simply recurse up the tree containing $v$, keeping a running total of pseudodistances.

FIND-SET(v)
    if v != v.p
        (v.p, d) = FIND-SET(v.p)
        v.d = v.d + d
        return (v.p, v.d)
    return (v, 0)

d. To implement $\text{GRAFT}$ we need to find $v$'s actual depth and add it to the pseudodistance of the root of the tree $S_i$ which contains $r$.

GRAFT(r, v)
    (x, d_1) = FIND-SET(r)
    (y, d_2) = FIND-SET(v)
    if x.rank > y.rank
        y.p = x
        x.d = x.d + d_2 + y.d
    else
        x.p = y
        x.d = x.d + d_2
        if x.rank == y.rank
            y.rank = y.rank + 1

e. The three implemented operations have the same asymptotic running time as $\text{MAKE}$, $\text{FIND}$, and $\text{UNION}$ for disjoint sets, so the worst-case runtime of $m$ such operations, $n$ of which are $\text{MAKE-TREE}$ operations, is $O(m\alpha(n))$.