12-1 Binary search trees with equal keys

Equal keys pose a problem for the implementation of binary search trees.

a. What is the asymptotic performance of $\text{TREE-INSERT}$ when used to insert $n$ items with identical keys into an initially empty binary search tree?

We propose to improve $\text{TREE-INSERT}$ by testing before line 5 to determine whether $z.key = x.key$ and by testing before line 11 to determine whether $z.key = y.key$.

If equality holds, we implement one of the following strategies. For each strategy, find the asymptotic performance of inserting $n$ items with identical keys into an initially empty binary search tree. (The strategies are described for line 5, in which we compare the keys of $z$ and $x$. Substitute $y$ for $x$ to arrive at the strategies for line 11.)

b. Keep a boolean flag $x.b$ at node $x$, and set $x$ to either $x.left$ or $x.right$ based on the value of $x.b$, which alternates between $\text{FALSE}$ and $\text{TRUE}$ each time we visit $x$ while inserting a node with the same key as $x$.

c. Keep a list of nodes with equal keys at $x$, and insert $z$ into the list.

d. Randomly set $x$ to either $x.left$ or $x.right$. (Give the worst-case performance and informally derive the expected running time.)

a. Each insertion will add the element to the right of the rightmost leaf because the inequality on line 11 will always evaluate to false. This will result in the runtime being $\sum_{i = 1}^n i \in \Theta(n^2)$.

b. This strategy will result in each of the two children subtrees having a difference in size at most one. This means that the height will be $\Theta(\lg n)$. So, the total runtime will be $\sum_{i = 1}^n \lg n \in \Theta(n\lg n)$.

c. This will only take linear time since the tree itself will be height $0$, and a single insertion into a list can be done in constant time.

d.

  • Worst-case: every random choice is to the right (or all to the left) this will result in the same behavior as in the first part of this problem, $\Theta(n^2)$.
  • Expected running time: notice that when randomly choosing, we will pick left roughly half the time, so, the tree will be roughly balanced, so, we have that the depth is roughly $\lg(n)$, $\Theta(n\lg n)$.