12.4 Randomly built binary search trees

12.4-1

Prove equation $\text{(12.3)}$.

Consider all the possible positions of the largest element of the subset of $n + 3$ of size $4$. Suppose it were in position $i + 4$ for some $i \le n − 1$. Then, we have that there are $i + 3$ positions from which we can select the remaining three elements of the subset. Since every subset with different largest element is different, we get the total by just adding them all up (inclusion exclusion principle).

12.4-2

Describe a binary search tree on n nodes such that the average depth of a node in the tree is $\Theta(\lg n)$ but the height of the tree is $\omega(\lg n)$. Give an asymptotic upper bound on the height of an $n$-node binary search tree in which the average depth of a node is $\Theta(\lg n)$.

To keep the average depth low but maximize height, the desired tree will be a complete binary search tree, but with a chain of length $c(n)$ hanging down from one of the leaf nodes. Let $k = \lg(n − c(n))$ be the height of the complete binary search tree. Then the average height is approximately given by

$$\frac{1}{n} \left[\sum_{i = 1}^{n - c(n)} \lg i + (k + 1) + (k + 2) + \cdots + (k + c(n))\right] \approx \lg(n - c(n)) + \frac{c(n)^2}{2n}.$$

The upper bound is given by the largest $c(n)$ such that $\lg(n − c(n))+ \frac{c(n)^2}{2n} = \Theta(\lg n)$ and $c(n) = \omega(\lg n)$. One function which works is $\sqrt n$.

12.4-3

Show that the notion of a randomly chosen binary search tree on $n$ keys, where each binary search tree of $n$ keys is equally likely to be chosen, is different from the notion of a randomly built binary search tree given in this section. ($\textit{Hint:}$ List the possibilities when $n = 3$.)

Suppose we have the elements $\{1, 2, 3\}$. Then, if we construct a tree by a random ordering, then, we get trees which appear with probabilities some multiple of $\frac{1}{6}$. However, if we consider all the valid binary search trees on the key set of $\{1, 2, 3\}$. Then, we will have only five different possibilities. So, each will occur with probability $\frac{1}{5}$, which is a different probability distribution.

12.4-4

Show that the function $f(x) = 2^x$ is convex.

The second derivative is $2^x\ln^2 2$ which is always positive, so the function is convex

12.4-5 $\star$

Consider $\text{RANDOMIZED-QUICKSORT}$ operating on a sequence of $n$ distinct input numbers. Prove that for any constant $k > 0$, all but $O(1 / n^k)$ of the $n!$ input permutations yield an $O(n\lg n)$ running time.

Let $A(n)$ denote the probability that when quicksorting a list of length $n$, some pivot is selected to not be in the middle $n^{1 - k / 2}$ of the numberes. This doesn't happen with probability $\frac{1}{n^{k / 2}}$. Then, we have that the two subproblems are of size $n_1, n_2$ with $n_1 + n_2 = n - 1$, then

$$A(n) \le \frac{1}{n^{k / 2}} + T(n_1)+T(n_2).$$

Since we bounded the depth by $O(1 / \lg n)$ let $\{a_{i, j}\}_i$ be all the subproblem sizes left at depth $j$,

$$A(n) \le \frac{1}{n^{k / 2}} \sum_j\sum_i \frac{1}{a}.$$