7-5 Median-of-3 partition

One way to improve the $\text{RANDOMIZED-QUICKSORT}$ procedure is to partition around a pivot that is chosen more carefully than by picking a random element from the subarray. One common approach is the median-of-3 method: choose the pivot as the median (middle element) of a set of 3 elements randomly selected from the subarray. (See exercise 7.4-6.) For this problem, let us assume that the elements of the input array $A[1..n]$ are distinct and that $n \ge 3$. We denote the sorted output array by $A'[1..n]$. Using the median-of-3 method to choose the pivot element $x$, define $p_i = \Pr\{x = A'[i]\}$.

a. Give an exact formula for $p_i$ as a function of $n$ and $i$ for $i = 2, 3, \ldots, n - 1$. (Note that $p_1 = p_n = 0$.)

b. By what amount have we increased the likelihood of choosing the pivot as $x = A'[\lfloor (n + 1) / 2 \rfloor]$, the median of $A[1..n]$, compared with the ordinary implementation? Assume that $n \to \infty$, and give the limiting ratio of these probabilities.

c. If we define a "good" split to mean choosing the pivot as $x = A'[i]$, where $n / 3 \le i \le 2n / 3$, by what amount have we increased the likelihood of getting a good split compared with the ordinary implementation? ($\textit{Hint:}$ Approximate the sum by an integral.)

d. Argue that in the $\Omega(n\lg n)$ running time of quicksort, the median-of-3 method affects only the constant factor.

a. $p_i$ is the probability that a randomly selected subset of size three has the $A'[i]$ as it's middle element. There are 6 possible orderings of the three elements selected. So, suppose that $S'$ is the set of three elements selected.

We will compute the probability that the second element of $S'$ is $A'[i]$ among all possible $3$-sets we can pick, since there are exactly six ordered $3$-sets corresponding to each $3$-set, these probabilities will be equal. We will compute the probability that $S'[2] = A[i]$. For any such $S'$, we would need to select the first element from $[i - 1]$ and the third from ${i + 1, \ldots , n}$. So, there are $(i - 1)(n - i)$ such $3$-sets. The total number of $3$-sets is $\binom{n}{3} = \frac{n(n - 1)(n - 2)}{6}$. So,

$$p_i = \frac{6(n - i)(i - 1)}{n(n - 1)(n - 2)}.$$

b. If we let $i = \lfloor \frac{n + 1}{2} \rfloor$, the previous result gets us an increase of

$$\frac{6(\lfloor\frac{n - 1}{2}\rfloor)(n - \lfloor\frac{n + 1}{2}\rfloor)}{n(n - 1)(n - 2)} - \frac{1}{n}$$

in the limit $n$ going to infinity, we get

$$\lim_{n \to \infty} \frac{\frac{6(\lfloor \frac{n - 1}{2} \rfloor)(n - \lfloor \frac{n + 1}{2} \rfloor)}{n(n - 1)(n - 2)}}{\frac{1}{n}} = \frac{3}{2}.$$

c. To save the messiness, suppose $n$ is a multiple of $3$. We will approximate the sum as an integral, so,

$$ \begin{aligned} \sum_{i = n / 3}^{2n / 3} & \approx \int_{n / 3}^{2n / 3} \frac{6(-x^2 + nx + x - n)}{n(n - 1)(n - 2)}dx \\ & = \frac{6(-7n^3 / 81 + 3n^3 / 18 + 3n^2 / 18 - n^2 / 3)}{n(n - 1)(n - 2)}, \end{aligned} $$

which, in the limit $n$ goes to infinity, is $\frac{13}{27}$ which is a constant that $>\frac{1}{3}$ as it was in the original randomized quicksort implementation.

d. Since the new algorithm always has a "bad" choice that is within a constant factor of the original quicksort, it will still have a reasonable probability that the randomness leads us into a bad situation, so, it will still be $n\lg n$.