# 5.3 Randomized algorithms

## 5.3-1

Professor Marceau objects to the loop invariant used in the proof of Lemma 5.5. He questions whether it is true prior to the first iteration. He reasons that we could just as easily declare that an empty subarray contains no $0$-permutations. Therefore, the probability that an empty subarray contains a $0$-permutation should be $0$, thus invalidating the loop invariant prior to the first iteration. Rewrite the procedure $\text{RANDOMIZE-IN-PLACE}$ so that its associated loop invariant applies to a nonempty subarray prior to the first iteration, and modify the proof of Lemma 5.5 for your procedure.

Here's the rewritten procedure:

 1 2 3 4 5 RANDOMIZE-IN-PLACE(A) n = A.length exchange A[1] with A[RANDOM(1, n)] for i = 2 to n exchange A[i] with A[RANDOM(i, n)] 

The loop invariant becomes

Loop invariant: Just prior to the iteration of the for loop for each value of $i = 2, \ldots, n$, for each possible $(i - 1)$-permutation, the subarray $A[1..i - 1]$ contains this $(i - 1)$-permutation with probability $(n - i + 1)! / n!$.

The maintenance and termination parts remain the same. The initialization part is for the subarray $A[1..1]$, which contains any $1$-permutation with probability $(n - 1)! / n \ne 1 / n$.

## 5.3-2

Professor Kelp decides to write a procedure that produces at random any permutation besides the identity permutation. He proposes the following procedure:

 1 2 3 4 PERMUTE-WITHOUT-IDENTITY(A) n = A.length for i = 1 to n - 1 swap A[i] with A[RANDOM(i + 1, n)] 

Does this code do what Professor Kelp intends?

Although $\text{PERMUTE-WITHOUT-IDENTITY}$ will not produce the identity permutation, there are other permutations that it fails to produce. For example, consider its operation when $n = 3$, when it should be able to produce the $n! - 1 = 5$ nonidentity permutations. The for loop iterates for $i = 1$ and $i = 2$. When $i = 1$, the call to $\text{RANDOM}$ returns one of two possible values (either $2$ or $3$), and when $i = 2$, the call to $\text{RANDOM}$ returns just one value $(3)$. Thus, $\text{PERMUTE-WITHOUT-IDENTITY}$ can produce only $2 \cdot 1 = 2$ possible permutations, rather than the $5$ that are required.

## 5.3-3

Suppose that instead of swapping element $A[i]$ with a random element from the subarray $A[i..n]$, we swapped it with a random element from anywhere in the array:

 1 2 3 4 PERMUTE-WITH-ALL(A) n = A.length for i = 1 to n swap A[i] with A[RANDOM(1, n)] 

Does this code produce a uniform random permutation? Why or why not?

The $\text{PERMUTE-WITH-ALL}$ procedure does not produce a uniform random permutation. Consider the permutations it produces when $n = 3$. The procedure makes 3 calls to $\text{RANDOM}$, each of which returns one of 3 values, and so calling $\text{PERMUTE-WITH-ALL}$ has 27 possible outcomes. Since there are $3! = 6$ permutations, if $\text{PERMUTE-WITH-ALL}$ did produce a uniform random permutation, then each permutation would occur $1 / 6$ of the time. That would mean that each permutation would have to occur an integer number $m$ times, where $m / 27 = 1 / 6$. No integer $m$ satisfies this condition.

In fact, if we were to work out the possible permutations of $\langle 1, 2, 3 \rangle$ and how often they occur with $\text{PERMUTE-WITH-ALL}$, we would get the following probabilities:

$$\begin{array}{cc} \text{permutation} & \text{probability} \\ \hline \langle 1, 2, 3 \rangle & 4 / 27 \\ \langle 1, 3, 2 \rangle & 5 / 27 \\ \langle 2, 1, 3 \rangle & 5 / 27 \\ \langle 2, 3, 1 \rangle & 5 / 27 \\ \langle 3, 1, 2 \rangle & 4 / 27 \\ \langle 3, 2, 1 \rangle & 4 / 27 \end{array}$$

Although these probabilities sum to $1$, none are equal to $1 / 6$.

## 5.3-4

Professor Armstrong suggests the following procedure for generating a uniform random permutation:

  1 2 3 4 5 6 7 8 9 10 PERMUTE-BY-CYCLIC(A) n = A.length let B[1..n] be a new array offset = RANDOM(1, n) for i = 1 to n dest = i + offset if dest > n dest = dest - n B[dest] = A[i] return B 

Show that each element $A[i]$ has a $1 / n$ probability of winding up in any particular position in $B$. Then show that Professor Armstrong is mistaken by showing that the resulting permutation is not uniformly random.

$\text{PERMUTE-BY-CYCLE}$ chooses offset as a random integer in the range $1 \le offset \le n$, and then it performs a cyclic rotation of the array. That is, $B[((i + offset - 1)\mod n) + 1] = A[i]$. (The subtraction and addition of $1$ in the index calculation is due to the $1$-origin indexing. If we had used $0$-origin indexing instead, the index calculation would have simplied to $B[(i + offset)\mod n] = A[i]$ for $i = 0, 1, \ldots, n - 1$.)

Thus, once offset is determined, so is the entire permutation. Since each value of offset occurs with probability $1 / n$, each element $A[i]$ has a probability of ending up in position $B[j]$ with probability $1 / n$.

This procedure does not produce a uniform random permutation, however, since it can produce only $n$ different permutations. Thus, $n$ permutations occur with probability $1 / n$, and the remaining $n! - n$ permutations occur with probability $0$.

## 5.3-5 $\star$

Prove that in the array $P$ in procedure $\text{PERMUTE-BY-SORTING}$, the probability that all elements are unique is at least $1 - 1 / n$.

Let $\Pr\{j\}$ be the probability that the element with index $j$ is unique. If there are $n^3$ elements, then the $\Pr\{j\} = 1 - \frac{j - 1}{n^3}$.

\begin{aligned} \Pr\{1 \cap 2 \cap 3 \cap \ldots\} & = \Pr\{1\} \cdot \Pr\{2 \mid 1\} \cdot \Pr\{3 \mid 1 \cap 2\} \cdots \\ & = 1 (1 - \frac{1}{n^3})(1 - \frac{2}{n^3})(1 - \frac{3}{n^3}) \cdots \\ & \ge 1 (1 - \frac{n}{n^3}) (1 - \frac{n}{n^3})(1 - \frac{n}{n^3}) \cdots \\ & \ge (1 - \frac{1}{n^2})^n \\ & \ge 1 - \frac{1}{n}, \\ \end{aligned}

where the last step holds for $(1 - x)^n \ge 1 - nx$.

## 5.3-6

Explain how to implement the algorithm $\text{PERMUTE-BY-SORTING}$ to handle the case in which two or more priorities are identical. That is, your algorithm should produce a uniform random permutation, even if two or more priorities are identical.

 1 2 3 4 5 6 PERMUTE-BY-SORTING(A) let P[1..n] be a new array for i = 1 to n P[i] = i for i = 1 to n swap P[i] with P[RANDOM(i, n)] 

## 5.3-7

Suppose we want to create a random sample of the set $\{1, 2, 3, \ldots, n\}$, that is, an $m$-element subset $S$, where $0 \le m \le n$, such that each $m$-subset is equally likely to be created. One way would be to set $A[i] = i$ for $i = 1, 2, 3, \ldots, n$, call $\text{RANDOMIZE-IN-PLACE}(A)$, and then take just the first $m$ array elements. This method would make $n$ calls to the $\text{RANDOM}$ procedure. If $n$ is much larger than $m$, we can create a random sample with fewer calls to $\text{RANDOM}$. Show that the following recursive procedure returns a random $m$-subset $S$ of $\{1, 2, 3, \ldots, n\}$, in which each $m$-subset is equally likely, while making only $m$ calls to $\text{RANDOM}$:

 1 2 3 4 5 6 7 8 9 RANDOM-SAMPLE(m, n) if m == 0 return Ø else S = RANDOM-SAMPLE(m - 1, n - 1) i = RANDOM(1, n) if i ∈ S S = S ∪ {n} else S = S ∪ {i} return S 

Since each recursive call reduces $m$ by $1$ and makes only one call to $\text{RANDOM}$, it's easy to see that there are a total of $m$ calls to $\text{RANDOM}$. Moreover, since each recursive call adds exactly one element to the set, it's easy to see that the resulting set $S$ contains exactly $m$ elements.

Because the elements of set $S$ are chosen independently of each other, it suffices to show that each of the $n$ values appears in $S$ with probability $m / n$. We use an inductive proof. The inductive hypothesis is that a call to $\text{RANDOM-SUBSET}(m, n)$ returns a set $S$ of $m$ elements, each appearing with probability $m / n$. The base cases are for $m = 0$ and $m = 1$. When $m = 0$, the returned set is empty, and so it contains each element with probability $0$. When $m = 1$, the returned set has one element, and it is equally likely to be any number in $\{1, 2, 3, \ldots, n\}$

For the inductive step, we assume that the call $\text{RANDOM-SUBSET}(m - 1, n - 1)$ returns a set $S'$ of $m - 1$ elements in which each value in ${1, 2, 3, \ldots, n - 1}$ occurs with probability $(m - 1) / (n - 1)$. After the line $i = \text{RANDOM}(1, n)$, $i$ is equally likely to be any value in ${1, 2, 3, \ldots, n}$. We consider separately the probabilities that $S$ contains $j < n$ and that $S$ contains $n$. Let $R_j$ be the event that the call $\text{RANDOM}(1, n)$ returns $j$ , so that $\Pr\{R_j\} = 1 / n$.

For $j < n$, the event that $j \in S$ is the union of two disjoint events:

• $j \in S'$, and
• $j \notin S'$ and $R_j$ (these events are independent),

Thus

\begin{aligned} \Pr\{j \in S\} & = \Pr\{j \in S'\} + \Pr\{j \notin S' \text{ and } R_j\} & \text{ (the events are disjoint)} \\ & = \frac{m - 1}{n - 1} + \Big(1 - \frac{m - 1}{n - 1}\Big) \cdot \frac{1}{n} & \text{ (by the inductive hypothesis)} \\ & = \frac{m - 1}{n - 1} + \Big(\frac{n - 1}{n - 1} - \frac{m - 1}{n - 1}\Big) \cdot \frac{1}{n} \\ & = \frac{m - 1}{n - 1} \cdot \frac{n}{n} + \frac{n - m}{n - 1} \cdot \frac{1}{n} \\ & = \frac{(m - 1)n + (n - m)}{(n - 1)n} \\ & = \frac{mn - n + n - m}{(n - 1)n} \\ & = \frac{m(n - 1)}{(n - 1)n} \\ & = \frac{m}{n}. \end{aligned}

The event that $n \in S$ is also the union of two disjoint events:

• $R_n$, and
• $R_j$ and $j \in S'$ for some $j < n$ (these events are independent).

\begin{aligned} \Pr\{n \in S\} & = \Pr\{R_n\} + \Pr\{R_j \text{ and } j\in S' \text{ for some } j < n\} & \text{ (the events are disjoint)} \\ & = \frac{1}{n} + \frac{n - 1}{n} \cdot \frac{m - 1}{n - 1} & \text{ (by the inductive hypothesis)} \\ & = \frac{1}{n} \cdot \frac{n - 1}{n - 1} + \frac{n - 1}{n} \cdot \frac{m - 1}{n - 1} \\ & = \frac{n - 1+nm-n - m+1}{n(n - 1)} \\ & = \frac{nm-m}{n(n - 1)} \\ & = \frac{m(n - 1)}{n(n - 1)} \\ & = \frac{m}{n}. \end{aligned}