2-2 Correctness of bubblesort

Bubblesort is a popular, but inefficient, sorting algorithm. It works by repeatedly swapping adjacent elements that are out of order.

1
2
3
4
5
BUBBLESORT(A)
    for i = 1 to A.length - 1
        for j = A.length downto i + 1
            if A[j] < A[j - 1]
                exchange A[j] with A[j - 1]

a. Let $A'$ denote the output of $\text{BUBBLESORT}(A)$ To prove that $\text{BUBBLESORT}$ is correct, we need to prove that it terminates and that

$$A'[1] \le A'[2] \le \cdots \le A'[n], \tag{2.3}$$

where $n = A.length$. In order to show that BUBBLESORT actually sorts, what else do we need to prove?

The next two parts will prove inequality $\text{(2.3)}$.

b. State precisely a loop invariant for the for loop in lines 2–4, and prove that this loop invariant holds. Your proof should use the structure of the loop invariant proof presented in this chapter.

c. Using the termination condition of the loop invariant proved in part (b), state a loop invariant for the for loop in lines 1–4 that will allow you to prove inequality $\text{(2.3)}$. Your proof should use the structure of the loop invariant proof presented in this chapter.

d. What is the worst-case running time of bubblesort? How does it compare to the running time of insertion sort?

a. We need to show that the elements of $A'$ form a permutation of the elements of $A$.

b. Loop invariant: At the start of each iteration of the for loop of lines 2–4, $A[j] = \min\{A[k]: j \le k \le n\}$ and the subarray $A[j..n]$ is a permutation of the values that were in $A[j..n]$ at the time that the loop started.

Initialization: Initially, $j = n$, and the subarray $A[j..n]$ consists of single element $A[n]$. The loop invariant trivially holds.

Maintenance: Consider an iteration for a given value of $j$. By the loop invariant, $A[j]$ is the smallest value in $A[j..n]$. Lines 3–4 exchange $A[j]$ and $A[j - 1]$ if $A[j]$ is less than $A[j - 1]$, and so $A[j - 1]$ will be the smallest value in $A[j - 1..n]$ afterward. Since the only change to the subarray $A[j - 1..n]$ is this possible exchange, and the subarray $A[j..n]$ is a permutation of the values that were in $A[j..n]$ at the time that the loop started, we see that $A[j - 1..n]$ is a permutation of the values that were in $A[j - 1..n]$ at the time that the loop started. Decrementing $j$ for the next iteration maintains the invariant.

Termination: The loop terminates when $j$ reaches $i$. By the statement of the loop invariant, $A[i] = \min\{A[k]: i \le k \le n\}$ and $A[i..n]$ is a permutation of the values that were in $A[i..n]$ at the time that the loop started.

c. Loop invariant: At the start of each iteration of the for loop of lines 1–4, the subarray $A[1..i - 1]$ consists of the $i - 1$ smallest values originally in $A[1..n]$, in sorted order, and $A[i..n]$ consists of the $n - i + 1$ remaining values originally in $A[1..n]$.

Initialization: Before the first iteration of the loop, $i = 1$. The subarray $A[1..i - 1]$ is empty, and so the loop invariant vacuously holds.

Maintenance: Consider an iteration for a given value of $i$. By the loop invariant, $A[1..i - 1]$ consists of the $i$ smallest values in $A[1..n]$, in sorted order. Part (b) showed that after executing the for loop of lines 2–4, $A[i]$ is the smallest value in $A[i..n]$, and so $A[1..i]$ is now the $i$ smallest values originally in $A[1..n]$, in sorted order. Moreover, since the for loop of lines 2–4 permutes $A[i..n]$, the subarray $A[i + 1..n]$ consists of the $n - i$ remaining values originally in $A[1..n]$.

Termination: The for loop of lines 1–4 terminates when $i = n$, so that $i - 1 = n - 1$. By the statement of the loop invariant, $A[1..i - 1]$ is the subarray $A[1..n - 1]$, and it consists of the $n - 1$ smallest values originally in $A[1..n]$, in sorted order. The remaining element must be the largest value in $A[1..n]$, and it is in $A[n]$. Therefore, the entire array $A[1..n]$ is sorted.

Note: Tn the second edition, the for loop of lines 1–4 had an upper bound of $A.length$. The last iteration of the outer for loop would then result in no iterations of the inner for loop of lines 1–4, but the termination argument would simplify: $A[1..i - 1]$ would be the entire array $A[1..n]$, which, by the loop invariant, is sorted.

d. The running time depends on the number of iterations of the for loop of lines 2–4. For a given value of $i$, this loop makes $n - i$ iterations, and $i$ takes on the values $1, 2, \ldots, n - 1$. The total number of iterations, therefore, is

$$ \begin{aligned} \sum_{i = 1}^{n - 1} (n - i) & = \sum_{i = 1}^{n - 1} n - \sum_{i = 1}^{n - 1} i \\ & = n(n - 1) - \frac{n(n - 1)}{2} \\ & = \frac{n(n - 1)}{2} \\ & = \frac{n^2}{2} - \frac{n}{2}. \end{aligned} $$

Thus, the running time of bubblesort is $\Theta(n^2)$ in all cases. The worst-case running time is the same as that of insertion sort.