# 15.4 Longest common subsequence

## 15.4-1

Determine an $\text{LCS}$ of $\langle 1, 0, 0, 1, 0, 1, 0, 1 \rangle$ and $\langle 0, 1, 0, 1, 1, 0, 1, 1, 0 \rangle$.

$\langle 1, 0, 0, 1, 1, 0 \rangle$.

## 15.4-2

Give pseudocode to reconstruct an $\text{LCS}$ from the completed $c$ table and the original sequences $X = \langle x_1, x_2, \ldots, x_m \rangle$ and $Y = \langle y_1, y_2, \ldots, y_n \rangle$ in $O(m + n)$ time, without using the $b$ table.

1 2 3 4 5 6 7 8 9 10 | PRINT-LCS(c, X, Y, i, j) if c[i, j] == 0 return if X[i] == Y[j] PRINT-LCS(c, X, Y, i - 1, j - 1) print X[i] else if c[i - 1, j] > c[i, j - 1] PRINT-LCS(c, X, Y, i - 1, j) else PRINT-LCS(c, X, Y, i, j - 1) |

## 15.4-3

Give a memoized version of $\text{LCS-LENGTH}$ that runs in $O(mn)$ time.

1 2 3 4 5 6 7 8 | MEMOIZED-LCS-LENGTH(X, Y, i, j) if c[i, j] > -1 return c[i, j] if i == 0 or j == 0 return c[i, j] = 0 if x[i] == y[j] return c[i, j] = LCS-LENGTH(X, Y, i - 1, j - 1) + 1 return c[i, j] = max(LCS-LENGTH(X, Y, i - 1, j), LCS-LENGTH(X, Y, i, j - 1)) |

## 15.4-4

Show how to compute the length of an $\text{LCS}$ using only $2 \cdot \min(m, n)$ entries in the $c$ table plus $O(1)$ additional space. Then show how to do the same thing, but using $\min(m, n)$ entries plus $O(1)$ additional space.

When computing a particular row of the $c$ table, no rows before the previous row are needed. Thus only two rows—$2·length[Y]$ entries—need to be kept in memory at a time. (Note: Each row of $c$ actually has $length[Y] + 1$ entries, but we don't need to store the column of $0$'s—instead we can make the program "know" that those entries are $0$.) With this idea, we need only $2 \cdot \min(m, n)$ entries if we always call $\text{LCS-LENGTH}$ with the shorter sequence as the $Y$ argument.

We can thus do away with the $c$ table as follows:

- Use two arrays of length $\min(m, n)$, $previous\text-row$ and $current\text-row$, to hold the appropriate rows of $c$.
- Initialize $previous\text-row$ to all $0$ and compute $current\text-row$ from left to right.
- When $current\text-row$ is filled, if there are still more rows to compute, copy $current\text-row$ into $previous\text-row$ and compute the new $current\text-row$.

Actually only a little more than one row's worth of $c$ entries—$\min(m, n) + 1$ entries—are needed during the computation. The only entries needed in the table when it is time to compute $c[i, j]$ are $c[i, k]$ for $k \le j - 1$ (i.e., earlier entries in the current row, which will be needed to compute the next row); and $c[i - 1, k]$ for $k \ge j - 1$ (i.e., entries in the previous row that are still needed to compute the rest of the current row). This is one entry for each $k$ from $1$ to $\min(m, n)$ except that there are two entries with $k = j - 1$, hence the additional entry needed besides the one row's worth of entries.

We can thus do away with the $c$ table as follows:

- Use an array a of length $\min(m, n) + 1$ to hold the appropriate entries of $c$. At the time $c[i, j]$ is to be computed, $a$ will hold the following entries:
- $a[k] = c[i, k]$ for $1 \le k < j - 1$ (i.e., earlier entries in the current "row"),
- $a[k] = c[i - 1, k]$ for $k \ge j - 1$ (i.e., entries in the previous "row"),
- $a[0] = c[i, j - 1]$ (i.e., the previous entry computed, which couldn't be put into the "right" place in a without erasing the still-needed $c[i - 1, j - 1]$).

- Initialize a to all $0$ and compute the entries from left to right.
- Note that the 3 values needed to compute $c[i, j]$ for $j > 1$ are in $a[0] = c[i, j - 1], a[ j - 1] = c[i - 1, j - 1]$, and $a[ j] = c[i - 1, j]$.
- When $c[i, j]$ has been computed, move $a[0](c[i, j - 1])$ to its "correct" place, $a[j - 1]$, and put $c[i, j]$ in $a[0]$.

## 15.4-5

Give an $O(n^2)$-time algorithm to find the longest monotonically increasing subsequence of a sequence of $n$ numbers.

Given a list of numbers $L$, make a copy of $L$ called $L'$ and then sort $L'$.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | PRINT-LCS(c, X, Y) n = c[X.length, Y.length] let s[1..n] be a new array i = X.length j = Y.length while i > 0 and j > 0 if x[i] == y[j] s[n] = x[i] n = n - 1 i = i - 1 j = j - 1 else if c[i - 1, j] ≥ c[i, j - 1] i = i - 1 else j = j - 1 for i = 1 to s.length print s[i] |

1 2 3 4 5 6 7 8 9 10 11 12 13 14 | MEMO-LCS-LENGTH-AUX(X, Y, c, b) m = |X| n = |Y| if c[m, n] != 0 or m == 0 or n == 0 return if x[m] == y[n] b[m, n] = ↖ c[m, n] = MEMO-LCS-LENGTH-AUX(X[1..m - 1], Y[1..n - 1], c, b) + 1 else if MEMO-LCS-LENGTH-AUX(X[1..m - 1], Y, c, b) ≥ MEMO-LCS-LENGTH-AUX(X, Y[1..n - 1], c, b) b[m, n] = ↑ c[m, n] = MEMO-LCS-LENGTH-AUX(X[1..m - 1], Y, c, b) else b[m, n] = ← c[m, n] = MEMO-LCS-LENGTH-AUX(X, Y[1..n - 1], c, b) |

1 2 3 4 | MEMO-LCS-LENGTH(X, Y) let c[1..|X|, 1..|Y|] and b[1..|X|, 1..|Y|] be new tables MEMO-LCS-LENGTH-AUX(X, Y, c, b) return c and b |

Then, just run the $\text{LCS}$ algorithm on these two lists. The longest common subsequence must be monotone increasing because it is a subsequence of $L'$ which is sorted. It is also the longest monotone increasing subsequence because being a subsequence of $L'$ only adds the restriction that the subsequence must be monotone increasing. Since $|L| = |L'| = n$, and sorting $L$ can be done in $o(n^2)$ time, the final running time will be $O(|L||L'|) = O(n^2)$.

## 15.4-6 $\star$

Give an $O(n\lg n)$-time algorithm to find the longest monotonically increasing subsequence of a sequence of $n$ numbers. ($\textit{Hint:}$ Observe that the last element of a candidate subsequence of length $i$ is at least as large as the last element of a candidate subsequence of length $i - 1$. Maintain candidate subsequences by linking them through the input sequence.)

The algorithm $\text{LONG-MONOTONIC}(S)$ returns the longest monotonically increasing subsequence of $S$, where $S$ has length $n$.

The algorithm works as follows: a new array B will be created such that $B[i]$ contains the last value of a longest monotonically increasing subsequence of length $i$. A new array $C$ will be such that $C[i]$ contains the monotonically increasing subsequence of length $i$ with smallest last element seen so far.

To analyze the runtime, observe that the entries of $B$ are in sorted order, so we can execute line 9 in $O(\lg n)$ time. Since every other line in the for-loop takes constant time, the total run-time is $O(n\lg n)$.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | LONG-MONOTONIC(S) let B[1..n] be a new array where every value = ∞ let C[1..n] be a new array L = 1 for i = 1 to n if A[i] < B[1] B[1] = A[i] C[1].head.key = A[i] else let j be the largest index of B such that B[j] < A[i] B[j + 1] = A[i] C[j + 1] = C[j] INSERT(C[j + 1], A[i]) if j + 1 > L L = L + 1 print C[L] |