# 16.2 Elements of the greedy strategy

## 16.2-1

Prove that the fractional knapsack problem has the greedy-choice property.

Let $I$ be the following instance of the knapsack problem: Let $n$ be the number of items, let $v_i$ be the value of the $i$th item, let $w_i$ be the weight of the $i$th item and let $W$ be the capacity. Assume the items have been ordered in increasing order by $v_i / w_i$ and that $W \ge w_n$.

Let $s = (s_1, s_2, \ldots, s_n)$ be a solution. The greedy algorithm works by assigning $s_n = \min(w_n, W)$, and then continuing by solving the subproblem

$$I' = (n - 1, \{v_1, v_2, \ldots, v_{n - 1}\}, \{w_1, w_2, \ldots, w_{n - 1}\}, W - w_n)$$

until it either reaches the state $W = 0$ or $n = 0$.

We need to show that this strategy always gives an optimal solution. We prove this by contradiction. Suppose the optimal solution to $I$ is $s_1, s_2, \ldots, s_n$, where $s_n < \min(w_n, W)$. Let $i$ be the smallest number such that $s_i > 0$. By decreasing $s_i$ to $\max(0, W - w_n)$ and increasing $s_n$ by the same amount, we get a better solution. Since this a contradiction the assumption must be false. Hence the problem has the greedy-choice property.

## 16.2-2

Give a dynamic-programming solution to the $0$-$1$ knapsack problem that runs in $O(nW)$ time, where $n$ is the number of items and $W$ is the maximum weight of items that the thief can put in his knapsack.

The solution is based on the optimal-substructure observation in the text: Let $i$ be the highest-numbered item in an optimal solution $S$ for $W$ pounds and items $1, \ldots, n$. Then $S' = S - \{i\}$ must be an optimal solution for $W - w_i$ pounds and items $1, \ldots, i - 1$, and the value of the solution $S$ is $v_i$ plus the value of the subproblem solution $S'$.

We can express this relationship in the following formula: Define $c[i, w]$ to be the value of the solution for items $1, \ldots, i$ and maximum weight $w$. Then

$$ c[i,w] = \begin{cases} 0 & \text{ if } i = 0 \text{ or } w = 0, \\ c[i - 1, w] & \text{ if } w_i > w, \\ \max(v_i + c[i - 1, w - w_i], c[i - 1, w]) & \text{ if } i > 0 \text{ and } w \ge w_i. \end{cases} $$

The last case says that the value of a solution for i items either includes item $i$, in which case it is $v_i$ plus a subproblem solution for $i - 1$ items and the weight excluding $w_i$, or doesn't include item $i$, in which case it is a subproblem solution for $i - 1$ items and the same weight. That is, if the thief picks item $i$, he takes $v_i$ value, and he can choose from items $1, \ldots, i - 1$ up to the weight limit $w - w_i$ , and get $c[i - 1, w - w_i]$ additional value. On the other hand, if he decides not to take item $i$, he can choose from items $1, \ldots, i - 1$ up to the weight limit $w$, and get $c[i - 1, w]$ value. The better of these two choices should be made.

The algorithm takes as inputs the maximum weight $W$, the number of items $n$, and the two sequences $v = \langle v_1, v_2, \ldots, v_n \rangle$ and $w = \langle w_1, w_2, \ldots, w_n \rangle$. It stores the $c[i, j]$ values in a table $c[0..n, 0..W]$ whose entries are computed in row-major order. (That is, the first row of $c$ is filled in from left to right, then the second row, and so on.) At the end of the computation, $c[n, W]$ contains the maximum value the thief can take.

1 2 3 4 5 6 7 8 9 10 11 12 | DYNAMIC-0-1-KNAPSACK(v, w, n, W) let c[0..n, 0..W] be a new array for w = 0 to W c[0, w] = 0 for i = 1 to n c[i, 0] = 0 for w = 1 to W if w[i] ≤ w if v[i] + c[i - 1, w - w[i]] > c[i - 1, w] c[i, w] = v[i] + c[i - 1, w - w[i]] else c[i, w] = c[i - 1, w] else c[i, w] = c[i - 1, w] |

The set of items to take can be deduced from the $c$ table by starting at $c[n, W]$ and tracing where the optimal values came from. If $c[i, w] = c[i - 1, w]$, then item $i$ is not part of the solution, and we continue tracing with $c[i - 1, w]$. Otherwise item $i$ is part of the solution, and we continue tracing with $c[i - 1, w - w_i]$.

The above algorithm takes $\Theta(nW)$ time total:

- $\Theta(nW)$ to fill in the $c$ table: $(n + 1) \cdot (W + 1)$ entries, each requiring $\Theta(1)$ time to compute.
- $O(n)$ time to trace the solution (since it starts in row $n$ of the table and moves up one row at each step).

## 16.2-3

Suppose that in a $0$-$1$ knapsack problem, the order of the items when sorted by increasing weight is the same as their order when sorted by decreasing value. Give an efficient algorithm to find an optimal solution to this variant of the knapsack problem, and argue that your algorithm is correct.

Suppose in an optimal solution we take an item with $v_1$, $w_1$, and drop an item with $v_2$, $w_2$, and $w_1 > w_2$, $v_1 < v_2$, we can substitude $1$ with $2$ and get a better solution. Therefore we should always choose the items with the greatest values.

## 16.2-4

Professor Gekko has always dreamed of inline skating across North Dakota. He plans to cross the state on highway U.S. 2, which runs from Grand Forks, on the eastern border with Minnesota, to Williston, near the western border with Montana. The professor can carry two liters of water, and he can skate $m$ miles before running out of water. (Because North Dakota is relatively flat, the professor does not have to worry about drinking water at a greater rate on uphill sections than on flat or downhill sections.) The professor will start in Grand Forks with two full liters of water. His official North Dakota state map shows all the places along U.S. 2 at which he can refill his water and the distances between these locations.

The professor's goal is to minimize the number of water stops along his route across the state. Give an efficient method by which he can determine which water stops he should make. Prove that your strategy yields an optimal solution, and give its running time.

The optimal strategy is the obvious greedy one. Starting will both bottles, Professor Gekko should go to the westernmost place that he can refill his bottles within $m$ miles of Grand Forks. Fill up there. Then go to the westernmost refilling location he can get to within $m$ miles of where he filled up, and fill up there, and so on.

Looked at another way, at each refilling location, Professor Gekko should check whether he can make it to the next refilling location without stopping at this one. If he can, skip this one. If he cannot, then fill up. Professor Gekko doesn't need to know how much water he has or how fat the next refilling location is to implement this approach, since at each fillup, he can determine which is the next location at which he'll need to stop.

This problem has optimal substructure. Suppose there are $n$ possible refilling locations. Consider an optimal solution with $s$ refilling locations and whose first stop is at the $k$th refilling location. Then the rest of the optimal solution must be an optimal solution to the subproblem of the remaining $n - k$ locations. Otherwise, if there were a better solution to the subproblem, i.e., one with fewer than $s - 1$ stops, we could use it to come up with a solution with fewer than $s$ stops for the full problem, contradicting our supposition of optimality.

This problem also has the greedy-choice property. Suppose there are $k$ refilling locations beyond the start that are within $m$ miles of the start. The greedy solution chooses the $k$th location as its first stop. No location beyond the $k$th works as a first stop, since Professor Gekko runs out of gas first. If a solution chooses a location $j < k$ as its first stop, then Professor Gekko could choose the $k$th location instead, having at least as much water when he leaves the $k$th location as if he'd chosen the $j$th location. Therefore, he would get at least as far without filling up again if he had chosen the $k$th location.

If there are $n$ refilling locations on the map, Professor Gekko needs to inspect each one just once. The running time is $O(n)$.

## 16.2-5

Describe an efficient algorithm that, given a set $\{x_1, x_2, \ldots, x_n\}$ of points on the real line, determines the smallest set of unit-length closed intervals that contains all of the given points. Argue that your algorithm is correct.

Consider the leftmost interval. It will do no good if it extends any further left than the leftmost point, however, we know that it must contain the leftmost point. So, we know that it's left hand side is exactly the leftmost point.

So, we just remove any point that is within a unit distance of the left most point since they are contained in this single interval. Then, we just repeat until all points are covered. Since at each step there is a clearly optimal choice for where to put the leftmost interval, this final solution is optimal.

## 16.2-6 $\star$

Show how to solve the fractional knapsack problem in $O(n)$ time.

Use a linear-time median algorithm to calculate the median m of the $v_i / w_i$ ratios. Next, partition the items into three sets: $G = \{i : v_i / w_i > m\}$, $E = \{i : v_i / w_i = m\}$, and $L = \{i : v_i / w_i < m\}$; this step takes linear time. Compute $W_G = \sum_{i \in G} w_i$ and $W_E = \sum_{i \in E} w_i$, the total weight of the items in sets $G$ and $E$, respectively.

- If $W_G > W$, then do not yet take any items in set $G$, and instead recurse on the set of items $G$ and knapsack capacity $W$.
- Otherwise $(W_G \le W)$, take all items in set $G$, and take as much of the items in set $E$ as will fit in the remaining capacity $W - W_G$.
- If $W_G + W_E \ge W$ (i.e., there is no capacity left after taking all the items in set $G$ and all the items in set E that fit in the remaining capacity $W - W_G$), then we are done.
- Otherwise $(W_G + W_E < W)$, then after taking all the items in sets $G$ and $E$, recurse on the set of items $L$ and knapsack capacity $W - W_G - W_E$.

To analyze this algorithm, note that each recursive call takes linear time, exclusive of the time for a recursive call that it may make. When there is a recursive call, there is just one, and it's for a problem of at most half the size. Thus, the running time is given by the recurrence $T(n) \le T (n / 2) + \Theta(n)$, whose solution is $T(n) = O(n)$.

## 16.2-7

Suppose you are given two sets $A$ and $B$, each containing $n$ positive integers. You can choose to reorder each set however you like. After reordering, let $a_i$ be the $i$th element of set $A$, and let $b_i$ be the $i$th element of set $B$. You then receive a payoff of $\prod_{i = 1}^n a_i^{b_i}$. Give an algorithm that will maximize your payoff. Prove that your algorithm maximizes the payoff, and state its running time.

Sort $A$ and $B$ into monotonically decreasing order.

Here's a proof that this method yields an optimal solution. Consider any indices $i$ and $j$ such that $i < j$, and consider the terms $a_i^{b_i}$ and $a_j^{b_j}$. We want to show that it is no worse to include these terms in the payoff than to include $a_i^{b_j}$ and $a_j^{b_i}$, i.e., that $a_i^{b_i} a_j^{b_j} \ge a_i^{b_j} a_j^{b_i}$. Since $A$ and $B$ are sorted into monotonically decreasing order and $i < j$, we have $a_i \ge a_j$ and $b_i \ge b_j$. Since $a_i$ and $a_j$ are positive and $b_i - b_j$ is nonnegative, we have $a_i^{b_i - b_j} \ge a_j^{b_i - b_j}$. Multiplying both sides by $a_i^{b_j} a_j^{b_j}$ yields $a_i^{b_i} a_j^{b_j} \ge a_i^{b_j} a_j^{b_i}$.

Since the order of multiplication doesn't matter, sorting $A$ and $B$ into monotonically increasing order works as well.