16.1 An activity-selection problem

16.1-1

Give a dynamic-programming algorithm for the activity-selection problem, based on recurrence $\text{(16.2)}$. Have your algorithm compute the sizes $c[i, j]$ as defined above and also produce the maximum-size subset of mutually compatible activities.

Assume that the inputs have been sorted as in equation $\text{(16.1)}$. Compare the running time of your solution to the running time of $\text{GREEDY-ACTIVITY-SELECTOR}$.

DYNAMIC-ACTIVITY-SELECTOR(s, f, n)
    let c[0..n + 1, 0..n + 1] and act[0..n + 1, 0..n + 1] be new tables
    for i = 0 to n
        c[i, i] = 0
        c[i, i + 1] = 0
    c[n + 1, n + 1] = 0
    for l = 2 to n + 1
        for i = 0 to n - l + 1
            j = i + l
            c[i, j] = 0
            k = j - 1
            while f[i] < f[k]
                if f[i] ≤ s[k] and f[k] ≤ s[j] and c[i, k] + c[k, j] + 1 > c[i, j]
                    c[i, j] = c[i, k] + c[k, j] + 1
                    act[i, j] = k
                k = k - 1
    print "A maximum size set of mutually compatible activities has size" c[0, n + 1]
    print "The set contains"
    PRINT-ACTIVITIES(c, act, 0, n + 1)

PRINT-ACTIVITIES(c, act, i, j)
    if c[i, j] > 0
        k = act[i, j]
        print k
        PRINT-ACTIVITIES(c, act, i, k)
        PRINT-ACTIVITIES(c, act, k, j)

$\text{GREEDY-ACTIVITY-SELECTOR}$ runs in $\Theta(n)$ time and
$\text{DYNAMIC-ACTIVITY-SELECTOR}$ runs in $O(n^3)$ time.

16.1-2

Suppose that instead of always selecting the first activity to finish, we instead select the last activity to start that is compatible with all previously selected activities. Describe how this approach is a greedy algorithm, and prove that it yields an optimal solution.

This becomes exactly the same as the original problem if we imagine time running in reverse, so it produces an optimal solution for essentially the same reasons. It is greedy because we make the best looking choice at each step.

16.1-3

Not just any greedy approach to the activity-selection problem produces a maximum-size set of mutually compatible activities. Give an example to show that the approach of selecting the activity of least duration from among those that are compatible with previously selected activities does not work. Do the same for the approaches of always selecting the compatible activity that overlaps the fewest other remaining activities and always selecting the compatible remaining activity with the earliest start time.

As a counterexample to the optimality of greedily selecting the shortest, suppose our activity times are $\{(1, 9), (8, 11), (10, 20)\}$ then, picking the shortest first, we have to eliminate the other two, where if we picked the other two instead, we would have two tasks not one.

As a counterexample to the optimality of greedily selecting the task that conflicts with the fewest remaining activities, suppose the activity times are $\{(−1, 1), (2, 5), (0, 3), (0, 3), (0, 3), (4, 7), (6, 9), (8, 11), (8, 11), (8, 11), (10, 12)\}$. Then, by this greedy strategy, we would first pick $(4, 7)$ since it only has a two conflicts. However, doing so would mean that we would not be able to pick the only optimal solution of $(−1, 1)$, $(2, 5)$, $(6, 9)$, $(10, 12)$.

As a counterexample to the optimality of greedily selecting the earliest start times, suppose our activity times are $\{(1, 10), (2, 3), (4, 5)\}$. If we pick the earliest start time, we will only have a single activity, $(1, 10)$, whereas the optimal solution would be to pick the two other activities.

16.1-4

Suppose that we have a set of activities to schedule among a large number of lecture halls, where any activity can take place in any lecture hall. We wish to schedule all the activities using as few lecture halls as possible. Give an efficient greedy algorithm to determine which activity should use which lecture hall.

(This problem is also known as the interval-graph coloring problem. We can create an interval graph whose vertices are the given activities and whose edges connect incompatible activities. The smallest number of colors required to color every vertex so that no two adjacent vertices have the same color corresponds to finding the fewest lecture halls needed to schedule all of the given activities.)

Maintain a set of free (but already used) lecture halls $F$ and currently busy lecture halls $B$. Sort the classes by start time. For each new start time which you encounter, remove a lecture hall from $F$, schedule the class in that room, and add the lecture hall to $B$. If $F$ is empty, add a new, unused lecture hall to $F$. When a class finishes, remove its lecture hall from $B$ and add it to $F$. This is optimal for following reason, suppose we have just started using the mth lecture hall for the first time. This only happens when ever classroom ever used before is in $B$. But this means that there are $m$ classes occurring simultaneously, so it is necessary to have $m$ distinct lecture halls in use.

16.1-5

Consider a modification to the activity-selection problem in which each activity $a_i$ has, in addition to a start and finish time, a value $v_i$. The objective is no longer to maximize the number of activities scheduled, but instead to maximize the total value of the activities scheduled. That is, we wish to choose a set $A$ of compatible activities such that $\sum_{a_k \in A} v_k$ is maximized. Give a polynomial-time algorithm for this problem.

Easy and straightforward solution is to run a dynamic programming solution based on the equation $\text{(16.2)}$ where the second case has "1" replaced with "$v_k$". Since the subproblems are still indexed by a pair of activities, and each calculation requires taking the minimum over some set of size $\le |S_{ij}| \in O(n)$. The total runtime is bounded by $O(n^3)$.

However, if we are cunning a little, we can be more efficient and give the algorithm which runs in $O(n\log n)$.

INPUT: $n$ activities with values.

IDEA OF ALGORITHM:

Sort input vector of activities according their finish times in ascending order. Let us denote the activities in this sorted vector by $(a_0, a_1, \dots, a_{n - 1})$.
For each $0 \le i < n$ construct partial solution $S_i$. By a partial solution $S_i$, we mean a solution to the problem but considering only activities with indexes lower or equal to $i$. Remember value of each partial solution.
Clearly $S_0 = \{a_0\}$.
We can construct $S_{i + 1}$ as follows. Possible values of $S_{i + 1}$ is either $S_i$ or the solution obtained by joining the activity $a_{i + 1}$ with partial solution $S_j$ where $j < i + 1$ is the index of activity such that $a_j$ is compatible with $a_{i + 1}$ but $a_{j + 1}$ is not compatible with $a_{i + 1}$. Pick the one of these two possible solutions, which has greater value. Ties can be resolved arbitrarily.
Therefore we can construct partial solutions in order $S_0, S_1, \dots, S_{n - 1}$ using (3) for $S_0$ and (4) for all the others.
Give $S_{n - 1}$ as the solution for problem.

ANALYSIS OF TIME COMPLEXITY:

Sorting of activities can be done in $O(n\log n)$ time.
Finding the value of $S_0$ is in $O(1)$.
Any $S_{i + 1}$ can be found in $O(\log n)$. It is thanks to the fact that we have properly sorted activities. Therefore we can for each $i + 1$ find the proper $j$ in $O(\log n)$ using the binary search. If we have the proper $j$, the rest can be done in $O(1)$.
Therefore, we have $O(n\log n)$ time for constructing of all $S_i$'s.

IMPLEMENTATION DETAILS:

Use the dynamic programming.
It is important not to remember too much for each $S_i$. Do not construct $S_i$'s directly (you can end up in $\Omega(n^2)$ time if you do so). For each $S_i$ it is sufficient to remember:
- it's value
- whether or not it includes the activity $a_i$
- the value of $j$ (from (4)).
Using these information obtained by the run of described algorithm you can reconstruct the solution in $O(n)$ time, which does not violate final time complexity.

PROOF OF CORRECTNESS: (sketched)

Clearly, $S_0 = \{a_0\}.$
For $S_{i + 1}$ we argue by (4). Partial solution $S_{i + 1}$ either includes the activity $a_{i + 1}$ or doesn't include it, there is no third way.
- If it does not include $a_{i + 1}$, then clearly $S_{i + 1} = S_i$.
- If it includes $a_{i + 1}$, then $S_{i + 1}$ consists of $a_{i + 1}$ and partial solution which uses all activities compatible with $a_{i + 1}$ with indexes lower than $i + 1$. Since activities are sorted according their finish times, activities with indexes $j$ and lower are compatible and activities with index $j + 1$ and higher up to $i + 1$ are not compatible. We do not consider all the other activities for $S_{i + 1}$. Therefore setting $S_{i + 1} = \{a_{i + 1}\} \cup S_j$ gives correct answer in this case. The fact that we need $S_j$ and not some other solution for activities with indexes up to $j$ can be easily shown by the standard cut-and-paste argument.
Since for $S_{n - 1}$ we consider all of the activities, it is actually the solution of the problem.