26-3 Algorithmic consulting

Professor Gore wants to open up an algorithmic consulting company. He has identified n important subareas of algorithms (roughly corresponding to different portions of this textbook), which he represents by the set $A = \{A_1, A_2, \ldots, A_n\}$. In each subarea $A_k$, he can hire an expert in that area for $c_k$ dollars. The consulting company has lined up a set $J = \{J_1, J_2, \ldots, J_m\}$ of potential jobs. In order to perform job $J_i$, the company needs to have hired experts in a subset $R_i \subseteq A$ of subareas. Each expert can work on multiple jobs simultaneously. If the company chooses to accept job $J_i$, it must have hired experts in all subareas in $R_i$, and it will take in revenue of $p_i$ dollars.

Professor Gore's job is to determine which subareas to hire experts in and which jobs to accept in order to maximize the net revenue, which is the total income from jobs accepted minus the total cost of employing the experts.

Consider the following flow network $G$. It contains a source vertex $s$, vertices $A_1, A_2, \ldots, A_n$, vertices $J_1, J_2, \ldots, J_m$, and a sink vertex $t$. For $k = 1, 2, \ldots, n$, the flow network contains an edge $(s, A_k)$ with capacity $c(s, A_k) = c_k$, and for $i = 1, 2, \ldots, m$, the flow network contains an edge $(J_i, t)$ with capacity $c(J_i, t) = p_i$. For $k = 1, 2, \ldots, n$ and $i = 1, 2, \ldots, m$, if $A_k \in R_i$, then $G$ contains an edge $(A_k, J_i)$ with capacity $c(A_k, J_i) = \infty$.

a. Show that if $J_i \in T$ for a finite-capacity cut $(S, T)$ of $G$, then $A_k \in T$ for each $A_k \in R_i$.

b. Show how to determine the maximum net revenue from the capacity of a minimum cut of $G$ and the given $p_i$ values.

c. Give an efficient algorithm to determine which jobs to accept and which experts to hire. Analyze the running time of your algorithm in terms of $m$, $n$, and $r = \sum_{i = 1}^m |R_i|$.

a. Assume for the sake of contradiction that $A_k \notin T$ for some $A_k \in R_i$. Since $A_k \notin T$, we must have $A_k \in S$. On the other hand, we have $J_i \in T$. Thus, the edge $(A_k, J_i)$ crosses the cut $(S, T)$. But $c(A_k, J_i) = \infty$ by construction, which contradicts the assumption that $(S, T)$ is a finite-capacity cut.

b. Let us define a project-plan as a set of jobs to accept and experts to hire. Let $P$ be a project-plan. We assume that $P$ has two attributes. The attribute $P.J$ denotes the set of accepted jobs, and $P.A$ denotes the set of hired experts.

A valid project-plan is one in which we have hired all experts that are required by the accepted jobs. Specifically, let $P$ be a valid project plan. If $J_i \in P.J$, then $A_k \in P.A$ for each $A_k \in R_i$. Note that Professor Gore might decide to hire more experts than those that are actually required.

We define the revenue of a project-plan as the total profit from the accepted jobs minus the total cost of the hired experts. The problem asks us to find a valid project plan with maximum revenue.

We start by proving the following lemma, which establishes the relationship between the capacity of a cut in flow network $G$ and the revenue of a valid project-plan.

Lemma (Min-cut max-revenue)

There exists a finite-capacity cut $(S, T)$ of $G$ with capacity $c(S, T)$ if and only if there exists a valid project-plan with net revenue $(\sum_{J_i \in J} p_i) - c(S, T)$.

Proof

Let $(S, T)$ be a finite-capacity cut of $G$ with capacity $c(S, T)$. We prove one direction of the lemma by constructing the required project-plan.

Construct the project-plan $P$ by including $J_i$ in $P.J$ if and only if $J_i \in T$ and including $A_k$ in $P.A$ if and only if $A_k \in T$. From part (a), $P$ is a valid project-plan, since, for every $J_i \in P.J$, we have $A_k \in P.A$ for each $A_k \in R_i$.

Since the capacity of the cut is finite, there cannot be any edges of the form $(A_k, J_i)$ crossing the cut, where $A_k \in S$ and $J_i \in T$. All edges going from a vertex in $S$ to a vertex in $T$ must be either of the form $(s, A_k)$ or of the form $(J_i, t)$. Let $E_A$ be the set of edges of the form $(s, A_k)$ that cross the cut, and let $E_J$ be the set of edges of the form $(J_i, t)$ that cross the cut, so that

$$c(S, T) = \sum_{(s, A_k) \in E_A} c(s, A_k) + \sum_{(J_i, j) \in E_J} c(J_i, t).$$

Consider edges of the form $(s, A_k)$. We have

$$ \begin{aligned} (s, A_k) \in E_A & \text{ if and only if $A_k \in T$} \\ & \text{ if and only if $A_k \in P.A$}. \end{aligned} $$

By construction, $c(s, A_k) = c_k$. Taking summations over $E_A$ and over $P.A$, we obtain

$$\sum_{(s, A_k) \in E_A} c(s, A_k) = \sum_{A_k \in P.A} c_k.$$

Similarly, consider edges of the form $(J_i, t)$. We have

$$ \begin{aligned} (J_i, t) \in E_J & \text{ if and only if $J_i \in S$} \\ & \text{ if and only if $J_i \notin T$} \\ & \text{ if and only if $J_i \notin P.J$}. \end{aligned} $$

By construction, $c(J_i, t) = p_i$. Taking summations over $E_J$ and over $P.J$, we obtain

$$\sum_{(J_i, t) \in E_J} c(J_i, t) = \sum_{J_i \notin P.J} p_i.$$

Let $v$ be the net revenue of $P$. Then, we have

$$ \begin{aligned} v & = \sum_{J_i \in P.J} p_i - \sum_{A_k \in P.A} c_k \\ & = \Bigg( \sum_{J_i \in J} p_i - \sum_{J_i \notin P.J} p_i \Bigg) - \sum_{A_k \in P.A} c_k \\ & = \sum_{J_i \in J} p_i - \Bigg( \sum_{J_i \notin P.J} p_i + \sum_{A_k \in P.A} c_k \Bigg) \\ & = \sum_{J_i \in J} p_i - \Bigg( \sum_{(J_i, t) \in E_J} c(J_i, t) + \sum_{(s, A_k) \in E_A} c(s, A_k) \Bigg) \\ & = \Bigg( \sum_{J_i \in J} p_i \Bigg) - c(S, T). \end{aligned} $$

Now, we prove the other direction of the lemma by constructing the required cut from a valid project-plan.

Construct the cut $(S, T)$ as follows. For every $J_i \in P.J$, let $J_i \in T$. For every $A_k \in P.A$, let $A_k \in T$.

First, we prove that the cut $(S, T)$ is a finite-capacity cut. Since edges of the form $(A_k, J_i)$ are the only infinite-capacity edges, it suffices to prove that there are no edges $(A_k, J_i)$ such that $A_k \in S$ and $J_i \in T$.

For the purpose of contradiction, assume there is an edge $(A_k, J_i)$ such that $A_k \in S$ and $J_i \in T$. By our constuction, we must have $J_i \in P.J$ and $A_k \notin P.A$. But since the edge $(A_k, J_i)$ exists, we have $A_k \in R_i$. Since $P$ is a valid project-plan, we derive the contradiction that $A_k$ must have been in $P.A$.

From here on, the analysis is the same as the previous direction. In particular, the last equation from the previous analysis holds: the net revenue $v$ equals $(\sum_{J_i \in J} p_i) - c(S, T)$.

We conclude that the problem of finding a maximum-revenue project-plan reduces to the problem of finding a minimum cut in $G$. Let $(S, T)$ be a minimum cut. From the lemma, the maximum net revenue is given by

$$\Bigg( \sum_{j_i \in J} p_i \Bigg) - c(S, T).$$

c. Construct the flow network $G$ as shown in the problem statement. Obtain a minimum cut $(S, T)$ by running any of the maximum-flow algorithms (say, Edmonds-Karp). Construct the project plan $P$ as follows: add $J_i$ to $P.J$ if and only if $J_i \in T$. Add $A_k$ to $P.A$ if and only if $A_k \in T$.

First, we note that the number of vertices in $G$ is $|V| = m + n + 2$, and the number of edges in $G$ is $|E| = r + m + n$. Constructing $G$ and recovering the project-plan from the minimum cut are clearly linear-time operations. The running time of our algorithm is thus asymptotically the same as the running time of the algorithm used to find the minimum cut. If we use Edmonds-Karp to find the minimum cut, the running time is $O(VE^2)$.