# 22.5 Strongly connected components

## 22.5-1

How can the number of strongly connected components of a graph change if a new edge is added?

It can either stay the same or decrease. To see that it is possible to stay the same, just suppose you add some edge to a cycle. To see that it is possible to decrease, suppose that your original graph is on three vertices, and is just a path passing through all of them, and the edge added completes this path to a cycle. To see that it cannot increase, notice that adding an edge cannot remove any path that existed before.

So, if $u$ and $v$ are in the same connected component in the original graph, then there are a path from one to the other, in both directions. Adding an edge wont disturb these two paths, so we know that $u$ and $v$ will still be in the same $\text{SCC}$ in the graph after adding the edge. Since no components can be split apart, this means that the number of them cannot increase since they form a partition of the set of vertices.

## 22.5-2

Show how the procedure $\text{STRONGLY-CONNECTED-COMPONENTS}$ works on the graph of Figure 22.6. Specifically, show the finishing times computed in line 1 and the forest produced in line 3. Assume that the loop of lines 5–7 of $\text{DFS}$ considers vertices in alphabetical order and that the adjacency lists are in alphabetical order.

The finishing times of each vertex were computed in exercise 22.3-2. The forest consists of 5 trees, each of which is a chain. We'll list the vertices of each tree in order from root to leaf: $r$, $u$, $q - y - t$, $x - z$, and $s - w - v$.

## 22.5-3

Professor Bacon claims that the algorithm for strongly connected components would be simpler if it used the original (instead of the transpose) graph in the second depth-first search and scanned the vertices in order of increasing finishing times. Does this simpler algorithm always produce correct results?

Professor Bacon's suggestion doesn't work out. As an example, suppose that our graph is on the three vertices $\{1, 2, 3\}$ and consists of the edges $(2, 1), (2, 3), (3, 2)$. Then, we should end up with $\{2, 3\}$ and $\{1\}$ as our $\text{SCC}$'s. However, a possible $\text{DFS}$ starting at $2$ could explore $3$ before $1$, this would mean that the finish time of $3$ is lower than of $1$ and $2$. This means that when we first perform the $\text{DFS}$ starting at $3$. However, a $\text{DFS}$ starting at $3$ will be able to reach all other vertices. This means that the algorithm would return that the entire graph is a single $\text{SCC}$, even though this is clearly not the case since there is neither a path from $1$ to $2$ of from $1$ to $3$.

## 22.5-4

Prove that for any directed graph $G$, we have $((G^\text T)^{\text{SCC}})^\text T = G^{\text{SCC}}$. That is, the transpose of the component graph of $G^\text T$ is the same as the component graph of $G$.

First observe that $C$ is a strongly connected component of $G$ if and only if it is a strongly connected component of $G^\text T$. Thus the vertex sets of $G^{\text{SCC}}$ and $(G^\text T)^{\text{SCC}}$ are the same, which implies the vertex sets of $((G^\text T)^\text{SCC})^\text T$ and $G^{\text{SCC}}$ are the same. It suffices to show that their edge sets are the same. Suppose $(v_i, v_j)$ is an edge in $((G^\text T)^{\text{SCC}})^\text T$. Then $(v_j, v_i)$ is an edge in $(G^\text T)^{\text{SCC}}$. Thus there exist $x \in C_j$ and $y \in C_i$ such that $(x, y)$ is an edge of $G^\text T$, which implies $(y, x)$ is an edge of $G$. Since components are preserved, this means that $(v_i, v_j)$ is an edge in $G^{\text{SCC}}$. For the opposite implication we simply note that for any graph $G$ we have $(G^\text T)^{\text T} = G$.

## 22.5-5

Give an $O(V + E)$-time algorithm to compute the component graph of a directed graph $G = (V, E)$. Make sure that there is at most one edge between two vertices in the component graph your algorithm produces.

We have at our disposal an $O(V + E)$-time algorithm that computes strongly connected components. Let us assume that the output of this algorithm is a mapping $u.scc$, giving the number of the strongly connected component containing vertex $u$, for each vertex $u$. Without loss of generality, assume that $u.scc$ is an integer in the set $\{1, 2, \ldots, |V|\}$.

Construct the multiset (a set that can contain the same object more than once) $T = \{u.scc: u \in V\}$, and sort it by using counting sort. Since the values we are sorting are integers in the range $1$ to $|V|$, the time to sort is $O(V)$. Go through the sorted multiset $T$ and every time we find an element $x$ that is distinct from the one before it, add $x$ to $V^{\text{SCC}}$. (Consider the first element of the sorted set as "distinct from the one before it.") It takes $O(V)$ time to construct $V^{\text{SCC}}$.

Construct the set of ordered pairs

$$\text{(x, y): there is an edge (u, v) \in E, x = u.scc, and y = v.scc}.$$

We can easily construct this set in $\Theta(E)$ time by going through all edges in $E$ and looking up $u.scc$ and $v.scc$ for each edge $(u, v) \in E$.

Having constructed $S$, remove all elements of the form $(x, x)$. Alternatively, when we construct $S$, do not put an element in $S$ when we find an edge $(u, v)$ for which $u.scc = v.scc$. $S$ now has at most $|E|$ elements.

Now sort the elements of $S$ using radix sort. Sort on one component at a time. The order does not matter. In other words, we are performing two passes of counting sort. The time to do so is $O(V + E)$, since the values we are sorting on are integers in the range $1$ to $|V|$.

Finally, go through the sorted set $S$, and every time we find an element $(x, y)$ that is distinct from the element before it (again considering the first element of the sorted set as distinct from the one before it), add $(x, y)$ to $E^{\text{SCC}}$. Sorting and then adding $(x, y)$ only if it is distinct from the element before it ensures that we add $(x, y)$ at most once. It takes $O(E)$ time to go through $S$ in this way, once $S$ has been sorted.

The total time is $O(V + E)$.

## 22.5-6

Given a directed graph $G = (V, E)$, explain how to create another graph $G' = (V, E')$ such that (a) $G'$ has the same strongly connected components as $G$, (b) $G'$ has the same component graph as $G$, and (c) $E'$ is as small as possible. Describe a fast algorithm to compute $G'$.

The basic idea is to replace the edges within each $\text{SCC}$ by one simple, directed cycle and then remove redundant edges between $\text{SCC}$'s. Since there must be at least $k$ edges within an $\text{SCC}$ that has $k$ vertices, a single directed cycle of $k$ edges gives the $k$-vertex $\text{SCC}$ with the fewest possible edges.

The algorithm works as follows:

1. Identify all $\text{SCC}$'s of $G$. Time: $\Theta(V + E)$, using the $\text{SCC}$ algorithm in Section 22.5.
2. Form the component graph $G^{\text{SCC}}$. Time: $O(V + E)$, by Exercise 22.5-5.
3. Start with $E' = \emptyset$. Time: $O(1)$.
4. For each $\text{SCC}$ of $G$, let the vertices in the $\text{SCC}$ be $v_1, v_2, \ldots, v_k$, and add to $E'$ the directed edges $(v_1, v_2), (v_2, v_3), \ldots, (v_{k - 1}, v_k), (v_k, v_1)$. These edges form a simple, directed cycle that includes all vertices of the $\text{SCC}$. Time for all $\text{SCC}$'s: $O(V)$.
5. For each edge $(u, v)$ in the component graph $G^{\text{SCC}}$, select any vertex $x$ in $u$'s $\text{SCC}$ and any vertex $y$ in $v$'s $\text{SCC}$, and add the directed edge $(x, y)$ to $E'$. Time: $O(E)$.

## 22.5-7

A directed graph $G = (V, E)$ is semiconnected if, for all pairs of vertices $u, v \in V$, we have $u \leadsto v$ or $v \leadsto u$. Give an efficient algorithm to determine whether or not $G$ is semiconnected. Prove that your algorithm is correct, and analyze its running time.

To determine whether $G = (V, E)$ is semiconnected, do the following:

1. Call $\text{STRONGLY-CONNECTED-COMPONENTS}$.
2. Form the component graph. (By Exercise 22.5-5, you may assume that this takes $O(V + E)$ time.)
3. Topologically sort the component graph. (Recall that it's a dag.) Assuming that $G$ contains $k$ $\text{SCC}$'s, the topological sort gives a linear ordering $\langle v_1, v_2, \ldots, v_k \rangle$ of the vertices.
4. Verify that the sequence of vertices $\langle v_1, v_2, \ldots, v_k \rangle$ given by topological sort forms a linear chain in the component graph. That is, verify that the edges $(v_1, v_2), (v_2, v_3), \ldots, (v_{k - 1}, v_k)$ exist in the component graph. If the vertices form a linear chain, then the original graph is semiconnected; otherwise it is not.

Because we know that all vertices in each $\text{SCC}$ are mutually reachable from each other, it suffices to show that the component graph is semiconnected if and only if it contains a linear chain. We must also show that if there's a linear chain in the component graph, it's the one returned by topological sort.

We'll first show that if there's a linear chain in the component graph, then it's the one returned by topological sort. In fact, this is trivial. A topological sort has to respect every edge in the graph. So if there's a linear chain, a topological sort must give us the vertices in order.

Now we'll show that the component graph is semiconnected if and only if it contains a linear chain.

First, suppose that the component graph contains a linear chain. Then for every pair of vertices $u$, $v$ in the component graph, there is a path between them. If $u$ precedes $v$ in the linear chain, then there's a path $u \leadsto v$. Otherwise, $v$ precedes $u$, and there's a path $v \leadsto u$.

Conversely, suppose that the component graph does not contain a linear chain. Then in the list returned by topological sort, there are two consecutive vertices $v_i$ and $v_{i + 1}$, but the edge$(v_i, v_{i + 1})$ is not in the component graph. Any edges out of $v_i$ are to vertices $v_j$, where $j > i + 1$, and so there is no path from $v_i$ to $v_{i + 1}$ in the component graph. And since $v_{i + 1}$ follows $v_i$ in the topological sort, there cannot be any paths at all from $v_{i + 1}$ to $v_i$. Thus, the component graph is not semiconnected.

Running time of each step:

1. $\Theta(V + E)$.
2. $O(V + E)$.
3. Since the component graph has at most $|V|$ vertices and at most $|E|$ edges, $O(V + E)$.
4. Also $O(V + E)$. We just check the adjacency list of each vertex $v_i$ in the component graph to verify that there's an edge $(v_i, v_{i + 1})$. We'll go through each adjacency list once.

Thus, the total running time is $\Theta(V + E)$.