# 22.4 Topological sort

## 22.4-1

Show the ordering of vertices produced by $\text{TOPOLOGICAL-SORT}$ when it is run on the dag of Figure 22.8, under the assumption of Exercise 22.3-2.

Our start and finish times from performing the $\text{DFS}$ are

$$ \begin{array}{ccc} \text{label} & d & f \\ \hline m & 1 & 20 \\ q & 2 & 5 \\ t & 3 & 4 \\ r & 6 & 19 \\ u & 7 & 8 \\ y & 9 & 18 \\ v & 10 & 17 \\ w & 11 & 14 \\ z & 12 & 13 \\ x & 15 & 16 \\ n & 21 & 26 \\ o & 22 & 25 \\ s & 23 & 24 \\ p & 27 & 28 \end{array} $$

And so, by reading off the entries in decreasing order of finish time, we have the sequence $p, n, o, s, m, r, y, v, x, w, z, u, q, t$.

## 22.4-2

Give a linear-time algorithm that takes as input a directed acyclic graph $G = (V, E)$ and two vertices $s$ and $t$, and returns the number of simple paths from $s$ to $t$ in $G$. For example, the directed acyclic graph of Figure 22.8 contains exactly four simple paths from vertex $p$ to vertex $v: pov$, $poryv$, $posryv$, and $psryv$. (Your algorithm needs only to count the simple paths, not list them.)

The algorithm works as follows. The attribute $u.paths$ of node $u$ tells the number of simple paths from $u$ to $v$, where we assume that $v$ is fixed throughout the entire process. First of all, a topo sort should be conducted and list the vertex between $u$, $v$ as $\{v[1], v[2], \dots, v[k - 1]\}$. To count the number of paths, we should construct a solution from $v$ to $u$. Let's call $u$ as $v[0]$ and $v$ as $v[k]$, to avoid overlapping subproblem, the number of paths between $v_k$ and $u$ should be remembered and used as $k$ decrease to $0$. Only in this way can we solve the problem in $\Theta(V + E)$.

An bottom-up iterative version is possible only if the graph uses adjacency matrix so whether $v$ is adjacency to $u$ can be determined in $O(1)$ time. But building a adjacency matrix would cost $\Theta(|V|^2)$, so never mind.

1 2 3 4 5 6 7 8 9 | SIMPLE-PATHS(G, u, v) TOPO-SORT(G) let {v[1], v[2]..v[k - 1]} be the vertex between u and v v[0] = u v[k] = v for j = 0 to k - 1 DP[j] = ∞ DP[k] = 1 return SIMPLE-PATHS-AID(G, DP, 0) |

1 2 3 4 5 6 7 8 9 10 | SIMPLE-PATHS-AID(G, DP, i) if i > k return 0 else if DP[i] != ∞ return DP[i] else DP[i] = 0 for v[m] in G.adj[v[i]] and 0 < m ≤ k DP[i] += SIMPLE-PATHS-AID(G, DP, m) return DP[i] |

## 22.4-3

Give an algorithm that determines whether or not a given undirected graph $G = (V, E)$ contains a cycle. Your algorithm should run in $O(V)$ time, independent of $|E|$.

An undirected graph is acyclic (i.e., a forest) if and only if a $\text{DFS}$ yields no back edges.

- If there's a back edge, there's a cycle.
- If there's no back edge, then by Theorem 22.10, there are only tree edges. Hence, the graph is acyclic.

Thus, we can run $\text{DFS}$: if we find a back edge, there's a cycle.

- Time: $O(V)$. (Not $O(V + E)$!)

If we ever see $|V|$ distinct edges, we must have seen a back edge because (by Theorem B.2 on p. 1174) in an acyclic (undirected) forest, $|E| \le |V| - 1$.

## 22.4-4

Prove or disprove: If a directed graph $G$ contains cycles, then $\text{TOPOLOGICAL-SORT}(G)$ produces a vertex ordering that minimizes the number of "bad" edges that are inconsistent with the ordering produced.

This is not true. Consider the graph $G$ consisting of vertices $a, b, c$, and $d$. Let the edges be $(a, b)$, $(b, c)$, $(a, d)$, $(d, c)$, and $(c, a)$. Suppose that we start the $\text{DFS}$ of $\text{TOPOLOGICAL-SORT}$ at vertex $c$. Assuming that $b$ appears before $d$ in the adjacency list of $a$, the order, from latest to earliest, of finish times is $c, a, d, b$.

The "bad" edges in this case are $(b, c)$ and $(d, c)$. However, if we had instead ordered them by $a, b, d, c$ then the only bad edges would be $(c, a)$. Thus $\text{TOPOLOGICAL-SORT}$ doesn't always minimizes the number of "bad" edges

## 22.4-5

Another way to perform topological sorting on a directed acyclic graph $G = (V, E)$ is to repeatedly find a vertex of $\text{in-degree}$ $0$, output it, and remove it and all of its outgoing edges from the graph. Explain how to implement this idea so that it runs in time $O(V + E)$. What happens to this algorithm if $G$ has cycles?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | TOPOLOGICAL-SORT(G) // Initialize in-degree, Θ(V) time. for each vertex u ∈ G.V u.in-degree = 0 // Compute in-degree, Θ(V + E) time. for each vertex u ∈ G.V for each v ∈ G.Adj[u] v.in-degree = v.in-degree + 1 // Initialize Queue, Θ(V) time. Q = Ø for each vertex u ∈ G.V if u.in-degree == 0 ENQUEUE(Q, u) // while loop takes O(V + E) time. while Q != Ø u = DEQUEUE(Q) output u // for loop executes O(E) times total. for each v ∈ G.Adj[u] v.in-degree = v.in-degree - 1 if v.in-degree == 0 ENQUEUE(Q, v) // Check for cycles, O(V) time. for each vertex u ∈ G.V if u.in-degree != 0 report that there's a cycle // Another way to check for cycles would be to count the vertices // that are output and report a cycle if that number is < |V|. |

To find and output vertices of $\text{in-degree}$ $0$, we first compute all vertices' $\text{in-degree}$s by making a pass through all the edges (by scanning the adjacency lists of all the vertices) and incrementing the $\text{in-degree}$ of each vertex an edge enters.

- Computing all $\text{in-degree}$s takes $\Theta(V + E)$ time ($|V|$ adjacency lists accessed, $|E|$ edges total found in those lists, $\Theta(1)$ work for each edge).

We keep the vertices with $\text{in-degree}$ $0$ in a FIFO queue, so that they can be enqueued and dequeued in $O(1)$ time. (The order in which vertices in the queue are processed doesn't matter, so any kind of FIFO queue works.)

- Initializing the queue takes one pass over the vertices doing $\Theta(1)$ work, for total time $\Theta(V)$.

As we process each vertex from the queue, we effectively remove its outgoing edges from the graph by decrementing the $\text{in-degree}$ of each vertex one of those edges enters, and we enqueue any vertex whose $\text{in-degree}$ goes to $0$. We do not need to actually remove the edges from the adjacency list, because that adjacency list will never be processed again by the algorithm: Each vertex is enqueued/dequeued at most once because it is enqueued only if it starts out with $\text{in-degree}$ $0$ or if its indegree becomes $0$ after being decremented (and never incremented) some number of times.

- The processing of a vertex from the queue happens $O(V)$ times because no vertex can be enqueued more than once. The per-vertex work (dequeue and output) takes $O(1)$ time, for a total of $O(V)$ time.
- Because the adjacency list of each vertex is scanned only when the vertex is dequeued, the adjacency list of each vertex is scanned at most once. Since the sum of the lengths of all the adjacency lists is $\Theta(E)$, at most $O(E)$ time is spent in total scanning adjacency lists. For each edge in an adjacency list, $\Theta(1)$ work is done, for a total of $O(E)$ time.

Thus the total time taken by the algorithm is $O(V + E)$.

The algorithm outputs vertices in the right order ($u$ before $v$ for every edge $(u, v)$) because vwill not be output until its $\text{in-degree}$ becomes $0$, which happens only when every edge $(u, v)$ leading into $v$ has been "removed" due to the processing (including output) of $u$.

If there are no cycles, all vertices are output.

- Proof: Assume that some vertex $v_0$ is not output. Vertex $v_0$ cannot start out with $\text{in-degree}$ $0$ (or it would be output), so there are edges into $v_0$. Since $v_0$'s $\text{in-degree}$ never becomes $0$, at least one edge $(v_1, v_0)$ is never removed, which means that at least one other vertex $v_1$ was not output. Similarly, $v_1$ not output means that some vertex $v_2$ such that $(v_2, v_1) \in E$ was not output, and so on. Since the number of vertices is finite, this path ($\cdots \to v_2 \to v_1 \to v_0$) is finite, so we must have $v_i = v_j$ for some $i$ and $j$ in this sequence, which means there is a cycle.

If there are cycles, not all vertices will be output, because some $\text{in-degree}$s never become $0$.

- Proof: Assume that a vertex in a cycle is output (its $\text{in-degree}$ becomes $0$). Let $v$ be the first vertex in its cycle to be output, and let $u$ be $v$'s predecessor in the cycle. In order for $v$'s $\text{in-degree}$ to become $0$, the edge $(u, v)$ must have been "removed", which happens only when $u$ is processed. But this cannot have happened, because $v$ is the first vertex in its cycle to be processed. Thus no vertices in cycles are output.