27.2 Multithreaded matrix multiplication

27.2-1

Draw the computation dag for computing $\text{P-SQUARE-MATRIX-MULTIPLY}$ on $2 \times 2$ matrices, labeling how the vertices in your diagram correspond to strands in the execution of the algorithm. Use the convention that spawn and call edges point downward, continuation edges point horizontally to the right, and return edges point upward. Assuming that each strand takes unit time, analyze the work, span, and parallelism of this computation.

(Omit!)

27.2-2

Repeat Exercise 27.2-1 for $\text{P-MATRIX-MULTIPLY-RECURSIVE}$.

(Omit!)

27.2-3

Give pseudocode for a multithreaded algorithm that multiplies two $n \times n$ matrices with work $\Theta(n^3)$ but span only $\Theta(\lg n)$. Analyze your algorithm.

(Removed)

27.2-4

Give pseudocode for an efficient multithreaded algorithm that multiplies a $p \times q$ matrix by a $q \times r$ matrix. Your algorithm should be highly parallel even if any of $p$, $q$, and $r$ are $1$. Analyze your algorithm.

(Removed)

27.2-5

Give pseudocode for an efficient multithreaded algorithm that transposes an $n \times n$ matrix in place by using divide-and-conquer to divide the matrix recursively into four $n / 2 \times n / 2$ submatrices. Analyze your algorithm.

P-MATRIX-TRANSPOSE(A)
    n = A.rows
    if n == 1
        return
    partition A into n / 2 ✕ n / 2 submatrices A11, A12, A21, A22
    spawn P-MATRIX-TRANSPOSE(A11)
    spawn P-MATRIX-TRANSPOSE(A12)
    spawn P-MATRIX-TRANSPOSE(A21)
    P-MATRIX-TRANSPOSE(A22)
    sync
    // exchange A12 with A21
    parallel for i = 1 to n / 2
        parallel for j = 1 + n / 2 to n
            exchange A[i, j] with A[i + n / 2, j - n / 2]

span: $T(n) = T(n / 2) + O(\lg n) = O(\lg^2 n)$.
work: $T(n) = 4T(n / 2) + O(n^2) = O(n^2\lg n)$.

27.2-6

Give pseudocode for an efficient multithreaded implementation of the Floyd-Warshall algorithm (see Section 25.2), which computes shortest paths between all pairs of vertices in an edge-weighted graph. Analyze your algorithm.

(Removed)