# CHAPTER 26: ALL-PAIRS SHORTEST PATHS

We can solve an all-pairs shortest-paths problem by running a single-source shortest-paths algorithm V times, once for each vertex as the source. If all edge weights are nonnegative, we can use Dijkstra's algorithm. If we use the linear-array implementation of the priority queue, the running time is O(V3 + VE) = O(V3). The binary-heap implementation of the priority queue yields a running time of O(V E lg V), which is an improvement if the graph is sparse. Alternatively, we can implement the priority queue with a Fibonacci heap, yielding a running time of O(V2 lg V + VE).

If negative-weight edges are allowed, Dijkstra's algorithm can no longer be used. Instead, we must run the slower Bellman-Ford algorithm once from each vertex. The resulting running time is O(V2 E), which on a dense graph is O(V4). In this chapter we shall see how to do better. We shall also investigate the relation of the all-pairs shortest-paths problem to matrix multiplication and study its algebraic structure.

Unlike the single-source algorithms, which assume an adjacency-list representation of the graph, most of the algorithms in this chapter use an adjacency-matrix representation. (Johnson's algorithm for sparse graphs uses adjacency lists.) The input is an n n matrix W representing the edge weights of an n-vertex directed graph G = (V, E). That is, W = (wij), where

#### (26.1)

Negative-weight edges are allowed, but we assume for the time being that the input graph contains no negative-weight cycles.

The tabular output of the all-pairs shortest-paths algorithms presented in this chapter is an n n matrix D = (dij), where entry dij contains the weight of a shortest path from vertex i to vertex j. That is, if we let (i,j) denote the shortest-path weight from vertex i to vertex j (as in Chapter 25), then dij = (i,j) at termination.

`V,i = {j  V : ij  NIL}  {i}`

and

`E,i = {(ij, j): j  V,i and ij  NIL} .`

If G,i is a shortest-paths tree, then the following procedure, which is a modified version of the PRINT-PATH procedure from Chapter 23, prints a shortest path from vertex i to vertex j.

`PRINT-ALL-PAIRS-SHORTEST-PATH(,i,j)`

`1  if i = j`

`2      then print i`

`3      else if ij = NIL`

`4              then print "no path from" i "to" j "exists"`

`5              else PRINT-ALL-PAIRS-SHORTEST-PATH(,i,ij)`

`6                   print j`

In order to highlight the essential features of the all-pairs algorithms in this chapter, we won't cover the creation and properties of predecessor matrices as extensively as we dealt with predecessor subgraphs in Chapter 25. The basics are covered by some of the exercises.

## Chapter outline

Section 26.1 presents a dynamic-programming algorithm based on matrix multiplication to solve the all-pairs shortest-paths problem. Using the technique of "repeated squaring," this algorithm can be made to run in (V3 1g V) time. Another dynamic-programming algorithm, the Floyd-Warshall algorithm, is given in Section 26.2. The Floyd-Warshall algorithm runs in time (V3). Section 26.2 also covers the problem of finding the transitive closure of a directed graph, which is related to the all-pairs shortest-paths problem. Johnson's algorithm is presented in Section 26.3. Unlike the other algorithms in this chapter, Johnson's algorithm uses the adjacency-list representation of a graph. It solves the all-pairs shortest-paths problem in O(V2 lg V + V E) time, which makes it a good algorithm for large, sparse graphs. Finally, in Section 26.4, we examine an algebraic structure called a "closed semiring," which allows many shortest-paths algorithms to be applied to a host of other all-pairs problems not involving shortest paths.

Before proceeding, we need to establish some conventions for adjacency-matrix representations. First, we shall generally assume that the input graph G = (V, E) has n vertices, so that n = V. Second, we shall use the convention of denoting matrices by uppercase letters, such as W or D, and their individual elements by subscripted lowercase letters, such as wij or dij. Some matrices will have parenthesized superscripts, as in D(m) = , to indicate iterates. Finally, for a given n n matrix A, we shall assume that the value of n is stored in the attribute rows[A].

# 26.1 Shortest paths and matrix multiplication

Before proceeding, let us briefly recap the steps given in Chapter 16 for developing a dynamic-programming algorithm.

1. Characterize the structure of an optimal solution.

2. Recursively define the value of an optimal solution.

3. Compute the value of an optimal solution in a bottom-up fashion.

(The fourth step, constructing an optimal solution from computed information, is dealt with in the exercises.)

## A recursive solution to the all-pairs shortest-paths problem

Now, let be the minimum weight of any path from vertex i to vertex j that contains at most m edges. When m = 0, there is a shortest path from i to j with no edges if and only if i = j. Thus,

For m 1, we compute as the minimum of (the weight of the shortest path from i to j consisting of at most m - 1 edges) and the minimum weight of any path from i to j consisting of at most m edges, obtained by looking at all possible predecessors k of j. Thus, we recursively define

#### (26.2)

The latter equality follows since wjj = 0 for all j.

What are the actual shortest-path weights (i, j)? If the graph contains no negative-weight cycles, then all shortest paths are simple and thus contain at most n - 1 edges. A path from vertex i to vertex j with more than n - 1 edges cannot have less weight than a shortest path from i to j. The actual shortest-path weights are therefore given by

## Computing the shortest-path weights bottom up

Taking as our in put the matrix W = (wij), we now compute a series of matrices D(1), D(2), . . . , D(n-1), where for m = 1, 2, . . . n - 1,, we have . The final matrix D(n-1) contains the actual shortest-path weights. Observe that since for all vertices i, j V, we have D(1) = W.

The heart of the algorithm is the following procedure, which, given matrices D(m-1) and W, returns the matrix D(m). That is, it extends the shortest paths computed so far by one more edge.

`EXTEND-SHORTEST-PATHS(D,W)`

`1 n  rows[D]`

`2 let D' = (d'ij) be an n  n matrix`

`3 for i  1 to n`

`4      do for j  1 to n`

`5             do d'ij  `

`6                for k  1 to n`

`7                    do d'ij  min(d'ij, dik + wkj)`

`8 return D'`

The procedure computes a matrix D' = (d'ij), which it returns at the end. It does so by computing equation (26.2) for all i and j, using D for D(m-1)and D' for D(m). (It is written without the superscripts to make its input and output matrices independent of m.) Its running time is (n3) due to the three nested for loops.

We can now see the relation to matrix multiplication. Suppose we wish to compute the matrix product C = A B of two n n matrices A and B. Then, for i, j = 1, 2, . . . , n, we compute

#### (26.4)

Observe that if we make the substitutions

`d(m-1)    a ,`

`w    b ,`

`d(m)   c ,`

`min    + ,`

`+    `

in equation (26.2), we obtain equation (26.4). Thus, if we make these changes to EXTEND-SHORTEST-PATHS and also replace (the identity for min) by 0 (the identity for +), we obtain the straightforward (n3)-time procedure for matrix multiplication.

`MATRIX-MULTIPLY(A, B)`

`1 n  rows[A]`

`2 let C be an n  n matrix`

`3 for i  1 to n`

`4      do for j  1 to n`

`5             do cij  0`

`6                for k  1 to n`

`7                    do cij  cij + aik  bkj`

`8 return C`

Returning to the all-pairs shortest-paths problem, we compute the shortest-path weights by extending shortest paths edge by edge. Letting A B denote the matrix "product" returned by EXTEND-SHORTEST-PATHS(A, B), we compute the sequence of n - 1 matrices

`D(1)  =   D(0)  W  =  W,`

`D(2)  =   D(1)  W  =  W2,`

`D(3)  =   D(2)  W  =  W3,`

`D(n-1)  =  D(n-2)  W  =  Wn-1 .`

As we argued above, the matrix D(n-1) = Wn-1 contains the shortest-path weights. The following procedure computes this sequence in (n4) time.

`SLOW-ALL-PAIRS-SHORTEST-PATHS(W)`

`1  n  rows[W]`

`2  D(1)  W`

`3  for m  2 to n - 1`

`4       do D(m)  EXTEND-SHORTEST-PATHS(D(m-1),W)`

`5  return D(n-1)`

Figure 26.1 shows a graph and the matrices D(m) computed by the procedure SLOW-ALL-PAIRS-SHORTEST-PATHS.

## Improving the running time

Our goal, however, is not to compute all the D(m) matrices: we are interested only in matrix D(n-1). Recall that in the absence of negative-weight cycles, equation (26.3) implies D(m) = D(n-1) for all integers m n - 1. We can compute D(n-1) with only lg(n - 1) matrix products by computing the sequence

`D(1)  =   W ,`

`D(2)  =   W2   =  W W,`

`D(4)  =   W4   =  W2 W2`

`D(8)  =   W8   =  W4 W4,`

`D(2lg(n-1))  = W2lg(n-1)   =  W2lg(n-1)-1  W2lg(n-1)-1.`

Since 2lg(n-1) n - 1, the final product D(21g(n-1)) is equal to D(n-1).

#### Figure 26.1 A directed graph and the sequence of matrices D(m) computed by SLOW-ALL-PAIRS-SHORTEST-PATHS. The reader may verify that D(5) = D(4) . W is equal to D(4), and thus D(m) = D(4) for all m 4.

`FASTER-ALL-PAIRS-SHORTEST-PATHS(W)`

`1  n  rows[W]`

`2  D(1)  W`

`3  m  1`

`4  while n - 1 > m`

`5      do D(2m)  EXTEND-SHORTEST-PATHS(D(m),D(m))`

`6         m  2m`

`7  return D(m)`

In each iteration of the while loop of lines 4-6, we compute D(2m) = (D(m))2, starting with m = 1. At the end of each iteration, we double the value of m. The final iteration computes D(n-1) by actually computing D(2m) for some n - 1 2m < 2n - 2. By equation (26.3), D(2m) = D(n-1). The next time the test in line 4 is performed, m has been doubled, so now n - 1 m, the test fails, and the procedure returns the last matrix it computed.

The running time of FASTER-ALL-PAIRS-SHORTEST-PATHS is (n3lg n) since each of the lg(n - 1) matrix products takes (n3) time. Observe that the code is tight, containing no elaborate data structures, and the constant hidden in the -notation is therefore small.

## Exercises

Run SLOW-ALL-PAIRS-SHORTEST-PATHS on the weighted, directed graph of Figure 26.2, showing the matrices that result for each iteration of the respective loops. Then do the same for FASTER-ALL-PAIRS-SHORTEST-PATHS.

Why do we require that wii = 0 for all 1 i n?

What does the matrix

used in the shortest-paths algorithms correspond to in regular matrix multiplication?

Show how to express the single-source shortest-paths problem as a product of matrices and a vector. Describe how evaluating this product corresponds to a Bellman-Ford-like algorithm (see Section 25.3).

Suppose we also wish to compute the vertices on shortest paths in the algorithms of this section. Show how to compute the predecessor matrix from the completed matrix D of shortest-path weights in O(n3) time.

The vertices on shortest paths can also be computed at the same time as the shortest-path weights. Let us define to be the predecessor of vertex j on any minimum-weight path from i to j that contains at most m edges. Modify EXTEND-SHORTEST-PATHS and SLOW-ALL-PAIRS-SHORTEST-PATHS to compute the matrices (1), (2), . . . , (n-1) as the matrices D(1), D(2), . . . , D(n-1) are computed.

Give an efficient algorithm to find the length (number of edges) of a minimum-length negative-weight cycle in a graph.

# 26.2 The Floyd-Warshall algorithm

## The structure of a shortest path

The Floyd-Warshall algorithm is based on the following observation. Let the vertices of G be V = {1, 2, . . . , n}, and consider a subset {1, 2, . . . , k} of vertices for some k. For any pair of vertices i, j V, consider all paths from i to j whose intermediate vertices are all drawn from {1, 2, . . . , k}, and let p be a minimum-weight path from among them. (Path p is simple, since we assume that G contains no negative-weight cycles.) The Floyd- Warshall algorithm exploits a relationship between path p and shortest paths from i to j with all intermediate vertices in the set {1, 2, . . . , k - 1}. The relationship depends on whether or not k is an intermediate vertex of path p.

#### Figure 26.3 Path p is a shortest path from vertex i to vertex j, and k is the highest-numbered intermediate vertex of p. Path p1, the portion of path p from vertex i to vertex k, has all intermediate vertices in the set {1, 2, . . . , k - 1}. The same holds for path p2 from vertex k to vertex j.

If k is not an intermediate vertex of path p, then all intermediate vertices of path p are in the set {1, 2, . . . , k - 1}. Thus, a shortest path from vertex i to vertex j with all intermediate vertices in the set {1, 2, . . . , k - 1} is also a shortest path from i to j with all intermediate vertices in the set {1, 2, . . . , k}.

If k is an intermediate vertex of path p, then we break p down into as shown in Figure 26.3. By Lemma 25.1, p1 is a shortest path from i to k with all intermediate vertices in the set {1, 2, . . . , k}. In fact, vertex k is not an intermediate vertex of path p1, and so p1 is a shortest path from i to k with all intermediate vertices in the set {1, 2, . . . , k - 1}. Similarly, p2 is a shortest path from vertex k to vertex j with all intermediate vertices in the set {1, 2, . . . , k - 1}.

## A recursive solution to the all-pairs shortest-paths problem

Based on the above observations, we define a different recursive formulation of shortest-path estimates than we did in Section 26.1. Let be the weight of a shortest path from vertex i to vertex j with all intermediate vertices in the set {1, 2, . . . , k}. When k = 0, a path from vertex i to vertex j with no intermediate vertex numbered higher than 0 has no intermediate vertices at all. It thus has at most one edge, and hence . A recursive definition is given by

#### (26.5)

The matrix gives the final answer-- for all i, j V--because all intermediate vertices are in the set {1, 2, . . . , n}.

## Computing the shortest-path weights bottom up

Figure 26.4 shows a directed graph and the matrices D(k) computed by the Floyd-Warshall algorithm.

The running time of the Floyd-Warshall algorithm is determined by the triply nested for loops of lines 3-6. Each execution of line 6 takes O(1) time. The algorithm thus runs in time (n3). As in the final algorithm in Section 26.1, the code is tight, with no elaborate data structures, and so the constant hidden in the -notation is small. Thus, the Floyd-Warshall algorithm is quite practical for even moderate-sized input graphs.

## Constructing a shortest path

There are a variety of different methods for constructing shortest paths in the Floyd-Warshall algorithm. One way is to compute the matrix D of shortest-path weights and then construct the predecessor matrix from the D matrix. This method can be implemented to run in O(n3) time (Exercise 26.1-5). Given the predecessor matrix , the PRINT-ALL-PAIRS-SHORTEST-PATH procedure can be used to print the vertices on a given shortest path.

We can compute the predecessor matrix "on-line" just as the Floyd-Warshall algorithm computes the matrices D(k). Specifically, we compute a sequence of matrices (0), (1), . . . , (n), where = (n) and is defined to be the predecessor of vertex j on a shortest path from vertex i with all intermediate vertices in the set {1, 2, . . . , k}.

We can give a recursive formulation of . When k = 0, a shortest path from i to j has no intermediate vertices at all. Thus,

#### (26.6)

For k 1, if we take the path , then the predecessor of j we choose is the same as the predecessor of j we chose on a shortest path from k with all intermediate vertices in the set {1, 2, . . . , k - 1}. Otherwise, we choose the same predecessor of j that we chose on a shortest path from i with all intermediate vertices in the set {1, 2, . . . , k - 1}. Formally, for k 1,

#### (26.7)

We leave the incorporation of the (k) matrix computations into the FLOYD-WARSHALL procedure as Exercise 26.2-3. Figure 26.4 shows the sequence of (k) matrices that the resulting algorithm computes for the graph of Figure 26.1. The exercise also asks for the more difficult task of proving that the predecessor subgraph G,i is a shortest-paths tree with root i. Yet another way to reconstruct shortest paths is given as Exercise 26.2-6.

## Transitive closure of a directed graph

E* = {(i, j) : there is a path from vertex i to vertex j in G} .

One way to compute the transitive closure of a graph in (n3) time is to assign a weight to 1 to each edge of E and run the Floyd-Warshall algorithm. If there is a path from vertex i to vertex j, we get dij < n. Otherwise, we get dij = .

There is another, similar way to compute the transitive closure of G in (n3) time that can save time and space in practice. This method involves substitutions of the logical operations and for the arithmetic operations min and + in the Floyd-Warshall algorithm. For i, j, k = 1, 2, . . . , n, we define to be 1 if there exists a path in graph G from vertex i to j with all intermediate vertices in the set {1, 2, . . . , k}, and 0 otherwise. We construct the transitive closure G* = (V, E*) by putting edge (i, j) into E* if and only if = 1. A recursive definition of , analogous to recurrence (26.5), is

and for k 1,

#### (26.8)

As in the Floyd-Warshall algorithm, we compute the matrices in order of increasing k.

`TRANSITIVE-CLOSURE(G)`

Figure 26.5 shows the matrices T(k) computed by the TRANSITIVE-CLOSURE procedure on a sample graph. Like the Floyd-Warshall algorithm, the running time of the TRANSITIVE-CLOSURE procedure is (n3). On some computers, though, logical operations on single-bit values execute faster than arithmetic operations on integer words of data. Moreover, because the direct transitive-closure algorithm uses only boolean values rather than integer values, its space requirement is less than the Floyd-Warshall algorithm's by a factor corresponding to the size of a word of computer storage.

In Section 26.4, we shall see that the correspondence between FLOYD-WARSHALL and TRANSITIVE-CLOSURE is more than coincidence. Both algorithms are based on a type of algebraic structure called a "closed semiring."

## Exercises

As it appears above, the Floyd-Warshall algorithm requires (n3) space, since we compute for i, j, k = 1, 2, . . . , n. Show that the following procedure, which simply drops all the superscripts, is correct, and thus only (n2) space is required.

#### Figure 26.5 A directed graph and the matrices T(k) computed by the transitive-closure algorithm.

`FLOYD-WARSHALL'(W)`

`1  n  rows[W]`

`2  D  W`

`3  for k  1 to n`

`4       do for i  1 to n`

`5              do for j  1 to n`

`6                 dij  min(dij, dik + dkj)`

`7  return D`

Modify the FLOYD-WARSHALL procedure to include computation of the (k) matrices according to equations (26.6) and (26.7). Prove rigorously that for all i V, the predecessor subgraph G, i is a shortest-paths tree with root i. (Hint: To show that G,i is acyclic, first show that implies . Then, adapt the proof of Lemma 25.8.)

Suppose that we modify the way in which equality is handled in equation (26.7):

Is this alternative definition of the predecessor matrix correct?

Another way to reconstruct shortest paths in the Floyd-Warshall algorithm uses values for i, j, k = 1, 2, . . . , n, where is the highest-numbered intermediate vertex of a shortest path from i to j. Give a recursive formulation for , modify the FLOYD-WARSHALL procedure to compute the values, and rewrite the PRINT-ALL-PAIRS-SHORTEST-PATH procedure to take the matrix as an input. How is the matrix like the s table in the matrix-chain multiplication problem of Section 16.1?

Give an O(V E)-time algorithm for computing the transitive closure of a directed graph G = (V, E).

Suppose that the transitive closure of a directed acyclic graph can be computed in (V, E) time, where (V, E) = (V + E) and is monotonically increasing. Show that the time to compute the transitive closure of a general directed graph is O((V, E)).

# 26.3 Johnson's algorithm for sparse graphs

1. For all pairs of vertices u, v V, a shortest path from u to v using weight function w is also a shortest path from u to v using weight function .

2. For all edges (u, v), the new weight is nonnegative.

As we shall see in a moment, the preprocessing of G to determine the new weight function can be performed in O(V E) time.

## Preserving shortest paths by reweighting

As the following lemma shows, it is easy to come up with a reweighting of the edges that satisfies the first property above. We use to denote shortest-path weights derived from weight function w and to denote shortest-path weights derived from weight function .

Given a weighted, directed graph G = (V, E) with weight function w: E R, let h: V R be any function mapping vertices to real numbers. For each edge (u, v) E, define

#### (26.9)

Let p = v0, vl, . . . , vk) be a path from vertex 0 to vertex vk. Then, w(p) = (v0, vk) if and only if . Also, G has a negative-weight cycle using weight function w if and only if G has a negative-weight cycle using weight function .

Proof We start by showing that

#### (26.10)

We have

The third equality follows from the telescoping sum on the second line.

We now show by contradiction that w(p) = (v0, vk) implies . Suppose that there is a shorter path p' from v0 to vk using weight function . Then, . By equation (26.10),

which implies that w(p') < w(p). But this contradicts our assumption that p is a shortest path from u to v using w. The proof of the converse is similar.

Finally, we show that G has a negative-weight cycle using weight function w if and only if G has a negative-weight cycle using weight function . Consider any cycle c = <v0, v1, . . . , vk>, where v0= vk. By equation (26.10),

and thus c has negative weight using w if and only if it has negative weight using

## Producing nonnegative weights by reweighting

Our next goal is to ensure that the second property holds: we want (u, v) to be nonnegative for all edges (u,v) E. Given a weighted, directed graph G = (V, E) with weight function : E R, we make a new graph G' = (V', E'), where V' = V {s} for some new vertex s V and E'= E {(s, ): v V}. We extend the weight function w so that (s,v) = 0 for all v V. Note that because s has no edges that enter it, no shortest paths in G', other than those with source s, contain s. Moreover, G' has no negative-weight cycles if and only if G has no negative-weight cycles. Figure 26.6(a) shows the graph G' corresponding to the graph G of Figure 26.1.

Now suppose that G and G' have no negative-weight cycles. Let us define h(v) =(s,v) for all v V'. By Lemma 25.3, we have h(v) h(u) + (u, v) for all edges (u,v) E'. Thus, if we define the new weights according to equation (26.9), we have and the second property is satisfied. Figure 26.6(b) shows the graph G' from Figure 26.6(a) with reweighted edges.

## Computing all-pairs shortest paths

Johnson's algorithm to compute all-pairs shortest paths uses the Bellman-Ford algorithm (Section 25.3) and Dijkstra's algorithm (Section 25.2) as subroutines. It assumes that the edges are stored in adjacency lists. The algorithm returns the usual VXV matrix D = dij, where dij = (i, j), or it reports that the input graph contains a negative-weight cycle. (In order for the indices into the D matrix to make any sense, we assume that the vertices are numbered from 1 to V.)

## Exercises

Use Johnson's algorithm to find the shortest paths between all pairs of vertices in the graph of Figure 26.2. Show the values of h and computed by the algorithm.

What is the purpose of adding the new vertex s to V, yielding V'?

Suppose that w(u, v) 0 for all edges (u, v) E. What is the relationship between the weight functions w and ?

# * 26.4 A general framework for solving path problems in directed graphs

In this section, we examine "closed semirings," an algebraic structure that yields a general framework for solving path problems in directed graphs. We start by defining closed semirings and discussing how they relate to a calculus of directed paths. We then show some examples of closed semirings and a "generic" algorithm for computing all-pairs path information. Both the Floyd-Warshall algorithm and the transitive-closure algorithm from Section 26.2 are instantiations of this generic algorithm.

## Definition of closed semirings

1. is a monoid:

S is closed under : a b S for all a, b S.

is associative: a (b c) = (a b) c for all a,b,c S.

Likewise, is a monoid.

3. is commutative: a b = b a for all a, b S.

4. is idempotent: a a = a for all a S.

6. If a1, a2, a3, . . . is a countable sequence of elements of S, then a1 a2 a3 . . . is well defined and in S.

7. Associativity, commutativity, and idempotence apply to infinite summaries. (Thus, any infinite summary can be rewritten as an infinite summary in which each term of the summary is included just once and the order of evaluation is arbitrary.)

## A calculus of paths in directed graphs

The identity serves as the label of the empty path.

As a runing example of an application of closed semirings, we shall use shortest paths with nonnegative edge weights. The codomain S is R0 {}, where R0 is the set of nonnegative reals, and (i, j) = wij for all i, j V. The extension operator corresponds to the arithmetic operator +, and the label of path p =v1,v2, . . . ,vk is therefore

Not surprisingly, the role of , the identity for , is taken by 0, the identity for +. We denote the empty path by , and its label is .

`p1  p2 = v1, v2, . . . ,vk,vk+1, . . . ,vl,`

and the label of their concatenation is

Our goal will be to compute, for all pairs of vertices i, j V, the summary of all path labels from i to j:

#### (26.11)

We require commutativity and associativity of because the order in which paths are summarized should not matter. Because we use the annihilator as the label of an ordered pair (u, v) that is not an edge in the graph, any path that attempts to take an absent edge has label .

For shortest paths, we use min as the summary operator . The identity for min is , and is indeed an annihilator for + : a + = + a = for all a R0 {}. Absent edges have weight , and if any edge of a path has weight , so does the path.

We want the summary operator to be idempotent, because from equation (26.11), we see that should summarize the labels of a set of paths. If p is a path, then {p} {p} = {p}; if we summarize path p with itself, the resulting label should be the label of p: (p) (p) = (p).

Because we consider paths that may not be simple, there may be a countably infinite number of paths in a graph. (Each path, simple or not, has a finite number of edges.) The operator should therefore be applicable to a countably infinite number of path labels. That is, if a1, a2, a3, . . . is a countable sequence of elements in codomain S, then the label a1 a2 a3 . . . should be well defined and in S. It should not matter in which order we summarize path labels, and thus associativity and commutativity should hold for infinite summaries. Furthermore, if we summarize the same path label a a countably infinite number of times, we should get a as the result, and thus idempotence should hold for infinite summaries.

Returning to the shortest-paths example, we ask if min is applicable to an infinite sequence of values in R0 {}. For example, is the value of min well defined? It is, if we think of the min operator as actually returning the greatest lower bound (infimum) of its arguments, in which case we get min .

To compute labels of diverging paths, we need distributivity of the extension operator over the summary operator . As shown in Figure 26.7, suppose that we have paths By distributivity, we can summarize the labels of paths p1 2 and p1 3 by computing either .

Because there may be a countably infinite number of paths in a graph, should distribute over infinite summaries as well as finite ones. Figure 26.8, for example, contains paths along with the cycle . We must be able to summarize the paths p1 p2, p1 c p2, p1 c c p2, . . . . Distributivity of over countably infinite summaries gives us

#### Figure 26.8 Distributivity of over countably infinite summaries of . Because of cycle c, there are a countably infinite number of paths from vertex v to vertex x. We must be able to summarize the paths p1 p2, p1 c p2, p1 c c p2, . . ..

Thus, in Figure 26.8, we want to compute .

For the shortest-paths example, for any nonnegative real a R0 {},

The interpretation of this property is that since all cycles have nonnegative weight, no shortest path ever needs to traverse an entire cycle.

## Examples of closed semirings

We claimed, however, that even if there are negative-weight edges, the Floyd-Warshall algorithm computes shortest-path weights as long as no negative-weight cycles are present. By adding the appropriate closure operator and extending the codomain of labels to R {-, +}, we can find a closed semiring to handle negative-weight cycles. Using min for and + for , the reader may verify that the closure of a R {-, +} is

The second case (a < 0) models the situation in which we can traverse a negative-weight cycle an infinite number of times to obtain a weight of - on any path containing the cycle. Thus, the closed semiring to use for the Floyd-Warshall algorithm with negative edge weights is S2 = (R {-, +}, min, +, +, 0). (See Exercise 26.4-3.)

For transitive closure, we use the closed semiring S3 = ({0, 1}, V, , 0, 1), where (i, j) = 1 if (i, j) E, and (i, j)= 0 otherwise. Here we have 0* = 1* = 1.

## A dynamic-programming algorithm for directed-path labels

which is the result of summarizing all paths from i to j using the summary operator . For shortest paths, for example, we wish to compute

There is a dynamic-programming algorithm to solve this problem, and its form is very similar to the Floyd-Warshall algorithm and the transitive- closure algorithm. Let be the set of paths from vertex i to vertex j with all intermediate vertices in the set { 1, 2, . . . , k}. We define

Note the analogy to the definitions of in the Floyd-Warshall algorithm and in the transitive-closure algorithm. We can define recursively by

#### (26.12)

Recurrence (26.12) is reminiscent of recurrences (26.5) and (26.8), but with an additional factor of included. This factor represents the summary of all cycles that pass through vertex k and have all other vertices in the set {1, 2, . . . , k - 1}. (When we assume no negative-weight cycles in the Floyd-Warshall algorithm, is 0, corresponding to , the weight of an empty cycle. In the transitive-closure algorithm, the empty path from k to k gives us . Thus, for both of these algorithms, we can ignore the factor of , since it is just the identity for .) The basis of the recursive definition is

which we can see as follows. The label of the one-edge path <i, j> is simply (i, j) (which is equal to if (i, j) is not an edge in E). If, in addition, i = j, then is the label of the empty path from i to i.

The running time of this algorithm depends on the time to compute , , and *. If we let , and T* represent these times, then the running time of COMPUTE-SUMMARIES is , which is (n3) if each of the three operations takes O(1) time.

## Exercises

Verify that S1 = (R0 {}, min, +, , 0) and S3 = ({0, 1}, V, , 0, 1) are closed semirings.

Verify that S2 = (R {-, +}, min, +, +, 0) is a closed semiring. What is the value of a + (- ) for a R? What about (-) + (+)?

Is the system S4 = (R, +, , 0, 1) a closed semiring?

Can we use an arbitrary closed semiring for Dijkstra's algorithm? What about for the Bellman-Ford algorithm? What about for the FASTER-ALL-PAIRS-SHORTEST-PATHS procedure?

A trucking firm wishes to send a truck from Castroville to Boston laden as heavily as possible with artichokes, but each road in the United States has a maximum weight limit on trucks that use the road. Model this problem with a directed graph G = (V, E) and an appropriate closed semiring, and give an efficient algorithm to solve it.

# Problems

a. Show how the transitive closure G* = (V, E*) of a graph G = (V, E) can be updated in 0(V2) time when a new edge is added to G.

b. Give an example of a graph G and an edge e such that (V2) time is required to update the transitive closure after the insertion of e into G.

c. Describe an efficient algorithm for updating the transitive closure as edges are inserted into the graph. For any sequence of n insertions, your algorithm should run in total time , where ti is the time to update the transitive closure when the ith edge is inserted. Prove that your algorithm attains this time bound.

a. What are the asymptotic running times for INSERT, EXTRACT-MIN, and DECREASE-KEY, as a function of d and the number n of elements in a d-ary heap? What are these running times if we choose d = (n) for some constant 0 < 1? Compare these running times to the amortized costs of these operations for a Fibonacci heap.

b. Show how to compute shortest paths from a single source on an -dense directed graph G = (V, E) with no negative-weight edges in O(E) time. (Hint: Pick d as a function of .)

c. Show how to solve the all-pairs shortest-paths problem on an -dense directed graph G = (V, E) with no negative-weight edges in O(V E) time.

d. Show how to solve the all-pairs shortest-paths problem in O(V E) time on an -dense directed graph G = (V, E) that may have negative-weight edges but has no negative-weight cycles.

a. Briefly justify the assertion that S = (R {- , }, min, max, , -) is a closed semiring.

Since S is a closed semiring, we can use the COMPUTE-SUMMARIES procedure to determine the minimax weights mij in graph G. Let be the minimax weight over all paths from vertex i to vertex j with all intermediate vertices in the set {1, 2, . . . , k}.

b. Give a recurrence for , where k 0.

c. Let Tm = {(i, j) E: w(i, j) = mij}. Prove that the edges in Tm form a spanning tree of G.

d. Show that Tm = T. (Hint: Consider the effect of adding edge (i, j) to T and removing an edge on another path from i to j. Consider also the effect of removing edge (i, j) from T and replacing it with another edge.)

# Chapter notes

Lawler [132] has a good discussion of the all-pairs shortest-paths problem, although he does not analyze solutions for sparse graphs. He attributes the matrix-multiplication algorithm to the folklore. The Floyd-Warshall algorithm is due to Floyd [68], who based it on a theorem of Warshall [198] that describes how to compute the transitive closure of boolean matrices. The closed-semiring algebraic structure appears in Aho, Hopcroft, and Ullman [4]. Johnson's algorithm is taken from [114].