Dynamic programming, like the divide-and-conquer method, solves problems by combining the solutions to subproblems. ("Programming" in this context refers to a tabular method, not to writing computer code.) As we saw in Chapter 1, divide-and-conquer algorithms partition the problem into independent subproblems, solve the subproblems recursively, and then combine their solutions to solve the original problem. In contrast, dynamic programming is applicable when the subproblems are not independent, that is, when subproblems share subsubproblems. In this context, a divide-and-conquer algorithm does more work than necessary, repeatedly solving the common subsubproblems. A dynamic-programming algorithm solves every subsubproblem just once and then saves its answer in a table, thereby avoiding the work of recomputing the answer every time the subsubproblem is encountered.

Dynamic programming is typically applied to **optimization problems***. *In such problems there can be many possible solutions. Each solution has a value, and we wish to find a solution with the optimal (minimum or maximum) value. We call such a solution *an* optimal solution to the problem, as opposed to *the* optimal solution, since there may be several solutions that achieve the optimal value.

The development of a dynamic-programming algorithm can be broken into a sequence of four steps.

1. Characterize the structure of an optimal solution.

2. Recursively define the value of an optimal solution.

3. Compute the value of an optimal solution in a bottom-up fashion.

4. Construct an optimal solution from computed information.

Our first example of dynamic programming is an algorithm that solves the problem of matrix-chain multiplication. We are given a sequence (chain) *A _{1}, A_{2}, . . ., A_{n}* of

A_{1}A_{2}^{ }^{ }^{An .}

We can evaluate the expression (16.1) using the standard algorithm for multiplying pairs of matrices as a subroutine once we have parenthesized it to resolve all ambiguities in how the matrices are multiplied together. A product of matrices is * fully parenthesized *if it is either a single matrix or the product of two fully parenthesized matrix products, surrounded by parentheses. Matrix multiplication is associative, and so all parenthesizations yield the same product. For example, if the chain of matrices is

(A_{1}(A_{2}(A_{3}A_{4}))) ,

(A_{1}((A_{2}A_{3})A_{4})) ,

((A_{1}A_{2})(A_{3}A_{4})) ,

((A_{1}(A_{2}A_{3}))A_{4}) ,

(((A_{1}A_{2})A_{3})A_{4}) .

MATRIX-MULTIPLY(A,B)

1ifcolumns[A] rows[B]

2then error"incompatible dimensions"

3else fori1rowsto[A]

4do forj1tocolumns[B]

5doC[i, j] 0

6fork1columnsto[A]

7doC[i ,j] C[i ,j]+ A[i, k]B[k, j]

8returnC

Problem 13-4 asked you to show that the solution to this recurrence is the sequence of * Catalan numbers*:

P(n) =C(n- 1),

The first step of the dynamic-programming paradigm is to characterize the structure of an optimal solution. For the matrix-chain multiplication problem, we can perform this step as follows. For convenience, let us adopt the notation *A _{i}*..

m[i, j] = m[i, k] + m[k + 1, j] + p_{i-1}p_{k}p_{j }.

MATRIX-CHAIN-ORDER(p)

1nlength[p] - 1

2fori1ton

3dom[i, i]0

4forl2ton

5do fori1ton - l +1

6doji + l-1

7m[i, j]

8forkitoj- 1

9doqm[i, k]+ m[k +1,j] +p- 1_{i}p_{k}p_{j}

10ifq<m[i, j]

11thenm[i, j]q

12s[i, j]k

13returnmands

matrix dimension

-----------------

A_{1 }30 X 35

A_{2 }35 X 15

A_{3 }15 X 5

A_{4 }5 X 10

A_{5 }10 X 20

A_{6 }20 X 25

Although MATRIX-CHAIN-ORDER determines the optimal number of scalar multiplications needed to compute a matrix-chain product, it does not directly show how to multiply the matrices. Step 4 of the dynamic-programming paradigm is to construct an optimal solution from computed information.

MATRIX-CHAIN-MULTIPLY(A, s, i, ,j)

1ifj >i

2thenXMATRIX-CHAIN-MULTIPLY(A, s, i, s[i, j])

3YMATRIX-CHAIN-MULTIPLY(A, s, s[i, j] + 1,j)

4returnMATRIX-MULTIPLY(X, Y)

5else returnA_{i}

((A_{1}(A_{2}A_{3}))((A_{4}A_{5})A_{6})) .

Give an efficient algorithm PRINT-OPTIMAL-PARENS to print the optimal parenthesization of a matrix chain given the table *s* computed by MATRIX-CHAIN-ORDER. Analyze your algorithm.

(*Hint*: You may find the identity

Show that a full parenthesization of an *n*-element expression has exactly *n* - 1 pairs of parentheses.

Although we have just worked through an example of the dynamic-programming method, you might still be wondering just when the method applies. From an engineering perspective, when should we look for a dynamic-programming solution to a problem? In this section, we examine the two key ingredients that an optimization problem must have for dynamic programming to be applicable: optimal substructure and overlapping subproblems. We also look at a variant method, called memoization, for taking advantage of the overlapping-subproblems property.

The first step in solving an optimization problem by dynamic programming is to characterize the structure of an optimal solution. We say that a problem exhibits **optimal substructure*** *if an optimal solution to the problem contains within it optimal solutions to subproblems. Whenever a problem exhibits optimal substructure, it is a good clue that dynamic programming might apply. (It also might mean that a greedy strategy applies, however. See Chapter 17.)

The second ingredient that an optimization problem must have for dynamic programming to be applicable is that the space of subproblems must be "small" in the sense that a recursive algorithm for the problem solves the same subproblems over and over, rather than always generating new subproblems. Typically, the total number of distinct subproblems is a polynomial in the input size. When a recursive algorithm revisits the same problem over and over again, we say that the optimization problem has * overlapping subproblems*. In contrast, a problem for which a divide-and-conquer approach is suitable usually generates brand-new problems at each step of the recursion. Dynamic-programming algorithms typically take advantage of overlapping subproblems by solving each subproblem once and then storing the solution in a table where it can be looked up when needed, using constant time per lookup.

RECURSIVE-MATRIX-CHAIN(p, i, j)

1ifi=j

2then return0

3m[i, j]

4forkitoj- 1

5doqRECURSIVE-MATRIX-CHAIN(p, i, k)

+ RECURSIVE-MATRIX-CHAIN(p, k +1, j) +p1_{i-}p_{k}p_{j}

6ifq<m[i, j]

7thenm[i, j]q

8returnm[i, j]

There is a variation of dynamic programming that often offers the efficiency of the usual dynamic-programming approach while maintaining a top-down strategy. The idea is to * memoize* the natural, but inefficient, recursive algorithm. As in ordinary dynamic programming, we maintain a table with subproblem solutions, but the control structure for filling in the table is more like the recursive algorithm.

The following procedure is a memoized version of RECURSIVE-MATRIX-CHAIN.

MEMOIZED-MATRIX-CHAIN(p)

1nlength[p] - 1

2fori1ton

3do forjiton

4dom[i,j]

5returnLOOKUP-CHAIN(p, 1,n)

LOOKUP-CHAIN(p, i, j)

1ifm[i,j] <

2then returnm[i, j]

3ifi = j

4thenm[i, j]0

5else forkitoj- 1

6doqLOOKUP-CHAIN(p, i, k)

+ LOOKUP-CHAIN(p, k+ 1,j) +p- 1_{i}pkpj

7ifq<m[i, j]

8thenm[i, j]q

9returnm[i, j]

Draw the recursion tree for the MERGE-SORT procedure from Section 1.3.1 on an array of 16 elements. Explain why memoization is ineffective in speeding up a good divide-and-conquer algorithm such as MERGE-SORT.

The next problem we shall consider is the longest-common-subsequence problem. A subsequence of a given sequence is just the given sequence with some elements (possibly none) left out. Formally, given a sequence *X* = *x*_{1},* x*_{2}, . . . , *x _{m}*, another sequence

Given two sequences *X* and *Y*, we say that a sequence *Z* is a * common subsequence* of

The LCS problem has an optimal-substructure property, however, as the following theorem shows. As we shall see, the natural class of subproblems correspond to pairs of "prefixes" of the two input sequences. To be precise, given a sequence *X* = *x*_{1}, *x*_{2}, . . . ,*x _{m}*, we define the

1. If *x _{m}* =

2. If *x _{m}*

3. If *x _{m}*

(3) The proof is symmetric to (2).

Procedure LCS-LENGTH takes two sequences *X* = *x*_{1},*x*_{2},*...*,*x _{m}* and

LCS-LENGTH(X,Y)

1mlength[X]

2nlength[Y]

3fori1tom

4doc[i,0] 0

5forj0ton

6doc[0,j] 0

7fori1tom

8do forj1ton

9do ifx=_{i}y_{j}

10thenc[i,j]c[i- 1,j-1] + 1

11b[i,j] ""

12else ifc[i- 1,j]c[i,j- 1]

13thenc[i,j]c[i- 1,j]

14b[i,j] ""

15elsec[i,j]c[i,j-1]

16b[i,j] ""

17returncandb

PRINT-LCS(b,X,i,j)

1ifi= 0 orj= 0

2then return

3ifb[i,j] = ""

4thenPRINT-LCS(b,X,i- 1,j -1)

5 printx_{i}

6elseifb[i,j] = ""

7thenPRINT-LCS(b,X,i- 1,j)

8elsePRINT-LCS(b,X,i, j- 1)

Determine an LCS of 1,0,0,1,0,1,0,1 and 0,1,0,1,1,0,1,1,0.

Give a memoized version of LCS-LENGTH that runs in *O*(*mn*)* *time*.*

In this section, we investigate the problem of optimally triangulating a convex polygon. Despite its outward appearance, we shall see that this geometric problem has a strong similarity to matrix-chain multiplication.

A **polygon**** **is a piecewise-linear, closed curve in the plane. That is, it is a curve ending on itself that is formed by a sequence of straight-line segments, called the * sides* of the polygon. A point joining two consecutive sides is called a

Given two nonadjacent vertices *v _{i}* and

In the * optimal (polygon) triangulation problem*, we are given a convex polygon

There is a surprising correspondence between the triangulation of a polygon and the parenthesization of an expression such as a matrix-chain product. This correspondence is best explained using trees.

A full parenthesization of an expression corresponds to a full binary tree, sometimes called the * parse tree* of the expression. Figure 16.5(a) shows a parse tree for the parenthesized matrix-chain product

((A_{1}(A_{2}A_{3}))(A_{4}(A_{5}A_{6}))) .

Consider an optimal triangulation *T* of an (*n* + 1)-vertex polygon *P* = *v*_{0}, *v*_{1}, . . . , *v _{n}* that includes the triangle for some

Find an optimal triangulation of a regular octagon with unit-length sides. Use the weight function

16-1 Bitonic euclidean traveling-salesman problem

The * euclidean traveling-salesman problem* is the problem of determining the shortest closed tour that connects a given set of

J. L. Bentley has suggested that we simplify the problem by restricting our attention to * bitonic tours*, that is, tours that start at the leftmost point, go strictly left to right to the rightmost point, and then go strictly right to left back to the starting point. Figure 16.6(b) shows the shortest bitonic tour of the same 7 points. In this case, a polynomial-time algorithm is possible.

Consider the problem of neatly printing a paragraph on a printer. The input text is a sequence of *n* words of lengths *l*_{1},*l*_{2},...,*l*_{n}, measured in characters. We want to print this paragraph neatly on a number of lines that hold a maximum of *M* characters each. Our criterion of "neatness" is as follows. If a given line contains words *i* through *j* and we leave exactly one space between words, the number of extra space characters at the end of the line is We wish to minimize the sum, over all lines except the last, of the cubes of the numbers of extra space characters at the ends of lines. Give a dynamic-programming algorithm to print a paragraph of *n* words neatly on a printer. Analyze the running time and space requirements of your algorithm.

When a "smart" terminal updates a line of text, replacing an existing "source" string *x*[1..*m*] with a new "target" string *y*[1..*n*], there are several ways in which the changes can be made. A single character of the source string can be deleted, replaced by another character, or copied to the target string; characters can be inserted; or two adjacent characters of the source string can be interchanged ("twiddled") while being copied to the target string. After all the other operations have occurred, an entire suffix of the source string can be deleted, an operation known as "kill to end of line."

Operation Target string Source string

------------------------------------------------

copy a a lgorithm

copy l al gorithm

replace g by t alt orithm

delete o alt rithm

copy r altr ithm

insert u altru ithm

insert i altrui ithm

insert s altruis ithm

twiddle it into ti altruisti hm

insert c altruistic hm

kill hm altruistic

There are many other sequences of operations that accomplish the same result.

(3 cost(copy)) + cost(replace) + cost(delete) + (3 cost(insert))

+ cost(twiddle) + cost(kill) .

* b. *How can the professor ensure that the president gets invited to his own party?

We can use dynamic programming on a directed graph *G* = (*V,E*) for speech recognition. Each edge (*u,v*) *E* is labeled with a sound (*u,v*) from a finite set of sounds. The labeled graph is a formal model of a person speaking a restricted language. Each path in the graph starting from a distinguished vertex *v*_{0} *V* corresponds to a possible sequence of sounds produced by the model. The label of a directed path is defined to be the concatenation of the labels of the edges on that path.

The *O*(*mn*)-time algorithm for the longest-common-subsequence problem seems to be a folk algorithm. Knuth [43] posed the question of whether subquadratic algorithms for the LCS problem exist. Masek and Paterson [143] answered this question in the affirmative by giving an algorithm that runs in *O*(*mn*/l*g n*) time, where *n* *m* and the sequences are drawn from a set of bounded size. For the special case in which no element appears more than once in an input sequence, Szymanski [184] shows that the problem can be solved in *O*((*n+m*) l*g*(*n*+*m*)) time. Many of these results extend to the problem of computing string edit distances (Problem 16-3).