Many problems of practical significance are NP-complete but are too important to abandon merely because obtaining an optimal solution is intractable. If a problem is NP-complete, we are unlikely to find a polynomial-time algorithm for solving it exactly, but this does not imply that all hope is lost. There are two approaches to getting around NP-completeness. First, if the actual inputs are small, an algorithm with exponential running time may be perfectly satisfactory. Second, it may still be possible to find *near-optimal* solutions in polynomial time (either in the worst case or on the average). In practice, near-optimality is often good enough. An algorithm that returns near-optimal solutions is called an * approximation algorithm*. This chapter presents polynomial-time approximation algorithms for several NP-complete problems.

Performance bounds for approximation algorithms

We say that an approximation algorithm for the problem has a * ratio bound *of (

Sometimes, it is more convenient to work with a measure of relative error. For any input, the * relative error* of the approximation algorithm is defined to be

(n) (n) - 1.

An * approximation scheme* for an optimization problem is an approximation algorithm that takes as input not only an instance of the problem, but also a value > 0 such that for any fixed , the scheme is an approximation algorithm with relative error bound . We say that an approximation scheme is a

We say that an approximation scheme is a * fully polynomial-time approximation scheme* if its running time is polynomial both in 1/ and in the size

The vertex-cover problem was defined and proved NP-complete in Section 36.5.2. A * vertex cover* of an undirected graph

The * vertex-cover problem* is to find a vertex cover of minimum size in a given undirected graph. We call such a vertex cover an

Figure 37.1 illustrates the operation of APPROX-VERTEX-COVER. The variable *C* contains the vertex cover being constructed. Line 1 initializes *C* to the empty set. Line 2 sets *E*' to be a copy of the edge set *E*[*G*] of the graph. The loop on lines 3-6 repeatedly picks an edge (*u*, *v*) from *E*', adds its endpoints *u* and *v* to *C*, and deletes all edges in *E*'* *that are covered by either *u* or *v*. The running time of this algorithm is *O*(*E*), using an appropriate data structure for representing *E*'.

APPROX-VERTEX-COVER has a ratio bound of 2.

Given an example of a graph for which APPROX-VERTEX-COVER always yields a suboptimal solution.

Give an efficient greedy algorithm that finds an optimal vertex cover for a tree in linear time.

From the proof of Theorem 36.12, we know that the vertex-cover problem and the NP-complete clique problem are complementary in the sense that an optimal vertex cover is the complement of a maximum-size clique in the complement graph. Does this relationship imply that there is an approximation algorithm with constant ratio bound for the clique problem? Justify your answer.

In the traveling-salesman problem introduced in Section 36.5.5, we are given a complete undirected graph *G = *(*V, E*) that has a nonnegative integer cost *c*(*u, v*) associated with each edge (*u, v*) *E*, and we must find a hamiltonian cycle (a tour) of *G* with minimum cost. As an extension of our notation, let *c*(*A*) denote the total cost of the edges in the subset *A* *E*:

In many practical situations, it is always cheapest to go directly from a place *u* to a place *w*; going by way of any intermediate stop *v* can't be less expensive. Putting it another way, cutting out an intermediate stop never increases the cost. We formalize this notion by saying that the cost function *c* satisfies the * triangle inequality* if for all vertices

c(u, w)c(u, v) +c(v, w).

The following algorithm computes a near-optimal tour of an undirected graph *G*, using the minimum-spanning-tree algorithm MST-PRIM from Section 24.2. We shall see that when the cost function satisfies the triangle inequality, the tour that this algorithm returns is no worse than twice as long as an optimal tour.

APPROX-TSP-TOUR(G, c)

1 select a vertexrV[G] to be a "root" vertex

2 grow a minimum spanning treeTforGfrom rootr

using MST-PRIM(G, c, r)

3 letLbe the list of vertices visited in a preorder tree walk ofT

4returnthe hamiltonian cycleHthat visits the vertices in the orderL

c(T)c(H*) .

A* full walk* of

a, b, c, b, h, b, a, d, e, f, e, g, e, d, a.

Since the full walk traverses every edge of *T* exactly twice, we have

c(W) = 2c(T) .

Equations (37.4) and (37.5) imply that

c(W) 2c(H*),

and so the cost of *W* is within a factor of 2 of the cost of an optimal tour.

Unfortunately, *W* is generally not a tour, since it visits some vertices more than once. By the triangle inequality, however, we can delete a visit to any vertex from *W* and the cost does not increase. (If a vertex *v* is deleted from *W* between visits to *u* and *w*, the resulting ordering specifies going directly from *u* to *w*.) By repeatedly applying this operation, we can remove from *W* all but the first visit to each vertex. In our example, this leaves the ordering

a, b, c, h, d, e, f, g .

c(H)c(W).

Combining inequalities (37.6) and (37.7) completes the proof.

E' = {(u, v):u, vVanduv}.

Assign an integer cost to each edge in *E*' as follows:

( |V| + 1) + (|V| - 1) > |V| .

Consider the following **closest-point heuristic**** **for building an approximate traveling-salesman tour. Begin with a trivial cycle consisting of a single arbitrarily chosen vertex. At each step, identify the vertex *u* that is not on the cycle but whose distance to any vertex on the cycle is minimum. Suppose that the vertex on the cycle that is nearest *u* is vertex *v*. Extend the cycle to include *u* by inserting *u* just after *v*. Repeat until all vertices are on the cycle. Prove that this heuristic returns a tour whose total cost is not more than twice the cost of an optimal tour.

The **bottleneck traveling-salesman problem**** **is the problem of finding the hamiltonian cycle such that the length of the longest edge in the cycle is minimized. Assuming that the cost function satisfies the triangle inequality, show that there exists a polynomial-time approximation algorithm with ratio bound 3 for this problem. (*Hint:* Show recursively that we can visit all the nodes in a minimum spanning tree exactly once by taking a full walk of the tree and skipping nodes, but without skipping more than 2 consecutive intermediate nodes.)

An instance of the * set-covering problem* consists of a finite set

We say that a subset * covers* its elements. The problem is to find a minimum-size subset whose members cover all of

We say that any satisfying equation (37.8) **covers***X*. Figure 37.3 illustrates the problem.

The greedy method works by picking, at each stage, the set *S* that covers the most remaining uncovered elements.

GREEDY-SET-COVER has a ratio bound

From inequalities (37.9) and (37.10), it follows that

proving the theorem. It thus remains only to prove inequality (37.10). For any set , let

For integers *a* and *b*, where a < *b*, we have

Using this inequality, we obtain the telescoping sum

since *H*(0) = 0. This completes the proof of inequality (37.10).

GREEDY-SET-COVER has a ratio bound of (ln |*X*| + 1).

* Proof *Use inequality (3.12) and Theorem 37.4.

Show that the decision version of the set-covering problem is NP-complete by reduction from the vertex-cover problem.

Show how to implement GREEDY-SET-COVER in such a way that it runs in time .

Show that the following weaker form of Theorem 37.4 is trivially true:

An instance of the subset-sum problem is a pair (*S*, *t*), where *S* is a set {*x*_{1}, *x*_{2}, . . . , *x _{n}*} of positive integers and

S+x= {s+x:sS} .

We use an auxiliary procedure MERGE-LISTS(*L*, *L*') that returns the sorted list that is the merge of its two sorted input lists *L* and *L*'. Like the MERGE procedure we used in merge sort (Section 1.3.1), MERGE-LISTS runs in time *O*(|*L*| + |*L*'|). (We omit giving pseudocode for MERGE-LISTS.) The procedure EXACT-SUBSET-SUM takes an input set *S* = {*x*_{1}, *x*_{2}, . . . , *x _{n}*} and a target value

EXACT-SUBSET-SUM(S,t)

1n|S|

2L_{0}0

3fori1ton

4doLMERGE-LISTS(_{i}L1,_{i-}L1 +_{i-}x)_{i}

5 remove fromLevery element that is greater than_{i}t

6returnthe largest element inL_{n}

P_{1}= {0, 1} ,

P_{2}= {0, 1, 4, 5} ,

P_{3}= {0, 1, 4, 5, 6, 9, 10} .

We can derive a fully polynomial-time approximation scheme for the subset-sum problem by "trimming" each list L_{i} after it is created. We use a trimming parameter such that 0 < < 1. To trim a list L by means to remove as many elements from L as possible, in such a way that if L'is the result of trimming L, then for every element y that was removed from L, there is an element z y still in L' such that

(1 - )yzy.

L= 10, 11, 12, 15, 20, 21, 22, 23, 24, 29,

L' = 10, 12, 15, 20, 23, 29,

TRIM(L, )

1 m |L|

2 L' <y_{1}>

3 last <y_{1}>

4fori 2tom

5do iflast < (1 - )y_{i}

6thenappend y_{i }onto the end of L'

7 last y_{i}

8returnL'

APPROX-SUBSET-SUM(S, t,)

1n|S|

2L_{0}0

3fori1ton

4doLMERGE-LISTS(_{i}L-_{i}_{1},L-_{i}_{1}+x)_{i}

5LTRIM(_{i}L, /_{i}n)

6 remove fromLevery element that is greater than_{i}t

7 letzbe the largest value inL_{n}

8returnz

As an example, suppose we have the instance

L= 104, 102, 201, 101

line 2:L_{0}= 0 ,

line 4:L_{1 }= 0, 104 ,

line 5:L_{1 }= 0, 104 ,

line 6:L_{1 }= 0, 104 ,

line 4:L2 = 0, 102, 104, 206 ,

line 5:L_{2 }= 0, 102, 206 ,

line 6:L_{2 }= 0, 102, 206 ,

line 4:L_{3 }= 0,102, 201, 206, 303, 407 ,

line 5:L_{3 }= 0,102, 201, 303, 407 ,

line 6:L_{3}= 0, 102, 201, 303 ,

line 4:L_{4 }= 0, 101, 102, 201, 203, 302, 303, 404 ,

line 5:L_{4 }= 0,101, 201, 302, 404 ,

line 6:L_{4 }= 0,101, 201, 302 .

APPROX-SUBSET-SUM is a fully polynomial-time approximation scheme for the subset-sum problem.

(1 - /n)^{i}yzy.

If *y** *P _{n}* denotes an optimal solution to the subset-sum problem, then there is a

(1 - /n)*^{n}yzy*;

the largest such *z* is the value returned by APPROX-SUBSET-SUM. Since it can be shown that

the function (1 - /*n*)* ^{n}* increases with

1 - <(1 - /n),^{n}

(1 - )y*z.

Prove equations (37.12) and (37.13).

Suppose that we are given a set of *n* objects, where the the size *s _{i}* of the

* b. *Argue that the optimal number of bins required is at least

* c. *Argue that the first-fit heuristic leaves at most one bin less than half full.

* d. *Prove that the number of bins used by the first-fit heuristic is never more than 2

* e. *Prove a ratio bound of 2 for the first-fit heuristic.

* f. *Give an efficient implementation of the first-fit heuristic, and analyze its running time.

37-2 Approximating the size of a maximum clique

Let *G* = (*V, E*) be an undirected graph. For any *k* 1, define *G*^{(k)} to be the undirected graph (*V*^{(k),}* E*^{(k), }where *V*^{(k)} is the set of all ordered *k*-tuples of vertices from *V* and *E*^{(k)} is defined so that (*v*_{1},*v*_{2}, . . . , *v _{k}*) is adjacent to (

37-3 Weighted set-covering problem

Suppose that we generalize the set-covering problem so that each set *S _{i}* in the family has an associated weight