Problem Set 6
Recall the Vertex Cover problem from the video lectures: given an undirected graph (with no parallel edges), compute a minimize-size subset of vertices that includes at least one endpoint of every edge. Consider the following greedy algorithm, given a graph
: (1) initialize ; (2) while is not a vertex cover of : (2a) let denote the edges with neither endpoint in ; (2b) let be some edge of ; (2c) add both endpoints of to ; (3) return . Consider the following statement: for every graph
with vertices, this greedy algorithm returns a vertex cover with size at most times that of an optimal (minimum-size) vertex cover. Which of the following is the smallest choice of the function for which this statement is true? [Hint: suppose the greedy algorithm picks an edge
with endpoints and . What can you say about every feasible solution to the problem?]
ANSWER: What the algorithm finds in the end is a matching (a set of edges no two of which share an endpoint) that is “maximal” (we can’t add any more edges to it and keep it a matching). Note that a “maximum” matching (we cannot do better) is also a “maximal” matching, but the converse is not necessarily true. See the following example:

Option 1 is correct. Let
Claim: The output of the algorithm is feasible.
Proof: We prove this by contradiction: suppose there exists an edge
Lemma: Given algorithm gives a 2-approximation for minimum vertex cover regardless of the choice of
Proof: Let
This technique of lower bounding the optimum is often key in proving approximation factors, as we are usually unable to compute the value of OPT.
This algorithm was discovered independently by Fanica Gavril and Mihalis Yannakakis.
In the set cover problem, you are given
sets , each a subset of a common set with size . The goal is to select as few of these sets as possible, subject to the constraint that the union of the chosen sets is all of . (You can assume that .) For example, if the given sets are , then the optimal solution consists of the sets . Here is a natural iterative greedy algorithm. First, initialize
, where denotes the sets chosen so far. The main loop is: as long as the union of the sets chosen so far is not the entire set :
- Let
be a set that includes the maximum-possible number of elements not in previously-chosen sets (i.e., that maximizes ). - Add
to . Consider the following statement: for every instance of the set cover problem (with
), this greedy algorithm returns a set cover with size at most times that of an optimal (minimum-size) set cover. Which of the following is the smallest choice of the function for which this statement is true? [Hint: what’s the mininum-possible progress that the greedy algorithm can make in each iteration, as a function of the size of an optimal set cover and of the number of elements that have not yet been covered?]
ANSWER: Before giving the proof, we need one useful technical inequality.
Lemma: For all
where
Proof: We use the fact that for any real

We will cheat a bit. Let
Let’s consider how many new elements we cover with each round of the algorithm. Initially, there are
Thus, with each iteration the number of remaining elements decreases by a factor of at least
How long can this go on? Since the greedy heuristic ran for
(In the last step, we just rewrote the expression in a manner that makes it easier to apply the above technical lemma.) By the above lemma we have
Now, if we multiply by
Therefore (ignoring the missing “+1” as mentioned above), the greedy set cover is larger than the optimum set cover by a factor of at most
Suppose you are given
sets , each a subset of a common set . The goal is to choose 2 of the sets, and , to maximize the size of their union. One natural heuristic is to use a greedy algorithm: (i) choose the first set to be as large as possible, breaking ties arbitrarily; (ii) choose the second set to maximize (i.e., as the set that includes as many elements as possible that are not already in ), again breaking ties arbitrarily. For example, if the given sets are , then the algorithm might pick the set in the first step; if it does so, it definitely picks the set in the second step (for an objective function value of 4). Consider the following statement: for every instance of the above problem, the greedy algorithm above chooses two sets
such that is at least times the maximum size of the union of two of the given sets. Which of the following is the largest choice of the constant for which this statement is true?
ANSWER: This is known as the Maximum Coverage Problem. We will prove the lower bound for
Claim 1:
Proof: At each step, the algorithm selects a subset whose inclusion covers the maximum number of uncovered elements. Since the optimal solution uses
Claim 2:
Proof: By induction. For
Lemma: The algorithm is a
Proof: Using Claim 2, and the fact that
Lastly, since
Consider the following job scheduling problem. There are
machines, all identical. There are jobs, and job has size . Each job must be assigned to exactly one machine. The load of a machine is the sum of the sizes of the jobs that get assigned to it. The makespan of an assignment of jobs is the maximum load of a machine; this is the quantity that we want to minimize. For example, suppose there are two machines and jobs with sizes . Assigning the first two jobs to the first machine and the last two jobs to the second machine yields machine loads and , for a makespan of . A better assignment puts the first and last jobs on the first machine and the second and third jobs on the second machine, for a makespan of . Consider the following greedy algorithm. Iterate through the jobs
one-by-one. When considering job , assign it to the machine that currently has the smallest load (breaking ties arbitrarily). For example, in the four-job instance above, this algorithm would assign the first job to the first machine, the second job to the second machine, the third job to the first machine, and the fourth job to the second machine (for a suboptimal makespan of ). Consider the following statement: for every such job scheduling instance, this greedy algorithm computes a job assignment with makespan at most
times that of an optimal (minimum-makespan) job assignment. Which of the following is the smallest choice of the constant that makes this statement true? [Hint: let
and denote the average and maximum job sizes and ). Try to relate both the optimal solution and the output of the greedy algorithm to .]
ANSWER: Let
Lemma 1:
However, there is one case where this lower bound is too weak to be useful; when the size of one job dominates all the other. Thus, an additional lower bound is given by Lemma 2.
Lemma 2:
Consider machine
Consider the same makespan-minimization job scheduling problem studied in the previous problem. Now suppose that, prior to running the greedy algorithm in the previous problem, we first sort the jobs from biggest to smallest. For example, in the four-job instance discussed in the previous problem, the jobs would be considered in the order
, and the greedy algorithm would then produce an optimal schedule, with makespan . Consider the following statement: for every such job scheduling instance, the greedy algorithm (with this sorting preprocessing step) computes a job assignment with makespan at most
times that of an optimal (minimum-makespan) job assignment. Which of the following is the smallest choice of the constant for which this statement is true?
ANSWER:
Lemma: If there are more than
Proof: If there are as many jobs as there are machines, then the optimal assignment is trivial; every machine gets a job. The optimal makespan is
As before, we consider a machine
The rest of the proof is very similar to that of the previous one.
Which of the following statements is NOT true about the generic local search algorithm?
- The generic local search algorithm is guaranteed to eventually converge to an optimal solution.
- The output of the generic local search algorithm generally depends on the choice of the starting point.
- The output of the generic local search algorithm generally depends on the choice of the superior neighboring solution to move to next (in an iteration where there are multiple such solutions).
- In some cases, the generic local search algorithm can require an exponential number of iterations before terminating.
ANSWER: Option 1 is false. Local search is guaranteed to eventually converge to an locally optimal solution, which at times, can be quite far from a globally optimal solution.
Options 2, 3 and 4 is correct; see lecture video 18.4.
Suppose we apply local search to the minimum cut problem. Given an undirected graph, we begin with an arbitrary cut (A,B). We check if there is a vertex v such that switching v from one group to the other would strictly decrease the number of edges crossing the cut. (Also, we disallow vertex switches that would cause A or B to become empty.) If there is such a vertex, we switch it from one group to the other; if there are many such vertices, we pick one arbitrarily to switch. If there are no such vertices, then we return the current locally optimal cut (A,B). Which of the following statements is true about this local search algorithm?
- This local search algorithm is guaranteed to terminate in a polynomial number of iterations.
- This local search algorithm is guaranteed to compute a minimum cut.
- This local search algorithm is guaranteed to compute a cut for which the number of crossing edges is at most twice the minimum possible.
- If this local search algorithm is guaranteed to terminate in a polynomial number of iterations, it would immediately imply P = NP.
ANSWER: Option 1 is true. Every iteration strictly decreases the number of crossing edges, so there can be only
Options 2 and 3 are false.
Option 4 is false. local search is guaranteed to eventually converge to an locally optimal solution, whereas P = NP implies we can find a globally optimal solution in polynomial time.
In the maximum
-cut problem, the input is an undirected graph , and each edge has a nonnegative weight . The goal is to partition into non-empty pieces to maximize the total weight of the cut edges (i.e., edges with endpoints in different ’s). The maximum cut problem (as studied in lecture) corresponds to the special case where . Consider applying local search to the maximum
-cut problem: (i) start with an arbitrary -cut; (ii) repeat: if possible, move a vertex from one set to another set so as to strictly increase the total weight of the cut edges; (iii) once no such move is possible, halt. Consider the following statement: for every instance of the maximum
-cut problem, every local maximum has objective function value at least times that of the maximum possible. Which of the following is the biggest choice of the function for which this statement is true?
ANSWER: This paper presents a local search solution that achieves a lower bound no smaller than
Suppose
is a random variable that has expected value . What is the probability that is or larger? (Choose the strongest statement that is guaranteed to be true.)
- At most
- At most
- At most
ANSWER: Consider the following probability distribution for
Then
Suppose
is a random variable that is always nonnegative and has expected value . What is the probability that is or larger? (Choose the strongest statement that is guaranteed to be true.)
- At most
- At most
- At most
ANSWER: Using Markov’s Inequality, we have:
Therefore, option 4 is correct.
Leave a comment