# Problem Set 1

We are given as input a set of $n$ requests (e.g., for the use of an auditorium), with a known start time $s_i$ and finish time $t_i$ for each request $i$. Assume that all start and finish times are distinct. Two requests conflict if they overlap in time — if one of them starts between the start and finish times of the other. Our goal is to select a maximum-cardinality subset of the given requests that contains no conflicts. (For example, given three requests consuming the intervals [0,3], [2,5], and [4,7], we want to return the first and third requests.) We aim to design a greedy algorithm for this problem with the following form: At each iteration we select a new request $i$, including it in the solution-so-far and deleting from future consideration all requests that conflict with $i$.

Which of the following greedy rules is guaranteed to always compute an optimal solution?

• At each iteration, pick the remaining request with the earliest finish time.
• At each iteration, pick the remaining request which requires the least time (i.e., has the smallest value of $t_i - s_i$) (breaking ties arbitrarily).
• At each iteration, pick the remaining request with the earliest start time.
• At each iteration, pick the remaining request with the fewest number of conflicts with other remaining requests (breaking ties arbitrarily).

ANSWER: Consider the following requests:

i | start | finish | duration
-----------------------------
1 | 0     | 3      | 3
2 | 2     | 4      | 2
3 | 3     | 6      | 3


If we pick the request with the smallest $t_i - s_i$, we can only pick $r_1$, whereas just eyeballing shows that the maximum subset should contain $r_1$ and $r_3$. Therefore, option 2 is incorrect.

Consider the following requests:

i | start | finish
------------------
1 | 0     | 6
2 | 2     | 4
3 | 3     | 6


If we pick the request with the earliest time ($r_1$), we cannot pick any from the table above, but if we somehow don’t pick $r_1$, we can pick either one of $r_2$ or $r_3$. Therefore, option 3 is incorrect.

It is shown here that option 4 is incorrect as well, but I honestly don’t understand how they came up with such an example, so I’m going to refrain from copy-pasting it here.

Option 1 is correct. To prove it, let $S$ be a set of requests and $r_m \in S$ a request with the earliest finish time. Let $R \subseteq S$ be the maximum subset of mutually compatible requests, and $r_k \in R$ a request with the earliest finish time. Clearly, $finish(r_m) \leq finish(r_k)$. If they are equal, we are done with the proof. Otherwise, let $R' = R - \{r_k\} \cup \{r_m\}$. Since ${r_k}$ is the request with the earliest finish time in $R$, and $finish(r_m) \lt finish(r_k)$, therefore, none of the other requests in $R$ conflict with $r_m$, and hence $\lvert R' \rvert = \lvert R \rvert$. Therefore, the maximum subset $R$ contains $r_m$, the request with the earliest finish time.

We are given as input a set of $n$ jobs, where job $j$ has a processing time $p_j$ and a deadline $d_j$. Recall the definition of completion times $C_j$ from the video lectures. Given a schedule (i.e., an ordering of the jobs), we define the lateness $l_j$ of job $j$ as the amount of time $C_j - d_j$ after its deadline that the job completes, or as $0$ if $C_j \le d_j$. Our goal is to minimize the maximum lateness, $\max_{j} l_j$.

Which of the following greedy rules produces an ordering that minimizes the maximum lateness? You can assume that all processing times and deadlines are distinct.

• None of the other answers are correct.
• Schedule the requests in increasing order of deadline $d_j$
• Schedule the requests in increasing order of processing time $p_j$
• Schedule the requests in increasing order of the product $d_j \times p_j$

ANSWER: Let $S$ be a sequence in increasing order of processing time $p_j$. Let the position of the job with the maximum lateness be $j$. If we swap job at $j$ with the one at $j - 1$, since $C_{j-1} < C_j$, we get an even lower value for lateness. Thus, option 3 is incorrect.

By similar argument, option 4 is incorrect.

Let $S$ be a sequence in increasing order of deadline $d_j$. If we swap job at $j$ with the one at $j - 1$, the lateness for the new job at position $j$ increases. Thus, the swap makes the sequence worse, and hence, the sequence is optimal to begin with.

Consider an undirected graph $G = (V,E)$ where every edge $e \in E$ has a given cost $c_e$. Assume that all edge costs are positive and distinct. Let $T$ be a minimum spanning tree of $G$ and $P$ a shortest path from the vertex $s$ to the vertex $t$. Now suppose that the cost of every edge $e$ of $G$ is increased by $1$ and becomes $c_e + 1$. Call this new graph $G'$. Which of the following is true about $G'$?

• $T$ may not be a minimum spanning tree and $P$ may not be a shortest $s-t$ path.
• $T$ may not be a minimum spanning tree but $P$ is always a shortest $s-t$ path.
• $T$ is always a minimum spanning tree and $P$ is always a shortest $s-t$ path.
• $T$ must be a minimum spanning tree but $P$ may not be a shortest $s-t$ path.

ANSWER: Since Prim’s MST algorithm picks the edge with the lowest cost at each iteration, increasing the cost of every edge by $1$ wouldn’t change anything. However, the shortest path may change. Consider the graph $D \xleftarrow{7} A \xrightarrow{2} B \xrightarrow{2} C \xrightarrow{2} D$. The shortest path from $A$ to $D$ is $A \rightarrow B \rightarrow C \rightarrow D$. Adding $1$ to every edge gives us the new graph $D \xleftarrow{8} A \xrightarrow{3} B \xrightarrow{3} C \xrightarrow{3} D$. Now $A \rightarrow B \rightarrow C \rightarrow D$ is no longer the shortest path between $A$ and $D$. Thus, option 4 is correct, and the other ones are not.

Suppose $T$ is a minimum spanning tree of the connected graph $G$. Let $H$ be a connected induced subgraph of $G$. (I.e., $H$ is obtained from $G$ by taking some subset $S \subseteq V$ of vertices, and taking all edges of $E$ that have both endpoints in $S$. Also, assume $H$ is connected.) Which of the following is true about the edges of $T$ that lie in $H$? You can assume that edge costs are distinct, if you wish. [Choose the strongest true statement.]

• For every $G$ and $H$, these edges form a minimum spanning tree of $H$
• For every $G$ and $H$, these edges are contained in some minimum spanning tree of $H$
• For every $G$ and $H$ and spanning tree $T_H$ of $H$, at least one of these edges is missing from $T_H$
• For every $G$ and $H$, these edges form a spanning tree (but not necessary minimum-cost) of $H$

ANSWER: Consider a triangle graph $G = C \xleftarrow{3} A \xrightarrow{1} B \xrightarrow{1} C; \therefore T = {(A, B), (B, C)}$. If $H = {(A, C)}, T \cap H = \emptyset$, which rules out options 1 and 4.

Option 2 is correct. To prove it, let’s establish the Light-Edge Property of a MST.

Light-Edge Property: Let G = (V, E) be a connected undirected weighted graph with distinct edge weights. For any cut of G, the minimum weight edge that crosses the cut is in the minimum spanning tree T of G.

Proof: Suppose $e(v, w) \in E, e \notin T$ be the minimum weight edge. If we take a cut $(A, B) \text{ s.t. } v \in A, w \in B$, then there must be another edge $e' \in T, e' \neq e$ that connects $v$ and $w$ (since by definition $T$ is a connected subgraph). Let $T' = T - \{e'\} \cup \{e\}$; since, $weight(e') > weight(e), weight(T') < weight(T)$. Since $T$ is a MST, this brings us to a contradiction, and hence $e \notin T$ cannot be true.

Back to the original question, suppose $T'$ be a MST of $H$ and an edge $e \in T \cap H, e \notin T'$. Arguing on the same lines as the Light-Edge Property, we can show that this contradicts the assumption $T$ is a MST, and $e$ must be in $T'$. Since $e$ is an arbitrary edge, all edges in $T \cap H$ must be included in $T'$.

Since option 2 is correct, option 3 is not.

Consider an undirected graph $G = (V,E)$ where edge $e \in E$ has cost $c_e$. A minimum bottleneck spanning tree $T$ is a spanning tree that minimizes the maximum edge cost $\max_{e \in T} c_e$. Which of the following statements is true? Assume that the edge costs are distinct.

• A minimum bottleneck spanning tree is not always a minimum spanning tree and a minimum spanning tree is not always a minimum bottleneck spanning tree.
• A minimum bottleneck spanning tree is always a minimum spanning tree and a minimum spanning tree is always a minimum bottleneck spanning tree.
• A minimum bottleneck spanning tree is always a minimum spanning tree but a minimum spanning tree is not always a minimum bottleneck spanning tree.
• A minimum bottleneck spanning tree is not always a minimum spanning tree, but a minimum spanning tree is always a minimum bottleneck spanning tree.

ANSWER: Recall the Light-Edge Property; it implies that every other spanning tree has maximum edge cost at least as large. Therefore, a MST is also a minimum bottleneck spanning tree. But the reverse is not true; to see that, consider the graph $G = C \xleftarrow{4} A \xrightarrow{4} B \xrightarrow{-1} C; MST(G) = \{(A, B), (B, C)\}$. Consider the spanning tree $\{(A, B), (A, C)\}$; it doesn’t increase the bottleneck ($4$), and hence is a minimum bottleneck spanning tree, but isn’t a MST.

In this problem you are given as input a graph $T = (V,E)$ that is a tree (that is, $T$ is undirected, connected, and acyclic). A perfect matching of $T$ is a subset $F \subset E$ of edges such that every vertex $v \in V$ is the endpoint of exactly one edge of $F$. Equivalently, $F$ matches each vertex of $T$ with exactly one other vertex of $T$. For example, a path graph has a perfect matching if and only if it has an even number of vertices.

Consider the following two algorithms that attempt to decide whether or not a given tree has a perfect matching. The degree of a vertex in a graph is the number of edges incident to it. (The two algorithms differ only in the choice of $v$ in line 5.)

Algorithm A:

1 While T has at least one vertex:
2   If T has no edges:
3     halt and output "T has no perfect matching."
4   Else:
5     Let v be a vertex of T with maximum degree.
6     Choose an arbitrary edge e incident to v.
7     Delete e and its two endpoints from T.
8 [end of while loop]
9 Halt and output "T has a perfect matching."


Algorithm B:

1 While T has at least one vertex:
2   If T has no edges:
3     halt and output "T has no perfect matching."
4   Else:
5     Let v be a vertex of T with minimum non-zero degree.
6     Choose an arbitrary edge e incident to v.
7     Delete e and its two endpoints from T.
8 [end of while loop]
9 Halt and output "T has a perfect matching."


Is either algorithm correct?

• Algorithm A always correctly determines whether or not a given tree graph has a perfect matching; algorithm B does not.
• Neither algorithm always correctly determines whether or not a given tree graph has a perfect matching.
• Both algorithms always correctly determine whether or not a given tree graph has a perfect matching.
• Algorithm B always correctly determines whether or not a given tree graph has a perfect matching; algorithm A does not.

ANSWER: The algorithms are understated: When a vertex is deleted, all edges incident to that vertex are also deleted, otherwise we’re left with dangling edges.

Consider the path tree $A - B - C - D$. Algo A chooses vertex $B$, and say, edge $(B, C)$. After deleting vertices $B$ and $C$, and all the edges, we’re left with vertices $A$ and $D$ with no edges. Algo A halts, and incorrectly declares that $T$ doesn’t have perfect matching (it does, $\{ (A, B ), (C, D) \}$).

Lets prove the correctness of algo B by induction. For $\lvert V \rvert = 1$, it correctly identifies that $T$ doesn’t have perfect matching. For $\lvert V \rvert = 2$, it correctly identifies that $T$ has perfect matching. Clearly, for $T$ to have perfect matching, $\lvert V \rvert$ must be even. Assume algo B works for $\lvert V \rvert = k$. Let us add a new vertex $v$ and a new edge $(v, w)$ to $T$, where $w$ is some existing vertex in $T$. Clearly, for $T$ to have perfect matching, it must include edge $(v, w)$ (since that’s the only way to match vertex $v$). By construction, algo B picks $v$ and removes edge $(v, w)$ in some iteration, leaving $\lvert V \rvert = k - 1$, which by inductive hypothesis, is correctly solved by algo B.

Therefore, algo B is correct.