upgrade structures and migrate to nextra v4
This commit is contained in:
353
content/CSE347/CSE347_L8.md
Normal file
353
content/CSE347/CSE347_L8.md
Normal file
@@ -0,0 +1,353 @@
|
||||
# Lecture 8
|
||||
|
||||
## NP-optimization problem
|
||||
|
||||
Cannot be solved in polynomial time.
|
||||
|
||||
Example:
|
||||
|
||||
- Maximum independent set
|
||||
- Minimum vertex cover
|
||||
|
||||
What can we do?
|
||||
|
||||
- solve small instances
|
||||
- hard instances are rare - average case analysis
|
||||
- solve special cases
|
||||
- find an approximate solution
|
||||
|
||||
## Approximation algorithms
|
||||
|
||||
We find a "good" solution in polynomial time, but may not be optimal.
|
||||
|
||||
Example:
|
||||
|
||||
- Minimum vertex cover: we will find a small vertex cover, but not necessarily the smallest one.
|
||||
- Maximum independent set: we will find a large independent set, but not necessarily the largest one.
|
||||
|
||||
Question: How do we quantify the quality of the solution?
|
||||
|
||||
### Approximation ratio
|
||||
|
||||
Intuition:
|
||||
|
||||
How good is an algorithm $A$ compared to an optimal solution in the worst case?
|
||||
|
||||
Definition:
|
||||
|
||||
Consider algorithm $A$ for an NP-optimization problem $L$. Say for **any** instance $l$, $A$ finds a solution output $c_A(l)$ and the optimal solution is $c^*(l)$.
|
||||
|
||||
Approximation ratio is either:
|
||||
|
||||
$$
|
||||
\max_{l \in L} \frac{c_A(l)}{c^*(l)}=\alpha
|
||||
$$
|
||||
|
||||
for maximization problems, or
|
||||
|
||||
$$
|
||||
\min_{l \in L} \frac{c^A(l)}{c_*(l)}=\alpha
|
||||
$$
|
||||
|
||||
for minimization problems.
|
||||
|
||||
Example:
|
||||
|
||||
Alice's Algorithm, $A$, finds a vertex cover of size $c_A(l)$ for instance $l(G)$. The optimal vertex cover has size $c^*(l)$.
|
||||
|
||||
We want approximation ratio to be as close to 1 as possible.
|
||||
|
||||
> Vertex cover:
|
||||
>
|
||||
> A vertex cover is a set of vertices that touches all edges.
|
||||
|
||||
Let's try an approximation algorithm for the vertex cover problem, called Greedy cover.
|
||||
|
||||
#### Greedy cover
|
||||
|
||||
Pick any uncovered edge, both its endpoints are added to the cover $C$, until all edges are covered.
|
||||
|
||||
Runtime: $O(m)$
|
||||
|
||||
Claim: Greedy cover is correct, and it finds a vertex cover.
|
||||
|
||||
Proof:
|
||||
|
||||
Algorithm only terminates when all edges are covered.
|
||||
|
||||
Claim: Greedy cover is a 2-approximation algorithm.
|
||||
|
||||
Proof:
|
||||
|
||||
Look at the two edges we picked.
|
||||
|
||||
Either it is covered by Greedy cover, or it is not.
|
||||
|
||||
If it is not covered by Greedy cover, then we will add both endpoints to the cover.
|
||||
|
||||
In worst case, Greedy cover will add both endpoints of each edge to the cover. (Consider the graph with disjoint edges.)
|
||||
|
||||
Thus, the size of the vertex cover found by Greedy cover is at most twice the size of the optimal vertex cover.
|
||||
|
||||
Thus, Greedy cover is a 2-approximation algorithm.
|
||||
|
||||
> Min-cut:
|
||||
>
|
||||
> Given a graph $G$ and two vertices $s$ and $t$, find the minimum cut between $s$ and $t$.
|
||||
>
|
||||
> Max-cut:
|
||||
>
|
||||
> Given a graph $G$, find the maximum cut.
|
||||
|
||||
#### Local cut
|
||||
|
||||
Algorithm:
|
||||
|
||||
- start with an arbitrary cut of $G$.
|
||||
- While you can move a vertex from one side to the other side while increasing the size of the cut, do so.
|
||||
- Return the cut found.
|
||||
|
||||
We will prove its:
|
||||
|
||||
- Runtime
|
||||
- Feasibility
|
||||
- Approximation ratio
|
||||
|
||||
##### Runtime for local cut
|
||||
|
||||
Since size of cut is at most $|E|$, the runtime is $O(m)$.
|
||||
|
||||
When we move a vertex from one side to the other side, the size of the cut increases by at least 1.
|
||||
|
||||
Thus, the algorithm terminates in at most $|V|$ steps.
|
||||
|
||||
So the runtime is $O(|E||V|^2)$.
|
||||
|
||||
##### Feasibility for local cut
|
||||
|
||||
The algorithm only terminates when no more vertices can be moved.
|
||||
|
||||
Thus, the cut found is a feasible solution.
|
||||
|
||||
##### Approximation ratio for local cut
|
||||
|
||||
This is a half-approximation algorithm.
|
||||
|
||||
We need to show that the size of the cut found is at least half of the size of the optimal cut.
|
||||
|
||||
We could first upper bound the size of the optimal cut is at most $|E|$.
|
||||
|
||||
We will then prove that solution we found is at least half of the optimal cut $\frac{|E|}{2}$ for any graph $G$.
|
||||
|
||||
Proof:
|
||||
|
||||
When we terminate, no vertex could be moved
|
||||
|
||||
Therefore, **The number of crossing edges is at least the number of non-crossing edges**.
|
||||
|
||||
Let $d(u)$ be the degree of vertex $u\in V$.
|
||||
|
||||
The total number of crossing edges for vertex $u$ is at least $\frac{1}{2}d(u)$.
|
||||
|
||||
Summing over all vertices, the total number of crossing edges is at least $\frac{1}{2}\sum_{u\in V}d(u)=\frac{1}{2}|E|$.
|
||||
|
||||
So the total number of non-crossing edges is at most $\frac{|E|}{2}$.
|
||||
|
||||
QED
|
||||
|
||||
#### Set cover
|
||||
|
||||
Problem:
|
||||
|
||||
You are collecting a set of magic cards.
|
||||
|
||||
$X$ is the set of all possible cards. You want at least one of each card.
|
||||
|
||||
Each dealer $j$ has a pack $S_j\subseteq X$ of cards. You have to buy entire pack or none from dealer $j$.
|
||||
|
||||
Goal: What is the least number of packs you need to buy to get all cards?
|
||||
|
||||
Formally:
|
||||
|
||||
Input $X$ is a universe of $n$ elements, and a collection of subsets of $X$, $Y=\{S_1, S_2, \ldots, S_m\}\subseteq X$.
|
||||
|
||||
Goal: Pick $C\subseteq Y$ such that $\bigcup_{S_i\in C}S_i=X$, and $|C|$ is minimized.
|
||||
|
||||
Set cover is an NP-optimization problem. It is a generalization of the vertex cover problem.
|
||||
|
||||
#### Greedy set cover
|
||||
|
||||
Algorithm:
|
||||
|
||||
- Start with empty set $C$.
|
||||
- While there is an element $x$ in $X$ that is not covered, pick one such element $x\in S_i$ where $S_i$ is the set that has not been picked before.
|
||||
- Add $S_i$ to $C$.
|
||||
- Return $C$.
|
||||
|
||||
```python
|
||||
def greedy_set_cover(X, Y):
|
||||
# X is the set of elements
|
||||
# Y is the collection of sets, hashset by default
|
||||
C = []
|
||||
def non_covered_elements(X, C):
|
||||
# return the elements in X that are not covered by C
|
||||
# O(|X|)
|
||||
return [x for x in X if not any(x in c for c in C)]
|
||||
non_covered = non_covered_elements(X, C)
|
||||
# O(|X|) every loop reduce the size of non_covered by 1
|
||||
while non_covered:
|
||||
max_cover,max_set = 0,None
|
||||
# O(|Y|)
|
||||
for S in Y:
|
||||
# Intersection of two sets is O(min(|X|,|S|))
|
||||
cur_cover = len(set(non_covered) & set(S))
|
||||
if cur_cover > max_cover:
|
||||
max_cover,max_set = cur_cover,S
|
||||
C.append(max_set)
|
||||
non_covered = non_covered_elements(X, C)
|
||||
return C
|
||||
```
|
||||
|
||||
It is not optimal.
|
||||
|
||||
Need to prove its:
|
||||
|
||||
- Correctness:
|
||||
Keep picking until all elements are covered.
|
||||
- Runtime:
|
||||
$O(|X||Y|^2)$
|
||||
- Approximation ratio:
|
||||
|
||||
##### Approximation ratio for greedy set cover
|
||||
|
||||
> Harmonic number:
|
||||
>
|
||||
> $H_n=\sum_{i=1}^n\frac{1}{i}=\frac{1}{1}+\frac{1}{2}+\frac{1}{3}+\cdots+\frac{1}{n}=\Theta(\log n)$
|
||||
|
||||
We claim that the size of the set cover found is at most $H_n\log n$ times the size of the optimal set cover.
|
||||
|
||||
###### First bound:
|
||||
|
||||
Proof:
|
||||
|
||||
If the optimal picks $k$ sets, then the size of the set cover found is at most $(1+\log n)k$ sets.
|
||||
|
||||
Let $n=|X|$.
|
||||
|
||||
Observe that
|
||||
|
||||
For the first round, the elements that we not covered is $n$.
|
||||
$$
|
||||
|U_0|=n
|
||||
$$
|
||||
|
||||
In the second round, the elements that we not covered is at most $|U_0|-x$ where $x=|S_1|$ is the number of elements in the set picked in the first round.
|
||||
|
||||
$$
|
||||
|U_1|=|U_0|-|S_1|
|
||||
$$
|
||||
|
||||
...
|
||||
|
||||
So $x_i\geq \frac{|U_{i-1}|}{k}$.
|
||||
|
||||
We proceed by contradiction.
|
||||
|
||||
Suppose all sets in the optimal solution are $< \frac{|U_0|}{k}$. Then the sum of the sizes of the sets in the optimal solution is $< |U_0|=n$.
|
||||
|
||||
_There exists a least ratio of selection of sets determined by $k_i$. Otherwise the function (selecting the set cover) will not terminate (no such sets exists)_
|
||||
|
||||
> Some math magics:
|
||||
> $$(1-\frac{1}{k})^k\leq \frac{1}{e}$$
|
||||
|
||||
So $n(1-\frac{1}{k})^{|C|-1}=1$, $|C|\leq 1+k\ln n$.
|
||||
|
||||
So the size of the set cover found is at most $(1+\ln n)k$.
|
||||
|
||||
QED
|
||||
|
||||
So the greedy set cover is not too bad...
|
||||
|
||||
###### Second bound:
|
||||
|
||||
Greedy set cover is a $H_d$-approximation algorithm of set cover.
|
||||
|
||||
Proof:
|
||||
|
||||
Assign a cost to the elements of $X$ according to the decisions of the greedy set cover.
|
||||
|
||||
Let $\delta(S^i)$ be the new number of elements covered by set $S^i$.
|
||||
|
||||
$$
|
||||
\delta(S^i)=|S_i\cap U_{i-1}|
|
||||
$$
|
||||
|
||||
If the element $x$ is added by step $i$, when set $S_i$ is picked, then the cost of $x$ to
|
||||
|
||||
$$
|
||||
\frac{1}{\delta(S^i)}=\frac{1}{x_i}
|
||||
$$
|
||||
|
||||
Example:
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
X&=\{A,B,C,D,E,F,G\}\\
|
||||
S_1&=\{A,C,E\}\\
|
||||
S_2&=\{B,C,F,G\}\\
|
||||
S_3&=\{B,D,F,G\}\\
|
||||
S_4&=\{D,G\}
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
First we select $S_2$, then $cost(B)=cost(C)=cost(F)=cost(G)=\frac{1}{4}$.
|
||||
|
||||
Then we select $S_1$, then $cost(A)=cost(E)=\frac{1}{2}$.
|
||||
|
||||
Then we select $S_3$, then $cost(D)=1$.
|
||||
|
||||
If element $x$ was covered by greedy set cover due to the addition of set $S^i$ at step $i$, then the cost of $x$ is $\frac{1}{\delta(S^i)}$.
|
||||
|
||||
$$
|
||||
\textup{Total cost of GSC}=\sum_{x\in X}c(x)=\sum_{i=1}^{|C|}\sum_{X\textup{ covered at iteration }i}c(x)=\sum_{i=1}^{|C|}\delta(S^i)\frac{1}{\delta(S^i)}=|C|
|
||||
$$
|
||||
|
||||
Claim: Consider any set $S$ that is a subset of $X$. The cost paid by the greedy set cover for $S$ is at most $H_{|S|}$.
|
||||
|
||||
Suppose that the greedy set covers $S$ in order $x_1,x_2,\ldots,x_{|S|}$, where $\{x_1,x_2,\ldots,x_{|S|}\}=S$.
|
||||
|
||||
When GSC covers $x_j$, $\{x_j,x_{j+1},\ldots,x_{|S|}\}$ are not covered.
|
||||
|
||||
At this point, the GSC has the option of picking $S$
|
||||
|
||||
This implies that the $\delta(S)$ is at least $|S|-j+1$.
|
||||
|
||||
Assume that $S$ is picked $\hat{S}$ for which $\delta(\hat{S})$ is maximized ($\hat{S}$ may be $S$ or other sets that have not covered $x_j$).
|
||||
|
||||
So, $\delta(\hat{S})\geq \delta(S)\geq |S|-j+1$.
|
||||
|
||||
So the cost of $x_j$ is $\delta(\hat{S})\leq \frac{1}{\delta(S)}\leq \frac{1}{|S|-j+1}$.
|
||||
|
||||
Summing over all $j$, the cost of $S$ is at most $\sum_{j=1}^{|S|}\frac{1}{|S|-j+1}=H_{|S|}$.
|
||||
|
||||
Back to the proof of approximation ratio:
|
||||
|
||||
Let $C^*$ be optimal set cover.
|
||||
|
||||
$$
|
||||
|C|=\sum_{x\in X}c(x)\leq \sum_{S_j\in C^*}\sum_{x\in S_j}c(x)
|
||||
$$
|
||||
|
||||
This inequality holds because of counting element that is covered by more than one set.
|
||||
|
||||
Since $\sum_{x\in S_j}c(x)\leq H_{|S_j|}$, by our claim.
|
||||
|
||||
Let $d$ be the largest cardinality of any set in $C^*$.
|
||||
|
||||
$$
|
||||
|C|\leq \sum_{S_j\in C^*}H_{|S_j|}\leq \sum_{S_j\in C^*}H_d=H_d|C^*|
|
||||
$$
|
||||
|
||||
So the approximation ratio for greedy set cover is $H_d$.
|
||||
|
||||
QED
|
||||
Reference in New Issue
Block a user