upgrade structures and migrate to nextra v4

2025-07-06 12:40:25 -05:00
parent 76e50de44d
commit 717520624d
317 changed files with 18143 additions and 22777 deletions
--- a/content/CSE347/CSE347_L8.md
+++ b/content/CSE347/CSE347_L8.md
@@ -0,0 +1,353 @@
+# Lecture 8
+
+## NP-optimization problem
+
+Cannot be solved in polynomial time.
+
+Example:
+
+- Maximum independent set
+- Minimum vertex cover
+
+What can we do?
+
+- solve small instances
+- hard instances are rare - average case analysis
+- solve special cases
+- find an approximate solution
+
+## Approximation algorithms
+
+We find a "good" solution in polynomial time, but may not be optimal.
+
+Example:
+
+- Minimum vertex cover: we will find a small vertex cover, but not necessarily the smallest one.
+- Maximum independent set: we will find a large independent set, but not necessarily the largest one.
+
+Question: How do we quantify the quality of the solution?
+
+### Approximation ratio
+
+Intuition:
+
+How good is an algorithm $A$ compared to an optimal solution in the worst case?
+
+Definition:
+
+Consider algorithm $A$ for an NP-optimization problem $L$. Say for **any** instance $l$, $A$ finds a solution output $c_A(l)$ and the optimal solution is $c^*(l)$. 
+
+Approximation ratio is either:
+
+$$
+\max_{l \in L} \frac{c_A(l)}{c^*(l)}=\alpha
+$$
+
+for maximization problems, or
+
+$$
+\min_{l \in L} \frac{c^A(l)}{c_*(l)}=\alpha
+$$
+
+for minimization problems.
+
+Example:
+
+Alice's Algorithm, $A$, finds a vertex cover of size $c_A(l)$ for instance $l(G)$. The optimal vertex cover has size $c^*(l)$.
+
+We want approximation ratio to be as close to 1 as possible.
+
+> Vertex cover:
+> 
+> A vertex cover is a set of vertices that touches all edges.
+
+Let's try an approximation algorithm for the vertex cover problem, called Greedy cover.
+
+#### Greedy cover
+
+Pick any uncovered edge, both its endpoints are added to the cover $C$, until all edges are covered.
+
+Runtime: $O(m)$
+
+Claim: Greedy cover is correct, and it finds a vertex cover.
+
+Proof:
+
+Algorithm only terminates when all edges are covered.
+
+Claim: Greedy cover is a 2-approximation algorithm.
+
+Proof:
+
+Look at the two edges we picked.
+
+Either it is covered by Greedy cover, or it is not.
+
+If it is not covered by Greedy cover, then we will add both endpoints to the cover.
+
+In worst case, Greedy cover will add both endpoints of each edge to the cover. (Consider the graph with disjoint edges.)
+
+Thus, the size of the vertex cover found by Greedy cover is at most twice the size of the optimal vertex cover.
+
+Thus, Greedy cover is a 2-approximation algorithm.
+
+> Min-cut:
+>
+> Given a graph $G$ and two vertices $s$ and $t$, find the minimum cut between $s$ and $t$.
+>
+> Max-cut:
+>
+> Given a graph $G$, find the maximum cut.
+
+#### Local cut
+
+Algorithm:
+
+- start with an arbitrary cut of $G$.
+- While you can move a vertex from one side to the other side while increasing the size of the cut, do so.
+- Return the cut found.
+
+We will prove its:
+
+- Runtime
+- Feasibility
+- Approximation ratio
+
+##### Runtime for local cut
+
+Since size of cut is at most $|E|$, the runtime is $O(m)$.
+
+When we move a vertex from one side to the other side, the size of the cut increases by at least 1.
+
+Thus, the algorithm terminates in at most $|V|$ steps.
+
+So the runtime is $O(|E||V|^2)$.
+
+##### Feasibility for local cut
+
+The algorithm only terminates when no more vertices can be moved.
+
+Thus, the cut found is a feasible solution.
+
+##### Approximation ratio for local cut
+
+This is a half-approximation algorithm.
+
+We need to show that the size of the cut found is at least half of the size of the optimal cut.
+
+We could first upper bound the size of the optimal cut is at most $|E|$.
+
+We will then prove that solution we found is at least half of the optimal cut $\frac{|E|}{2}$ for any graph $G$.
+
+Proof:
+
+When we terminate, no vertex could be moved
+
+Therefore, **The number of crossing edges is at least the number of non-crossing edges**.
+
+Let $d(u)$ be the degree of vertex $u\in V$.
+
+The total number of crossing edges for vertex $u$ is at least $\frac{1}{2}d(u)$.
+
+Summing over all vertices, the total number of crossing edges is at least $\frac{1}{2}\sum_{u\in V}d(u)=\frac{1}{2}|E|$.
+
+So the total number of non-crossing edges is at most $\frac{|E|}{2}$.
+
+QED
+
+#### Set cover
+
+Problem:
+
+You are collecting a set of magic cards.
+
+$X$ is the set of all possible cards. You want at least one of each card.
+
+Each dealer $j$ has a pack $S_j\subseteq X$ of cards. You have to buy entire pack or none from dealer $j$.
+
+Goal: What is the least number of packs you need to buy to get all cards?
+
+Formally:
+
+Input $X$ is a universe of $n$ elements, and a collection of subsets of $X$, $Y=\{S_1, S_2, \ldots, S_m\}\subseteq X$.
+
+Goal: Pick $C\subseteq Y$ such that $\bigcup_{S_i\in C}S_i=X$, and $|C|$ is minimized.
+
+Set cover is an NP-optimization problem. It is a generalization of the vertex cover problem.
+
+#### Greedy set cover
+
+Algorithm:
+
+- Start with empty set $C$.
+- While there is an element $x$ in $X$ that is not covered, pick one such element $x\in S_i$ where $S_i$ is the set that has not been picked before.
+- Add $S_i$ to $C$.
+- Return $C$.
+
+```python
+def greedy_set_cover(X, Y):
+    # X is the set of elements
+    # Y is the collection of sets, hashset by default
+    C = []
+    def non_covered_elements(X, C):
+        # return the elements in X that are not covered by C
+        # O(|X|)
+        return [x for x in X if not any(x in c for c in C)]
+    non_covered = non_covered_elements(X, C)
+    # O(|X|) every loop reduce the size of non_covered by 1
+    while non_covered:
+        max_cover,max_set = 0,None
+        # O(|Y|)
+        for S in Y:
+            # Intersection of two sets is O(min(|X|,|S|))
+            cur_cover = len(set(non_covered) & set(S))
+            if cur_cover > max_cover:
+                max_cover,max_set = cur_cover,S
+        C.append(max_set)
+        non_covered = non_covered_elements(X, C)
+    return C
+```
+
+It is not optimal.
+
+Need to prove its:
+
+- Correctness:  
+    Keep picking until all elements are covered.
+- Runtime:  
+    $O(|X||Y|^2)$
+- Approximation ratio:  
+
+##### Approximation ratio for greedy set cover
+
+> Harmonic number:
+>
+> $H_n=\sum_{i=1}^n\frac{1}{i}=\frac{1}{1}+\frac{1}{2}+\frac{1}{3}+\cdots+\frac{1}{n}=\Theta(\log n)$
+
+We claim that the size of the set cover found is at most $H_n\log n$ times the size of the optimal set cover.
+
+###### First bound:
+
+Proof:
+
+If the optimal picks $k$ sets, then the size of the set cover found is at most $(1+\log n)k$ sets.
+
+Let $n=|X|$.
+
+Observe that
+
+For the first round, the elements that we not covered is $n$.
+$$
+|U_0|=n
+$$
+
+In the second round, the elements that we not covered is at most $|U_0|-x$ where $x=|S_1|$ is the number of elements in the set picked in the first round.
+
+$$
+|U_1|=|U_0|-|S_1|
+$$
+
+...
+
+So $x_i\geq \frac{|U_{i-1}|}{k}$.
+
+We proceed by contradiction.
+
+Suppose all sets in the optimal solution are $< \frac{|U_0|}{k}$. Then the sum of the sizes of the sets in the optimal solution is $< |U_0|=n$.
+
+_There exists a least ratio of selection of sets determined by $k_i$. Otherwise the function (selecting the set cover) will not terminate (no such sets exists)_
+
+> Some math magics:
+> $$(1-\frac{1}{k})^k\leq \frac{1}{e}$$
+
+So $n(1-\frac{1}{k})^{|C|-1}=1$, $|C|\leq 1+k\ln n$.
+
+So the size of the set cover found is at most $(1+\ln n)k$.
+
+QED
+
+So the greedy set cover is not too bad...
+
+###### Second bound:
+
+Greedy set cover is a $H_d$-approximation algorithm of set cover.
+
+Proof:
+
+Assign a cost to the elements of $X$ according to the decisions of the greedy set cover.
+
+Let $\delta(S^i)$ be the new number of elements covered by set $S^i$.
+
+$$
+\delta(S^i)=|S_i\cap U_{i-1}|
+$$
+
+If the element $x$ is added by step $i$, when set $S_i$ is picked, then the cost of $x$ to
+
+$$
+\frac{1}{\delta(S^i)}=\frac{1}{x_i}
+$$
+
+Example:
+
+$$
+\begin{aligned}
+X&=\{A,B,C,D,E,F,G\}\\
+S_1&=\{A,C,E\}\\
+S_2&=\{B,C,F,G\}\\
+S_3&=\{B,D,F,G\}\\
+S_4&=\{D,G\}
+\end{aligned}
+$$
+
+First we select $S_2$, then $cost(B)=cost(C)=cost(F)=cost(G)=\frac{1}{4}$.
+
+Then we select $S_1$, then $cost(A)=cost(E)=\frac{1}{2}$.
+
+Then we select $S_3$, then $cost(D)=1$.
+
+If element $x$ was covered by greedy set cover due to the addition of set $S^i$ at step $i$, then the cost of $x$ is $\frac{1}{\delta(S^i)}$.
+
+$$
+\textup{Total cost of GSC}=\sum_{x\in X}c(x)=\sum_{i=1}^{|C|}\sum_{X\textup{ covered at iteration }i}c(x)=\sum_{i=1}^{|C|}\delta(S^i)\frac{1}{\delta(S^i)}=|C|
+$$
+
+Claim: Consider any set $S$ that is a subset of $X$. The cost paid by the greedy set cover for $S$ is at most $H_{|S|}$.
+
+Suppose that the greedy set covers $S$ in order $x_1,x_2,\ldots,x_{|S|}$, where $\{x_1,x_2,\ldots,x_{|S|}\}=S$.
+
+When GSC covers $x_j$, $\{x_j,x_{j+1},\ldots,x_{|S|}\}$ are not covered.
+
+At this point, the GSC has the option of picking $S$
+
+This implies that the $\delta(S)$ is at least $|S|-j+1$.
+
+Assume that $S$ is picked $\hat{S}$ for which $\delta(\hat{S})$ is maximized ($\hat{S}$ may be $S$ or other sets that have not covered $x_j$).
+
+So, $\delta(\hat{S})\geq \delta(S)\geq |S|-j+1$.
+
+So the cost of $x_j$ is $\delta(\hat{S})\leq \frac{1}{\delta(S)}\leq \frac{1}{|S|-j+1}$.
+
+Summing over all $j$, the cost of $S$ is at most $\sum_{j=1}^{|S|}\frac{1}{|S|-j+1}=H_{|S|}$.
+
+Back to the proof of approximation ratio:
+
+Let $C^*$ be optimal set cover.
+
+$$
+|C|=\sum_{x\in X}c(x)\leq \sum_{S_j\in C^*}\sum_{x\in S_j}c(x)
+$$
+
+This inequality holds because of counting element that is covered by more than one set.
+
+Since $\sum_{x\in S_j}c(x)\leq H_{|S_j|}$, by our claim.
+
+Let $d$ be the largest cardinality of any set in $C^*$.
+
+$$
+|C|\leq \sum_{S_j\in C^*}H_{|S_j|}\leq \sum_{S_j\in C^*}H_d=H_d|C^*|
+$$
+
+So the approximation ratio for greedy set cover is $H_d$.
+
+QED