upgrade structures and migrate to nextra v4
This commit is contained in:
245
content/CSE347/CSE347_L1.md
Normal file
245
content/CSE347/CSE347_L1.md
Normal file
@@ -0,0 +1,245 @@
|
||||
# Lecture 1
|
||||
|
||||
## Greedy Algorithms
|
||||
|
||||
* Builds up a solution by making a series of small decisions that optimize some objective.
|
||||
* Make one irrevocable choice at a time, creating smaller and smaller sub-problems of the same kind as the original problem.
|
||||
* There are many potential greedy strategies and picking the right one can be challenging.
|
||||
|
||||
### A Scheduling Problem
|
||||
|
||||
You manage a giant space telescope.
|
||||
|
||||
* There are $n$ research projects that want to use it to make observations.
|
||||
* Only one project can use the telescope at a time.
|
||||
* Project $p_i$ needs the telescope starting at time $s_i$ and running for a length of time $t_i$.
|
||||
* Goal: schedule as many as possible
|
||||
|
||||
Formally
|
||||
|
||||
Input:
|
||||
|
||||
* Given a set $P$ of projects, $|P|=n$
|
||||
* Each request $p_i\in P$ occupies interval $[s_i,f_i)$, where $f_i=s_i+t_i$
|
||||
|
||||
Goal: Choose a subset $\Pi\sqsubseteq P$ such that
|
||||
|
||||
1. No two projects in $\Pi$ have overlapping intervals.
|
||||
2. The number of selected projects $|\Pi|$ is maximized.
|
||||
|
||||
#### Shortest Interval
|
||||
|
||||
Counter-example: `[1,10],[9,12],[11,20]`
|
||||
|
||||
#### Earliest start time
|
||||
|
||||
Counter-example: `[1,10],[2,3],[4,5]`
|
||||
|
||||
#### Fewest Conflicts
|
||||
|
||||
Counter-example: `[1,2],[1,4],[1,4],[3,6],[7,8],[5,8],[5,8]`
|
||||
|
||||
#### Earliest finish time
|
||||
|
||||
Correct... but why
|
||||
|
||||
#### Theorem of Greedy Strategy (Earliest Finishing Time)
|
||||
|
||||
Say this greedy strategy (Earliest Finishing Time) picks a set $\Pi$ of intervals, some other strategy picks a set $O$ of intervals.
|
||||
|
||||
Assume sorted by finishing time
|
||||
|
||||
* $\Pi=\{i_1,i_2,...,i_k\},|\Pi|=k$
|
||||
* $O=\{j_1,j_2,...,j_m\},|O|=m$
|
||||
|
||||
We want to show that $|\Pi|\geq|O|,k>m$
|
||||
|
||||
#### Lemma: For all $r<k,f_{i_r}\leq f_{j_r}$
|
||||
|
||||
We proceed the proof by induction.
|
||||
|
||||
* Base Case, when r=1.
|
||||
$\Pi$ is the earliest finish time, and $O$ cannot pick a interval with earlier finish time, so $f_{i_r}\leq f_{j_r}$
|
||||
|
||||
* Inductive step, when r>1.
|
||||
Since $\Pi_r$ is the earliest finish time, so for any set in $O_r$, $f_{i_{r-1}}\leq f_{j_{r-1}}$, for any $j_r$ inserted to $O_r$, it can also be inserted to $\Pi_r$. So $O_r$ cannot pick an interval with earlier finish time than $Pi$ since it will also be picked by definition if $O_r$ is the optimal solution $OPT$.
|
||||
|
||||
#### Problem of “Greedy Stays Ahead” Proof
|
||||
|
||||
* Every problem has very different theorem.
|
||||
* It can be challenging to even write down the correct statement that you must prove.
|
||||
* We want a systematic approach to prove the correctness of greedy algorithms.
|
||||
|
||||
### Road Map to Prove Greedy Algorithm
|
||||
|
||||
#### 1. Make a Choice
|
||||
|
||||
Pick an interval based on greedy choice, say $q$
|
||||
|
||||
Proof: **Greedy Choice Property**: Show that using our first choice is not "fatal" – at least one optimal solution makes this choice.
|
||||
|
||||
Techniques: **Exchange Argument**: "If an optimal solution does not choose $q$, we can turn it into an equally good solution that does."
|
||||
|
||||
Let $\Pi^*$ be any optimal solution for project set $P$.
|
||||
- If $q\in \Pi^*$, we are done.
|
||||
- Otherwise, let $x$ be the optimal solution from $\Pi^*$ that does not pick $q$. We create another solution $\bar{\Pi^*}$ that replace $x$ with $q$, and prove that the $\bar{\Pi^*}$ is as optimal as $\Pi^*$
|
||||
|
||||
#### 2. Create a smaller instance $P'$ of the original problem
|
||||
|
||||
$P'$ has the same optimization criteria.
|
||||
|
||||
Proof: **Inductive Structure**: Show that after making the first choice, we're left with a smaller version of the same problem, whose solution we can safely combine with the first choice.
|
||||
|
||||
Let $P'$ be the subproblem left after making first choice $q$ in problem $P$ and let $\Pi'$ be an optimal solution to $P'$. Then $\Pi=\Pi^*\cup\{q\}$ is an optimal solution to $P$.
|
||||
|
||||
$P'=P-\{q\}-\{$projects conflicting with $q\}$
|
||||
|
||||
#### 3. Solution: Union of choices that we made
|
||||
|
||||
Union of choices that we made.
|
||||
|
||||
Proof: **Optimal Substructure**: Show that if we solve the subproblem optimally, adding our first choice creates an optimal solution to the *whole* problem.
|
||||
|
||||
Let $q$ be the first choice, $P'$ be the subproblem left after making $q$ in problem $P$, $\Pi'$ be an optimal solution to $P'$. We claim that $\Pi=\Pi'\cup \{q\}$ is an optimal solution to $P$.
|
||||
|
||||
We proceed the proof by contradiction.
|
||||
|
||||
Assume that $\Pi=\Pi'+\{q\}$ is not optimal.
|
||||
|
||||
|
||||
By Greedy choice property $GCP$. we already know that $\exists$ an optimal solution $\Pi^*$ for problem $P$ that contains $q$. If $\Pi$ is not optimal, $cost(\Pi^*)<cost(\Pi)$. Then since $\Pi^*-q$ is also a feasible solution to $P'$. $cost(\Pi^*-q)>cost(\Pi-q)=\Pi'$ which leads to contradiction that $\Pi'$ is an optimal solution to $P'$.
|
||||
|
||||
#### 4. Put 1-3 together to write an inductive proof of the Theorem
|
||||
|
||||
This is independent of problem, same for every problem.
|
||||
|
||||
Use scheduling problem as an example:
|
||||
|
||||
Theorem: given a scheduling problem $P$, if we repeatedly choose the remaining feasible project with the earliest finishing time, we will construct an optimal feasible solution to $P$.
|
||||
|
||||
Proof: We proceed by induction on $|P|$. (based on the size of problem $P$).
|
||||
|
||||
- Base case: $|P|=1$.
|
||||
- Inductive step.
|
||||
- Inductive hypothesis: For all problems of size $<n$, earliest finishing time (EFT) gives us an optimal solution.
|
||||
- EFT is optimal for problem of size $n$.
|
||||
- Proof: Once we pick q, because of greedy choice. $P'=P=\{q\} -\{$interval that conflict with $q\}$. $|P'|<n$, By Inductive hypothesis, EFT gives us an optimal solution to $P'$, but by inductive substructure, and optimal substructure. $\Pi'$ (optimal solution to $P'$), we have optimal solution to $P$.
|
||||
|
||||
_this step always holds as long as the previous three properties hold, and we don't usually write the whole proof._
|
||||
|
||||
```python
|
||||
# Algorithm construction for Interval scheduling problem
|
||||
def schedule(p):
|
||||
# sorting takes O(n)=nlogn
|
||||
p=sorted(p,key=lambda x:x[1])
|
||||
res=[P[0]]
|
||||
# O(n)=n
|
||||
for i in p[1:]:
|
||||
if res[-1][-1]<i[0]:
|
||||
res.append(i)
|
||||
return res
|
||||
```
|
||||
|
||||
## Extra Examples:
|
||||
|
||||
### File compression problem
|
||||
|
||||
You have $n$ files of different sizes $f_i$.
|
||||
|
||||
You want to merge them to create a single file. $merge(f_i,f_j)$ takes time $f_i+f_j$ and creates a file of size $f_k=f_i+f_j$.
|
||||
|
||||
Goal: Find the order of merges such that the total time to merge is minimized.
|
||||
|
||||
Thinking process: The merge process is a binary tree and each of the file is the leaf of the tree.
|
||||
|
||||
The total time required =$\sum^n_{i=1} d_if_i$, where $d_i$ is the depth of the file in the compression tree.
|
||||
|
||||
So compressing the smaller file first may yield a faster run time.
|
||||
|
||||
Proof:
|
||||
|
||||
#### Greedy Choice Property
|
||||
|
||||
Construct part of the solution by making a locally good decision.
|
||||
|
||||
Lemma: $\exist$ some optimal solution that merges the two smallest file first, lets say $[f_1,f_2]$
|
||||
|
||||
Proof: **Exchange argument**
|
||||
|
||||
* Case 1: Optimal choice already merges $f_1,f_2$, done. Time order does not matter in this problem at some point.
|
||||
* eg: [2,2,3], merge 2,3 and 2,2 first don't change the total cost
|
||||
* Case 2: Optimal choice does not merges $f_1$ and $f_2$.
|
||||
* Suppose the optimal solution merges $f_x,f_y$ as the deepest merge.
|
||||
* Then $d_x\geq d_1,d_y\geq d_2$. Exchanging $f_1,f_2$ with $f_x,f_y$ would yield a strictly less greater solution since $f_1,f_2$ already smallest.
|
||||
|
||||
#### Inductive Structure
|
||||
|
||||
* We can combine feasible solution to the subproblem $P'$ with the greedy choice to get a feasible solution to $P$
|
||||
* After making greedy choice $q$, we are left with a strictly smaller subproblem $P'$ with the same optimality criteria of the original problem
|
||||
*
|
||||
Proof: **Optimal Substructure**: Show that if we solve the subproblem optimally, adding our first choice creates an optimal solution to the *whole* problem.
|
||||
|
||||
Let $q$ be the first choice, $P'$ be the subproblem left after making $q$ in problem $P$, $\Pi^*$ be an optimal solution to $P'$. We claim that $\Pi=\Pi'\cup \{q\}$ is an optimal solution to $P$.
|
||||
|
||||
We proceed the proof by contradiction.
|
||||
|
||||
Assume that $\Pi=\Pi^*+\{q\}$ is not optimal.
|
||||
|
||||
By Greedy choice property $GCP$. we already know that $\Pi^*$ is optimal solution that contains $q$. Then $|\Pi^*|>|\Pi|$ $\Pi^*-q$ is also feasible solution to $P'$. $|\Pi^*-q|>|\Pi-q|=\Pi'$ which is an optimal solution to $P'$ which leads to contradiction.
|
||||
|
||||
Proof: **Smaller problem size**
|
||||
|
||||
After merging the smallest two files into one, we have strictly less files waiting to merge.
|
||||
|
||||
#### Optimal Substructure
|
||||
|
||||
* We can combine optimal solution to the subproblem $P'$ with the greedy choice to get a optimal solution to $P$
|
||||
|
||||
Step 4 ignored, same for all greedy problems.
|
||||
|
||||
### Conclusion: Greedy Algorithm
|
||||
|
||||
* Algorithm
|
||||
* Runtime Complexity
|
||||
* Proof
|
||||
* Greedy Choice Property
|
||||
* Construct part of the solution by making a locally good decision.
|
||||
* Inductive Structure
|
||||
* We can combine feasible solution to the subproblem $P'$ with the greedy choice to get a feasible solution to $P$
|
||||
* After making greedy choice $q$, we are left with a strictly smaller subproblem $P'$ with the same optimality criteria of the original problem
|
||||
* Optimal Substructure
|
||||
* We can combine optimal solution to the subproblem $P'$ with the greedy choice to get a optimal solution to $P$
|
||||
* Standard Contradiction Argument simplifies it
|
||||
|
||||
## Review:
|
||||
|
||||
### Essence of master method
|
||||
|
||||
Let $a\geq 1$ and $b>1$ be constants, let $f(n)$ be a function, and let $T(n)$ be defined on the nonnegative integers by the recurrence
|
||||
|
||||
$$
|
||||
T(n)=aT(\frac{n}{b})+f(n)
|
||||
$$
|
||||
|
||||
where we interpret $n/b$ to mean either ceiling or floor of $n/b$. $c_{crit}=\log_b a$ Then $T(n)$ has to following asymptotic bounds.
|
||||
|
||||
* Case I: if $f(n) = O(n^{c})$ ($f(n)$ "dominates" $n^{\log_b a-c}$) where $c<c_{crit}$, then $T(n) = \Theta(n^{c_{crit}})$
|
||||
|
||||
* Case II: if $f(n) = \Theta(n^{c_{crit}})$, ($f(n), n^{\log_b a-c}$ have no dominate) then $T(n) = \Theta(n^{\log_b a} \log_2 n)$
|
||||
|
||||
Extension for $f(n)=\Theta(n^{critical\_value}*(\log n)^k)$
|
||||
|
||||
* if $k>-1$
|
||||
|
||||
$T(n)=\Theta(n^{critical\_value}*(\log n)^{k+1})$
|
||||
|
||||
* if $k=-1$
|
||||
|
||||
$T(n)=\Theta(n^{critical\_value}*\log \log n)$
|
||||
|
||||
* if $k<-1$
|
||||
|
||||
$T(n)=\Theta(n^{critical\_value})$
|
||||
|
||||
* Case III: if $f(n) = \Omega(n^{log_b a+c})$ ($n^{log_b a-c}$ "dominates" $f(n)$) for some constant $c >0$, and if a $f(n/b)<= c f(n)$ for some constant $c <1$ then for all sufficiently large $n$, $T(n) = \Theta(n^{log_b a+c})$
|
||||
|
||||
320
content/CSE347/CSE347_L10.md
Normal file
320
content/CSE347/CSE347_L10.md
Normal file
@@ -0,0 +1,320 @@
|
||||
# Lecture 10
|
||||
|
||||
## Online Algorithms
|
||||
|
||||
### Example 1: Elevator
|
||||
|
||||
Problem: You've entered the lobby of a tall building, and want to go to the top floor as quickly as possible. There is an elevator which takes $E$ time to get to the top once it arrives. You can also take the stairs which takes $S$ time to climb (once you start) with $S>E$. However, you **do not know** when the elevator will arrive.
|
||||
|
||||
#### Offline (Clairvoyant) vs. Online
|
||||
|
||||
Offline: If you know that the elevator is arriving in $T$ time, the what will you do?
|
||||
|
||||
- Easy. I will computer $E+T$ with $S$ and take the smaller one.
|
||||
|
||||
Online: You do not know when the elevator will arrive.
|
||||
|
||||
- You can either wait for the elevator or take the stairs.
|
||||
|
||||
#### Strategies
|
||||
|
||||
**Always take the stairs.**
|
||||
|
||||
Your cost $S$,
|
||||
|
||||
Optimal Cost: $E$.
|
||||
|
||||
Your cost / Optimal cost = $\frac{S}{E}$.
|
||||
|
||||
$S$ would be arbitrary large. For example, the Empire State Building has $103$ floors.
|
||||
|
||||
**Wait for the elevator**
|
||||
|
||||
Your cost $T+E$
|
||||
|
||||
Optimal Cost: $S$ (if $T$ is large)
|
||||
|
||||
Your cost / Optimal cost = $\frac{T+E}{S}$.
|
||||
|
||||
$T$ could be arbitrary large. For out of service elevator, $T$ could be infinite.
|
||||
|
||||
#### Online Algorithms
|
||||
|
||||
Definition: An online algorithm must take decisions **without** full information about the problem instance [in this case $T$] and/or it does not know the future [e.g. makes decision immediately as jobs come in without knowing the future jobs].
|
||||
|
||||
An **offline algorithm** has the full information about the problem instance.
|
||||
|
||||
### Competitive Ratio
|
||||
|
||||
Quality of online algorithm is quantified by the **competitive ratio** (Idea is similar to the approximation ratio in optimization).
|
||||
|
||||
Consider a problem $L$ (minimization) and let $l$ be an instance of this problem.
|
||||
|
||||
$C^*(l)$ is the cost of the optimal offline solution with full information and unlimited computational power.
|
||||
|
||||
$A$ is the online algorithm for $L$.
|
||||
|
||||
$C_A(l)$ is the value of $A$'s solution on $l$.
|
||||
|
||||
An online algorithm $A$ is $\alpha$-competitive if
|
||||
|
||||
$$
|
||||
\frac{C_A(l)}{C^*(l)}\leq \alpha
|
||||
$$
|
||||
|
||||
for all instances $l$ of the problem.
|
||||
|
||||
In other words, $\alpha=\max_l\frac{C_A(l)}{C^*(l)}$.
|
||||
|
||||
For maximization problems, we want to minimize the comparative ratio.
|
||||
|
||||
### Back to the Elevator Problem
|
||||
|
||||
**Strategy 1**: Always take the stairs. Ratio is $\frac{S}{E}$. can be arbitrarily large.
|
||||
|
||||
**Strategy 2**: Wait for the elevator. Ratio is $\frac{T+E}{S}$. can be arbitrarily large.
|
||||
|
||||
**Strategy 3**: We do not make a decision immediately. Let's wait for $R$ times and then takes stairs if elevator does not arrive.
|
||||
|
||||
Question: What is the value of $R$? (how long to wait?)
|
||||
|
||||
Let's try $R=S$.
|
||||
|
||||
Claim: The comparative ratio is $2$.
|
||||
|
||||
Proof:
|
||||
|
||||
Case 1: The optimal offline solution takes the elevator, then $T+E\leq S$.
|
||||
|
||||
We also take the elevator.
|
||||
|
||||
Competitive ratio = $\frac{T+E}{T+E}=1$.
|
||||
|
||||
Case 2: The optimal offline solution takes the stairs, immediately.
|
||||
|
||||
We wait for $R$ times and then take the stairs. In worst case, we wait for $R$ times and then take the stairs for $R$.
|
||||
|
||||
Competitive ratio = $\frac{2R}{R}=2$.
|
||||
|
||||
QED
|
||||
|
||||
Let's try $R=S-E$ instead.
|
||||
|
||||
Claim: The comparative ratio is $max\{1,2-\frac{E}{S}\}$.
|
||||
|
||||
Proof:
|
||||
|
||||
Case 1: The optimal offline solution takes the elevator, then $T+E\leq S$.
|
||||
|
||||
We also take the elevator.
|
||||
|
||||
Competitive ratio = $\frac{T+E}{T+E}=1$.
|
||||
|
||||
Case 2: The optimal offline solution takes the stairs, immediately.
|
||||
|
||||
We wait for $R=S-E$ times and then take the stairs.
|
||||
|
||||
Competitive ratio = $\frac{S-E+S}{S}=2-\frac{E}{S}$.
|
||||
|
||||
QED
|
||||
|
||||
What if we wait less time? Let's try $R=S-E-\epsilon$ for some $\epsilon>0$
|
||||
|
||||
In the worst case, we take the stairs for $S-E-\epsilon$ times and then take the stairs for $S$.
|
||||
|
||||
Competitive ratio = $\frac{(S-E-\epsilon)+S}{S-E-\epsilon+E}=\frac{2S-E-\epsilon}{2S-E}>2-\frac{E}{S}$.
|
||||
|
||||
So the optimal competitive ratio is $max\{1,2-\frac{E}{S}\}$ when we wait for $S-E$ time.
|
||||
|
||||
### Example 2: Cache Replacement
|
||||
|
||||
Cache: Data in a cache is organized in blocks (also called pages or cache lines).
|
||||
|
||||
If CPU accesses data that is already in the cache, it is called **cache hit**, then access is fast.
|
||||
|
||||
If CPU accesses data that is not in the cache, it is called **cache miss**, This block if brought to cache from main memory. If the cache already has $k$ blocks (full), then another block need to be **kicked out** (eviction).
|
||||
|
||||
Global: Minimize the number of cache misses.
|
||||
|
||||
**Clairvoyant policy**: Knows that will be accessed in the future and the sequence of access.
|
||||
|
||||
FIF: Farthest in the future
|
||||
|
||||
Example: $k=3$, cache has $3$ blocks.
|
||||
|
||||
Sequence: $A B C D C A B$
|
||||
|
||||
Cache: $A B C$, the evict $B$ for $D$. then 3 warm up and 1 miss.
|
||||
|
||||
Online Algorithm: Least recently used (LRU)
|
||||
|
||||
LRU: least recently used.
|
||||
|
||||
Example: $A B C D C A B$
|
||||
|
||||
Cache: $A B C$, the evict $A$ for $D$. then 3 warm up and 1 miss.
|
||||
|
||||
Cache: $D B C$, the evict $B$ for $A$. 1 miss.
|
||||
|
||||
Cache: $D A C$, the evict $D$ for $B$. 1 miss.
|
||||
|
||||
#### Competitive Ratio for LRU
|
||||
|
||||
Claim: LRU is $k+1$-competitive.
|
||||
|
||||
Proof:
|
||||
|
||||
Split the sequence into subsequences such that each subsequence contains $k+1$ distinct blocks.
|
||||
|
||||
For example, suppose $k=3$, sequence $ABCDCEFGEA$, subsequences are $ABCDC$ and $EFGEA$.
|
||||
|
||||
LRU Cache: In each subsequence, it has at most $k+1$ misses.
|
||||
|
||||
The optimal offline solution: In each subsequence, must have at least $1$ miss.
|
||||
|
||||
So the competitive ratio is at most $k+1$.
|
||||
|
||||
QED
|
||||
|
||||
Using similar analysis, we can show that LRU is $k$ competitive.
|
||||
|
||||
Hint for the proof:
|
||||
|
||||
Split the sequence into subsequences such that each subsequence LRU has $k$ misses.
|
||||
|
||||
Argue that OPT has at least $1$ miss in each subsequence.
|
||||
|
||||
QED
|
||||
|
||||
#### Many sensible algorithms are $k$-competitive
|
||||
|
||||
**Lower Bound**: No deterministic online algorithm is better than $k$-competitive.
|
||||
|
||||
**Resource augmentation**: Offline algorithm (which knows the future) has $k$ cache lines in its cache and the online algorithm has $ck$ cache lines with $c>1$.
|
||||
|
||||
##### Lemma: Competitive Ratio is $\sim \frac{c}{c-1}$
|
||||
|
||||
Say $c=2$. LRU cache has twice as much as cache. LRU is $2$-competitive.
|
||||
|
||||
Proof:
|
||||
|
||||
LRU has cache of size $2k$.
|
||||
|
||||
Divide the sequence into subsequences such that you have $ck$ distinct pages.
|
||||
|
||||
In each subsequence, LRU has at most $ck$ misses.
|
||||
|
||||
The OPT has at least $(c-1)k$ misses.
|
||||
|
||||
So competitive ratio is at most $\frac{ck}{(c-1)k}=\frac{c}{c-1}$.
|
||||
|
||||
_Actual competitive ratio is $\sim \frac{c}{c-1+\frac{1}{k}}$._
|
||||
|
||||
QED
|
||||
|
||||
### Conclusion
|
||||
|
||||
- Definition: some information unknown
|
||||
- Clairvoyant vs. Online
|
||||
- Competitive Ratio
|
||||
- Example:
|
||||
- Elevator
|
||||
- Cache Replacement
|
||||
|
||||
### Example 3: Pessimal cache problem
|
||||
|
||||
Maximize number of cache misses.
|
||||
|
||||
Maximization problem: competitive ratio is $max\{\frac{\text{cost of the optimal online algorithm}}{\text{cost of our algorithm}}\}$.
|
||||
|
||||
Or get $min\{\frac{\text{cost of our algorithm}}{\text{cost of the optimal online algorithm}}\}$.
|
||||
|
||||
The size of the cache is $k$.
|
||||
|
||||
So if OPT has $X$ cache misses, we want $\geq \frac{X}{\alpha}$. cache misses where $\alpha$ is the competitive ratio.
|
||||
|
||||
Claim: The OPT could always miss (note quite) except when the page is accessed twice in a row.
|
||||
|
||||
Claim: No deterministic online algorithm has a bounded competitive ratio. (that is independent of the length of the sequence)
|
||||
|
||||
Proof:
|
||||
|
||||
Start with an empty cache. (size of cache is $k$)
|
||||
|
||||
Miss the first $k$ unique pages.
|
||||
|
||||
$P_1,P_2,\cdots,P_k|P_{k+1},P_{k+2},\cdots,P_{2k}$
|
||||
|
||||
Say your deterministic online algorithm choose to evict $P_i$ for $i\in\{1,2,\cdots,k\}$.
|
||||
|
||||
We want to choose $P_i$ for $i\in\{1,2,\cdots,k\}$ such that the number of misses is maximized.
|
||||
|
||||
The optimal offline solution: evict the page that will be accessed furthest in the future. Let's call it $\sigma$.
|
||||
|
||||
The online algorithm: evict $P_i$ for $i\in\{1,2,\cdots,k\}$. Will have $k+1$ misses in the worst case.
|
||||
|
||||
So the competitive ratio is at most $\frac{\sigma}{k+1}$, which is unbounded.
|
||||
|
||||
#### Randomized most recently used (RAND, MRU)
|
||||
|
||||
MRU without randomization is a deterministic algorithm, and thus, the competitive ration is bounded.
|
||||
|
||||
First $k$ unique accesses brings all pages to cache.
|
||||
|
||||
On the $k+1$th access, pick a random page from the cache and evict it.
|
||||
|
||||
After that evict the MRU no a miss.
|
||||
|
||||
Claim: RAND is $k$-competitive.
|
||||
|
||||
#### Lemma: After the first $k+1$ unique accesses at all times
|
||||
|
||||
1. 1 page is in the cache with probability 1 (the MRU one)
|
||||
2. There exists $k$ pages each of which is in the cache with probability $1-\frac{1}{k}$
|
||||
3. All other pages are in the cache with probability $0$.
|
||||
|
||||
Proof:
|
||||
|
||||
By induction.
|
||||
|
||||
Base case: right after the first $k+1$ unique accesses and before $k+2$th access.
|
||||
|
||||
1. $P_{k+1}$ is in the cache with probability $1$.
|
||||
2. When we brought $P_{k+1}$ to the cache, we evicted one page uniformly at random. (i.e. $P_i$ is evicted with probability $\frac{1}{k}$, $P_i$ is still in the cache with probability $1-\frac{1}{k}$)
|
||||
3. All other $r$ pages are definitely not in the cache because we did not see them yet.
|
||||
|
||||
Inductive cases:
|
||||
|
||||
Let $P$ be a page that is in the cache with probability $0$
|
||||
|
||||
Cache miss and RAND MRU evict $P'$ for another page with probability in this cache with probability $0$.
|
||||
|
||||
1. $P$ is in the cache with probability $1$.
|
||||
2. By induction, there exists a set of $k$ pages each of which is in the cache with probability $1-\frac{1}{k}$.
|
||||
3. All other pages are in the cache with probability $0$.
|
||||
|
||||
Let $P$ be a page in the cache with probability $1-\frac{1}{k}$.
|
||||
|
||||
With probability $\frac{1}{k}$, $P$ is not in the cache and RAND evicts $P'$ in the cache and brings $P$ to the cache.
|
||||
|
||||
QED
|
||||
|
||||
MRU is $k$-competitive.
|
||||
|
||||
Proof:
|
||||
|
||||
Case 1: Access MRU page.
|
||||
|
||||
Both OPT and our algorithm don't miss.
|
||||
|
||||
Case 2: Access some other 1 page
|
||||
|
||||
OPT definitely misses.
|
||||
|
||||
RAND MRU misses with probability $\geq \frac{1}{k}$.
|
||||
|
||||
Let's define the random variable $X$ as the number of misses of RAND MRU.
|
||||
|
||||
$E[X]\leq 1+\frac{1}{k}$.
|
||||
|
||||
QED
|
||||
152
content/CSE347/CSE347_L11.md
Normal file
152
content/CSE347/CSE347_L11.md
Normal file
@@ -0,0 +1,152 @@
|
||||
# Lecture 11
|
||||
|
||||
## More randomized algorithms
|
||||
|
||||
> Caching problem: You have a cache with $k$ blocks and a sequence of accesses, called $\sigma$. The cost of a randomized caching algorithm is the expected number of cache misses on $\sigma$.
|
||||
|
||||
### Randomized Marking Algorithm
|
||||
|
||||
> A phase $i$ has $n_i$ new pages.
|
||||
|
||||
Our goal is to optimize $m^*(\sigma)\geq \frac{1}{2}\sum_{i=1}^{n} n_j$ where $n_j$ is the number of new pages in phase $j$.
|
||||
|
||||
Marking algorithm:
|
||||
|
||||
- at a cache miss, evict an unmarked page uniformly at random
|
||||
- at the beginning of the algorithm, all the entries are unmarked
|
||||
- after $k$ unique accesses and one miss, all the entries are unmarked
|
||||
- old pages: pages in cache at the end of the previous phase
|
||||
- new pages: pages accessed in this phase that are not old.
|
||||
- new pages always cause a miss.
|
||||
- old pages can cause a miss if a new page was accessed and replaced that old page and then the old page was accessed again. This can also be caused by old pages replacing other old pages and creating this cascading effect.
|
||||
|
||||
Reminder: Competitive ratio for our randomized algorithm is
|
||||
|
||||
$$
|
||||
max_\sigma \{\frac{E[m(\sigma)]}{m^*(\sigma)}\}
|
||||
$$
|
||||
|
||||
```python
|
||||
def randomized_marking_algorithm(sigma, k):
|
||||
cache = set()
|
||||
marked = set()
|
||||
misses = 0
|
||||
for page in sigma:
|
||||
if page not in cache:
|
||||
# once all the blocks are marked, unmark all the blocks
|
||||
if len(marked) == k:
|
||||
marked.clear()
|
||||
# if the cache is full, randomly remove a page that is not marked
|
||||
if len(cache) == k:
|
||||
for page in cache:
|
||||
if page not in marked:
|
||||
cache.remove(page)
|
||||
misses += 1
|
||||
# add the new page to the cache and mark it
|
||||
cache.add(page)
|
||||
marked.add(page)
|
||||
return misses
|
||||
```
|
||||
|
||||
Example:
|
||||
|
||||
A cache on phase $i$ has $k$ blocks and miss on page $x$:
|
||||
|
||||
[$n_i$ new pages] [$o_i$ old pages] [$x$] [$\ldots$]
|
||||
|
||||
$P[x \text{ causes a miss}] = P[x\text{ was evicted earlier}] \leq \frac{n_j}{k-o_i}$
|
||||
|
||||
Proof:
|
||||
|
||||
**Warning: the first few line of the equation might be wrong.**
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
P\left[x \text{ was evicted earlier}\bigg\vert\begin{array}{c} n_j\text{ new pages}, \\ o_i\text{ old pages}, \\ k \text{ unmarked blocks} \end{array}\right] &=P[x\text{ was unmarked}]+P[x\text{ was marked}] \\
|
||||
&=P[x\text{ was unmarked (new page)}]+P[x\text{ was old page}]+P[x\text{ was in the remaining cache blocks}] \\
|
||||
&= \frac{1}{k}+\frac{o_i}{k} P\left[x \text{ was evicted earlier}\bigg\vert\begin{array}{c} n_j-1\text{ new pages}, \\ o_i-1\text{ old pages}, \\ k-1 \text{ unmarked blocks} \end{array}\right] +\frac{k-1-o_i}{k} P\left[x \text{ was evicted earlier}\bigg\vert\begin{array}{c} n_j-1\text{ new pages}, \\ o_i\text{ old pages}, \\ k-1 \text{ unmarked blocks} \end{array}\right] \\
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
Let $P(n_j, o_i, k)$ be the probability that page $x$ causes a miss when the cache has $n_j$ new pages, $o_i$ old pages, and $k$ unmarked blocks.
|
||||
|
||||
Using $P(n_j, o_i, k)\leq \frac{n_j}{k-o_i}$, we have
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
P(n_j, o_i, k) &= \frac{1}{k}+\frac{o_i}{k} P(n_j-1, o_i-1, k-1)+\frac{k-1-o_i}{k} P(n_j-1, o_i, k-1) \\
|
||||
&\leq \frac{1}{k}+\frac{o_i}{k} \frac{n_j-1}{k-1-o_i-1}+\frac{k-1-o_i}{k} \frac{n_j-1}{k-1-o_i} \\
|
||||
&= \frac{1}{k}+\left(1+\frac{o_in}{k-o_i}+\frac{n_j-1}{k-o_i}\right)\\
|
||||
&=\frac{1}{k}\left(\frac{k-o_i+o_in+(n_j-1)(k-o_i)}{k-o_i}\right)\\
|
||||
&= \frac{n_j}{k-o_i}
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
Fix a phase $j$, let $x_i$ be an indicator random variable
|
||||
|
||||
$$
|
||||
x_i=\begin{cases}
|
||||
1 & \text{if page } i \text{th old page causes a miss} \\
|
||||
0 & \text{otherwise}
|
||||
\end{cases}
|
||||
$$
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
P[x_i=1]&=P[i\text{th old page causes a miss}]\\
|
||||
&\leq \frac{n_j}{k-(i-1)}\\
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
E[x_i]&=E[\sum_{i=1}^{o_i} P[x_i=1]]\\
|
||||
&= E[n_j+\sum_{i=1}^{k-n_j}x_i]\\
|
||||
&=n_j+\sum_{i=1}^{k-n_j} E[x_i]\\
|
||||
&\leq n_j+\sum_{i=1}^{k-n_j} \frac{n_j}{k-(i-1)}\\
|
||||
&=n_j+\left(1+\frac{1}{k}+\frac{1}{k-1}+\cdots+\frac{1}{n_j}\right)\\
|
||||
&\leq n_j H_k\\
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
Let $N$ be the total number of phases.
|
||||
|
||||
So the expected total number of misses is
|
||||
|
||||
$$
|
||||
E[\sum_{i=1}^{N} x_i]\leq \sum_{i=1}^{N} E[x_i]\leq\sum_{j=1}^{N} n_j H_k
|
||||
$$
|
||||
|
||||
So the competitive ratio is
|
||||
|
||||
$$
|
||||
\frac{E[\sum_{i=1}^{N} x_i]}{\frac{1}{2}\sum_{j=1}^{N} n_j}\leq 2H_k=O(\log k)
|
||||
$$
|
||||
|
||||
## Probabilistic boosting for decision problems
|
||||
|
||||
Assume that you have a randomized algorithm that gives you the correct answer with probability $\frac{1}{2}+\epsilon$. for some $\epsilon>0$.
|
||||
|
||||
I want to boost the probability of the correct decision to be $\geq 1-\delta$.
|
||||
|
||||
What we can do is to run the algorithm $x$ times independently with probability $\frac{1}{2}+\epsilon$ and take the majority vote.
|
||||
|
||||
The probability of the wrong decision is
|
||||
|
||||
$$
|
||||
\binom{x}{\lceil x/2\rceil} \left(\frac{1}{2}-\epsilon\right)^{\lceil x/2\rceil}
|
||||
$$
|
||||
|
||||
I want to choose $x$ such that this is $\leq \delta$.
|
||||
|
||||
> $$(1-p)^{\frac{1}{p}}\leq e^{-1}$$
|
||||
|
||||
So
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
\binom{x}{\lceil x/2\rceil}\left(\frac{1}{2}-\epsilon\right)^{\lceil x/2\rceil}&\leq \left(\frac{xe}{x/2}\right)^{\lceil x/2\rceil}\left(\frac{1}{2}-\epsilon\right)^{-\lceil x/2\rceil\epsilon}
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
We use this to solve for $x$.
|
||||
334
content/CSE347/CSE347_L2.md
Normal file
334
content/CSE347/CSE347_L2.md
Normal file
@@ -0,0 +1,334 @@
|
||||
# Lecture 2
|
||||
|
||||
## Divide and conquer
|
||||
|
||||
Review of CSE 247
|
||||
|
||||
1. Divide the problem into (generally equal) smaller subproblems
|
||||
2. Recursively solve the subproblems
|
||||
3. Combine the solutions of subproblems to get the solution of the original problem
|
||||
- Examples: Merge Sort, Binary Search
|
||||
|
||||
Recurrence
|
||||
|
||||
Master Method:
|
||||
|
||||
$$
|
||||
T(n)=aT(\frac{n}{b})+\Theta(f(n))
|
||||
$$
|
||||
|
||||
### Example 1: Multiplying 2 numbers
|
||||
|
||||
Normal Algorithm:
|
||||
|
||||
```python
|
||||
def multiply(x,y):
|
||||
p=0
|
||||
for i in y:
|
||||
p+=x*y
|
||||
return p
|
||||
```
|
||||
|
||||
divide and conquer approach
|
||||
|
||||
```python
|
||||
def multiply(x,y):
|
||||
n=max(len(x),len(y))
|
||||
if n==1:
|
||||
return x*y
|
||||
xh,xl=x>>(n/2),x&((1<<n/2)-1)
|
||||
yh,yl=y>>(n/2),y&((1<<n/2)-1)
|
||||
return (multiply(xh,yh)<<n)+((multiply(xh,yl)+multiply(yh,xl))<<(n/2))+multiply(xl,yl)
|
||||
```
|
||||
|
||||
$$
|
||||
T(n)=4T(n/2)+\Theta(n)=\Theta(n^2)
|
||||
$$
|
||||
|
||||
Not a useful optimization
|
||||
|
||||
But,
|
||||
|
||||
$$
|
||||
multiply(xh,yl)+multiply(yh,xl)=multiply(xh-xl,yh-yl)+multiply(xh,yh)+multiply(xl,yl)
|
||||
$$
|
||||
|
||||
```python
|
||||
def multiply(x,y):
|
||||
n=max(len(x),len(y))
|
||||
if n==1:
|
||||
return x*y
|
||||
xh,xl=x>>(n/2),x&((1<<n/2)-1)
|
||||
yh,yl=y>>(n/2),y&((1<<n/2)-1)
|
||||
zhh=multiply(xh,yh)
|
||||
zll=multiply(xl,yl)
|
||||
return (zhh<<n)+((multiply(xh-xl,yh-yl)+zhh+zll)<<(n/2))+zll
|
||||
```
|
||||
|
||||
$$
|
||||
T(n)=3T(n/2)+\Theta(n)=\Theta(n^{\log_2 3})\approx \Theta(n^{1.58})
|
||||
$$
|
||||
|
||||
### Example 2: Closest Pairs
|
||||
|
||||
Input: $P$ is a set of $n$ points in the plane. $p_i=(x_i,y_i)$
|
||||
|
||||
$$
|
||||
d(p_i,p_j)=\sqrt{(x_i-x_j)^2+(y_i-y_j)^2}
|
||||
$$
|
||||
|
||||
Goal: Find the distance between the closest pair of points.
|
||||
|
||||
Naive algorithm: iterate all pairs ($O(n)=\Theta(n^2)$).
|
||||
|
||||
Divide and conquer algorithm:
|
||||
|
||||
Preprocessing: Sort $P$ by $x$ coordinate to get $P_x$.
|
||||
|
||||
Base case:
|
||||
|
||||
- 1 point: clostest d = inf
|
||||
- 2 points: clostest d = d(p_1,p_2)
|
||||
|
||||
Divide Step:
|
||||
|
||||
Compute mid point and get $Q, R$.
|
||||
|
||||
Recursive step:
|
||||
|
||||
- $d_l$ closest pair in $Q$
|
||||
- $d_r$ closest pair in $R$
|
||||
|
||||
Combine step:
|
||||
|
||||
Calculate $d_c$ closest point such that one point is on the left side and the other is on the right.
|
||||
|
||||
return $min(d_c,d_l,d_r)$
|
||||
|
||||
Total runtime:
|
||||
|
||||
$$
|
||||
T(n)=2T(n/2)+\Theta(n^2)
|
||||
$$
|
||||
|
||||
Still no change.
|
||||
|
||||
Important Insight: Can reduce the number of checks
|
||||
|
||||
**Lemma:** If all points within this square are at least $\delta=min\{d_r,d_l\}$ apart, there are at most 4 points in this square.
|
||||
|
||||
A better algorithm:
|
||||
|
||||
1. Divide $P_x$ into 2 halves using the mid point
|
||||
2. Recursively computer the $d_l$ and $d_r$, take $\delta=min(d_l,d_r)$.
|
||||
3. Filter points into y-strip: points which are within $(mid_x-\delta,mid_x+\delta)$
|
||||
4. Sort y-strip by y coordinate. For every point $p$, we look at this y-strip in sorted order starting at this point and stop when we see a point with y coordinate $>p_y +\delta$
|
||||
|
||||
```python
|
||||
# d is distance function
|
||||
def closestP(P,d):
|
||||
Px=sorted(P,key=lambda x:x[0])
|
||||
def closestPRec(P,d):
|
||||
n=len(P)
|
||||
if n==1:
|
||||
return float('inf')
|
||||
if n==2:
|
||||
return d(P[0],P[1])
|
||||
Q,R=Px[:n//2],Px[n//2:]
|
||||
midx=R[0][0]
|
||||
dl,dr=closestP(Q),closestP(R)
|
||||
dc=min(dl,dr)
|
||||
ys=[i if midx-dc<i[0]<midx+dc for i in P]
|
||||
ys.sort()
|
||||
yn=len(ys)
|
||||
# this step below checks at most 4 points, (but still runs O(n))
|
||||
for i in range(yn):
|
||||
for j in range(i,yn):
|
||||
curd=d(ys[i],ys[j])
|
||||
if curd>dc:
|
||||
break
|
||||
dc=min(dc,curd)
|
||||
return dc
|
||||
return closestPRec(Px,d):
|
||||
```
|
||||
|
||||
Runtime analysis:
|
||||
|
||||
$$
|
||||
T(n)=2T(n/2)+\Theta(n\log n)=\Theta(n\log^2 n)
|
||||
$$
|
||||
|
||||
We can do even better by presorting Y
|
||||
|
||||
1. Divide $P_x$ into 2 halves using the mid point
|
||||
2. Recursively computer the $d_l$ and $d_r$, take $\delta=min(d_l,d_r)$.
|
||||
3. Filter points into y-strip: points which are within $(mid_x-\delta,mid_x+\delta)$ by visiting presorted $P_y$
|
||||
|
||||
```python
|
||||
# d is distance function
|
||||
def closestP(P,d):
|
||||
Px=sorted(P,key=lambda x:x[0])
|
||||
Py=sorted(P,key=lambda x:x[1])
|
||||
def closestPRec(P,d):
|
||||
n=len(P)
|
||||
if n==1:
|
||||
return float('inf')
|
||||
if n==2:
|
||||
return d(P[0],P[1])
|
||||
Q,R=Px[:n//2],Px[n//2:]
|
||||
midx=R[0][0]
|
||||
dl,dr=closestP(Q),closestP(R)
|
||||
dc=min(dl,dr)
|
||||
ys=[i if midx-dc<i[0]<midx+dc for i in Py]
|
||||
yn=len(ys)
|
||||
# this step below checks at most 4 points, (but still runs O(n))
|
||||
for i in range(yn):
|
||||
for j in range(i,yn):
|
||||
curd=d(ys[i],ys[j])
|
||||
if curd>dc:
|
||||
break
|
||||
dc=min(dc,curd)
|
||||
return dc
|
||||
return closestPRec(Px,d):
|
||||
```
|
||||
|
||||
Runtime analysis:
|
||||
|
||||
$$
|
||||
T(n)=2T(n/2)+\Theta(n)=\Theta(n\log n)
|
||||
$$
|
||||
|
||||
## In-person lectures
|
||||
|
||||
$$
|
||||
T(n)=aT(n/b)+f(n)
|
||||
$$
|
||||
|
||||
$a$ is number of sub problems, $n/b$ is size of subproblems, $f(n)$ is the cost of divide and combine cost.
|
||||
|
||||
### Example 3: Max Contiguous Subsequence Sum (MCSS)
|
||||
|
||||
Given: array of integers (positive or negative), $S=[s_1,s_2,...,s_n]$
|
||||
|
||||
Return: $max\{\sum^i_{k=i} s_k|1\leq i\leq n, i\leq j\leq n\}$
|
||||
|
||||
Trivial solution:
|
||||
|
||||
brute force
|
||||
$O(n^3)$
|
||||
|
||||
A bit better solution:
|
||||
|
||||
$O(n^2)$ use prefix sum to reduce cost for sum.
|
||||
|
||||
Divide and conquer solution.
|
||||
|
||||
```python
|
||||
def MCSS(S):
|
||||
def MCSSMid(S,i,j,mid):
|
||||
res=S[j]
|
||||
for l in range(i,j):
|
||||
curS=0
|
||||
for r in range(l,j):
|
||||
curS+=S[r]
|
||||
res=max(res,curS)
|
||||
return res
|
||||
def MCSSRec(i,j):
|
||||
if i==j:
|
||||
return S[i]
|
||||
mid=(i+j)//2
|
||||
L,R=MCSSRec(i,mid),MCSSRec(mid,j)
|
||||
C=MCSSMid(i,j)
|
||||
return min([L,C,R])
|
||||
return MCSSRec(0,len(S))
|
||||
```
|
||||
|
||||
If `MCSSMid(S,i,j,mid)` use trivial solution, the running time is:
|
||||
|
||||
$$
|
||||
T(n)=2T(n/2)+O(n^2)=\Theta(n^2)
|
||||
$$
|
||||
|
||||
and we did nothing.
|
||||
|
||||
Observations: Any contiguous subsequence that starts on the left and ends on the right can be split into two parts as `sum(S[i:j])=sum(S[i:mid])+sum(S[mid,j])`
|
||||
|
||||
and let $LS$ be the subsequence that has the largest sum that ends at mid, and $RS$ be the subsequence that has the largest sum on the right that starts at mid.
|
||||
|
||||
**Lemma:** Biggest subsequence that contains `S[mid]` is $LS+RP$
|
||||
|
||||
Proof:
|
||||
|
||||
By contradiction,
|
||||
|
||||
Assume for the sake of contradiction that $y=L'+R'$ is a sum of such a subsequence that is larger than $x$ ($y>x$).
|
||||
|
||||
Let $z=LS+R'$, since $LS\geq L'$, by definition of $LS$, then $z\geq y$, WOLG, $RS\geq R'$, $x\geq y$, which contradicts that $y>x$.
|
||||
|
||||
Optimized function as follows:
|
||||
|
||||
```python
|
||||
def MCSS(S):
|
||||
def MCSSMid(S,i,j,mid):
|
||||
res=S[mid]
|
||||
LS,RS=0,0
|
||||
cl,cr=0,0
|
||||
for l in range(mid-1,i-1,-1):
|
||||
cl+=S[l]
|
||||
LS=max(LS,cl)
|
||||
for r in range(mid+1,j):
|
||||
cr+=S[r]
|
||||
RS=max(RS,cr)
|
||||
return res+LS+RS
|
||||
def MCSSRec(i,j):
|
||||
if i==j:
|
||||
return S[i]
|
||||
mid=(i+j)//2
|
||||
L,R=MCSSRec(i,mid),MCSSRec(mid,j)
|
||||
C=MCSSMid(i,j)
|
||||
return min([L,C,R])
|
||||
return MCSSRec(0,len(S))
|
||||
```
|
||||
|
||||
The running time is:
|
||||
|
||||
$$
|
||||
T(n)=2T(n/2)+O(n)=\Theta(n\log n)
|
||||
$$
|
||||
|
||||
Strengthening the recusions:
|
||||
|
||||
```python
|
||||
def MCSS(S):
|
||||
def MCSSRec(i,j):
|
||||
if i==j:
|
||||
return S[i],S[i],S[i],S[i]
|
||||
mid=(i+j)//2
|
||||
L,lp,ls,sl=MCSSRec(i,mid)
|
||||
R,rp,rs,sr=MCSSRec(mid,j)
|
||||
return min([L,R,ls+rp]),max(lp,sl+rp),max(rs,sr+ls),sl+sr
|
||||
return MCSSRec(0,len(S))
|
||||
```
|
||||
|
||||
Pre-computer version:
|
||||
|
||||
```python
|
||||
def MCSS(S):
|
||||
pfx,sfx=[0],[S[-1]]
|
||||
n=len(S)
|
||||
for i in range(n-1):
|
||||
pfx.append(pfx[-1]+S[i])
|
||||
sfx.insert(sfx[0]+S[n-i-2],0)
|
||||
def MCSSRec(i,j):
|
||||
if i==j:
|
||||
return S[i],pfx[i],sfx[i]
|
||||
mid=(i+j)//2
|
||||
L,lp,ls=MCSSRec(i,mid)
|
||||
R,rp,rs=MCSSRec(mid,j)
|
||||
return min([L,R,ls+rp]),max(lp,sfx[mid]-sfx[i]+rp),max(rs,sfx[j]-sfx[mid]+ls)
|
||||
return MCSSRec(0,n)
|
||||
```
|
||||
|
||||
$$
|
||||
T(n)=2T(n/2)+O(1)=\Theta(n)
|
||||
$$
|
||||
161
content/CSE347/CSE347_L3.md
Normal file
161
content/CSE347/CSE347_L3.md
Normal file
@@ -0,0 +1,161 @@
|
||||
# Lecture 3
|
||||
|
||||
## Dynamic programming
|
||||
|
||||
When we cannot find a good Greedy Choice, the only thing we can do is to iterate all choices.
|
||||
|
||||
### Example 1: Edit distance
|
||||
|
||||
Input: 2 sequences of some character set, e.g.
|
||||
|
||||
$S=ABCADA$, $T=ABADC$
|
||||
|
||||
Goal: Computer the minimum number of **insertions or deletions** you could do to convert $S$ into $T$
|
||||
|
||||
We will call it `Edit Distance(S[1...n],T[1...m])`. where `n` and `m` be the length of `S` and `T` respectively.
|
||||
|
||||
Idea: computer difference between the sequences.
|
||||
|
||||
Observe: The difference we observed appears at index 3, and in this example where the sequences are short, it is obvious that it is better to delete 'C'. But for long sequence, we donot know that the later sequence looks like so it is hard to make a decision on whether to insert 'A' or delete 'C'.
|
||||
|
||||
Use branching algorithm:
|
||||
|
||||
```python
|
||||
def editDist(S,T,i,j):
|
||||
if len(S)<=i:
|
||||
return len(T)
|
||||
if len(T)<=j:
|
||||
return len(S)
|
||||
if S[i]==T[j]:
|
||||
return editDist(S,T,i+1,j+1)
|
||||
else:
|
||||
return min(editDist(S,T,i+1,j),editDist(S,T,i,j+1))
|
||||
```
|
||||
|
||||
Correctness Proof Outline:
|
||||
|
||||
- ~~Greedy Choice Property~~
|
||||
|
||||
- Complete Choice Property:
|
||||
- The optimal solution makes **one** of the choices that we consider
|
||||
- Inductive Structure:
|
||||
- Once you make **any** choice, you are left with a smaller problem of the same type. **Any** first choice + **feasible** solution to the subproblem = feasible solution to the entire problem.
|
||||
- Optimal Substructure:
|
||||
- If we optimally solve the subproblem for **a particular choice c**, and combine it with c, resulting solution is the **optimal solution that makes choice c**.
|
||||
|
||||
Correctness Proof:
|
||||
|
||||
Claim: For any problem $P$, the branking algorithm finds the optimal solution.
|
||||
|
||||
Proof: Induct on problem size
|
||||
|
||||
- Base case: $|S|=0$ or $|T|=0$, obvious
|
||||
- Inductive Case: By inductive hypothesis: Branching algorithm works for all smaller problems, either $S$ is smaller or $T$ is smaller or both
|
||||
- For each choice we make, we got a strictly smaller problem: by inductive structure, and the answer is correct by inductive hypothesis.
|
||||
- By Optimal substructure, we know for any choice, the solution of branching algorithm for subproblem and the choice we make is an optimal solution for that problem.
|
||||
- Using Complete choice property, we considered all the choices.
|
||||
|
||||
Using tree graph, the left and right part of the tree has height n, but the middle part of the tree has height 2n. So the running time is $\Omega(2^n)$, at least $2^n$.
|
||||
|
||||
#### How could we reduce the complexity?
|
||||
|
||||
There are **overlapping subproblems** that we compute more than once! Number of distinct subproblems is polynomial, we can **share the solution** that we have already computed!
|
||||
|
||||
**store the result of subprolem in 2D array**
|
||||
|
||||
Use dp:
|
||||
|
||||
```python
|
||||
def editDist(S,T,i,j):
|
||||
m,n=len(S),len(T)
|
||||
dp=[[0]*(n+1) for _ in range(m+1)]
|
||||
for i in range(n):
|
||||
dp[i][m]=n-i
|
||||
for i in range(m):
|
||||
dp[n][j]=m-i
|
||||
for i in range(m):
|
||||
for j in range(n):
|
||||
if S[i]==T[j]:
|
||||
dp[i][j]=dp[i+1][j+1]
|
||||
else:
|
||||
# assuming the cost of insertion and deletion is 1
|
||||
dp[i][j]=min(1+dp[i][j+1],1+dp[i+1][j])
|
||||
```
|
||||
|
||||
We can use backtracking to find out how do we reach our final answer. Then the new runtime will be the time used to complete the table, which is $T(n,m)=\Theta(mn)$
|
||||
|
||||
### Example 2: Weighted Interval Scheduling (IS)
|
||||
|
||||
Input: $P=\{p_1,p_2,...,p_n\}$, $p_i=\{s_i,f_i,w_i\}$
|
||||
$s_i$ is the start time, $f_i$ is the finish time, $w_i$ is the weight of the task for job $i$
|
||||
|
||||
Goal: Pick a set of **non-overlapping** intervals $\Pi$ such that $\sum_{p_i\in \Pi} w_i$ is maximized.
|
||||
|
||||
Trivial solution ($T(n)=O(2^n)$)
|
||||
|
||||
```python
|
||||
# p=[[s_i,f_i,w_i],...]
|
||||
p=[]
|
||||
p.sort()
|
||||
n=len(p)
|
||||
def intervalScheduling(idx):
|
||||
res=0
|
||||
if i>=n:
|
||||
return res
|
||||
for i in range(idx,n):
|
||||
# pick when end
|
||||
if p[idx][1]>p[i][0]:
|
||||
continue
|
||||
res=max(intervalScheduling(i+1)+p[i][2],res)
|
||||
return intervalScheduling(0)
|
||||
```
|
||||
|
||||
Using dp ($T(n)=O(n^2)$)
|
||||
|
||||
```python
|
||||
def intervalScheduling(p):
|
||||
p.sort()
|
||||
n=len(p)
|
||||
dp=[0]*(n+1)
|
||||
for i in range(n-1,-1,-1):
|
||||
# load initial best case: do nothing
|
||||
dp[i]=dp[i+1]
|
||||
_,e,w=p[i]
|
||||
for j in range(bisect.bisect_left(p,e,key=lambda x:x[0]),n+1):
|
||||
dp[i]=max(dp[i],w+dp[j])
|
||||
return dp[0]
|
||||
```
|
||||
|
||||
### Example 3: Subset sums
|
||||
|
||||
Input: a set $S$ of positive and unique integers and another integer $K$.
|
||||
|
||||
Problem: Is there a subset $X\subseteq S$ such that $sum(X)=K$
|
||||
|
||||
Brute force takes $O(2^n)$.
|
||||
|
||||
```python
|
||||
def subsetSum(arr,i,k)->bool:
|
||||
if i>=len(arr):
|
||||
if k==0:
|
||||
return True
|
||||
return False
|
||||
return subsetSum(i+1,k-arr[i]) or subsetSum(i+1,k)
|
||||
```
|
||||
|
||||
Using dp $O(nk)$
|
||||
|
||||
```python
|
||||
def subsetSum(arr,k)->bool:
|
||||
n=len(arr)
|
||||
dp=[False]*(k+1)
|
||||
dp[0]=True
|
||||
for e in arr:
|
||||
ndp=[]
|
||||
for i in range(k+1):
|
||||
ndp.append(dp[i])
|
||||
if i-e>=0:
|
||||
ndp[i]|=dp[i-e]
|
||||
dp=ndp
|
||||
return dp[-1]
|
||||
```
|
||||
321
content/CSE347/CSE347_L4.md
Normal file
321
content/CSE347/CSE347_L4.md
Normal file
@@ -0,0 +1,321 @@
|
||||
# Lecture 4
|
||||
|
||||
## Maximum Flow
|
||||
|
||||
### Example 1: Ship cement from factory to building
|
||||
|
||||
Input $s$: source, $t$: destination
|
||||
|
||||
Graph with **directed** edges weights on each edge: **capacity**
|
||||
|
||||
**Goal:** Ship as much stuff as possible while obeying capacity constrains.
|
||||
|
||||
Graph: $(V,E)$ directed and weighted
|
||||
|
||||
- Unique source and sink nodes $\to s, t$
|
||||
- Each edge has capacity $c(e)$ [Integer]
|
||||
|
||||
A valid flow assignment assigns an integer $f(e)$ to each edge s.t.
|
||||
|
||||
Capacity constraint: $0\leq f(e)\leq c(e)$
|
||||
|
||||
Flow conservation:
|
||||
|
||||
$$
|
||||
\sum_{e\in E_{in}(v)}f(e)=\sum_{e\in E_{out}(v)}f(e),\forall v\in V-{s,t}
|
||||
$$
|
||||
|
||||
$E_{in}(v)$: set of incoming edges to $v$
|
||||
$E_{out}(v)$: set of outgoing edges from $v$
|
||||
|
||||
Compute: Maximum Flow: Find a valid flow assignment to
|
||||
|
||||
Maximize $|F|=\sum_{e\in E_{in}(t)}f(e)=\sum_{e\in E_{out}(s)}f(e)$ (total units received by end and sent by source)
|
||||
|
||||
Additional assumptions
|
||||
|
||||
1. $s$ has no incoming edges, $t$ has no outgoing edges
|
||||
2. You do not have a cycle of 2 nodes
|
||||
|
||||
A proposed algorithm:
|
||||
|
||||
1. Find a path from $s$ to $t$
|
||||
2. Push as much flow along the path as possible
|
||||
3. Adjust the capacities
|
||||
4. Repeat until we cannot find a path
|
||||
|
||||
**Residual Graph:** If there is an edge $e=(u,v)$ in $G$, we will add a back edge $\bar{e}=(v,u)$. Capacity of $\bar{e}=$ flow on $e$. Call this graph $G_R$.
|
||||
|
||||
Algorithm:
|
||||
|
||||
- Find an "augmenting path" $P$.
|
||||
- $P$ can contain forward or backward edges!
|
||||
- Say the smallest residual capacity along the path is $k$.
|
||||
- Push $k$ flow on the path ($f(e) =f(e) + k$ for all edges on path $P$)
|
||||
- Reduce the capacity of all edges on the path $P$ by $k$
|
||||
- **Increase** the capacity of the corresponding mirror/back edges
|
||||
- Repeat until there are no augmenting paths
|
||||
|
||||
### Formalize: Ford-Fulkerson (FF) Algorithm
|
||||
|
||||
1. Initialize the residual graph $G_R=G$
|
||||
2. Find an augmenting path $P$ with capacity $k$ (min capacity of any edge on $P$)
|
||||
3. Fix up the residual capacities in $G_R$
|
||||
- $c(e)=c(e)-k,\forall e\in P$
|
||||
- $c(\bar{e})=c(\bar{e})+k,\forall \bar{e}\in P$
|
||||
4. Repeat 2 and 3 until no augmenting path can be found in $G_R$.
|
||||
|
||||
```python
|
||||
def ford_fulkerson_algo(G,n,s,t):
|
||||
"""
|
||||
Args:
|
||||
G: is the graph for max_flow
|
||||
n: is the number of vertex in the graph
|
||||
s: start vertex of flow
|
||||
t: end vertex of flow
|
||||
Returns:
|
||||
the max flow in graph from s to t
|
||||
"""
|
||||
# Initialize the residual graph $G_R=G$
|
||||
GR=[defaultdict(int) for i in range(n)]
|
||||
for i in range(n):
|
||||
for v,_ in enumerate(G[i]):
|
||||
# weight w is unused
|
||||
GR[v][i]=0
|
||||
path=set()
|
||||
def augP(cur):
|
||||
# Find an augumentting path $P$ with capacity $k$ (min capacity of any edge on $P$)
|
||||
if cur==t: return True
|
||||
# true for edge in residual path, false for edge in graph
|
||||
for v,w in G[cur]:
|
||||
if w==0 or (cur,v,False) in path: continue
|
||||
path.add((cur,v,False))
|
||||
if augP(v): return True
|
||||
path.remove((cur,v,False))
|
||||
for v,w in GR[cur]:
|
||||
if w==0 or (cur,v,True) in path: continue
|
||||
path.add((cur,v,True))
|
||||
if augP(v): return True
|
||||
path.remove((cur,v,True))
|
||||
return False
|
||||
while augP(s):
|
||||
k=min([GR[a][b] if isR else G[a][b] for a,b,isR in path])
|
||||
# Fix up the residual capacities in $G_R$
|
||||
# - $c(e)=c(e)-k,\forall e\in P$
|
||||
# - $c(\bar{e})=c(\bar{e})+k,\forall \bar{e}\in P$
|
||||
for a,b,isR in path:
|
||||
if isR:
|
||||
GR[a][b]+=k
|
||||
else:
|
||||
G[a][b]-=k
|
||||
return sum(GR[s].values())
|
||||
```
|
||||
|
||||
#### Proof of Correctness: Valid Flow
|
||||
|
||||
**Lemma 1:** FF finds a valid flow
|
||||
|
||||
- Capacity and conservation constrains are not violated
|
||||
- Capacity constraint: $0\leq f(e)\leq c(e)$
|
||||
- Flow conservation: $\sum_{e\in E_{in}(v)}f(e)=\sum_{e\in E_{out}(v)}f(e),\forall v\in V-\{s,t\}$
|
||||
|
||||
Proof: We proceed by induction on **augmenting paths**
|
||||
|
||||
##### Base Case
|
||||
|
||||
$f(e)=0$ on all edges
|
||||
|
||||
##### Inductive Case
|
||||
|
||||
By inductive hypothesis, we have a valid flow and the corresponding residual graph $G_R$.
|
||||
|
||||
Inductive Step:
|
||||
|
||||
Now we find an augmented path $P$ in $GR$, pushed $k$ (which is the smallest edge capacity on $P$). Argue that the constraints are not violated.
|
||||
|
||||
**Capacity Constrains:** Consider an edge $e$ in $P$.
|
||||
|
||||
- If $e$ is an forward edge (in the original graph)
|
||||
- by construction of $G_R$, it had left over capacities.
|
||||
- If $e$ is an back edge with residual capacity $\geq k$
|
||||
- flow on real edge reduces, but the real capacity is still $\geq 0$, no capacity constrains violation.
|
||||
|
||||
**Conservation Constrains:** Consider a vertex $v$ on path $P$
|
||||
|
||||
1. Both forward edges
|
||||
- No violation, push $k$ flow into $v$ and out.
|
||||
2. Both back edges
|
||||
- No violation, push $k$ less flow into $v$ and out.
|
||||
3. Redirecting flow
|
||||
- No violation, change of $0$ by $k-k$ on $v$.
|
||||
|
||||
#### Proof of Correctness: Termination
|
||||
|
||||
**Lemma 2:** FF terminate
|
||||
|
||||
Proof:
|
||||
|
||||
Every time it finds an augmenting path that increases the total flow.
|
||||
|
||||
Must terminate either when it finds a max flow or before.
|
||||
|
||||
Each iteration we use $\Theta(m+n)$ to find a valid path.
|
||||
|
||||
The number of iteration $\leq |F|$, the total is $\Theta(|F|(m+n))$ (not polynomial time)
|
||||
|
||||
#### Proof of Correctness: Optimality
|
||||
|
||||
From Lemma 1 and 2, we know that FF returns a feasible solution, but does it return the **maximum** flow?
|
||||
|
||||
##### Max-flow Min-cut Theorem
|
||||
|
||||
Given a graph $G(V,E)$, a **graph cut** is a partition of vertices into 2 subsets.
|
||||
|
||||
- $S$: $s$ + maybe some other vertices
|
||||
- $V-S$: $t$ + maybe some other vertices
|
||||
|
||||
Define capacity of the cut be the sum of capacity of edges that go from a vertex in $S$ to a vertex in $T$.
|
||||
|
||||
**Lemma 3:** For all valid flows $f$, $|f|\leq C(S)$ for all cut $S$ (Max-flow $\leq$ Min-cut)
|
||||
|
||||
Proof: all flow must go through one of the cut edges.
|
||||
|
||||
**Min-cut:** cut of smallest capacity, $S^*$. $|f|\leq C(S^*)$
|
||||
|
||||
**Lemma 4:** FF produces a flow $=C(S^*)$
|
||||
|
||||
Proof: Let $\hat{f}$ be the flow found by FF. Mo augmenting paths in $G_R$.
|
||||
|
||||
Let $\hat{S}$ be all vertices that can be reached from $s$ using edges with capacities $>0$.
|
||||
|
||||
and all the forward edges going out of the cut are saturated. Since back edges have capacity 0, no flow is going into the cut $S$.
|
||||
|
||||
If some flow was coming from $V-\hat{S}$, then there must be some edges with capacity $>0$. So, $|f|\leq C(S^*)$
|
||||
|
||||
### Example 2: Bipartite Matching
|
||||
|
||||
input: Given $n$ classes and $n$ rooms; we want to match classes to rooms.
|
||||
|
||||
Bipartite graph $G=(V,E)$ (unweighted and undirected)
|
||||
|
||||
- Vertices are either in set $L$ or $R$
|
||||
- Edges only go between vertices of different sets
|
||||
|
||||
Matching: A subset of edges $M\subseteq E$ s.t.
|
||||
|
||||
- Each vertex has at most one edge from $M$ incident on it.
|
||||
|
||||
Maximum Matching: matching of the largest size.
|
||||
|
||||
We will reduce the problem to the problem of finding the maximum flow
|
||||
|
||||
#### Reduction
|
||||
|
||||
Given a bipartite graph $G=(V,E)$, construct a graph $G'=(V',E')$ such that
|
||||
|
||||
$$
|
||||
|max-flow (G')|=|max-flow(G)|
|
||||
$$
|
||||
|
||||
Let $s$ connects to all vertices in $L$ and all vertex in $R$ connects to $t$.
|
||||
|
||||
$G'=G+s+t+$added edges form $S$ to $T$ and added capacities.
|
||||
|
||||
#### Proof of correctness
|
||||
|
||||
Claim: $G'$ has a flow of $k$ iff $G$ has a matching of size $k$
|
||||
|
||||
Proof: Two directions:
|
||||
|
||||
1. Say $G$ has a matching of size $k$, we want to prove $G'$ has a flow of size $k$.
|
||||
2. Say $G'$ has a flow of size $k$, we want to prove $G$ has a matching of size $k$.
|
||||
|
||||
## Conclusion: Maximum Flow
|
||||
|
||||
Problem input and target
|
||||
|
||||
Ford-Fulkerson Algorithm
|
||||
|
||||
- Execution: residual graph
|
||||
- Runtime
|
||||
|
||||
FF correctness proof
|
||||
|
||||
- Max-flow Min-cut Theorem
|
||||
- Graph Cut definition
|
||||
- Capacity of cut
|
||||
|
||||
Reduction to Bipartite Matching
|
||||
|
||||
### Example 3: Image Segmentation: (reduction from min-cut)
|
||||
|
||||
Given:
|
||||
|
||||
- Image consisting of an object and a background.
|
||||
- the object occupies some set of pixels $A$, while the background occupies the remaining pixels $B$.
|
||||
|
||||
Required:
|
||||
|
||||
- Separate $A$ from $B$ but if doesn't know which pixels are each.
|
||||
- For each pixel $i,p_i$ is the probability that $i\in A$
|
||||
- For each pair of adjacent pixels $i,j,c_{ij}$ is the cost of placing the object boundary between them. i.e. putting $i$ in $A$ and $j$ in $B$.
|
||||
- A segmentation of the image is an assignment of each pixel to $A$ or $B$.
|
||||
- The goal is to find a segmentation that maximizes
|
||||
|
||||
$$
|
||||
\sum_{i\in A}p_i+\sum_{i\in B}(1-p_i)-\sum_{i,j\ on \ boundary}c_{ij}
|
||||
$$
|
||||
|
||||
Solution:
|
||||
|
||||
- Let's turn our maximization into a minimization
|
||||
- If the image has $N$ pixels, then we can rewrite the objective as
|
||||
|
||||
$$
|
||||
N-\sum_{i\in A}(1-p_i)-\sum_{i\in B}p_i-\sum_{i,j\ on \ boundary}c_{ij}
|
||||
$$
|
||||
|
||||
because $N=\sum_{i\in A}p_i+\sum_{i\in A}(1-p_i)+\sum_{i\in B}p_i+\sum_{i\in B}(1-p_i)$ boundary
|
||||
|
||||
New maximization problem:
|
||||
|
||||
$$
|
||||
Max\left( N-\sum_{i\in A}(1-p_i)-\sum_{i\in B}p_i-\sum_{i,j\ on \ boundary}c_{ij}\right)
|
||||
$$
|
||||
|
||||
Now, this is equivalent ot minimizing
|
||||
|
||||
$$
|
||||
\sum_{i\in A}(1-p_i)+\sum_{i\in B}p_i+\sum_{i,j\ on \ boundary}c_{ij}
|
||||
$$
|
||||
|
||||
Second steps
|
||||
|
||||
- Form a graph with $n$ vertices, $v_i$ on for each pixel
|
||||
- Add vertices $s$ and $t$
|
||||
- For each $v_i$, add edges $S-T$ cut of $G$ assigned each $v_i$ to either $S$ side or $T$ side.
|
||||
- The $S$ side of an $S-T$ is the $A$ side, while the $T$ side of the cur is the $B$ side.
|
||||
- Observer that if $v_i$ goes on the $S$ side, it becomes part of $A$, so the cut increases by $1-p$. Otherwise, it become part of $B$, so the cut increases by $p_i$ instead.
|
||||
- Now add edges $v_i\to v_j$ with capacity $c_{ij}$ for all adjacent pixels pairs $i,j$
|
||||
- If $v_i$ and $v_j$ end up on opposite sides of the cut (boundary), then the cut increases by $c_{ij}$.
|
||||
- Conclude that any $S-T$ cut that assigns $S\subseteq V$ to the $A$ side and $V\backslash S$ to the $B$ side pays a total of
|
||||
1. $1-p_i$ for each $v_i$ on the $A$ side
|
||||
2. $p_i$ for each $v_i$ on the $B$ side
|
||||
3. $c_{ij}$ for each adjacent pair $i,j$ that is at the boundary. i.e. $i\in S\ and\ j\in V\backslash S$
|
||||
- Conclude that a cut with a capacity $c$ implies a segmentation with objective value $cs$.
|
||||
- The converse can (and should) be also checked: a segmentation with subjective value $c$ implies a $S-T$ cut with capacity $c$.
|
||||
|
||||
#### Algorithm
|
||||
|
||||
- Given an image with $N$ pixels, build the graph $G$ as desired.
|
||||
- Use the FF algorithm to find a minimum $S-T$ cut of $G$
|
||||
- Use this cut to assign each pixel to $A$ or $B$ as described, i.e pixels that correspond to vertices on the $S$ side are assigned to $A$ and those corresponding to vertices on the $T$ side to $B$.
|
||||
- Minimizing the cut capacity minimizes our transformed minimization objective function.
|
||||
|
||||
#### Running time
|
||||
|
||||
The graph $G$ contains $\Theta(N)$ edges, because each pixel is adjacent to a maximum of of 4 neighbors and $S$ and $T$.
|
||||
|
||||
FF algorithm has running time $O((m+n)|F|)$, where $|F|\leq |n|$ is the size of set of min-cut. The edge count is $m=6n$.
|
||||
|
||||
So the total running time is $O(n^2)$
|
||||
341
content/CSE347/CSE347_L5.md
Normal file
341
content/CSE347/CSE347_L5.md
Normal file
@@ -0,0 +1,341 @@
|
||||
# Lecture 5
|
||||
|
||||
## Takeaway from Bipartite Matching
|
||||
|
||||
- We saw how to solve a problem (bi-partite matching and others) by reducing it to another problem (maximum flow).
|
||||
- In general, we can design an algorithm to map instances of a new problem to instances of known solvable problem (e.g., max-flow) to solve this new problem!
|
||||
- Mapping from one problem to another which preserves solutions is called reduction.
|
||||
|
||||
## Reduction: Basic Ideas
|
||||
|
||||
Convert solutions to the known problem to the solutions to the new problem
|
||||
|
||||
- Instance of new problem
|
||||
- Instance of known problem
|
||||
- Solution of known problem
|
||||
- Solution of new problem
|
||||
|
||||
## Reduction: Formal Definition
|
||||
|
||||
Problems $L,K$.
|
||||
|
||||
$L$ reduces to $K$ ($L\leq K$) if there is a mapping $\phi$ from **any** instance $l\in L$ to some instance $\phi(l)\in K'\subset K$, such that the solution for $\phi(l)$ yields a solution for $l$.
|
||||
|
||||
This means that **L is no harder than K**
|
||||
|
||||
### Using reduction to design algorithms
|
||||
|
||||
In the example of reduction to solve Bipartite Matching:
|
||||
|
||||
$L:$ Bipartite Matching
|
||||
|
||||
$K:$ Max-flow Problem
|
||||
|
||||
Efficiency:
|
||||
|
||||
1. Reduction: $\phi:l\to\phi(l)$ (Polynomial time reduction $\phi(l)$)
|
||||
2. Solve prom $\phi(l)$ (Polynomial time to solve $poly(g)$)
|
||||
3. Convert the solution for $\phi(l)$ to a solution to $l$ (Polynomial time to solve $poly(g)$)
|
||||
|
||||
### Efficient Reduction
|
||||
|
||||
A reduction $\phi:l\to\phi(l)$ is efficient ($L\leq p(k)$) if for any $l\in L$:
|
||||
|
||||
1. $\phi(l)$ is computable from $l$ in polynomial ($|l|$) time.
|
||||
2. Solution to $l$ is computable from solution of $\phi(l)$ in polynomial ($|l|$) time.
|
||||
|
||||
We call $L$ is **poly-time reducible** to $K$, or $L$ poly-time
|
||||
reduces to $K$.
|
||||
|
||||
### Which problem is harder?
|
||||
|
||||
Theorem: If $L\leq p(k)$ and there is a polynomial time algorithm to solve $K$, then there is a polynomial time algorithm to solve $L$.
|
||||
|
||||
Proof: Given an instance of $l\in L$ If we can convert the problem in polynomial time with respect to the original problem $l$.
|
||||
|
||||
1. Compute $\phi(l)$: $p(l)$
|
||||
2. Solve $\phi(l)$: $p(\phi(l))$
|
||||
3. Convert solution: $p(\phi(l))$
|
||||
|
||||
Total time: $p(l)+p(\phi(l))+p(\phi(l))=p(l)+p(\phi(l))$
|
||||
Need to show: $|\phi(l)|=poly(|l|)$
|
||||
|
||||
Proof:
|
||||
|
||||
Since we can convert $\phi(l)$ in $p(l)$ time, and on every time step, (constant step) we can only write constant amount of data.
|
||||
|
||||
So $|\phi(l)|=poly(|l|)$
|
||||
|
||||
## Hardness Problems
|
||||
|
||||
Reductions show the relationship between problem hardness!
|
||||
|
||||
Question: Could you solve a problem in polynomial time?
|
||||
|
||||
Easy: polynomial time solution
|
||||
Hard: No polynomial time solution (as far as we know)
|
||||
|
||||
### Types of Problems
|
||||
|
||||
Decision Problem: Yes/No answer
|
||||
|
||||
Examples: Subset sums
|
||||
|
||||
1. Is the there a flow of size $F$
|
||||
2. Is there a shortest path of length $L$ from vertex $u$ to vertex $v$.
|
||||
3. Given a set of intercal, can you schedule $k$ of them.
|
||||
|
||||
Optimization Problem: What is the value of an optimal feasible solution of a problem?
|
||||
|
||||
- Minimization: Minimize cost
|
||||
- min cut
|
||||
- minimal spanning tree
|
||||
- shortest path
|
||||
- Maximization: Maximize profit
|
||||
- interval scheduling
|
||||
- maximum flow
|
||||
- maximum matching
|
||||
|
||||
#### Canonical Decision Problem
|
||||
|
||||
Does the instance $l\in L$ (an optimization problem) have a feasible solution with objective value $k$:
|
||||
|
||||
Objective value $\geq k$ (maximization) $\leq k$ (minimization)
|
||||
|
||||
$DL$ is the reduced Canonical Decision problem $L$
|
||||
|
||||
##### Hardness of Canonical Decision Problems
|
||||
|
||||
Lemma 1: $DL\leq p(L)$ ($DL$ is no harder than $L$)
|
||||
|
||||
Proof: Assume $L$ **maximization** problem $DL(l)$: does have a solution $\geq k$.
|
||||
|
||||
Example: Does graph $G$ have flow $\geq k$.
|
||||
|
||||
Let $v^∗$ be the maximum objective on $l$ by solving $l$.
|
||||
|
||||
Let the instance of $DL:(l,k)$ and $l$ be the problem and $k$ be the objective
|
||||
|
||||
1. $l\to \phi(l)\in L$ (optimization problem) $\phi(l,k)=l$
|
||||
2. Is $v^*(l)\geq k$? If so, return true, else return false.
|
||||
|
||||
Lemma 2: If $v^* =O(c^{|l|})$ for any constant $c$, then $L\leq p(DL)$.
|
||||
|
||||
Proof: First we could show $L\leq DL$. Suppose maximization problem, canonical decision problem is is there a solution $\geq k$.
|
||||
|
||||
Naïve Linear Search: Ask $DL(l,k)$, if returns false, ask $DL(l,k+1)$ until returns true
|
||||
|
||||
Runtime: At most $k$ search to iterate all possibilities.
|
||||
|
||||
This is exponential! How to reduce it?
|
||||
|
||||
Our old friend Binary (exponential) Search is back!
|
||||
|
||||
You gets a no at some value: try power of 2 until you get a no, then do binary search
|
||||
|
||||
\# questions: $=log_2(v^*(l))=poly(l)$
|
||||
|
||||
Binary search in area: from last yes to first no.
|
||||
|
||||
Runtime: Binary search ($O(n)=\log(v^*(l))$)
|
||||
|
||||
### Reduction for Algorithm Design vs Hardness
|
||||
|
||||
For problems $L,K$
|
||||
|
||||
If $K$ is “easy” (exists a poly-time solution), then $L$ is also easy.
|
||||
|
||||
If $L$ is “hard” (no poly-time solution), then $k$ is also hard.
|
||||
|
||||
Every problem that we worked on so far, $K$ is “easy”, so we reduce from new problem to known problem (e.g., max-flow).
|
||||
|
||||
#### Reduction for Hardness: Independent Set (ISET)
|
||||
|
||||
Input: Given an undirected graph $G = (V,E)$,
|
||||
|
||||
A subset of vertices $S\subset V$ is called an **independent set** if no two vertices of are connected by an edge.
|
||||
|
||||
Problem: Does $G$ contain an independent set of size $\geq k$?
|
||||
|
||||
$ISET(G,k)$ returns true if $G$ contains an independent set of size $\geq k$, and false otherwise.
|
||||
|
||||
Algorithm? NO! We think that this is a hard problem.
|
||||
|
||||
A lot of pQEDle have tried and could not find a poly-time solution
|
||||
|
||||
### Example: Vertex Cover (VC)
|
||||
|
||||
Input: Given an undirected graph $G = (V,E)$
|
||||
|
||||
A subset of vertices $C\subset V$ is called a **vertex cover** if contains at least one end point of every edge.
|
||||
|
||||
Formally, for all edges $(u,v)\in E$, either $u\in C$, or $v\in C$.
|
||||
|
||||
Problem: $VC(G,j)$ returns true if has a vertex cover of size $\leq j$, and false otherwise (minimization problem)
|
||||
|
||||
Example:
|
||||
|
||||
#### How hard is Vertex Cover?
|
||||
|
||||
Claim: $ISET\leq p(VC)$
|
||||
Side Note: when we prove $VC$ is hard, we prove it is no easier than $ISET$.
|
||||
|
||||
DO NOT: $VC\leq p(ISET)$
|
||||
|
||||
Proof: Show that $G=(V,E)$ has an independent set of $k$ **if and only if** the same graph (not always!) has a vertex cover of size $|V|-k$.
|
||||
|
||||
Map:
|
||||
|
||||
$$
|
||||
ISET(G,k)\to VC(g,|v|-k)
|
||||
$$
|
||||
|
||||
$G'=G$
|
||||
|
||||
##### Proof of reduction: Direction 1
|
||||
|
||||
Claim 1: $ISET$ of size $k\to$ $VC$ of size $|V|-k$
|
||||
|
||||
Proof: Assume $G$ has an $ISET$ of size $k:S$, consider $C = V-S,|C|=|V|-k$
|
||||
|
||||
Claim: $C$ is a vertex cover
|
||||
|
||||
##### Proof of reduction: Direction 2
|
||||
|
||||
|
||||
Claim 2: $VC$ of size $|V|-k\to ISET$ of size $k$
|
||||
|
||||
Proof: Assume $G$ has an $VC$ of size $|V| −k:C$, consider $S = V − C, |S| =k$
|
||||
|
||||
Claim: $S$ is an independent set
|
||||
|
||||
### What does poly-time mean?
|
||||
|
||||
Algorithm runs in time polynomial to input size.
|
||||
|
||||
- If the input has items, algorithm runs in $\Theta(n^c)$ for any constant is poly-time.
|
||||
- Examples: intervals to schedule, number of integers to sort, # vertices + # edges in a graph
|
||||
- Numerical Value (Integer $n$), what is the input size?
|
||||
- Examples: weights, capacity, total time, flow constraints
|
||||
- It is not straightforward!
|
||||
|
||||
### Real time complexity of F-F?
|
||||
|
||||
In class: $O(F( |V| + |E|))$
|
||||
|
||||
- $|V| + |E|$ = this much space to represent the graph
|
||||
- $F$ : size of the maximum flow.
|
||||
|
||||
If every edge has capacity , then $F = O(CE)$
|
||||
Running time:$O(C|E|(|V| + |E| )))$
|
||||
|
||||
### What is the actual input size?
|
||||
|
||||
Each edge ($|E|$ edges):
|
||||
|
||||
- 2 vertices: $|V|$ distinct symbol, $\log |V|$ bits per symbol
|
||||
- 1 capacity: $\log C$
|
||||
|
||||
Size of graph:
|
||||
|
||||
- $O(|E|(|V| + \log C))$
|
||||
- $p( |E| , |V| , \log C)$
|
||||
|
||||
Running time:
|
||||
|
||||
- $P( |E| , |V| , |C| )$
|
||||
- Exponential if is exponential in $|V|+|E|$
|
||||
|
||||
### Pseudo-polynomial
|
||||
|
||||
Naïve Ford-Fulkerson is bad!
|
||||
|
||||
Problem ’s inputs contain some numerical values, say $|W|$. We need only log bits to store . If algorithms runs in $p(W)$, then it is exponential, or **pseudopolynomial**.
|
||||
|
||||
In homework, you improved F-F to make it work in
|
||||
$p( |V| ,|E| , \log C)$, to make it a real polynomial algorithm.
|
||||
|
||||
## Conclusion: Reductions
|
||||
|
||||
- Reduction
|
||||
- Construction of mapping with runtime
|
||||
- Bidirectional proof
|
||||
- Efficient Reduction $L\leq p(K)$
|
||||
- Which problem is harder?
|
||||
- If $L$ is hard, then $K$ is hard. $\to$ Used to show hardness
|
||||
- If $K$ is easy, then $L$ is easy. $\to$ Used for design algorithms
|
||||
- Canonical Decision Problem
|
||||
- Reduction to and from the optimization problem
|
||||
- Reduction for hardness
|
||||
- Independent Set$leq p$ Vertex Cover
|
||||
|
||||
## On class
|
||||
|
||||
Reduction: $V^* = O(c^k)$
|
||||
|
||||
OPT: Find max flow of at least one instance $(G,s,t)$
|
||||
|
||||
DEC: Is there a flow of size $pK$, given $G,s,t \implies$ the instance is defined by the tuple $(G,s,t,k)$
|
||||
|
||||
Yes, if there exists one
|
||||
No, otherwise
|
||||
|
||||
Forget about F-F and assume that you have an oracle that solves the decision problem.
|
||||
|
||||
First solution (the naive solution): iterate over $k = 1, 2, \dots$ until the oracle returns false and the last one returns true would be the max flow.
|
||||
|
||||
Time complexity: $K\cdot X$, where $X$ is the time complexity of the oracle
|
||||
Input size: $poly(||V|,|E|, |E|log(max-capacity))$, and $V^* \leq \sum$ capacities
|
||||
|
||||
A better solution: do a binary search. If there is no upper bound, we use exponential binary search instead. Then,
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
log(V^*) &\leq X\cdot log(\sum capacities)\\
|
||||
&\leq X\cdot log(|E|\cdot maxCapacity)\\
|
||||
&\leq X\cdot (log(|E| + log(maxCapacity)))
|
||||
\end{aligned}
|
||||
$$
|
||||
As $\log(maxCapacity)$ is linear in the size of the input, the running time is polynomial to the solution of the original problem.
|
||||
|
||||
Assume that ISET is a hard problem, i.e. we don't know of any polynomial time solution. We want to show that vertex cover is also a hard problem here:
|
||||
|
||||
$ISET \leq_{p} VC$
|
||||
|
||||
1. Given an instance of ISET, construct an instance of VC
|
||||
2. Show that the construction can be done in polynomial time
|
||||
3. Show that if the ISET instance is true than the CV instance is true
|
||||
4. Show that if the VC instance is true then the ISET instance is true.
|
||||
|
||||
> ISET: given $(G,K)$, is there a set of vertices that do not share edges of size $K$
|
||||
> VC: given $(G,K)$, is there a set of vertices that cover all edges of size $K$
|
||||
|
||||
1. Given $l: (G,K)$ being an instance of ISET, we construct $\phi(l): (G',K')$ as an instance of VC. $\phi(l): (G, |V|-K), \textup{i.e., } G' = G \cup K' = |V| - K$
|
||||
2. It is obvious that it is a polynomial time construction since copying the graph is linear, in the size of the graph and the subtraction of integers is constant time.
|
||||
|
||||
**Direction 1**: ISET of size k $\implies$ VC of size $|V| - K$ Assume that ISET(G,K) returns true, show that $VC(G, |V|-K)$ returns true
|
||||
|
||||
Let $S$ be an independent set of size $K$ and $C = V-S$
|
||||
|
||||
We claim that $C$ is a vertex cover of size $|V|-K$
|
||||
|
||||
Proof:
|
||||
|
||||
We proceed by contradiction. Assume that $C$ is NOT a vertex cover, and it means that there is an edge $(u,v)$ such that $u\notin c , v\notin C$. And it implies that $u\in S , v\in S$, which contradicts with the assumption that S is an independent set.
|
||||
Therefore, $c$ is an vertex cover
|
||||
|
||||
**Direction 2**: VC of size $|V|-K \implies$ ISET of size $K$
|
||||
|
||||
Let $C$ be a vertex cover of size $|V|-K$ , let $s = |v| - c$
|
||||
|
||||
We claim that $S$ is an independent set of size $K$.
|
||||
|
||||
Again, assume, for the sake of contradiction, that $S$ is not an independent set. And we get
|
||||
|
||||
$\exists (u,v) \textup{such that } u\in S, v \in S$
|
||||
|
||||
$u,v \notin C$
|
||||
|
||||
$C \textup{ is not a vertex cover}$
|
||||
|
||||
And this is a contradiction with our assumption.
|
||||
287
content/CSE347/CSE347_L6.md
Normal file
287
content/CSE347/CSE347_L6.md
Normal file
@@ -0,0 +1,287 @@
|
||||
# Lecture 6
|
||||
|
||||
## NP-completeness
|
||||
|
||||
### $P$: Polynomial-time Solvable
|
||||
|
||||
$P$: Class of decision problems $L$ such that there is a polynomial-time algorithm that correctly answers yes or not for every instance $l\in L$.
|
||||
|
||||
Algorithm "$A$ decides $L$". If algorithm $A$ always correctly answers for any instance $l\in L$.
|
||||
|
||||
Example:
|
||||
|
||||
Is the number $n$ prime? Best algorithm so far: $O(\log^6 n)$, 2002
|
||||
|
||||
## Introduction to NP
|
||||
|
||||
- NP$\neq$ Non-polynomial (Non-deterministic polynomial time)
|
||||
- Let $L$ be a decision problem.
|
||||
- Let $l$ be an instance of the problem that the answer happens to be "yes".
|
||||
- A **certificate** c(l) for $l$ is a "proof" that the answer for $l$ is true. [$l$ is a true instance]
|
||||
- For canonical decision problems for optimization problems, the certificate is often a feasible solution for the corresponding optimization problem.
|
||||
|
||||
### Example of certificates
|
||||
|
||||
- Problem: Is there a path from $s$ to $t$
|
||||
- Instance: graph $G(V,E),s,t$.
|
||||
- Certificate: path from $s$ to $t$.
|
||||
- Problem: Can I schedule $k$ intervals in the room so that they do not conflict.
|
||||
- Instance: $l:(I,k)$
|
||||
- Certificate: set of $k$ non-conflicting intervals.
|
||||
- Problem: ISET
|
||||
- Instance: $G(V,E),k$.
|
||||
- Certificate: $k$ vertices with no edges between them.
|
||||
|
||||
If the answer to the problem is NO, you don't need to provide anything to prove that.
|
||||
|
||||
### Useful certificates
|
||||
|
||||
For a problem to be in NP, the problem need to have "useful" certificates. What is considered a good certificate?
|
||||
|
||||
- Easy to check
|
||||
- Verifying algorithm which can check a YES answer and a certificate in $poly(l)$
|
||||
- Not too long: [$poly(l)$]
|
||||
|
||||
### Verifier Algorithm
|
||||
|
||||
**Verifier algorithm** is one that takes an instance $l\in L$ and a certificate $c(l)$ and says yes if the certificate proves that $l$ is a true instance and false otherwise.
|
||||
|
||||
$V$ is a poly-time verifier for $L$ is it is a verifier and runs in $poly(|l|,|c|)$ time. (c=$poly(l)$)
|
||||
|
||||
- The runtime must be polynomial
|
||||
- Must check **every** problem constraint
|
||||
- Not always trivial
|
||||
|
||||
## Class NP
|
||||
|
||||
**NP:** A class of decision problems such that exists a certificate schema $c$ and a verifier algorithm $V$ such that:
|
||||
|
||||
1. certificate is $poly(l)$ in size.
|
||||
2. $V:poly(l)$ in time.
|
||||
|
||||
**P:** is a class of problems that you can **solve** in polynomial time
|
||||
|
||||
**NP:** is a class of problems that you can **verify** TRUE instances in polynomial time given a poly-size certificate
|
||||
|
||||
**Millennium question**
|
||||
|
||||
$P\subseteq NP$? $NP\subseteq P$?
|
||||
|
||||
$P\subseteq NP$ is true.
|
||||
|
||||
Proof: Let $L$ be a problem in $P$, we want to show that there is a polynomial size certificate with a poly-time verifier.
|
||||
|
||||
There is an algorithm $A$ which solves $L$ in polynomial time.
|
||||
|
||||
**Certificate:** empty thing.
|
||||
|
||||
**Verifier:** $(l,c)$
|
||||
|
||||
1. Discard $c$.
|
||||
2. Run $A$ on $l$ and return the answer.
|
||||
|
||||
Nobody knows the solution $NP\subseteq P$. Sad.
|
||||
|
||||
### Class of problem: NP complete
|
||||
|
||||
Informally: hardest problem in NP
|
||||
|
||||
Consider a problem $L$.
|
||||
|
||||
- We want to show if $L\subseteq P$, then $NP\subseteq P$
|
||||
|
||||
**NP-hard**: A decision problem $L$ is NP-hard if for any problem $K\in NP$, $K\leq_p L$.
|
||||
|
||||
$L$ is at least as hard as all the problems in NP. If we have an algorithm for $L$, we have an algorithm for any problem in NP with only polynomial time extra cost.
|
||||
|
||||
MindMap:
|
||||
|
||||
$K\implies L\implies sol(L)\implies sol(K)$
|
||||
|
||||
#### Lemma $P=NP$
|
||||
|
||||
Let $L$ be an NP-hard problem. If $L\in P$, then $P=NP$.
|
||||
|
||||
Proof:
|
||||
|
||||
Say $L$ has a poly-time solution, some problem $K$ in $NP$.
|
||||
|
||||
For any $K\in NP$, $NP\subset P$, $P\subset NP$, then $P=NP$.
|
||||
|
||||
**NP-complete:** $L$ is **NP-complete** if it is both NP-hard and $L\in NP$.
|
||||
|
||||
**NP-optimization:** $L$ is **NP-optimization** problem if the canonical decision problem is NP-complete.
|
||||
|
||||
**Claim:** If any NP-optimization problem have polynomial-time solution, then $P=NP$.
|
||||
|
||||
### Is $P=NP$?
|
||||
|
||||
- Answering this problem is hard.
|
||||
- But for any NP-complete problem, if you could find a poly-time algorithm for $L$, then you would have answered this question.
|
||||
- Therefore, finding a poly-time algorithm for $L$ is hard.
|
||||
|
||||
## NP-Complete problem
|
||||
|
||||
### Satisfiability (SAT)
|
||||
|
||||
Boolean Formulas:
|
||||
|
||||
A set of Boolean variables:
|
||||
|
||||
$x,y,a,b,c,w,z,...$ they take values true or false.
|
||||
|
||||
A boolean formula is a formula of Boolean variables with and, or and not.
|
||||
|
||||
Examples:
|
||||
|
||||
$\phi:x\land (\neg y \lor z)\land\neg(y\lor w)$
|
||||
|
||||
$x=1,y=0,z=1,w=0$, the formula is $1$.
|
||||
|
||||
**SAT:** given a formula $\phi$, is there a setting $M$ of variables such that the $\phi$ evaluates to True under this setting.
|
||||
|
||||
If there is such assignment, then $\phi$ is satisfiable. Otherwise, it is not.
|
||||
|
||||
Example: $x\land y\land \neg(x\lor y)$ is not satisfiable.
|
||||
|
||||
A seminar paper by Cook and Levin in 1970 showed that SAT is NP-complete.
|
||||
|
||||
1. SAT is in NP
|
||||
Proof:
|
||||
$\exists$ a certificate schema and a poly-time verifier.
|
||||
$c$ satisfying assignment $M$ and $v$ check that $M$ makes $\phi$ true.
|
||||
2. SAT is NP-hard. we can just accept it has a fact.
|
||||
|
||||
#### How to show a problem is NP-complete?
|
||||
|
||||
Say we have a problem $L$.
|
||||
|
||||
1. Show that $L\in NP$.
|
||||
Exists certificate schema and verification algorithm in polynomial time.
|
||||
2. Prove that we can reduce SAT to $L$. $SAT\leq_p L$ **(NOT $L\leq_p SAT$)**
|
||||
Solving $L$ also solve SAT.
|
||||
|
||||
### CNF-SAT
|
||||
|
||||
**CNF:** Conjugate normal form of SAT
|
||||
|
||||
The formula $\phi$ must be an "and of ors"
|
||||
|
||||
$$
|
||||
\phi=\land_{i=1}^n(\lor^{m_i}_{j=1}l_{i,j})
|
||||
$$
|
||||
|
||||
$l_{i,j}$: clause
|
||||
|
||||
### 3-CNF-SAT
|
||||
|
||||
**3-CNF-SAT:** where every clauses has exactly 3 literals.
|
||||
|
||||
is NP complete [not all version of them are, 2-CNF-SAT is in P]
|
||||
|
||||
Input: 3-CNF expression with $n$ variables and $m$ clauses in the form:
|
||||
|
||||
number of total literals: $3m$
|
||||
|
||||
Output: An assignment of the $n$ variables such that at least one literal from each clauses evaluates to true.
|
||||
|
||||
Note:
|
||||
|
||||
1. One variable can be used to satisfy multiple clauses.
|
||||
2. $x_i$ and $\neg x_i$ cannot both evaluate to true.
|
||||
|
||||
Example: ISET is NP-complete.
|
||||
|
||||
Proof:
|
||||
|
||||
Say we have a problem $L$
|
||||
|
||||
1. Show that $ISET\in NP$
|
||||
Certificate: set of $k$ vertices: $|S|=k\in poly(g)$\
|
||||
Verifier: checks that there are no edges between them $O(E k^2)$
|
||||
2. ISET is NP-hard. We need to prove $3SAT\leq_p ISET$
|
||||
- Construct a reduction from $3SAT$ to $ISET$.
|
||||
- Show that $ISET$ is harder than $3SAT$.
|
||||
|
||||
We need to prove $\phi\in 3SAT$ is satisfiable if and only if the constructed $G$ has an $ISET$ of size $\geq k=m$
|
||||
|
||||
#### Reduction mapping construction
|
||||
|
||||
We construct an ISET instance from $3-SAT$.
|
||||
|
||||
Suppose the formula has $n$ variables and $m$ clauses
|
||||
|
||||
1. for each clause, we construct vertex for each literal and connect them (for $x\lor \neg y\lor z$, we connect $x,\neg y,z$ together)
|
||||
2. then we connect all the literals with their negations (connects $x$ and $\neg x$)
|
||||
|
||||
$\implies$
|
||||
|
||||
If $\phi$ has a satisfiable assignment, then $G$ has an independent set of size $\geq m$,
|
||||
|
||||
For a set $S$ we pick exactly one true literal from every clause and take the corresponding vertex to that clause, $|S|=m$
|
||||
|
||||
Must also argue that $S$ is an independent set.
|
||||
|
||||
Example: picked a set of vertices $|S|=4$.
|
||||
|
||||
A literal has edges:
|
||||
|
||||
- To all literals in the same clause: We never pick two literals form the same clause.
|
||||
- To its negation.
|
||||
|
||||
Since it is a satisfiable 3-SAT assignment, $x$ and $\neg x$ cannot both evaluate to true, those edges are not a problem, so $S$ is an independent set.
|
||||
|
||||
$\impliedby$
|
||||
|
||||
If $G$ has an independent set of size $\geq m$, then $\phi$ is satisfiable.
|
||||
|
||||
Say that $S$ is an independent set of $m$, we need to construct a satisfiable assignment for the original $\phi$.
|
||||
|
||||
- If $S$ contains a vertex corresponding to literal $x_i$, then set $x_i$ to true.
|
||||
- If contains a vertex corresponding to literal $\neg x_i$, then set $x_i$ to false.
|
||||
- Other variables can be set arbitrarily
|
||||
|
||||
Question: Is it a valid 3-SAT assignment?
|
||||
|
||||
Your ISET $S$ can contain at most $1$ vertex from each clause. Since vertices in a clause are connected by edges.
|
||||
|
||||
- Since $S$ contains $m$ vertices, it must contain exactly $1$ vertex from each clause.
|
||||
- Therefore, we will make at least $1$ literals form each clause to be true.
|
||||
- Therefore, all the clauses are true and $\phi$ is satisfied.
|
||||
|
||||
## Conclusion: NP-completeness
|
||||
|
||||
- Prove NP-Complete:
|
||||
- If NP-optimization, convert to canonical decision problem
|
||||
- Certificate, Verification algorithm
|
||||
- Prove NP-hard: reduce from existing NP-Complete
|
||||
problems
|
||||
- 3-SAT Problem:
|
||||
- Input, output, constraints
|
||||
- A well-known NP-Complete problem
|
||||
- Reduce from 3-SAT to ISET to show ISET is NP-Complete
|
||||
|
||||
## On class
|
||||
|
||||
### NP-complete
|
||||
|
||||
$p\in NP$, if we have a certificate schema and a verifier algorithm.
|
||||
|
||||
### NP-complete proof
|
||||
|
||||
#### P is in NP
|
||||
|
||||
what a certificate would looks like, show that if has a polynomial time o the problem size.
|
||||
|
||||
design a verifier algorithm that checks a certificate if it indeed prove tha the answer is YES and has a polynomial time complexity. Inputs: certificate and the problem input $poly(|l|,|c|)=poly(|p|)$
|
||||
|
||||
#### P is NP hard
|
||||
|
||||
select an already known NP-hard problem: eg. 3-SAT, ISET, VC,...
|
||||
|
||||
show that $3-SAT\leq_p p$
|
||||
|
||||
- present an algorithm that given any instance of 3-SAT (on the chosen NP hard problem) to an instance of $p$.
|
||||
- show that the construction is done in polynomial time.
|
||||
- show that if $p$'s instance answer is YES, then the instance of 3-SAT is YES.
|
||||
- show that if 3-SAT's instance answer is YES then the instance of $p$ is YES.
|
||||
312
content/CSE347/CSE347_L7.md
Normal file
312
content/CSE347/CSE347_L7.md
Normal file
@@ -0,0 +1,312 @@
|
||||
# Lecture 7
|
||||
|
||||
## Known NP-Complete Problems
|
||||
|
||||
- SAT and 3-SAT
|
||||
- Vertex Cover
|
||||
- Independent Set
|
||||
|
||||
## How to show a problem $L$ is NP-Complete
|
||||
|
||||
- Show $L \in$ NP
|
||||
- Give a polynomial time certificate
|
||||
- Give a polynomial time verifier
|
||||
- Show $L$ is NP-Hard: for some known NP-Complete problem $K$, show $K \leq_p L$
|
||||
- Construct a mapping $\phi$ from instance in $K$ to instance in $L$, given an instance $k\in K$, $\phi(k)\in L$.
|
||||
- Show that you can compute $\phi(k)$ in polynomial time.
|
||||
- Show that $k \in K$ is true if and only if $\phi(k) \in L$ is true.
|
||||
|
||||
### Example 1: Subset Sum
|
||||
|
||||
Input: A set $S$ of integers and a target positive integer $t$.
|
||||
|
||||
Problem: Determine if there exists a subset $S' \subseteq S$ such that $\sum_{a_i\in S'} a_i = t$.
|
||||
|
||||
We claim that Subset Sum is NP-Complete.
|
||||
|
||||
Step 1: Subset Sum $\in$ NP
|
||||
|
||||
- Certificate: $S' \subseteq S$
|
||||
- Verifier: Check that $\sum_{a_i\in S'} a_i = t$
|
||||
|
||||
Step 2: Subset Sum is NP-Hard
|
||||
|
||||
We claim that 3-SAT $\leq_p$ Subset Sum
|
||||
|
||||
Given any $3$-CNF formula $\Psi$, we will construct an instance $(S, t)$ of Subset Sum such that $\Psi$ is satisfiable if and only if there exists a subset $S' \subseteq S$ such that $\sum_{a_i\in S'} a_i = t$.
|
||||
|
||||
#### How to construct $\Psi$?
|
||||
|
||||
Reduction construction:
|
||||
|
||||
Assumption: No clause contains both a literal and its negation.
|
||||
|
||||
3-SAT problem: $\Psi$ has $n$ variables and $m$ clauses.
|
||||
|
||||
Need to: construct $S$ of positive numbers and a target $t$
|
||||
|
||||
Ideas of construction:
|
||||
|
||||
For 3-SAT instance $\Psi$:
|
||||
|
||||
- At least one literal in each clause is true
|
||||
- A variable and its negation cannot both be true
|
||||
|
||||
$S$ contains integers with $n+m$ digits (base 10)
|
||||
|
||||
$$
|
||||
p_1p_2\cdots p_n q_1 q_2 \cdots q_m
|
||||
$$
|
||||
|
||||
where $p_i$ are representations of variables that are either $0$ or $1$ and $q_j$ are representations of clauses.
|
||||
|
||||
For each variable $x_i$, we will have two integers in $S$, called $v_i$ and $\overline{v_i}$.
|
||||
|
||||
- For each variable $x_i$, both $v_i$ and $\overline{v_i}$ have digits $p_i=1$. all other $p$ positions are zero
|
||||
|
||||
- Each digit $q_j$ in $v_i$ is $1$ if $x_i$ appears in clause $j$; otherwise $q_j=0$
|
||||
|
||||
For example:
|
||||
|
||||
$\Psi=(x_1\lor \neg x_2 \lor x_3) \land (\neg x_1 \lor x_2 \lor x_3)$
|
||||
|
||||
| | $p_1$ | $p_2$ | $p_3$ | $q_1$ | $q_2$ |
|
||||
| ---------------- | ----- | ----- | ----- | ----- | ----- |
|
||||
| $v_1$ | 1 | 0 | 0 | 1 | 0 |
|
||||
| $\overline{v_1}$ | 1 | 0 | 0 | 0 | 1 |
|
||||
| $v_2$ | 0 | 1 | 0 | 0 | 1 |
|
||||
| $\overline{v_2}$ | 0 | 1 | 0 | 1 | 0 |
|
||||
| $v_3$ | 0 | 0 | 1 | 1 | 1 |
|
||||
| $\overline{v_3}$ | 0 | 0 | 1 | 0 | 0 |
|
||||
| t | 1 | 1 | 1 | 1 | 1 |
|
||||
|
||||
Let's try to prove correctness of the reduction.
|
||||
|
||||
Direction 1: Say subset sum has a solution $S'$.
|
||||
|
||||
We must prove that there is a satisfying assignment for $\Psi$.
|
||||
|
||||
Set $x_i=1$ if $v_i\in S'$
|
||||
|
||||
Set $x_i=0$ if $\overline{v_i}\in S'$
|
||||
|
||||
1. We want set $x_i$ to be both true and false, we will pick (in $S'$) either $v_i$ or $\overline{v_i}$
|
||||
2. For each clause we have at least one literal that is true since $q_j$ has a $1$ in the clause.
|
||||
|
||||
Direction 2: Say $\Psi$ has a satisfying assignment.
|
||||
|
||||
We must prove that there is a subset $S'$ such that $\sum_{a_i\in S'} a_i = t$.
|
||||
|
||||
If $x_i=1$, then $v_i\in S'$
|
||||
|
||||
If $x_i=0$, then $\overline{v_i}\in S'$
|
||||
|
||||
Problem: 1,2 or 3 literals in every clause can be true.
|
||||
|
||||
Fix
|
||||
|
||||
Add 2 numbers to $S$ for each clause $j$. We add $y_j,z_j$.
|
||||
|
||||
- All $p$ digits are zero
|
||||
- $q_j$ of $y_j$ is $1$, $q_j$ of $z_j$ is $2$, for all $j$, other digits are zero.
|
||||
- Intuitively, these numbers account for the number of literals in clause $j$ that are true.
|
||||
|
||||
New target are as follows:
|
||||
|
||||
| | $p_1$ | $p_2$ | $p_3$ | $q_1$ | $q_2$ |
|
||||
| ----- | ----- | ----- | ----- | ----- | ----- |
|
||||
| $y_1$ | 0 | 0 | 0 | 1 | 0 |
|
||||
| $z_1$ | 0 | 0 | 0 | 2 | 0 |
|
||||
| $y_2$ | 0 | 0 | 0 | 0 | 1 |
|
||||
| $z_2$ | 0 | 0 | 0 | 0 | 2 |
|
||||
| $t$ | 1 | 1 | 1 | 4 | 4 |
|
||||
|
||||
#### Time Complexity of construction for Subset Sum
|
||||
|
||||
- $O(n+m)$
|
||||
- $n$ is the number of variables
|
||||
- $m$ is the number of clauses
|
||||
|
||||
How many integers are in $S$?
|
||||
|
||||
- $2n$ for variables
|
||||
- $2m$ for new numbers
|
||||
- Total: $2n+2m$ integers
|
||||
|
||||
How many digits are in each integer?
|
||||
|
||||
- $n+m$ digits
|
||||
- Time complexity: $O((n+m)^2)$
|
||||
|
||||
#### Proof of reduction for Subset Sum
|
||||
|
||||
Claim 1: If Subset Sum has a solution, then $\Psi$ is satisfiable.
|
||||
|
||||
Proof:
|
||||
|
||||
Say $S'$ is a solution to Subset Sum. Then there exists a subset $S' \subseteq S$ such that $\sum_{a_i\in S'} a_i = t$. Here is an assignment of truth values to variables in $\Psi$ that satisfies $\Psi$:
|
||||
|
||||
- Set $x_i=1$ if $v_i\in S'$
|
||||
- Set $x_i=0$ if $\overline{v_i}\in S'$
|
||||
|
||||
This is a valid assignment since:
|
||||
|
||||
- We pick either $v_i$ or $\overline{v_i}$
|
||||
- For each clause, at least one literal is true
|
||||
|
||||
QED
|
||||
|
||||
Claim 2: If $\Psi$ is satisfiable, then Subset Sum has a solution.
|
||||
|
||||
Proof:
|
||||
|
||||
If $A$ is a satisfiable assignment for $\Psi$, then we can construct a subset $S'$ of $S$ such that $\sum_{a_i\in S'} a_i = t$.
|
||||
|
||||
If $x_i=1$, then $v_i\in S'$
|
||||
|
||||
If $x_i=0$, then $\overline{v_i}\in S'$
|
||||
|
||||
Say $t=\sum$ elements we picked from $S$.
|
||||
|
||||
- All $p_i$ in $t$ are $1$
|
||||
- All $q_j$ in $t$ are either $1$ or $2$ or $3$.
|
||||
- If $q_j=1$, then $y_j,z_j\in S'$
|
||||
- If $q_j=2$, then $z_j\in S'$
|
||||
- If $q_j=3$, then $y_j\in S'$
|
||||
|
||||
QED
|
||||
|
||||
### Example 2: 3 Color
|
||||
|
||||
Input: Graph $G$
|
||||
|
||||
Problem: Determine if $G$ is 3-colorable.
|
||||
|
||||
We claim that 3-Color is NP-Complete.
|
||||
|
||||
#### Proof of NP for 3-Color
|
||||
|
||||
Homework
|
||||
|
||||
#### Proof of NP-Hard for 3-Color
|
||||
|
||||
We claim that 3-SAT $\leq_p$ 3-Color
|
||||
|
||||
Given a 3-CNF formula $\Psi$, we will construct a graph $G$ such that $\Psi$ is satisfiable if and only if $G$ is 3-colorable.
|
||||
|
||||
Construction:
|
||||
|
||||
1. Construct a core triangle (3 vertices for 3 colors)
|
||||
2. 2 vertices for each variable $x_i:v_i,\overline{v_i}$
|
||||
3. Clause widget
|
||||
|
||||
Clause widget:
|
||||
|
||||
- 3 vertices for each clause $C_j:y_j,z_j,t_j$ (clause widget)
|
||||
- 3 edges extended from clause widget
|
||||
- variable vertex connected to extended edges
|
||||
|
||||
Key for dangler design:
|
||||
|
||||
Connect to all $v_i$ with true to the same color. and connect to all $v_i$ with false to another color.
|
||||
|
||||
'''
|
||||
TODO: Add dangler design image here.
|
||||
'''
|
||||
|
||||
#### Proof of reduction for 3-Color
|
||||
|
||||
Direction 1: If $\Psi$ is satisfiable, then $G$ is 3-colorable.
|
||||
|
||||
Proof:
|
||||
|
||||
Say $\Psi$ is satisfiable. Then $v_i$ and $\overline{v_i}$ are in different colors.
|
||||
|
||||
For the color in central triangle, we can pick any color.
|
||||
|
||||
For each dangler color is connected to blue, all literals cannot be blue.
|
||||
|
||||
...
|
||||
|
||||
QED
|
||||
|
||||
Direction 2: If $G$ is 3-colorable, then $\Psi$ is satisfiable.
|
||||
|
||||
Proof:
|
||||
|
||||
QED
|
||||
|
||||
### Example 3:Hamiltonian cycle problem (HAMCYCLE)
|
||||
|
||||
Input: $G(V,E)$
|
||||
|
||||
Output: Does $G$ have a Hamiltonian cycle? (A cycle that visits each vertex exactly once.)
|
||||
|
||||
Proof is too hard.
|
||||
|
||||
but it is an existing NP-complete problem.
|
||||
|
||||
## On lecture
|
||||
|
||||
### Example 4: Scheduling problem (SCHED)
|
||||
|
||||
scheduling with release time, deadline and execution times.
|
||||
|
||||
Given $n$ jobs, where job $i$ has release time $r_i$, deadline $d_i$, and execution time $t_i$.
|
||||
|
||||
Example:
|
||||
|
||||
$S=\{2,3,7,5,4\}$. we created 5 jobs release time is 0, deadline is 26, execution time is $1$.
|
||||
|
||||
Problem: Can you schedule these jobs so that each job starts after its release time and finishes before its deadline, and executed for $t_i$ time units?
|
||||
|
||||
#### Proof of NP-completeness
|
||||
|
||||
Step 1: Show that the problem is in NP.
|
||||
|
||||
Certificate: $\langle (h_i,j_i),(h_2,j_2),\cdots,(h_n,j_n)\rangle$, where $h_i$ is the start time of job $i$ and $j_i$ is the machine that job $i$ is assigned to.
|
||||
|
||||
Verifier: Check that $h_i + t_i \leq d_i$ for all $i$.
|
||||
|
||||
Step 2: Show that the problem is NP-hard.
|
||||
|
||||
We proceed by proving that $SSS\leq_p$ Scheduling.
|
||||
|
||||
Consider an instance of SSS: $\{ a_1,a_2,\cdots,a_n\}$ and sum $b$. We can create a scheduling instance with release time 0, deadline $b$, and execution time $1$.
|
||||
|
||||
Then we prove that the scheduling instance is a "yes" instance if and only if the SSS instance is a "yes" instance.
|
||||
|
||||
Ideas of proof:
|
||||
|
||||
If there is a subset of $\{a_1,a_2,\cdots,a_n\}$ that sums to $b$, then we can schedule the jobs in that order on one machine.
|
||||
|
||||
If there is a schedule where all jobs are finished by time $b$, then the sum of the scheduled jobs is exactly $b$.
|
||||
|
||||
### Example 5: Component grouping problem (CG)
|
||||
|
||||
Given an undirected graph which is not necessarily connected. (A component is a subgraph that is connected.)
|
||||
|
||||
Problem: Component Grouping: Give a graph $G$ that is not connected, and a positive integer $k$, is there a subset of its components that sums up to $k$?
|
||||
|
||||
Denoted as $CG(G,k)$.
|
||||
|
||||
#### Proof of NP-completeness for Component Grouping
|
||||
|
||||
Step 1: Show that the problem is in NP.
|
||||
|
||||
Certificate: $\langle S\rangle$, where $S$ is the subset of components that sums up to $k$.
|
||||
|
||||
Verifier: Check that the sum of the sizes of the components in $S$ is $k$. This can be done in polynomial time using breadth-first search.
|
||||
|
||||
Step 2: Show that the problem is NP-hard.
|
||||
|
||||
We proceed by proving that $SSS\leq_p CG$. (Subset Sum $\leq_p$ Component Grouping)
|
||||
|
||||
Consider an instance of SSS: $\langle a_1,a_2,\cdots,a_n,b\rangle$.
|
||||
|
||||
We construct an instance of CG as follows:
|
||||
|
||||
For each $a_i\in S$, we create a chain of $a_i$ vertices.
|
||||
|
||||
WARNING: this is not a valid proof for NP hardness since the reduction is not polynomial for $n$, where $n$ is the number of vertices in the SSS instance.
|
||||
|
||||
353
content/CSE347/CSE347_L8.md
Normal file
353
content/CSE347/CSE347_L8.md
Normal file
@@ -0,0 +1,353 @@
|
||||
# Lecture 8
|
||||
|
||||
## NP-optimization problem
|
||||
|
||||
Cannot be solved in polynomial time.
|
||||
|
||||
Example:
|
||||
|
||||
- Maximum independent set
|
||||
- Minimum vertex cover
|
||||
|
||||
What can we do?
|
||||
|
||||
- solve small instances
|
||||
- hard instances are rare - average case analysis
|
||||
- solve special cases
|
||||
- find an approximate solution
|
||||
|
||||
## Approximation algorithms
|
||||
|
||||
We find a "good" solution in polynomial time, but may not be optimal.
|
||||
|
||||
Example:
|
||||
|
||||
- Minimum vertex cover: we will find a small vertex cover, but not necessarily the smallest one.
|
||||
- Maximum independent set: we will find a large independent set, but not necessarily the largest one.
|
||||
|
||||
Question: How do we quantify the quality of the solution?
|
||||
|
||||
### Approximation ratio
|
||||
|
||||
Intuition:
|
||||
|
||||
How good is an algorithm $A$ compared to an optimal solution in the worst case?
|
||||
|
||||
Definition:
|
||||
|
||||
Consider algorithm $A$ for an NP-optimization problem $L$. Say for **any** instance $l$, $A$ finds a solution output $c_A(l)$ and the optimal solution is $c^*(l)$.
|
||||
|
||||
Approximation ratio is either:
|
||||
|
||||
$$
|
||||
\max_{l \in L} \frac{c_A(l)}{c^*(l)}=\alpha
|
||||
$$
|
||||
|
||||
for maximization problems, or
|
||||
|
||||
$$
|
||||
\min_{l \in L} \frac{c^A(l)}{c_*(l)}=\alpha
|
||||
$$
|
||||
|
||||
for minimization problems.
|
||||
|
||||
Example:
|
||||
|
||||
Alice's Algorithm, $A$, finds a vertex cover of size $c_A(l)$ for instance $l(G)$. The optimal vertex cover has size $c^*(l)$.
|
||||
|
||||
We want approximation ratio to be as close to 1 as possible.
|
||||
|
||||
> Vertex cover:
|
||||
>
|
||||
> A vertex cover is a set of vertices that touches all edges.
|
||||
|
||||
Let's try an approximation algorithm for the vertex cover problem, called Greedy cover.
|
||||
|
||||
#### Greedy cover
|
||||
|
||||
Pick any uncovered edge, both its endpoints are added to the cover $C$, until all edges are covered.
|
||||
|
||||
Runtime: $O(m)$
|
||||
|
||||
Claim: Greedy cover is correct, and it finds a vertex cover.
|
||||
|
||||
Proof:
|
||||
|
||||
Algorithm only terminates when all edges are covered.
|
||||
|
||||
Claim: Greedy cover is a 2-approximation algorithm.
|
||||
|
||||
Proof:
|
||||
|
||||
Look at the two edges we picked.
|
||||
|
||||
Either it is covered by Greedy cover, or it is not.
|
||||
|
||||
If it is not covered by Greedy cover, then we will add both endpoints to the cover.
|
||||
|
||||
In worst case, Greedy cover will add both endpoints of each edge to the cover. (Consider the graph with disjoint edges.)
|
||||
|
||||
Thus, the size of the vertex cover found by Greedy cover is at most twice the size of the optimal vertex cover.
|
||||
|
||||
Thus, Greedy cover is a 2-approximation algorithm.
|
||||
|
||||
> Min-cut:
|
||||
>
|
||||
> Given a graph $G$ and two vertices $s$ and $t$, find the minimum cut between $s$ and $t$.
|
||||
>
|
||||
> Max-cut:
|
||||
>
|
||||
> Given a graph $G$, find the maximum cut.
|
||||
|
||||
#### Local cut
|
||||
|
||||
Algorithm:
|
||||
|
||||
- start with an arbitrary cut of $G$.
|
||||
- While you can move a vertex from one side to the other side while increasing the size of the cut, do so.
|
||||
- Return the cut found.
|
||||
|
||||
We will prove its:
|
||||
|
||||
- Runtime
|
||||
- Feasibility
|
||||
- Approximation ratio
|
||||
|
||||
##### Runtime for local cut
|
||||
|
||||
Since size of cut is at most $|E|$, the runtime is $O(m)$.
|
||||
|
||||
When we move a vertex from one side to the other side, the size of the cut increases by at least 1.
|
||||
|
||||
Thus, the algorithm terminates in at most $|V|$ steps.
|
||||
|
||||
So the runtime is $O(|E||V|^2)$.
|
||||
|
||||
##### Feasibility for local cut
|
||||
|
||||
The algorithm only terminates when no more vertices can be moved.
|
||||
|
||||
Thus, the cut found is a feasible solution.
|
||||
|
||||
##### Approximation ratio for local cut
|
||||
|
||||
This is a half-approximation algorithm.
|
||||
|
||||
We need to show that the size of the cut found is at least half of the size of the optimal cut.
|
||||
|
||||
We could first upper bound the size of the optimal cut is at most $|E|$.
|
||||
|
||||
We will then prove that solution we found is at least half of the optimal cut $\frac{|E|}{2}$ for any graph $G$.
|
||||
|
||||
Proof:
|
||||
|
||||
When we terminate, no vertex could be moved
|
||||
|
||||
Therefore, **The number of crossing edges is at least the number of non-crossing edges**.
|
||||
|
||||
Let $d(u)$ be the degree of vertex $u\in V$.
|
||||
|
||||
The total number of crossing edges for vertex $u$ is at least $\frac{1}{2}d(u)$.
|
||||
|
||||
Summing over all vertices, the total number of crossing edges is at least $\frac{1}{2}\sum_{u\in V}d(u)=\frac{1}{2}|E|$.
|
||||
|
||||
So the total number of non-crossing edges is at most $\frac{|E|}{2}$.
|
||||
|
||||
QED
|
||||
|
||||
#### Set cover
|
||||
|
||||
Problem:
|
||||
|
||||
You are collecting a set of magic cards.
|
||||
|
||||
$X$ is the set of all possible cards. You want at least one of each card.
|
||||
|
||||
Each dealer $j$ has a pack $S_j\subseteq X$ of cards. You have to buy entire pack or none from dealer $j$.
|
||||
|
||||
Goal: What is the least number of packs you need to buy to get all cards?
|
||||
|
||||
Formally:
|
||||
|
||||
Input $X$ is a universe of $n$ elements, and a collection of subsets of $X$, $Y=\{S_1, S_2, \ldots, S_m\}\subseteq X$.
|
||||
|
||||
Goal: Pick $C\subseteq Y$ such that $\bigcup_{S_i\in C}S_i=X$, and $|C|$ is minimized.
|
||||
|
||||
Set cover is an NP-optimization problem. It is a generalization of the vertex cover problem.
|
||||
|
||||
#### Greedy set cover
|
||||
|
||||
Algorithm:
|
||||
|
||||
- Start with empty set $C$.
|
||||
- While there is an element $x$ in $X$ that is not covered, pick one such element $x\in S_i$ where $S_i$ is the set that has not been picked before.
|
||||
- Add $S_i$ to $C$.
|
||||
- Return $C$.
|
||||
|
||||
```python
|
||||
def greedy_set_cover(X, Y):
|
||||
# X is the set of elements
|
||||
# Y is the collection of sets, hashset by default
|
||||
C = []
|
||||
def non_covered_elements(X, C):
|
||||
# return the elements in X that are not covered by C
|
||||
# O(|X|)
|
||||
return [x for x in X if not any(x in c for c in C)]
|
||||
non_covered = non_covered_elements(X, C)
|
||||
# O(|X|) every loop reduce the size of non_covered by 1
|
||||
while non_covered:
|
||||
max_cover,max_set = 0,None
|
||||
# O(|Y|)
|
||||
for S in Y:
|
||||
# Intersection of two sets is O(min(|X|,|S|))
|
||||
cur_cover = len(set(non_covered) & set(S))
|
||||
if cur_cover > max_cover:
|
||||
max_cover,max_set = cur_cover,S
|
||||
C.append(max_set)
|
||||
non_covered = non_covered_elements(X, C)
|
||||
return C
|
||||
```
|
||||
|
||||
It is not optimal.
|
||||
|
||||
Need to prove its:
|
||||
|
||||
- Correctness:
|
||||
Keep picking until all elements are covered.
|
||||
- Runtime:
|
||||
$O(|X||Y|^2)$
|
||||
- Approximation ratio:
|
||||
|
||||
##### Approximation ratio for greedy set cover
|
||||
|
||||
> Harmonic number:
|
||||
>
|
||||
> $H_n=\sum_{i=1}^n\frac{1}{i}=\frac{1}{1}+\frac{1}{2}+\frac{1}{3}+\cdots+\frac{1}{n}=\Theta(\log n)$
|
||||
|
||||
We claim that the size of the set cover found is at most $H_n\log n$ times the size of the optimal set cover.
|
||||
|
||||
###### First bound:
|
||||
|
||||
Proof:
|
||||
|
||||
If the optimal picks $k$ sets, then the size of the set cover found is at most $(1+\log n)k$ sets.
|
||||
|
||||
Let $n=|X|$.
|
||||
|
||||
Observe that
|
||||
|
||||
For the first round, the elements that we not covered is $n$.
|
||||
$$
|
||||
|U_0|=n
|
||||
$$
|
||||
|
||||
In the second round, the elements that we not covered is at most $|U_0|-x$ where $x=|S_1|$ is the number of elements in the set picked in the first round.
|
||||
|
||||
$$
|
||||
|U_1|=|U_0|-|S_1|
|
||||
$$
|
||||
|
||||
...
|
||||
|
||||
So $x_i\geq \frac{|U_{i-1}|}{k}$.
|
||||
|
||||
We proceed by contradiction.
|
||||
|
||||
Suppose all sets in the optimal solution are $< \frac{|U_0|}{k}$. Then the sum of the sizes of the sets in the optimal solution is $< |U_0|=n$.
|
||||
|
||||
_There exists a least ratio of selection of sets determined by $k_i$. Otherwise the function (selecting the set cover) will not terminate (no such sets exists)_
|
||||
|
||||
> Some math magics:
|
||||
> $$(1-\frac{1}{k})^k\leq \frac{1}{e}$$
|
||||
|
||||
So $n(1-\frac{1}{k})^{|C|-1}=1$, $|C|\leq 1+k\ln n$.
|
||||
|
||||
So the size of the set cover found is at most $(1+\ln n)k$.
|
||||
|
||||
QED
|
||||
|
||||
So the greedy set cover is not too bad...
|
||||
|
||||
###### Second bound:
|
||||
|
||||
Greedy set cover is a $H_d$-approximation algorithm of set cover.
|
||||
|
||||
Proof:
|
||||
|
||||
Assign a cost to the elements of $X$ according to the decisions of the greedy set cover.
|
||||
|
||||
Let $\delta(S^i)$ be the new number of elements covered by set $S^i$.
|
||||
|
||||
$$
|
||||
\delta(S^i)=|S_i\cap U_{i-1}|
|
||||
$$
|
||||
|
||||
If the element $x$ is added by step $i$, when set $S_i$ is picked, then the cost of $x$ to
|
||||
|
||||
$$
|
||||
\frac{1}{\delta(S^i)}=\frac{1}{x_i}
|
||||
$$
|
||||
|
||||
Example:
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
X&=\{A,B,C,D,E,F,G\}\\
|
||||
S_1&=\{A,C,E\}\\
|
||||
S_2&=\{B,C,F,G\}\\
|
||||
S_3&=\{B,D,F,G\}\\
|
||||
S_4&=\{D,G\}
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
First we select $S_2$, then $cost(B)=cost(C)=cost(F)=cost(G)=\frac{1}{4}$.
|
||||
|
||||
Then we select $S_1$, then $cost(A)=cost(E)=\frac{1}{2}$.
|
||||
|
||||
Then we select $S_3$, then $cost(D)=1$.
|
||||
|
||||
If element $x$ was covered by greedy set cover due to the addition of set $S^i$ at step $i$, then the cost of $x$ is $\frac{1}{\delta(S^i)}$.
|
||||
|
||||
$$
|
||||
\textup{Total cost of GSC}=\sum_{x\in X}c(x)=\sum_{i=1}^{|C|}\sum_{X\textup{ covered at iteration }i}c(x)=\sum_{i=1}^{|C|}\delta(S^i)\frac{1}{\delta(S^i)}=|C|
|
||||
$$
|
||||
|
||||
Claim: Consider any set $S$ that is a subset of $X$. The cost paid by the greedy set cover for $S$ is at most $H_{|S|}$.
|
||||
|
||||
Suppose that the greedy set covers $S$ in order $x_1,x_2,\ldots,x_{|S|}$, where $\{x_1,x_2,\ldots,x_{|S|}\}=S$.
|
||||
|
||||
When GSC covers $x_j$, $\{x_j,x_{j+1},\ldots,x_{|S|}\}$ are not covered.
|
||||
|
||||
At this point, the GSC has the option of picking $S$
|
||||
|
||||
This implies that the $\delta(S)$ is at least $|S|-j+1$.
|
||||
|
||||
Assume that $S$ is picked $\hat{S}$ for which $\delta(\hat{S})$ is maximized ($\hat{S}$ may be $S$ or other sets that have not covered $x_j$).
|
||||
|
||||
So, $\delta(\hat{S})\geq \delta(S)\geq |S|-j+1$.
|
||||
|
||||
So the cost of $x_j$ is $\delta(\hat{S})\leq \frac{1}{\delta(S)}\leq \frac{1}{|S|-j+1}$.
|
||||
|
||||
Summing over all $j$, the cost of $S$ is at most $\sum_{j=1}^{|S|}\frac{1}{|S|-j+1}=H_{|S|}$.
|
||||
|
||||
Back to the proof of approximation ratio:
|
||||
|
||||
Let $C^*$ be optimal set cover.
|
||||
|
||||
$$
|
||||
|C|=\sum_{x\in X}c(x)\leq \sum_{S_j\in C^*}\sum_{x\in S_j}c(x)
|
||||
$$
|
||||
|
||||
This inequality holds because of counting element that is covered by more than one set.
|
||||
|
||||
Since $\sum_{x\in S_j}c(x)\leq H_{|S_j|}$, by our claim.
|
||||
|
||||
Let $d$ be the largest cardinality of any set in $C^*$.
|
||||
|
||||
$$
|
||||
|C|\leq \sum_{S_j\in C^*}H_{|S_j|}\leq \sum_{S_j\in C^*}H_d=H_d|C^*|
|
||||
$$
|
||||
|
||||
So the approximation ratio for greedy set cover is $H_d$.
|
||||
|
||||
QED
|
||||
349
content/CSE347/CSE347_L9.md
Normal file
349
content/CSE347/CSE347_L9.md
Normal file
@@ -0,0 +1,349 @@
|
||||
# Lecture 9
|
||||
|
||||
## Randomized Algorithms
|
||||
|
||||
### Hashing
|
||||
|
||||
Hashing with chaining:
|
||||
|
||||
Input: We have integers in range $[1,n-1]=U$. We want to map them to a hash table $T$ with $m$ slots.
|
||||
|
||||
Hash function: $h:U\rightarrow [m]$
|
||||
|
||||
Goal: Hashing a set $S\subseteq U$, $|S|=n$ into $T$ such that the number of elements in each slot is at most $1$.
|
||||
|
||||
#### Collisions
|
||||
|
||||
When multiple keys are mapped to the same slot, we call it a collision, we keep a linked list of all the keys that map to the same slot.
|
||||
|
||||
**Runtime** of insert, query, delete of elements $=\Theta(\textup{length of the chain})$
|
||||
|
||||
**Worst-case** runtime of insert, query, delete of elements $=\Theta(n)$
|
||||
|
||||
Therefore, we want chains to be short, or $\Theta(1)$, as long as $|S|$ is reasonably sized, or equivalently, we want the number in any set $S$ to hash **uniformly** across all slots.
|
||||
|
||||
#### Simple Uniform Hashing Assumptions
|
||||
|
||||
The $n$ elements we want to hash (the set $S$) is picked uniformly at random from $U$. Therefore, we could see that this simple hash function works fine:
|
||||
|
||||
$$
|
||||
h(x)=x\mod m
|
||||
$$
|
||||
|
||||
Question: What happens if an adversary knows this function and designs $S$ to make the worst-case runtime happen?
|
||||
|
||||
Answer: The adversary can make the runtime of each operation $\Theta(n)$ by simply making all the elements hash to the same slot.
|
||||
|
||||
#### Randomization to the rescue
|
||||
|
||||
We don't want the adversary to know the hash function based on just looking at the code.
|
||||
|
||||
Ideas: Randomize the choice of the hash function.
|
||||
|
||||
### Randomized Algorithm
|
||||
|
||||
#### Definition
|
||||
|
||||
A randomized algorithm is an algorithm the algorithm makes internal random choices.
|
||||
|
||||
2 kinds of randomized algorithms:
|
||||
|
||||
1. Las Vegas: The runtime is random, but the output is always correct.
|
||||
2. Monte Carlo: The runtime is fixed, but the output is sometimes incorrect.
|
||||
|
||||
We will focus on Las Vegas algorithms in this course.
|
||||
|
||||
$$O(n)=E[T(n)]$$ or some other probabilistic quantity.
|
||||
|
||||
#### Randomization can help
|
||||
|
||||
Ideas: Randomize the choice of hash function $h$ from a family of hash functions, $H$.
|
||||
|
||||
If we randomly pick a hash function from this family, then the probability that the hash function is bad on **any particular** set $S$ is small.
|
||||
|
||||
Intuitively, the adversary can not pick a bad input since most hash functions are good for any particular input $S$.
|
||||
|
||||
#### Universal Hashing: Goal
|
||||
|
||||
We want to design a universal family of hash functions, $H$, such that the probability that the hash table behaves badly on any input $S$ is small.
|
||||
|
||||
#### Universal Hashing: Definition
|
||||
|
||||
Suppose we have $m$ buckets in the hash table. We also have $2$ inputs $x\neq y$ and $x,y\in U$. We want $x$ and $y$ to be unlikely to hash to the same bucket.
|
||||
|
||||
$H$ is a universal **family** of hash functions if for any two elements $x\neq y$,
|
||||
|
||||
$$
|
||||
Pr_{h\in H}[h(x)=h(y)]=\frac{1}{m}
|
||||
$$
|
||||
|
||||
where $h$ is picked uniformly at random from the family $H$.
|
||||
|
||||
#### Universal Hashing: Analysis
|
||||
|
||||
Claim: If we choose $h$ randomly from a universal family of hash functions, $H$, then the hash table will exhibit good behavior on any set $S$ of size $n$ with high probability.
|
||||
|
||||
Question: What are some good properties and what does it mean by with high probability?
|
||||
|
||||
Claim: Given a universal family of hash functions, $H$, $S=\{a_1,a_2,\cdots,a_n\}\subset \mathbb{N}$. For any probability $0\leq \delta\leq 1$, if $n\leq \sqrt{2m\delta}$, the chance that no two keys hash to the same slot is $\geq1-\delta$.
|
||||
|
||||
Example: If we pick $\delta=\frac{1}{2}$. As long as $n<\sqrt{2m}$, the chance that no two keys hash to the same slot is $\geq\frac{1}{2}$.
|
||||
|
||||
If we pick $\delta=\frac{1}{3}$. As long as $n<\sqrt{\frac{4}{3}m}$, the chance that no two keys hash to the same slot is $\geq\frac{2}{3}$.
|
||||
|
||||
Proof Strategy:
|
||||
|
||||
1. Compute the **expected value** of collisions. Note that collisions occurs when two different values are hashed to the same slot. (Indicator random variables)
|
||||
2. Apply a "tail" bound that converts the expected value to probability. (Markov's inequality)
|
||||
|
||||
##### Compute the expected number of collisions
|
||||
|
||||
Let $m$ be the size of the hash table. $n$ is the number of keys in the set $S$. $N$ is the size of the universe.
|
||||
|
||||
For inputs $x,y\in S,x\neq y$, we define a random variable
|
||||
|
||||
$$
|
||||
C_{xy}=
|
||||
\begin{cases}
|
||||
1 & \text{if } h(x)=h(y) \\
|
||||
0 & \text{otherwise}
|
||||
\end{cases}
|
||||
$$
|
||||
|
||||
$C_{xy}$ is called an indicator random variable, that takes value $0$ or $1$.
|
||||
|
||||
The expected number of collisions is
|
||||
|
||||
$$
|
||||
E[C_{xy}]=1\times Pr[C_{xy}=1]+0\times Pr[C_{xy}=0]=Pr[C_{xy}=1]=\frac{1}{m}
|
||||
$$
|
||||
|
||||
Define $C_x$: random variable that represents the cost of inserting/searching/deleting $x$ from the hash table.
|
||||
|
||||
$C_x\leq$ total number of elements that collide with $x$ (= number of elements $y$ such that $h(x)=h(y)$).
|
||||
|
||||
$$
|
||||
C_x=\sum_{y\in S,y\neq x,h(x)=h(y)}1
|
||||
$$
|
||||
|
||||
So, $C_x=\sum_{y\in S,y\neq x}C_{xy}$.
|
||||
|
||||
By linearity of expectation,
|
||||
|
||||
$$
|
||||
E[C_x]=\sum_{y\in S,y\neq x}E[C_{xy}]=\sum_{y\in S,y\neq x}\frac{1}{m}=\frac{n-1}{m}
|
||||
$$
|
||||
|
||||
$E[C]=\Theta(1)$ if $n=O(m)$. Total cost of $K$ insert/search operations is $O(k)$. by linearity of expectation.
|
||||
|
||||
Say $C$ is the total number of collisions.
|
||||
|
||||
$C=\frac{\sum_{x\in S}C_x}{2}$ because each collision is counted twice.
|
||||
|
||||
$$
|
||||
E[C]=\frac{1}{2}\sum_{x\in S}E[C_x]=\frac{1}{2}\sum_{x\in S}\frac{n-1}{m}=\frac{n(n-1)}{2m}
|
||||
$$
|
||||
|
||||
If we want $E[C]\leq \delta$, then we need $n=\sqrt{2m\delta}$.
|
||||
|
||||
#### The probability of no collisions $C=0$
|
||||
|
||||
We know that the expected value of number of collisions is now $\leq \delta$, but what about the probability of **NO** collisions?
|
||||
|
||||
> Markov's inequality: $$P[X\geq k]\leq\frac{E[X]}{k}$$
|
||||
> For non-negative random variable $X$, $Pr[X\geq k\cdot E[X]]\leq \frac{1}{k}$.
|
||||
|
||||
Use Markov's inequality: For non-negative random variable $X$, $Pr[X\geq k\cdot E[X]]\leq \frac{1}{k}$.
|
||||
|
||||
Apply this to $C$:
|
||||
|
||||
$$
|
||||
Pr[C\geq \frac{1}{\delta}E[C]]<\delta\Rightarrow Pr[C\geq 1]<\delta
|
||||
$$
|
||||
|
||||
So, if we want $Pr[C=0]>1-\delta$, $n<\sqrt{2m\delta}$ with probability $1-\delta$, you will have no collisions.
|
||||
|
||||
#### More general conclusion
|
||||
|
||||
Claim: For a universal hash function family $H$, if number of keys $n\leq \sqrt{Bm\delta}$, then the probability that at most $B+1$ keys hash to the same slot is $> 1-\delta$.
|
||||
|
||||
### Example: Quicksort
|
||||
|
||||
Based on partitioning [assume all elements are distinct]: Partition($A[p\cdots r]$)
|
||||
|
||||
- Rearranges $A$ into $A[p\cdots q-1],A[q],A[q+1\cdots r]$
|
||||
|
||||
Runtime: $O(r-p)$, linear time.
|
||||
|
||||
```python
|
||||
def partition(A,p,r):
|
||||
x=A[r]
|
||||
lo=p
|
||||
for i in range(p,r):
|
||||
if A[i]<x:
|
||||
A[lo],A[i]=A[i],A[lo]
|
||||
lo+=1
|
||||
A[lo],A[r]=A[r],A[lo]
|
||||
return lo
|
||||
|
||||
def quicksort(A,p,r):
|
||||
if p<r:
|
||||
q=partition(A,p,r)
|
||||
quicksort(A,p,q-1)
|
||||
quicksort(A,q+1,r)
|
||||
```
|
||||
|
||||
#### Runtime analysis
|
||||
|
||||
Let the number of element in $A_{low}$ be $k$.
|
||||
|
||||
$$
|
||||
T(n)=\Theta(n)+T(k)+T(n-k-1)
|
||||
$$
|
||||
|
||||
By even split assumption, $k=\frac{n}{2}$.
|
||||
|
||||
$$
|
||||
T(n)=T(\frac{n}{2})+T(\frac{n}{2}-1)+\Theta(n)\approx \Theta(n\log n)
|
||||
$$
|
||||
|
||||
Which is approximately the same as merge sort.
|
||||
|
||||
_Average case analysis is always suspicious._
|
||||
|
||||
### Randomized Quicksort
|
||||
|
||||
- Pick a random pivot element.
|
||||
- Analyze the expected runtime. over the random choices of pivot.
|
||||
|
||||
```python
|
||||
|
||||
def randomized_partition(A,p,r):
|
||||
ix=random.randint(p,r)
|
||||
x=A[ix]
|
||||
A[r],A[ix]=A[ix],A[r]
|
||||
lo=p
|
||||
for i in range(p,r):
|
||||
if A[i]<x:
|
||||
A[lo],A[i]=A[i],A[lo]
|
||||
lo+=1
|
||||
A[lo],A[r]=A[r],A[lo]
|
||||
return lo
|
||||
|
||||
def randomized_quicksort(A,p,r):
|
||||
if p<r:
|
||||
q=randomized_partition(A,p,r)
|
||||
randomized_quicksort(A,p,q-1)
|
||||
randomized_quicksort(A,q+1,r)
|
||||
```
|
||||
|
||||
$$
|
||||
E[T(n)]=E(T(n-k-1)+T(k)+cn)=E(T(n-k-1))+E(T(k))+cn
|
||||
$$
|
||||
|
||||
by linearity of expectation.
|
||||
|
||||
$$
|
||||
Pr[\textup{pivot has rank }k]=\frac{1}{n}
|
||||
$$
|
||||
|
||||
So,
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
E[T(n)]&=\frac{1}{n}\sum_{k=0}^{n-1}(E[T(k)]+E[T(n-k-1)])+cn\\
|
||||
&=cn+\sum_{k=0}^{n-1}Pr[n-k-1=j]T(j)+\sum_{k=0}^{n-1}Pr[k=j]T(j)\\
|
||||
&=cn+\sum_{k=0}^{n-1}\frac{1}{n}T(j)+\sum_{k=0}^{n-1}\frac{1}{n}T(j)\\
|
||||
&=cn+\frac{2}{n}\sum_{k=0}^{n-1}T(j)
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
Claim: the solution to this recurrence is $E[T(n)]=O(n\log n)$ or $T(n)=c'n\log n+1$.
|
||||
|
||||
Proof:
|
||||
|
||||
We prove by induction.
|
||||
|
||||
Base case: $n=1,T(n)=T(1)=c$
|
||||
|
||||
Inductive step: Assume that $T(k)=c'k\log k+1$ for all $k<n$.
|
||||
|
||||
Then,
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
T(n)&=cn+\frac{2}{n}\sum_{k=0}^{n-1}T(k)\\
|
||||
&=cn+\frac{2}{n}\sum_{k=0}^{n-1}(c'k\log k+1)\\
|
||||
&=cn+\frac{2c'}{n}\sum_{k=0}^{n-1}k\log k+\frac{2}{n}\sum_{k=0}^{n-1}1
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
Then we use the fact that $\sum_{k=0}^{n-1}k\log k\leq \frac{n^2\log n}{2}-\frac{n^2}{8}$ (can be proved by induction).
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
T(n)&=cn+\frac{2c'}{n}\left(\frac{n^2\log n}{2}-\frac{n^2}{8}\right)+\frac{2}{n}n\\
|
||||
&=c'n\log n-\frac{1}{4}c'n+cn+2\\
|
||||
&=(c'n\log n+1)-\left(\frac{1}{4}c'n-cn-1\right)
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
We need to prove that $\frac{1}{4}c'n-cn-1\geq 0$.
|
||||
|
||||
Choose $c'$ and $c$ such that $\frac{1}{4}c'n\geq cn+1$ for all $n\geq 2$.
|
||||
|
||||
If $c'\geq 8c$, then $T(n)\leq c'n\log n+1$.
|
||||
|
||||
$E[T(n)]\leq c'n\log n+1=O(n\log n)$
|
||||
|
||||
QED
|
||||
|
||||
A more elegant proof:
|
||||
|
||||
Let $X_{ij}$ be an indicator random variable that is $1$ if element of rank $i$ is compared to element of rank $j$.
|
||||
|
||||
Running time: $$X=\sum_{i=0}^{n-2}\sum_{j=i+1}^{n-1}X_{ij}$$
|
||||
|
||||
So, the expected number of comparisons is
|
||||
|
||||
$$
|
||||
E[X_{ij}]=Pr[X_{ij}=1]\times 1+Pr[X_{ij}=0]\times 0=Pr[X_{ij}=1]
|
||||
$$
|
||||
|
||||
This is equivalent to the expected number of comparisons in randomized quicksort.
|
||||
|
||||
The expected number of running time is
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
E[X]&=E[\sum_{i=0}^{n-2}\sum_{j=i+1}^{n-1}X_{ij}]\\
|
||||
&=\sum_{i=0}^{n-2}\sum_{j=i+1}^{n-1}E[X_{ij}]\\
|
||||
&=\sum_{i=0}^{n-2}\sum_{j=i+1}^{n-1}Pr[X_{ij}=1]
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
For any two elements $z_i,z_j\in S$, the probability that $z_i$ is compared to $z_j$ is (either $z_i$ or $z_j$ is picked first as the pivot before the any elements of the ranks larger than $i$ and less than $j$)
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
Pr[X_{ij}=1]&=Pr[z_i\text{ is picked first}]+Pr[z_j\text{ is picked first}]\\
|
||||
&=\frac{1}{j-i+1}+\frac{1}{j-i+1}\\
|
||||
&=\frac{2}{j-i+1}
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
So, with harmonic number, $H_n=\sum_{k=1}^{n}\frac{1}{k}$,
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
E[X]&=\sum_{i=0}^{n-2}\sum_{j=i+1}^{n-1}\frac{2}{j-i+1}\\
|
||||
&\leq 2\sum_{i=0}^{n-2}\sum_{k=1}^{n-i-1}\frac{1}{k}\\
|
||||
&\leq 2\sum_{i=0}^{n-2}c\log(n)\\
|
||||
&=2c\log(n)\sum_{i=0}^{n-2}1\\
|
||||
&=\Theta(n\log n)
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
|
||||
QED
|
||||
|
||||
34
content/CSE347/Exam_reviews/CSE347_E1.md
Normal file
34
content/CSE347/Exam_reviews/CSE347_E1.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# Exam 1 review
|
||||
|
||||
## Greedy
|
||||
|
||||
A Greedy Algorithm is an algorithm whose solution applies the same choice rule at each step over and over until no more choices can be made.
|
||||
|
||||
- Stating and Proving a Greedy Algorithm
|
||||
- State your algorithm (“at this step, make this choice”)
|
||||
- Greedy Choice Property (Exchange Argument)
|
||||
- Inductive Structure
|
||||
- Optimal Substructure
|
||||
- "Simple Induction"
|
||||
- Asymptotic Runtime
|
||||
|
||||
## Divide and conquer
|
||||
|
||||
Stating and Proving a Dividing and Conquer Algorithm
|
||||
|
||||
- Describe the divide, conquer, and combine steps of your algorithm.
|
||||
- The combine step is the most important part of a divide and conquer algorithm, and in your recurrence this step is the "f (n)", or work done at each subproblem level. You need to show that you can combine the results of your subproblems somehow to get the solution for the entire problem.
|
||||
- Provide and prove a base case (when you can divide no longer)
|
||||
- Prove your induction step: suppose subproblems (two problems of size n/2, usually) of the same kind are solved optimally. Then, because of the combine step, the overall problem (of size n) will be solved optimally.
|
||||
- Provide recurrence and solve for its runtime (Master Method)
|
||||
|
||||
## Maximum Flow
|
||||
Given a weighted directed acyclic graph with a source and a sink node, the goal is to see how much "flow" you can push from the source to the sink simultaneously.
|
||||
|
||||
Finding the maximum flow can be solved by the Ford-Fulkerson Algorithm. Runtime (from lecture slides): $O(F (|V | + |E |))$.
|
||||
|
||||
Fattest Path improvement: $O(log |V |(|V | + |E |))$
|
||||
|
||||
Min Cut-Max Flow: the maximum flow from source $s$ to sink $t$ is equal to the minimum sum of an $s-t$ cut.
|
||||
|
||||
A cut is a partition of a graph into two disjoint sets by removing edges connecting the two parts. An $s-t$ cut will put $s$ and $t$ into the different sets.
|
||||
139
content/CSE347/Exam_reviews/CSE347_E2.md
Normal file
139
content/CSE347/Exam_reviews/CSE347_E2.md
Normal file
@@ -0,0 +1,139 @@
|
||||
# Exam 2 Review
|
||||
|
||||
## Reductions
|
||||
|
||||
We say that a problem $A$ reduces to a problem $B$ if there is a **polynomial time** reduction function $f$ such that for all $x$, $x \in A \iff f(x) \in B$.
|
||||
|
||||
To prove a reduction, we need to show that the reduction function $f$:
|
||||
|
||||
1. runs in polynomial time
|
||||
2. $x \in A \iff f(x) \in B$.
|
||||
|
||||
### Useful results from reductions
|
||||
|
||||
1. $B$ is at least as hard as $A$ if $A \leq B$.
|
||||
2. If we can solve $B$ in polynomial time, then we can solve $A$ in polynomial time.
|
||||
3. If we want to solve problem $A$, and we already know an efficient algorithm for $B$, then we can use the reduction $A \leq B$ to solve $A$ efficiently.
|
||||
4. If we want to show that $B$ is NP-hard, we can do this by showing that $A \leq B$ for some known NP-hard problem $A$.
|
||||
|
||||
$P$ is the class of problems that can be solved in polynomial time. $NP$ is the class of problems that can be verified in polynomial time.
|
||||
|
||||
We know that $P \subseteq NP$.
|
||||
|
||||
### NP-complete problems
|
||||
|
||||
A problem is NP-complete if it is in $NP$ and it is also NP-hard.
|
||||
|
||||
#### NP
|
||||
|
||||
A problem is in $NP$ if
|
||||
|
||||
1. there is a polynomial size certificate for the problem, and
|
||||
2. there is a polynomial time verifier for the problem that takes the certificate and checks whether it is a valid solution.
|
||||
|
||||
#### NP-hard
|
||||
|
||||
A problem is NP-hard if every instance of $NP$ hard problem can be reduced to it in polynomial time.
|
||||
|
||||
List of known NP-hard problems:
|
||||
|
||||
1. 3-SAT (or SAT):
|
||||
- Statement: Given a boolean formula in CNF with at most 3 literals per clause, is there an assignment of truth values to the variables that makes the formula true?
|
||||
2. Independent Set:
|
||||
- Statement: Given a graph $G$ and an integer $k$, does $G$ contain a set of $k$ vertices such that no two vertices in the set are adjacent?
|
||||
3. Vertex Cover:
|
||||
- Statement: Given a graph $G$ and an integer $k$, does $G$ contain a set of $k$ vertices such that every edge in $G$ is incident to at least one vertex in the set?
|
||||
4. 3-coloring:
|
||||
- Statement: Given a graph $G$, can each vertex be assigned one of 3 colors such that no two adjacent vertices have the same color?
|
||||
5. Hamiltonian Cycle:
|
||||
- Statement: Given a graph $G$, does $G$ contain a cycle that visits every vertex exactly once?
|
||||
6. Hamiltonian Path:
|
||||
- Statement: Given a graph $G$, does $G$ contain a path that visits every vertex exactly once?
|
||||
|
||||
## Approximation Algorithms
|
||||
|
||||
- Consider optimization problems whose decision problem variant is NP-hard. Unless P=NP, finding an optimal solution to these problems cannot be done in polynomial time.
|
||||
- In approximation algorithms, we make a trade-o↵: we’re willing to accept sub-optimal solutions in exchange for polynomial runtime.
|
||||
- The Approximation Ratio of our algorithm is the worst-case ratio of our solution to the optimal solution.
|
||||
- For minimization problems, this ratio is $$\max_{l\in L}\left(\frac{c_A(l)}{c_{OPT}(l)}\right)$$, since our solution will be larger than OPT.
|
||||
- For maximization problems, this ratio is $$\min_{l\in L}\left(\frac{c_{OPT}(l)}{c_A(l)}\right)$$, since our solution will be smaller than OPT.
|
||||
- If given an algorithm, and you need to show it has some desired approximation ratio, there are a few approaches.
|
||||
- In recitation, we saw Max-Subset Sum. We found upper bounds on the optimal solution and showed that the given algorithm would always give a solution with value at least half of the upper bound, giving our approximation ratio of 2.
|
||||
- In lecture, you saw the Vertex Cover 2-approximation. Here, you would select any uncovered edge $(u, v)$ and add both u and v to the cover. We argued that at least one of u or v must be in the optimal cover, as the edge must be covered, so at every step we added at least one vertex from an optimal solution, and potentially one extra. So, the size of our cover could not be any larger than twice the optimal.
|
||||
|
||||
## Randomized Algorithms
|
||||
|
||||
Sometimes, we can get better expected performance from an algorithm by introducing randomness.
|
||||
|
||||
We make the tradeoff _guarantee_ runtime and solution quality from a deterministic algorithm, to _expected_ runtime and _quality_ from randomized algorithms.
|
||||
|
||||
We can make various bounds and tricks to calculate and amplify the probability of succeeding.
|
||||
|
||||
### Chernoff Bound
|
||||
|
||||
Statement:
|
||||
|
||||
$$
|
||||
Pr[X < (1-\delta)E[x]] \leq e^{-\frac{\delta^2 E[x]}{2}}
|
||||
$$
|
||||
|
||||
Requirements:
|
||||
|
||||
- $X$ is the sum of $n$ independent random variables
|
||||
- You used the Chernoff bound to bound the probability of getting less than $d$ good partitions, since the probability of each partition being good is independent – the quality of one partition does not affect the quality of the next.
|
||||
- Usage: If you have some probability $Pr[X < \text{something}]$ that you want to bound, you must find $E[X]$, and find a value for $\delta$ such that $(1-\delta)E[X] = \text{something}$. You can then plug in and $E[X]$ into the Chernoff bound.
|
||||
|
||||
### Markov's Inequality
|
||||
|
||||
Statement:
|
||||
|
||||
$$
|
||||
Pr[X \geq a] \leq \frac{E[X]}{a}
|
||||
$$
|
||||
|
||||
Requirements:
|
||||
|
||||
- $X$ is a non-negative random variable
|
||||
- No assumptions about independence
|
||||
- Usage: If you have some probability $Pr[X \geq \text{something}]$ that you want to bound, you must find $E[X]$, and find a value for $a$ such that $a = \text{something}$. You can then plug in and $E[X]$ into Markov's inequality.
|
||||
|
||||
### Union Bound
|
||||
|
||||
Statement:
|
||||
|
||||
$$
|
||||
Pr[\bigcup_{i=1}^n e_i] \leq \sum_{i=1}^n Pr[e_i]
|
||||
$$
|
||||
|
||||
- Conceptually, it's saying that at least one event out of a collection will occur is no more than the sum of the probabilities of each event.
|
||||
- Usage: To bound some bad event $e$, we can use the union bound to sum up the probabilities of each of the bad events $e_i$ and use that to bound $Pr[e]$.
|
||||
|
||||
#### Probabilistic Boosting via Repeated Trials
|
||||
|
||||
- If we want to reduce the probability of some bad event $e$ to some value $p$, we can run the algorithm repeatedly and make majority votes for the decision.
|
||||
- Assume we run the algorithm $k$ times, and the probability of success is $\frac{1}{2} + \epsilon$.
|
||||
- The probability that all trials fail is at most $(1-\epsilon)^k$.
|
||||
- The majority vote of $k$ runs is wrong is the same as probability that more than $\frac{k}{2}+1$ trials fail.
|
||||
- So, the probability is
|
||||
|
||||
$$
|
||||
\begin{aligned}
|
||||
Pr[\text{majority fails}] &=\sum_{i=\frac{k}{2}+1}^{k}\binom{k}{i}(\frac{1}{2}-\epsilon)^i(\frac{1}{2}+\epsilon)^{k-i}\\
|
||||
&= \binom{k}{\frac{k}{2}+1}(\frac{1}{2}-\epsilon)^{\frac{k}{2}+1}
|
||||
\end{aligned}
|
||||
$$
|
||||
|
||||
|
||||
- If we want this probability to be at most $p$, we can just solve for $k$ in the inequality make it less than some $\delta$. Then we solve for $k$ in the inequality $\binom{k}{\frac{k}{2}+1}(\frac{1}{2}-\epsilon)^{\frac{k}{2}+1} \leq \delta$.
|
||||
|
||||
## Online Algorithms
|
||||
|
||||
- We make decisions on the fly, without knowing the future.
|
||||
- The _offline optimum_ is the optimal solution that knows the future.
|
||||
- The _competitive ratio_ of an online algorithm is the worst-case ratio of the cost of the online algorithm to the cost of the offline optimum. (when offline problem is NP-complete, an online algorithm for the problem is also an approximation algorithm) $$\text{Competitive Ratio} = \frac{C_{online}}{C_{offline}}$$
|
||||
- We do case by case analysis to show that the competitive ratio is at most some value. Just like approximation ratio proofs.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
18
content/CSE347/_meta.js
Normal file
18
content/CSE347/_meta.js
Normal file
@@ -0,0 +1,18 @@
|
||||
export default {
|
||||
//index: "Course Description",
|
||||
"---":{
|
||||
type: 'separator'
|
||||
},
|
||||
Exam_reviews: "Exam reviews",
|
||||
CSE347_L1: "Analysis of Algorithms (Lecture 1)",
|
||||
CSE347_L2: "Analysis of Algorithms (Lecture 2)",
|
||||
CSE347_L3: "Analysis of Algorithms (Lecture 3)",
|
||||
CSE347_L4: "Analysis of Algorithms (Lecture 4)",
|
||||
CSE347_L5: "Analysis of Algorithms (Lecture 5)",
|
||||
CSE347_L6: "Analysis of Algorithms (Lecture 6)",
|
||||
CSE347_L7: "Analysis of Algorithms (Lecture 7)",
|
||||
CSE347_L8: "Analysis of Algorithms (Lecture 8)",
|
||||
CSE347_L9: "Analysis of Algorithms (Lecture 9)",
|
||||
CSE347_L10: "Analysis of Algorithms (Lecture 10)",
|
||||
CSE347_L11: "Analysis of Algorithms (Lecture 11)"
|
||||
}
|
||||
21
content/CSE347/index.md
Normal file
21
content/CSE347/index.md
Normal file
@@ -0,0 +1,21 @@
|
||||
# CSE 347
|
||||
|
||||
This is a course about fancy algorithms.
|
||||
|
||||
Topics include:
|
||||
|
||||
1. Greedy Algorithms
|
||||
2. Dynamic Programming
|
||||
3. Divide and Conquer
|
||||
4. Maximum Flows
|
||||
5. Reductions
|
||||
6. NP-Complete Problems
|
||||
7. Approximation Algorithms
|
||||
8. Randomized Algorithms
|
||||
9. Online Algorithms
|
||||
|
||||
It's hard if you don't know the tricks for solving leetcode problems.
|
||||
|
||||
I've been doing leetcode daily problems for almost 2 years when I get into the course.
|
||||
|
||||
It's relatively easy for me but I do have a hard time to get every proof right.
|
||||
Reference in New Issue
Block a user