upgrade structures and migrate to nextra v4
This commit is contained in:
245
content/CSE347/CSE347_L1.md
Normal file
245
content/CSE347/CSE347_L1.md
Normal file
@@ -0,0 +1,245 @@
|
||||
# Lecture 1
|
||||
|
||||
## Greedy Algorithms
|
||||
|
||||
* Builds up a solution by making a series of small decisions that optimize some objective.
|
||||
* Make one irrevocable choice at a time, creating smaller and smaller sub-problems of the same kind as the original problem.
|
||||
* There are many potential greedy strategies and picking the right one can be challenging.
|
||||
|
||||
### A Scheduling Problem
|
||||
|
||||
You manage a giant space telescope.
|
||||
|
||||
* There are $n$ research projects that want to use it to make observations.
|
||||
* Only one project can use the telescope at a time.
|
||||
* Project $p_i$ needs the telescope starting at time $s_i$ and running for a length of time $t_i$.
|
||||
* Goal: schedule as many as possible
|
||||
|
||||
Formally
|
||||
|
||||
Input:
|
||||
|
||||
* Given a set $P$ of projects, $|P|=n$
|
||||
* Each request $p_i\in P$ occupies interval $[s_i,f_i)$, where $f_i=s_i+t_i$
|
||||
|
||||
Goal: Choose a subset $\Pi\sqsubseteq P$ such that
|
||||
|
||||
1. No two projects in $\Pi$ have overlapping intervals.
|
||||
2. The number of selected projects $|\Pi|$ is maximized.
|
||||
|
||||
#### Shortest Interval
|
||||
|
||||
Counter-example: `[1,10],[9,12],[11,20]`
|
||||
|
||||
#### Earliest start time
|
||||
|
||||
Counter-example: `[1,10],[2,3],[4,5]`
|
||||
|
||||
#### Fewest Conflicts
|
||||
|
||||
Counter-example: `[1,2],[1,4],[1,4],[3,6],[7,8],[5,8],[5,8]`
|
||||
|
||||
#### Earliest finish time
|
||||
|
||||
Correct... but why
|
||||
|
||||
#### Theorem of Greedy Strategy (Earliest Finishing Time)
|
||||
|
||||
Say this greedy strategy (Earliest Finishing Time) picks a set $\Pi$ of intervals, some other strategy picks a set $O$ of intervals.
|
||||
|
||||
Assume sorted by finishing time
|
||||
|
||||
* $\Pi=\{i_1,i_2,...,i_k\},|\Pi|=k$
|
||||
* $O=\{j_1,j_2,...,j_m\},|O|=m$
|
||||
|
||||
We want to show that $|\Pi|\geq|O|,k>m$
|
||||
|
||||
#### Lemma: For all $r<k,f_{i_r}\leq f_{j_r}$
|
||||
|
||||
We proceed the proof by induction.
|
||||
|
||||
* Base Case, when r=1.
|
||||
$\Pi$ is the earliest finish time, and $O$ cannot pick a interval with earlier finish time, so $f_{i_r}\leq f_{j_r}$
|
||||
|
||||
* Inductive step, when r>1.
|
||||
Since $\Pi_r$ is the earliest finish time, so for any set in $O_r$, $f_{i_{r-1}}\leq f_{j_{r-1}}$, for any $j_r$ inserted to $O_r$, it can also be inserted to $\Pi_r$. So $O_r$ cannot pick an interval with earlier finish time than $Pi$ since it will also be picked by definition if $O_r$ is the optimal solution $OPT$.
|
||||
|
||||
#### Problem of “Greedy Stays Ahead” Proof
|
||||
|
||||
* Every problem has very different theorem.
|
||||
* It can be challenging to even write down the correct statement that you must prove.
|
||||
* We want a systematic approach to prove the correctness of greedy algorithms.
|
||||
|
||||
### Road Map to Prove Greedy Algorithm
|
||||
|
||||
#### 1. Make a Choice
|
||||
|
||||
Pick an interval based on greedy choice, say $q$
|
||||
|
||||
Proof: **Greedy Choice Property**: Show that using our first choice is not "fatal" – at least one optimal solution makes this choice.
|
||||
|
||||
Techniques: **Exchange Argument**: "If an optimal solution does not choose $q$, we can turn it into an equally good solution that does."
|
||||
|
||||
Let $\Pi^*$ be any optimal solution for project set $P$.
|
||||
- If $q\in \Pi^*$, we are done.
|
||||
- Otherwise, let $x$ be the optimal solution from $\Pi^*$ that does not pick $q$. We create another solution $\bar{\Pi^*}$ that replace $x$ with $q$, and prove that the $\bar{\Pi^*}$ is as optimal as $\Pi^*$
|
||||
|
||||
#### 2. Create a smaller instance $P'$ of the original problem
|
||||
|
||||
$P'$ has the same optimization criteria.
|
||||
|
||||
Proof: **Inductive Structure**: Show that after making the first choice, we're left with a smaller version of the same problem, whose solution we can safely combine with the first choice.
|
||||
|
||||
Let $P'$ be the subproblem left after making first choice $q$ in problem $P$ and let $\Pi'$ be an optimal solution to $P'$. Then $\Pi=\Pi^*\cup\{q\}$ is an optimal solution to $P$.
|
||||
|
||||
$P'=P-\{q\}-\{$projects conflicting with $q\}$
|
||||
|
||||
#### 3. Solution: Union of choices that we made
|
||||
|
||||
Union of choices that we made.
|
||||
|
||||
Proof: **Optimal Substructure**: Show that if we solve the subproblem optimally, adding our first choice creates an optimal solution to the *whole* problem.
|
||||
|
||||
Let $q$ be the first choice, $P'$ be the subproblem left after making $q$ in problem $P$, $\Pi'$ be an optimal solution to $P'$. We claim that $\Pi=\Pi'\cup \{q\}$ is an optimal solution to $P$.
|
||||
|
||||
We proceed the proof by contradiction.
|
||||
|
||||
Assume that $\Pi=\Pi'+\{q\}$ is not optimal.
|
||||
|
||||
|
||||
By Greedy choice property $GCP$. we already know that $\exists$ an optimal solution $\Pi^*$ for problem $P$ that contains $q$. If $\Pi$ is not optimal, $cost(\Pi^*)<cost(\Pi)$. Then since $\Pi^*-q$ is also a feasible solution to $P'$. $cost(\Pi^*-q)>cost(\Pi-q)=\Pi'$ which leads to contradiction that $\Pi'$ is an optimal solution to $P'$.
|
||||
|
||||
#### 4. Put 1-3 together to write an inductive proof of the Theorem
|
||||
|
||||
This is independent of problem, same for every problem.
|
||||
|
||||
Use scheduling problem as an example:
|
||||
|
||||
Theorem: given a scheduling problem $P$, if we repeatedly choose the remaining feasible project with the earliest finishing time, we will construct an optimal feasible solution to $P$.
|
||||
|
||||
Proof: We proceed by induction on $|P|$. (based on the size of problem $P$).
|
||||
|
||||
- Base case: $|P|=1$.
|
||||
- Inductive step.
|
||||
- Inductive hypothesis: For all problems of size $<n$, earliest finishing time (EFT) gives us an optimal solution.
|
||||
- EFT is optimal for problem of size $n$.
|
||||
- Proof: Once we pick q, because of greedy choice. $P'=P=\{q\} -\{$interval that conflict with $q\}$. $|P'|<n$, By Inductive hypothesis, EFT gives us an optimal solution to $P'$, but by inductive substructure, and optimal substructure. $\Pi'$ (optimal solution to $P'$), we have optimal solution to $P$.
|
||||
|
||||
_this step always holds as long as the previous three properties hold, and we don't usually write the whole proof._
|
||||
|
||||
```python
|
||||
# Algorithm construction for Interval scheduling problem
|
||||
def schedule(p):
|
||||
# sorting takes O(n)=nlogn
|
||||
p=sorted(p,key=lambda x:x[1])
|
||||
res=[P[0]]
|
||||
# O(n)=n
|
||||
for i in p[1:]:
|
||||
if res[-1][-1]<i[0]:
|
||||
res.append(i)
|
||||
return res
|
||||
```
|
||||
|
||||
## Extra Examples:
|
||||
|
||||
### File compression problem
|
||||
|
||||
You have $n$ files of different sizes $f_i$.
|
||||
|
||||
You want to merge them to create a single file. $merge(f_i,f_j)$ takes time $f_i+f_j$ and creates a file of size $f_k=f_i+f_j$.
|
||||
|
||||
Goal: Find the order of merges such that the total time to merge is minimized.
|
||||
|
||||
Thinking process: The merge process is a binary tree and each of the file is the leaf of the tree.
|
||||
|
||||
The total time required =$\sum^n_{i=1} d_if_i$, where $d_i$ is the depth of the file in the compression tree.
|
||||
|
||||
So compressing the smaller file first may yield a faster run time.
|
||||
|
||||
Proof:
|
||||
|
||||
#### Greedy Choice Property
|
||||
|
||||
Construct part of the solution by making a locally good decision.
|
||||
|
||||
Lemma: $\exist$ some optimal solution that merges the two smallest file first, lets say $[f_1,f_2]$
|
||||
|
||||
Proof: **Exchange argument**
|
||||
|
||||
* Case 1: Optimal choice already merges $f_1,f_2$, done. Time order does not matter in this problem at some point.
|
||||
* eg: [2,2,3], merge 2,3 and 2,2 first don't change the total cost
|
||||
* Case 2: Optimal choice does not merges $f_1$ and $f_2$.
|
||||
* Suppose the optimal solution merges $f_x,f_y$ as the deepest merge.
|
||||
* Then $d_x\geq d_1,d_y\geq d_2$. Exchanging $f_1,f_2$ with $f_x,f_y$ would yield a strictly less greater solution since $f_1,f_2$ already smallest.
|
||||
|
||||
#### Inductive Structure
|
||||
|
||||
* We can combine feasible solution to the subproblem $P'$ with the greedy choice to get a feasible solution to $P$
|
||||
* After making greedy choice $q$, we are left with a strictly smaller subproblem $P'$ with the same optimality criteria of the original problem
|
||||
*
|
||||
Proof: **Optimal Substructure**: Show that if we solve the subproblem optimally, adding our first choice creates an optimal solution to the *whole* problem.
|
||||
|
||||
Let $q$ be the first choice, $P'$ be the subproblem left after making $q$ in problem $P$, $\Pi^*$ be an optimal solution to $P'$. We claim that $\Pi=\Pi'\cup \{q\}$ is an optimal solution to $P$.
|
||||
|
||||
We proceed the proof by contradiction.
|
||||
|
||||
Assume that $\Pi=\Pi^*+\{q\}$ is not optimal.
|
||||
|
||||
By Greedy choice property $GCP$. we already know that $\Pi^*$ is optimal solution that contains $q$. Then $|\Pi^*|>|\Pi|$ $\Pi^*-q$ is also feasible solution to $P'$. $|\Pi^*-q|>|\Pi-q|=\Pi'$ which is an optimal solution to $P'$ which leads to contradiction.
|
||||
|
||||
Proof: **Smaller problem size**
|
||||
|
||||
After merging the smallest two files into one, we have strictly less files waiting to merge.
|
||||
|
||||
#### Optimal Substructure
|
||||
|
||||
* We can combine optimal solution to the subproblem $P'$ with the greedy choice to get a optimal solution to $P$
|
||||
|
||||
Step 4 ignored, same for all greedy problems.
|
||||
|
||||
### Conclusion: Greedy Algorithm
|
||||
|
||||
* Algorithm
|
||||
* Runtime Complexity
|
||||
* Proof
|
||||
* Greedy Choice Property
|
||||
* Construct part of the solution by making a locally good decision.
|
||||
* Inductive Structure
|
||||
* We can combine feasible solution to the subproblem $P'$ with the greedy choice to get a feasible solution to $P$
|
||||
* After making greedy choice $q$, we are left with a strictly smaller subproblem $P'$ with the same optimality criteria of the original problem
|
||||
* Optimal Substructure
|
||||
* We can combine optimal solution to the subproblem $P'$ with the greedy choice to get a optimal solution to $P$
|
||||
* Standard Contradiction Argument simplifies it
|
||||
|
||||
## Review:
|
||||
|
||||
### Essence of master method
|
||||
|
||||
Let $a\geq 1$ and $b>1$ be constants, let $f(n)$ be a function, and let $T(n)$ be defined on the nonnegative integers by the recurrence
|
||||
|
||||
$$
|
||||
T(n)=aT(\frac{n}{b})+f(n)
|
||||
$$
|
||||
|
||||
where we interpret $n/b$ to mean either ceiling or floor of $n/b$. $c_{crit}=\log_b a$ Then $T(n)$ has to following asymptotic bounds.
|
||||
|
||||
* Case I: if $f(n) = O(n^{c})$ ($f(n)$ "dominates" $n^{\log_b a-c}$) where $c<c_{crit}$, then $T(n) = \Theta(n^{c_{crit}})$
|
||||
|
||||
* Case II: if $f(n) = \Theta(n^{c_{crit}})$, ($f(n), n^{\log_b a-c}$ have no dominate) then $T(n) = \Theta(n^{\log_b a} \log_2 n)$
|
||||
|
||||
Extension for $f(n)=\Theta(n^{critical\_value}*(\log n)^k)$
|
||||
|
||||
* if $k>-1$
|
||||
|
||||
$T(n)=\Theta(n^{critical\_value}*(\log n)^{k+1})$
|
||||
|
||||
* if $k=-1$
|
||||
|
||||
$T(n)=\Theta(n^{critical\_value}*\log \log n)$
|
||||
|
||||
* if $k<-1$
|
||||
|
||||
$T(n)=\Theta(n^{critical\_value})$
|
||||
|
||||
* Case III: if $f(n) = \Omega(n^{log_b a+c})$ ($n^{log_b a-c}$ "dominates" $f(n)$) for some constant $c >0$, and if a $f(n/b)<= c f(n)$ for some constant $c <1$ then for all sufficiently large $n$, $T(n) = \Theta(n^{log_b a+c})$
|
||||
|
||||
Reference in New Issue
Block a user