update notes

2024-11-18 14:07:36 -06:00
parent ee64606236
commit f08d8ff674
47 changed files with 5863 additions and 0 deletions
--- a/pages/CSE347/CSE347_L1.md
+++ b/pages/CSE347/CSE347_L1.md
@@ -0,0 +1,245 @@
 # Lecture 1
 ## Greedy Algorithms
 * Builds up a solution by making a series of small decisions that optimize some objective.
 * Make one irrevocable choice at a time, creating smaller and smaller sub-problems of the same kind as the original problem.
 * There are many potential greedy strategies and picking the right one can be challenging.
 ### A Scheduling Problem
 You manage a giant space telescope.
 * There are $n$ research projects that want to use it to make observations.
 * Only one project can use the telescope at a time.
 * Project $p_i$ needs the telescope starting at time $s_i$ and running for a length of time $t_i$.
 * Goal: schedule as many as possible
 Formally
 Input:
 * Given a set $P$ of projects, $|P|=n$
 * Each request $p_i\in P$ occupies interval $[s_i,f_i)$, where $f_i=s_i+t_i$
 Goal: Choose a subset $\Pi\sqsubseteq P$ such that
 1. No two projects in $\Pi$ have overlapping intervals.
 2. The number of selected projects $|\Pi|$ is maximized.
 #### Shortest Interval 
 Counter-example: `[1,10],[9,12],[11,20]`
 #### Earliest start time
 Counter-example: `[1,10],[2,3],[4,5]`
 #### Fewest Conflicts
 Counter-example: `[1,2],[1,4],[1,4],[3,6],[7,8],[5,8],[5,8]`
 #### Earliest finish time
 Correct... but why
 #### Theorem of Greedy Strategy (Earliest Finishing Time)
 Say this greedy strategy (Earliest Finishing Time) picks a set $\Pi$ of intervals, some other strategy picks a set $O$ of intervals.
 Assume sorted by finishing time
 * $\Pi=\{i_1,i_2,...,i_k\},|\Pi|=k$
 * $O=\{j_1,j_2,...,j_m\},|O|=m$
 We want to show that $|\Pi|\geq|O|,k>m$
 #### Lemma: For all $r<k,f_{i_r}\leq f_{j_r}$
 We proceed the proof by induction.
 * Base Case, when r=1.
    $\Pi$ is the earliest finish time, and $O$ cannot pick a interval with earlier finish time, so $f_{i_r}\leq f_{j_r}$
 * Inductive step, when r>1.
    Since $\Pi_r$ is the earliest finish time, so for any set in $O_r$, $f_{i_{r-1}}\leq f_{j_{r-1}}$, for any $j_r$ inserted to $O_r$, it can also be inserted to $\Pi_r$. So $O_r$ cannot pick an interval with earlier finish time than $Pi$ since it will also be picked by definition if $O_r$ is the optimal solution $OPT$.
 #### Problem of “Greedy Stays Ahead” Proof
 * Every problem has very different theorem.
 * It can be challenging to even write down the correct statement that you must prove.
 * We want a systematic approach to prove the correctness of greedy algorithms.
 ### Road Map to Prove Greedy Algorithm
 #### 1. Make a Choice
 Pick an interval based on greedy choice, say $q$
 Proof: **Greedy Choice Property**: Show that using our first choice is not "fatal" – at least one optimal solution makes this choice.
 Techniques: **Exchange Argument**: "If an optimal solution does not choose $q$, we can turn it into an equally good solution that does."
 Let $\Pi^*$ be any optimal solution for project set $P$.
 - If $q\in \Pi^*$, we are done.
 - Otherwise, let $x$ be the optimal solution from $\Pi^*$ that does not pick $q$. We create another solution $\bar{\Pi^*}$ that replace $x$ with $q$, and prove that the $\bar{\Pi^*}$ is as optimal as $\Pi^*$
 #### 2. Create a smaller instance $P'$ of the original problem
 $P'$ has the same optimization criteria.
 Proof: **Inductive Structure**: Show that after making the first choice, we're left with a smaller version of the same problem, whose solution we can safely combine with the first choice.
 Let $P'$ be the subproblem left after making first choice $q$ in problem $P$ and let $\Pi'$ be an optimal solution to $P'$. Then $\Pi=\Pi^*\cup\{q\}$ is an optimal solution to $P$.
 $P'=P-\{q\}-\{$projects conflicting with $q\}$
 #### 3. Solution: Union of choices that we made
 Union of choices that we made.
 Proof: **Optimal Substructure**: Show that if we solve the subproblem optimally, adding our first choice creates an optimal solution to the *whole* problem.
 Let $q$ be the first choice, $P'$ be the subproblem left after making $q$ in problem $P$, $\Pi'$ be an optimal solution to $P'$. We claim that $\Pi=\Pi'\cup \{q\}$ is an optimal solution to $P$.
 We proceed the proof by contradiction.
 Assume that $\Pi=\Pi'+\{q\}$ is not optimal.
 By Greedy choice property $GCP$. we already know that $\exists$ an optimal solution $\Pi^*$ for problem $P$ that contains $q$. If $\Pi$ is not optimal, $cost(\Pi^*)<cost(\Pi)$. Then since $\Pi^*-q$ is also a feasible solution to $P'$. $cost(\Pi^*-q)>cost(\Pi-q)=\Pi'$ which leads to contradiction that $\Pi'$ is an optimal solution to $P'$.
 #### 4. Put 1-3 together to write an inductive proof of the Theorem
 This is independent of problem, same for every problem.
 Use scheduling problem as an example:
 Theorem: given a scheduling problem $P$, if we repeatedly choose the remaining feasible project with the earliest finishing time, we will construct an optimal feasible solution to $P$.
 Proof: We proceed by induction on $|P|$. (based on the size of problem $P$).
 - Base case: $|P|=1$.
 - Inductive step.
  - Inductive hypothesis: For all problems of size $<n$, earliest finishing time (EFT) gives us an optimal solution.
  - EFT is optimal for problem of size $n$.
  - Proof: Once we pick q, because of greedy choice. $P'=P=\{q\} -\{$interval that conflict with $q\}$. $|P'|<n$, By Inductive hypothesis, EFT gives us an optimal solution to $P'$, but by inductive substructure, and optimal substructure. $\Pi'$ (optimal solution to $P'$), we have optimal solution to $P$.
 _this step always holds as long as the previous three properties hold, and we don't usually write the whole proof._
 ```python
 # Algorithm construction for Interval scheduling problem
 def schedule(p):
  # sorting takes O(n)=nlogn
  p=sorted(p,key=lambda x:x[1])
  res=[P[0]]
  # O(n)=n
  for i in p[1:]:
    if res[-1][-1]<i[0]:
      res.append(i)
  return res
 ```
 ## Extra Examples:
 ### File compression problem
 You have $n$ files of different sizes $f_i$.
 You want to merge them to create a single file. $merge(f_i,f_j)$ takes time $f_i+f_j$ and creates a file of size $f_k=f_i+f_j$.
 Goal: Find the order of merges such that the total time to merge is minimized.
 Thinking process: The merge process is a binary tree and each of the file is the leaf of the tree.
 The total time required =$\sum^n_{i=1} d_if_i$, where $d_i$ is the depth of the file in the compression tree.
 So compressing the smaller file first may yield a faster run time.
 Proof:
 #### Greedy Choice Property
 Construct part of the solution by making a locally good decision.
 Lemma: $\exist$ some optimal solution that merges the two smallest file first, lets say $[f_1,f_2]$
 Proof: **Exchange argument**
 * Case 1: Optimal choice already merges $f_1,f_2$, done. Time order does not matter in this problem at some point.
  * eg: [2,2,3], merge 2,3 and 2,2 first don't change the total cost
 * Case 2: Optimal choice does not merges $f_1$ and $f_2$.
  * Suppose the optimal solution merges $f_x,f_y$ as the deepest merge.
  * Then $d_x\geq d_1,d_y\geq d_2$. Exchanging $f_1,f_2$ with $f_x,f_y$ would yield a strictly less greater solution since $f_1,f_2$ already smallest.
 #### Inductive Structure
 * We can combine feasible solution to the subproblem $P'$ with the greedy choice to get a feasible solution to $P$
 * After making greedy choice $q$, we are left with a strictly smaller subproblem $P'$ with the same optimality criteria of the original problem
 * 
 Proof: **Optimal Substructure**: Show that if we solve the subproblem optimally, adding our first choice creates an optimal solution to the *whole* problem.
 Let $q$ be the first choice, $P'$ be the subproblem left after making $q$ in problem $P$, $\Pi^*$ be an optimal solution to $P'$. We claim that $\Pi=\Pi'\cup \{q\}$ is an optimal solution to $P$.
 We proceed the proof by contradiction.
 Assume that $\Pi=\Pi^*+\{q\}$ is not optimal.
 By Greedy choice property $GCP$. we already know that $\Pi^*$ is optimal solution that contains $q$. Then $|\Pi^*|>|\Pi|$ $\Pi^*-q$ is also feasible solution to $P'$. $|\Pi^*-q|>|\Pi-q|=\Pi'$ which is an optimal solution to $P'$ which leads to contradiction.
 Proof: **Smaller problem size**
 After merging the smallest two files into one, we have strictly less files waiting to merge.
 #### Optimal Substructure
 * We can combine optimal solution to the subproblem $P'$ with the greedy choice to get a optimal solution to $P$
 Step 4 ignored, same for all greedy problems.
 ### Conclusion: Greedy Algorithm
 * Algorithm
 * Runtime Complexity
 * Proof
  * Greedy Choice Property
    * Construct part of the solution by making a locally good decision.
  * Inductive Structure
    * We can combine feasible solution to the subproblem $P'$ with the greedy choice to get a feasible solution to $P$
    * After making greedy choice $q$, we are left with a strictly smaller subproblem $P'$ with the same optimality criteria of the original problem
  * Optimal Substructure
    * We can combine optimal solution to the subproblem $P'$ with the greedy choice to get a optimal solution to $P$
 * Standard Contradiction Argument simplifies it
 ## Review:
 ### Essence of master method
 Let $a\geq 1$ and $b>1$ be constants, let $f(n)$ be a function, and let $T(n)$ be defined on the nonnegative integers by the recurrence
 $$
 T(n)=aT(\frac{n}{b})+f(n)
 $$
 where we interpret $n/b$ to mean either ceiling or floor of $n/b$. $c_{crit}=\log_b a$ Then $T(n)$ has to following asymptotic bounds.
 * Case I: if $f(n) = O(n^{c})$ ($f(n)$ "dominates" $n^{\log_b a-c}$) where $c<c_{crit}$, then $T(n) = \Theta(n^{c_{crit}})$
 * Case II: if $f(n) = \Theta(n^{c_{crit}})$, ($f(n), n^{\log_b a-c}$ have no dominate) then $T(n) = \Theta(n^{\log_b a} \log_2 n)$
  Extension for $f(n)=\Theta(n^{critical\_value}*(\log n)^k)$
  * if $k>-1$
    $T(n)=\Theta(n^{critical\_value}*(\log n)^{k+1})$
  * if $k=-1$
    $T(n)=\Theta(n^{critical\_value}*\log \log n)$
  * if $k<-1$
    $T(n)=\Theta(n^{critical\_value})$
 * Case III: if $f(n) = \Omega(n^{log_b a+c})$ ($n^{log_b a-c}$ "dominates" $f(n)$) for some constant $c >0$, and if a $f(n/b)<= c f(n)$ for some constant $c <1$ then for all sufficiently large $n$, $T(n) = \Theta(n^{log_b a+c})$
--- a/pages/CSE347/CSE347_L10.md
+++ b/pages/CSE347/CSE347_L10.md
@@ -0,0 +1,320 @@
 # Lecture 10
 ## Online Algorithms
 ### Example 1: Elevator
 Problem: You've entered the lobby of a tall building, and want to go to the top floor as quickly as possible. There is an elevator which takes $E$ time to get to the top once it arrives. You can also take the stairs which takes $S$ time to climb (once you start) with $S>E$. However, you **do not know** when the elevator will arrive.
 #### Offline (Clairvoyant) vs. Online
 Offline: If you know that the elevator is arriving in $T$ time, the what will you do?
 - Easy. I will computer $E+T$ with $S$ and take the smaller one.
 Online: You do not know when the elevator will arrive.
 - You can either wait for the elevator or take the stairs.
 #### Strategies
 **Always take the stairs.**
 Your cost $S$, 
 Optimal Cost: $E$.
 Your cost / Optimal cost = $\frac{S}{E}$.
 $S$ would be arbitrary large. For example, the Empire State Building has $103$ floors.
 **Wait for the elevator**
 Your cost $T+E$
 Optimal Cost: $S$ (if $T$ is large)
 Your cost / Optimal cost = $\frac{T+E}{S}$.
 $T$ could be arbitrary large. For out of service elevator, $T$ could be infinite.
 #### Online Algorithms
 Definition: An online algorithm must take decisions **without** full information about the problem instance [in this case $T$] and/or it does not know the future [e.g. makes decision immediately as jobs come in without knowing the future jobs].
 An **offline algorithm** has the full information about the problem instance.
 ### Competitive Ratio
 Quality of online algorithm is quantified by the **competitive ratio** (Idea is similar to the approximation ratio in optimization).
 Consider a problem $L$ (minimization) and let $l$ be an instance of this problem.
 $C^*(l)$ is the cost of the optimal offline solution with full information and unlimited computational power.
 $A$ is the online algorithm for $L$.
 $C_A(l)$ is the value of $A$'s solution on $l$.
 An online algorithm $A$ is $\alpha$-competitive if
 $$
 \frac{C_A(l)}{C^*(l)}\leq \alpha
 $$
 for all instances $l$ of the problem.
 In other words, $\alpha=\max_l\frac{C_A(l)}{C^*(l)}$.
 For maximization problems, we want to minimize the comparative ratio.
 ### Back to the Elevator Problem
 **Strategy 1**: Always take the stairs. Ratio is $\frac{S}{E}$. can be arbitrarily large.
 **Strategy 2**: Wait for the elevator. Ratio is $\frac{T+E}{S}$. can be arbitrarily large.
 **Strategy 3**: We do not make a decision immediately. Let's wait for $R$ times and then takes stairs if elevator does not arrive.
 Question: What is the value of $R$? (how long to wait?)
 Let's try $R=S$.
 Claim: The comparative ratio is $2$.
 Proof:
 Case 1: The optimal offline solution takes the elevator, then $T+E\leq S$.
 We also take the elevator.
 Competitive ratio = $\frac{T+E}{T+E}=1$.
 Case 2: The optimal offline solution takes the stairs, immediately.
 We wait for $R$ times and then take the stairs. In worst case, we wait for $R$ times and then take the stairs for $R$.
 Competitive ratio = $\frac{2R}{R}=2$.
 EOP
 Let's try $R=S-E$ instead.
 Claim: The comparative ratio is $max\{1,2-\frac{E}{S}\}$.
 Proof:
 Case 1: The optimal offline solution takes the elevator, then $T+E\leq S$.
 We also take the elevator.
 Competitive ratio = $\frac{T+E}{T+E}=1$.
 Case 2: The optimal offline solution takes the stairs, immediately.
 We wait for $R=S-E$ times and then take the stairs.
 Competitive ratio = $\frac{S-E+S}{S}=2-\frac{E}{S}$.
 EOP
 What if we wait less time? Let's try $R=S-E-\epsilon$ for some $\epsilon>0$
 In the worst case, we take the stairs for $S-E-\epsilon$ times and then take the stairs for $S$.
 Competitive ratio = $\frac{(S-E-\epsilon)+S}{S-E-\epsilon+E}=\frac{2S-E-\epsilon}{2S-E}>2-\frac{E}{S}$.
 So the optimal competitive ratio is $max\{1,2-\frac{E}{S}\}$ when we wait for $S-E$ time.
 ### Example 2: Cache Replacement
 Cache: Data in a cache is organized in blocks (also called pages or cache lines).
 If CPU accesses data that is already in the cache, it is called **cache hit**, then access is fast.
 If CPU accesses data that is not in the cache, it is called **cache miss**, This block if brought to cache from main memory. If the cache already has $k$ blocks (full), then another block need to be **kicked out** (eviction).
 Global: Minimize the number of cache misses.
 **Clairvoyant policy**: Knows that will be accessed in the future and the sequence of access.
 FIF: Farthest in the future
 Example: $k=3$, cache has $3$ blocks.
 Sequence: $A B C D C A B$
 Cache: $A B C$, the evict $B$ for $D$. then 3 warm up and 1 miss.
 Online Algorithm: Least recently used (LRU)
 LRU: least recently used.
 Example: $A B C D C A B$
 Cache: $A B C$, the evict $A$ for $D$. then 3 warm up and 1 miss.
 Cache: $D B C$, the evict $B$ for $A$. 1 miss.
 Cache: $D A C$, the evict $D$ for $B$. 1 miss.
 #### Competitive Ratio for LRU
 Claim: LRU is $k+1$-competitive.
 Proof:
 Split the sequence into subsequences such that each subsequence contains $k+1$ distinct blocks.
 For example, suppose $k=3$, sequence $ABCDCEFGEA$, subsequences are $ABCDC$ and $EFGEA$.
 LRU Cache: In each subsequence, it has at most $k+1$ misses.
 The optimal offline solution: In each subsequence, must have at least $1$ miss.
 So the competitive ratio is at most $k+1$.
 EOP
 Using similar analysis, we can show that LRU is $k$ competitive.
 Hint for the proof: 
 Split the sequence into subsequences such that each subsequence LRU has $k$ misses.
 Argue that OPT has at least $1$ miss in each subsequence.
 EOP
 #### Many sensible algorithms are $k$-competitive
 **Lower Bound**: No deterministic online algorithm is better than $k$-competitive.
 **Resource augmentation**: Offline algorithm (which knows the future) has $k$ cache lines in its cache and the online algorithm has $ck$ cache lines with $c>1$.
 ##### Lemma: Competitive Ratio is $\sim \frac{c}{c-1}$
 Say $c=2$. LRU cache has twice as much as cache. LRU is $2$-competitive.
 Proof:
 LRU has cache of size $2k$.
 Divide the sequence into subsequences such that you have $ck$ distinct pages.
 In each subsequence, LRU has at most $ck$ misses.
 The OPT has at least $(c-1)k$ misses.
 So competitive ratio is at most $\frac{ck}{(c-1)k}=\frac{c}{c-1}$.
 _Actual competitive ratio is $\sim \frac{c}{c-1+\frac{1}{k}}$._
 EOP
 ### Conclusion
 - Definition: some information unknown
 - Clairvoyant vs. Online
 - Competitive Ratio
 - Example:
  - Elevator
  - Cache Replacement
 ### Example 3: Pessimal cache problem
 Maximize number of cache misses.
 Maximization problem: competitive ratio is $max\{\frac{\text{cost of the optimal online algorithm}}{\text{cost of our algorithm}}\}$.
 Or get $min\{\frac{\text{cost of our algorithm}}{\text{cost of the optimal online algorithm}}\}$.
 The size of the cache is $k$.
 So if OPT has $X$ cache misses, we want $\geq \frac{X}{\alpha}$. cache misses where $\alpha$ is the competitive ratio.
 Claim: The OPT could always miss (note quite) except when the page is accessed twice in a row.
 Claim: No deterministic online algorithm has a bounded competitive ratio. (that is independent of the length of the sequence)
 Proof:
 Start with an empty cache. (size of cache is $k$)
 Miss the first $k$ unique pages.
 $P_1,P_2,\cdots,P_k|P_{k+1},P_{k+2},\cdots,P_{2k}$
 Say your deterministic online algorithm choose to evict $P_i$ for $i\in\{1,2,\cdots,k\}$.
 We want to choose $P_i$ for $i\in\{1,2,\cdots,k\}$ such that the number of misses is maximized.
 The optimal offline solution: evict the page that will be accessed furthest in the future. Let's call it $\sigma$.
 The online algorithm: evict $P_i$ for $i\in\{1,2,\cdots,k\}$. Will have $k+1$ misses in the worst case.
 So the competitive ratio is at most $\frac{\sigma}{k+1}$, which is unbounded.
 #### Randomized most recently used (RAND, MRU)
 MRU without randomization is a deterministic algorithm, and thus, the competitive ration is bounded.
 First $k$ unique accesses brings all pages to cache.
 On the $k+1$th access, pick a random page from the cache and evict it.
 After that evict the MRU no a miss.
 Claim: RAND is $k$-competitive.
 #### Lemma: After the first $k+1$ unique accesses at all times
 1. 1 page is in the cache with probability 1 (the MRU one)
 2. There exists $k$ pages each of which is in the cache with probability $1-\frac{1}{k}$
 3. All other pages are in the cache with probability $0$.
 Proof:
 By induction.
 Base case: right after the first $k+1$ unique accesses and before $k+2$th access.
 1. $P_{k+1}$ is in the cache with probability $1$.
 2. When we brought $P_{k+1}$ to the cache, we evicted one page uniformly at random. (i.e. $P_i$ is evicted with probability $\frac{1}{k}$, $P_i$ is still in the cache with probability $1-\frac{1}{k}$)
 3. All other $r$ pages are definitely not in the cache because we did not see them yet.
 Inductive cases:
 Let $P$ be a page that is in the cache with probability $0$
 Cache miss and RAND MRU evict $P'$ for another page with probability in this cache with probability $0$.
 1. $P$ is in the cache with probability $1$.
 2. By induction, there exists a set of $k$ pages each of which is in the cache with probability $1-\frac{1}{k}$.
 3. All other pages are in the cache with probability $0$.
 Let $P$ be a page in the cache with probability $1-\frac{1}{k}$.
 With probability $\frac{1}{k}$, $P$ is not in the cache and RAND evicts $P'$ in the cache and brings $P$ to the cache.
 EOP
 MRU is $k$-competitive.
 Proof:
 Case 1: Access MRU page.
 Both OPT and our algorithm don't miss.
 Case 2: Access some other 1 page
 OPT definitely misses.
 RAND MRU misses with probability $\geq \frac{1}{k}$.
 Let's define the random variable $X$ as the number of misses of RAND MRU.
 $E[X]\leq 1+\frac{1}{k}$.
 EOP
--- a/pages/CSE347/CSE347_L11.md
+++ b/pages/CSE347/CSE347_L11.md
@@ -0,0 +1 @@
 # Lecture 11
--- a/pages/CSE347/CSE347_L12.md
+++ b/pages/CSE347/CSE347_L12.md
@@ -0,0 +1 @@
 # Lecture 12
--- a/pages/CSE347/CSE347_L13.md
+++ b/pages/CSE347/CSE347_L13.md
@@ -0,0 +1 @@
 # Lecture 13
--- a/pages/CSE347/CSE347_L14.md
+++ b/pages/CSE347/CSE347_L14.md
@@ -0,0 +1 @@
 # Lecture 14
--- a/pages/CSE347/CSE347_L15.md
+++ b/pages/CSE347/CSE347_L15.md
@@ -0,0 +1 @@
 # Lecture 15
--- a/pages/CSE347/CSE347_L2.md
+++ b/pages/CSE347/CSE347_L2.md
@@ -0,0 +1,334 @@
 # Lecture 2
 ## Divide and conquer
 Review of CSE 247
 1. Divide the problem into (generally equal) smaller subproblems
 2. Recursively solve the subproblems
 3. Combine the solutions of subproblems to get the solution of the original problem
    - Examples: Merge Sort, Binary Search
 Recurrence
 Master Method:
 $$
 T(n)=aT(\frac{n}{b})+\Theta(f(n))
 $$
 ### Example 1: Multiplying 2 numbers
 Normal Algorithm:
 ```python
 def multiply(x,y):
    p=0
    for i in y:
        p+=x*y
    return p
 ```
 divide and conquer approach
 ```python
 def multiply(x,y):
    n=max(len(x),len(y))
    if n==1:
        return x*y
    xh,xl=x>>(n/2),x&((1<<n/2)-1)
    yh,yl=y>>(n/2),y&((1<<n/2)-1)
    return (multiply(xh,yh)<<n)+((multiply(xh,yl)+multiply(yh,xl))<<(n/2))+multiply(xl,yl)
 ```
 $$
 T(n)=4T(n/2)+\Theta(n)=\Theta(n^2)
 $$
 Not a useful optimization
 But,
 $$
 multiply(xh,yl)+multiply(yh,xl)=multiply(xh-xl,yh-yl)+multiply(xh,yh)+multiply(xl,yl)
 $$
 ```python
 def multiply(x,y):
    n=max(len(x),len(y))
    if n==1:
        return x*y
    xh,xl=x>>(n/2),x&((1<<n/2)-1)
    yh,yl=y>>(n/2),y&((1<<n/2)-1)
    zhh=multiply(xh,yh)
    zll=multiply(xl,yl)
    return (zhh<<n)+((multiply(xh-xl,yh-yl)+zhh+zll)<<(n/2))+zll
 ```
 $$
 T(n)=3T(n/2)+\Theta(n)=\Theta(n^{\log_2 3})\approx \Theta(n^{1.58})
 $$
 ### Example 2: Closest Pairs
 Input: $P$ is a set of $n$ points in the plane. $p_i=(x_i,y_i)$
 $$
 d(p_i,p_j)=\sqrt{(x_i-x_j)^2+(y_i-y_j)^2}
 $$
 Goal: Find the distance between the closest pair of points.
 Naive algorithm: iterate all pairs ($O(n)=\Theta(n^2)$).
 Divide and conquer algorithm:
 Preprocessing: Sort $P$ by $x$ coordinate to get $P_x$.
 Base case:
 - 1 point: clostest d = inf
 - 2 points: clostest d = d(p_1,p_2)
 Divide Step: 
 Compute mid point and get $Q, R$.
 Recursive step:
 - $d_l$ closest pair in $Q$
 - $d_r$ closest pair in $R$
 Combine step: 
 Calculate $d_c$ closest point such that one point is on the left side and the other is on the right.
 return $min(d_c,d_l,d_r)$
 Total runtime:
 $$
 T(n)=2T(n/2)+\Theta(n^2)
 $$
 Still no change.
 Important Insight: Can reduce the number of checks
 **Lemma:** If all points within this square are at least $\delta=min\{d_r,d_l\}$ apart, there are at most 4 points in this square.
 A better algorithm:
 1. Divide $P_x$ into 2 halves using the mid point
 2. Recursively computer the $d_l$ and $d_r$, take $\delta=min(d_l,d_r)$.
 3. Filter points into y-strip: points which are within $(mid_x-\delta,mid_x+\delta)$
 4. Sort y-strip by y coordinate. For every point $p$, we look at this y-strip in sorted order starting at this point and stop when we see a point with y coordinate $>p_y +\delta$
 ```python
 # d is distance function
 def closestP(P,d):
    Px=sorted(P,key=lambda x:x[0])
    def closestPRec(P,d):
        n=len(P)
        if n==1:
            return float('inf')
        if n==2:
            return d(P[0],P[1])
        Q,R=Px[:n//2],Px[n//2:]
        midx=R[0][0]
        dl,dr=closestP(Q),closestP(R)
        dc=min(dl,dr)
        ys=[i if midx-dc<i[0]<midx+dc for i in P]
        ys.sort()
        yn=len(ys)
        # this step below checks at most 4 points, (but still runs O(n))
        for i in range(yn):
            for j in range(i,yn):
                curd=d(ys[i],ys[j])
                if curd>dc:
                    break
                dc=min(dc,curd)
        return dc
    return closestPRec(Px,d):
 ```
 Runtime analysis:
 $$
 T(n)=2T(n/2)+\Theta(n\log n)=\Theta(n\log^2 n)
 $$
 We can do even better by presorting Y
 1. Divide $P_x$ into 2 halves using the mid point
 2. Recursively computer the $d_l$ and $d_r$, take $\delta=min(d_l,d_r)$.
 3. Filter points into y-strip: points which are within $(mid_x-\delta,mid_x+\delta)$ by visiting presorted $P_y$
 ```python
 # d is distance function
 def closestP(P,d):
    Px=sorted(P,key=lambda x:x[0])
    Py=sorted(P,key=lambda x:x[1])
    def closestPRec(P,d):
        n=len(P)
        if n==1:
            return float('inf')
        if n==2:
            return d(P[0],P[1])
        Q,R=Px[:n//2],Px[n//2:]
        midx=R[0][0]
        dl,dr=closestP(Q),closestP(R)
        dc=min(dl,dr)
        ys=[i if midx-dc<i[0]<midx+dc for i in Py]
        yn=len(ys)
        # this step below checks at most 4 points, (but still runs O(n))
        for i in range(yn):
            for j in range(i,yn):
                curd=d(ys[i],ys[j])
                if curd>dc:
                    break
                dc=min(dc,curd)
        return dc
    return closestPRec(Px,d):
 ```
 Runtime analysis:
 $$
 T(n)=2T(n/2)+\Theta(n)=\Theta(n\log n)
 $$
 ## In-person lectures
 $$
 T(n)=aT(n/b)+f(n)
 $$
 $a$ is number of sub problems, $n/b$ is size of subproblems, $f(n)$ is the cost of divide and combine cost.
 ### Example 3: Max Contiguous Subsequence Sum (MCSS)
 Given: array of integers (positive or negative), $S=[s_1,s_2,...,s_n]$
 Return: $max\{\sum^i_{k=i} s_k|1\leq i\leq n, i\leq j\leq n\}$
 Trivial solution: 
 brute force
 $O(n^3)$
 A bit better solution: 
 $O(n^2)$ use prefix sum to reduce cost for sum.
 Divide and conquer solution.
 ```python
 def MCSS(S):
    def MCSSMid(S,i,j,mid):
        res=S[j]
        for l in range(i,j):
            curS=0
            for r in range(l,j):
                curS+=S[r]
                res=max(res,curS)
        return res
    def MCSSRec(i,j):
        if i==j:
            return S[i]
        mid=(i+j)//2
        L,R=MCSSRec(i,mid),MCSSRec(mid,j)
        C=MCSSMid(i,j)
        return min([L,C,R])
    return MCSSRec(0,len(S))
 ```
 If `MCSSMid(S,i,j,mid)` use trivial solution, the running time is:
 $$
 T(n)=2T(n/2)+O(n^2)=\Theta(n^2)
 $$
 and we did nothing.
 Observations: Any contiguous subsequence that starts on the left and ends on the right can be split into two parts as `sum(S[i:j])=sum(S[i:mid])+sum(S[mid,j])`
 and let $LS$ be the subsequence that has the largest sum that ends at mid, and $RS$ be the subsequence that has the largest sum on the right that starts at mid.
 **Lemma:** Biggest subsequence that contains `S[mid]` is $LS+RP$
 Proof:
 By contradiction,
 Assume for the sake of contradiction that $y=L'+R'$ is a sum of such a subsequence that is larger than $x$ ($y>x$).
 Let $z=LS+R'$, since $LS\geq L'$, by definition of $LS$, then $z\geq y$, WOLG, $RS\geq R'$, $x\geq y$, which contradicts that $y>x$.
 Optimized function as follows:
 ```python
 def MCSS(S):
    def MCSSMid(S,i,j,mid):
        res=S[mid]
        LS,RS=0,0
        cl,cr=0,0
        for l in range(mid-1,i-1,-1):
            cl+=S[l]
            LS=max(LS,cl)
        for r in range(mid+1,j):
            cr+=S[r]
            RS=max(RS,cr)
        return res+LS+RS
    def MCSSRec(i,j):
        if i==j:
            return S[i]
        mid=(i+j)//2
        L,R=MCSSRec(i,mid),MCSSRec(mid,j)
        C=MCSSMid(i,j)
        return min([L,C,R])
    return MCSSRec(0,len(S))
 ```
 The running time is:
 $$
 T(n)=2T(n/2)+O(n)=\Theta(n\log n)
 $$
 Strengthening the recusions:
 ```python
 def MCSS(S):
    def MCSSRec(i,j):
        if i==j:
            return S[i],S[i],S[i],S[i]
        mid=(i+j)//2
        L,lp,ls,sl=MCSSRec(i,mid)
        R,rp,rs,sr=MCSSRec(mid,j)
        return min([L,R,ls+rp]),max(lp,sl+rp),max(rs,sr+ls),sl+sr
    return MCSSRec(0,len(S))
 ```
 Pre-computer version:
 ```python
 def MCSS(S):
    pfx,sfx=[0],[S[-1]]
    n=len(S)
    for i in range(n-1):
        pfx.append(pfx[-1]+S[i])
        sfx.insert(sfx[0]+S[n-i-2],0)
    def MCSSRec(i,j):
        if i==j:
            return S[i],pfx[i],sfx[i]
        mid=(i+j)//2
        L,lp,ls=MCSSRec(i,mid)
        R,rp,rs=MCSSRec(mid,j)
        return min([L,R,ls+rp]),max(lp,sfx[mid]-sfx[i]+rp),max(rs,sfx[j]-sfx[mid]+ls)
    return MCSSRec(0,n)
 ```
 $$
 T(n)=2T(n/2)+O(1)=\Theta(n)
 $$
--- a/pages/CSE347/CSE347_L3.md
+++ b/pages/CSE347/CSE347_L3.md
@@ -0,0 +1,161 @@
 # Lecture 3
 ## Dynamic programming
 When we cannot find a good Greedy Choice, the only thing we can do is to iterate all choices.
 ### Example 1: Edit distance
 Input: 2 sequences of some character set, e.g.
 $S=ABCADA$, $T=ABADC$
 Goal: Computer the minimum number of **insertions or deletions** you could do to convert $S$ into $T$
 We will call it `Edit Distance(S[1...n],T[1...m])`. where `n` and `m` be the length of `S` and `T` respectively.
 Idea: computer difference between the sequences.
 Observe: The difference we observed appears at index 3, and in this example where the sequences are short, it is obvious that it is better to delete 'C'. But for long sequence, we donot know that the later sequence looks like so it is hard to make a decision on whether to insert 'A' or delete 'C'.
 Use branching algorithm:
 ```python
 def editDist(S,T,i,j):
    if len(S)<=i:
        return len(T)
    if len(T)<=j:
        return len(S)
    if S[i]==T[j]:
        return editDist(S,T,i+1,j+1)
    else:
        return min(editDist(S,T,i+1,j),editDist(S,T,i,j+1))
 ```
 Correctness Proof Outline:
 - ~~Greedy Choice Property~~
 - Complete Choice Property:
  - The optimal solution makes **one** of the choices that we consider
 - Inductive Structure:
  - Once you make **any** choice, you are left with a smaller problem of the same type. **Any** first choice + **feasible** solution to the subproblem = feasible solution to the entire problem.
 - Optimal Substructure:
  - If we optimally solve the subproblem for **a particular choice c**, and combine it with c, resulting solution is the **optimal solution that makes choice c**.
 Correctness Proof:
 Claim: For any problem $P$, the branking algorithm finds the optimal solution.
 Proof: Induct on problem size
 - Base case: $|S|=0$ or $|T|=0$, obvious
 - Inductive Case: By inductive hypothesis: Branching algorithm works for all smaller problems, either $S$ is smaller or $T$ is smaller or both
  - For each choice we make, we got a strictly smaller problem: by inductive structure, and the answer is correct by inductive hypothesis.
  - By Optimal substructure, we know for any choice, the solution of branching algorithm for subproblem and the choice we make is an optimal solution for that problem.
  - Using Complete choice property, we considered all the choices.
 Using tree graph, the left and right part of the tree has height n, but the middle part of the tree has height 2n. So the running time is $\Omega(2^n)$, at least $2^n$.
 #### How could we reduce the complexity?
 There are **overlapping subproblems** that we compute more than once! Number of distinct subproblems is polynomial, we can **share the solution** that we have already computed!
 **store the result of subprolem in 2D array**
 Use dp:
 ```python
 def editDist(S,T,i,j):
    m,n=len(S),len(T)
    dp=[[0]*(n+1) for _ in range(m+1)]
    for i in range(n):
        dp[i][m]=n-i
    for i in range(m):
        dp[n][j]=m-i
    for i in range(m):
        for j in range(n):
            if S[i]==T[j]:
                dp[i][j]=dp[i+1][j+1]
            else:
                # assuming the cost of insertion and deletion is 1
                dp[i][j]=min(1+dp[i][j+1],1+dp[i+1][j])
 ```
 We can use backtracking to find out how do we reach our final answer. Then the new runtime will be the time used to complete the table, which is $T(n,m)=\Theta(mn)$
 ### Example 2: Weighted Interval Scheduling (IS)
 Input: $P=\{p_1,p_2,...,p_n\}$, $p_i=\{s_i,f_i,w_i\}$
 $s_i$ is the start time, $f_i$ is the finish time, $w_i$ is the weight of the task for job $i$
 Goal: Pick a set of **non-overlapping** intervals $\Pi$ such that $\sum_{p_i\in \Pi} w_i$ is maximized.
 Trivial solution ($T(n)=O(2^n)$)
 ```python
 # p=[[s_i,f_i,w_i],...]
 p=[]
 p.sort()
 n=len(p)
 def intervalScheduling(idx):
    res=0
    if i>=n:
        return res
    for i in range(idx,n):
        # pick when end
        if p[idx][1]>p[i][0]:
            continue
        res=max(intervalScheduling(i+1)+p[i][2],res)
 return intervalScheduling(0)
 ```
 Using dp ($T(n)=O(n^2)$)
 ```python
 def intervalScheduling(p):
    p.sort()
    n=len(p)
    dp=[0]*(n+1)
    for i in range(n-1,-1,-1):
        # load initial best case: do nothing
        dp[i]=dp[i+1]
        _,e,w=p[i]
        for j in range(bisect.bisect_left(p,e,key=lambda x:x[0]),n+1):
            dp[i]=max(dp[i],w+dp[j])
    return dp[0]
 ```
 ### Example 3: Subset sums
 Input: a set $S$ of positive and unique integers and another integer $K$.
 Problem: Is there a subset $X\subseteq S$ such that $sum(X)=K$
 Brute force takes $O(2^n)$.
 ```python
 def subsetSum(arr,i,k)->bool:
    if i>=len(arr): 
        if k==0:
            return True
        return False
    return subsetSum(i+1,k-arr[i]) or subsetSum(i+1,k)
 ```
 Using dp $O(nk)$
 ```python
 def subsetSum(arr,k)->bool:
    n=len(arr)
    dp=[False]*(k+1)
    dp[0]=True
    for e in arr:
        ndp=[]
        for i in range(k+1):
            ndp.append(dp[i])
            if i-e>=0:
                ndp[i]|=dp[i-e]
        dp=ndp
    return dp[-1]
 ```
--- a/pages/CSE347/CSE347_L4.md
+++ b/pages/CSE347/CSE347_L4.md
@@ -0,0 +1,321 @@
 # Lecture 4
 ## Maximum Flow
 ### Example 1: Ship cement from factory to building
 Input $s$: source, $t$: destination
 Graph with **directed** edges weights on each edge: **capacity**
 **Goal:** Ship as much stuff as possible while obeying capacity constrains.
 Graph: $(V,E)$ directed and weighted
 - Unique source and sink nodes $\to s, t$
 - Each edge has capacity $c(e)$ [Integer]
 A valid flow assignment assigns an integer $f(e)$ to each edge s.t.
 Capacity constraint: $0\leq f(e)\leq c(e)$
 Flow conservation: 
 $$
 \sum_{e\in E_{in}(v)}f(e)=\sum_{e\in E_{out}(v)}f(e),\forall v\in V-{s,t}
 $$
 $E_{in}(v)$: set of incoming edges to $v$
 $E_{out}(v)$: set of outgoing edges from $v$
 Compute: Maximum Flow: Find a valid flow assignment to
 Maximize $|F|=\sum_{e\in E_{in}(t)}f(e)=\sum_{e\in E_{out}(s)}f(e)$ (total units received by end and sent by source)
 Additional assumptions
 1. $s$ has no incoming edges, $t$ has no outgoing edges
 2. You do not have a cycle of 2 nodes
 A proposed algorithm:
 1. Find a path from $s$ to $t$
 2. Push as much flow along the path as possible
 3. Adjust the capacities
 4. Repeat until we cannot find a path
 **Residual Graph:** If there is an edge $e=(u,v)$ in $G$, we will add a back edge $\bar{e}=(v,u)$. Capacity of $\bar{e}=$ flow on $e$. Call this graph $G_R$.
 Algorithm:
 - Find an "augmenting path" $P$.
  - $P$ can contain forward or backward edges!
 - Say the smallest residual capacity along the path is $k$.
 - Push $k$ flow on the path ($f(e) =f(e) + k$ for all edges on path $P$)
  - Reduce the capacity of all edges on the path $P$ by $k$
  - **Increase** the capacity of the corresponding mirror/back edges
 - Repeat until there are no augmenting paths
 ### Formalize: Ford-Fulkerson (FF) Algorithm
 1. Initialize the residual graph $G_R=G$
 2. Find an augmenting path $P$ with capacity $k$ (min capacity of any edge on $P$)
 3. Fix up the residual capacities in $G_R$
    - $c(e)=c(e)-k,\forall e\in P$
    - $c(\bar{e})=c(\bar{e})+k,\forall \bar{e}\in P$
 4. Repeat 2 and 3 until no augmenting path can be found in $G_R$.
 ```python
 def ford_fulkerson_algo(G,n,s,t):
    """
    Args:
        G: is the graph for max_flow
        n: is the number of vertex in the graph
        s: start vertex of flow
        t: end vertex of flow
    Returns:
        the max flow in graph from s to t
    """
    # Initialize the residual graph $G_R=G$
    GR=[defaultdict(int) for i in range(n)]
    for i in range(n):
        for v,_ in enumerate(G[i]):
            # weight w is unused
            GR[v][i]=0
    path=set()
    def augP(cur):
        # Find an augumentting path $P$ with capacity $k$ (min capacity of any edge on $P$)
        if cur==t: return True
        # true for edge in residual path, false for edge in graph
        for v,w in G[cur]:
            if w==0 or (cur,v,False) in path: continue
            path.add((cur,v,False))
            if augP(v): return True
            path.remove((cur,v,False))
        for v,w in GR[cur]:
            if w==0 or (cur,v,True) in path: continue
            path.add((cur,v,True))
            if augP(v): return True
            path.remove((cur,v,True))
        return False
    while augP(s):
        k=min([GR[a][b] if isR else G[a][b] for a,b,isR in path])
        # Fix up the residual capacities in $G_R$
        # - $c(e)=c(e)-k,\forall e\in P$
        # - $c(\bar{e})=c(\bar{e})+k,\forall \bar{e}\in P$
        for a,b,isR in path:
            if isR:
                GR[a][b]+=k
            else:
                G[a][b]-=k
    return sum(GR[s].values())
 ```
 #### Proof of Correctness: Valid Flow
 **Lemma 1:** FF finds a valid flow
 - Capacity and conservation constrains are not violated
 - Capacity constraint: $0\leq f(e)\leq c(e)$
 - Flow conservation: $\sum_{e\in E_{in}(v)}f(e)=\sum_{e\in E_{out}(v)}f(e),\forall v\in V-\{s,t\}$
 Proof: We proceed by induction on **augmenting paths**
 ##### Base Case
 $f(e)=0$ on all edges
 ##### Inductive Case
 By inductive hypothesis, we have a valid flow and the corresponding residual graph $G_R$.
 Inductive Step:
 Now we find an augmented path $P$ in $GR$, pushed $k$ (which is the smallest edge capacity on $P$). Argue that the constraints are not violated.
 **Capacity Constrains:** Consider an edge $e$ in $P$.
 - If $e$ is an forward edge (in the original graph)
  - by construction of $G_R$, it had left over capacities.
 - If $e$ is an back edge with residual capacity $\geq k$
  - flow on real edge reduces, but the real capacity is still $\geq 0$, no capacity constrains violation.
 **Conservation Constrains:** Consider a vertex $v$ on path $P$
 1. Both forward edges
   - No violation, push $k$ flow into $v$ and out.
 2. Both back edges
   - No violation, push $k$ less flow into $v$ and out.
 3. Redirecting flow
   - No violation, change of $0$ by $k-k$ on $v$.
 #### Proof of Correctness: Termination
 **Lemma 2:** FF terminate
 Proof:
 Every time it finds an augmenting path that increases the total flow.
 Must terminate either when it finds a max flow or before.
 Each iteration we use $\Theta(m+n)$ to find a valid path.
 The number of iteration $\leq |F|$, the total is $\Theta(|F|(m+n))$ (not polynomial time)
 #### Proof of Correctness: Optimality
 From Lemma 1 and 2, we know that FF returns a feasible solution, but does it return the **maximum** flow?
 ##### Max-flow Min-cut Theorem
 Given a graph $G(V,E)$, a **graph cut** is a partition of vertices into 2 subsets.
 - $S$: $s$ + maybe some other vertices
 - $V-S$: $t$ + maybe some other vertices
 Define capacity of the cut be the sum of capacity of edges that go from a vertex in $S$ to a vertex in $T$.
 **Lemma 3:** For all valid flows $f$, $|f|\leq C(S)$ for all cut $S$ (Max-flow $\leq$ Min-cut)
 Proof: all flow must go through one of the cut edges.
 **Min-cut:** cut of smallest capacity, $S^*$. $|f|\leq C(S^*)$
 **Lemma 4:** FF produces a flow $=C(S^*)$
 Proof: Let $\hat{f}$ be the flow found by FF. Mo augmenting paths in $G_R$.
 Let $\hat{S}$ be all vertices that can be reached from $s$ using edges with capacities $>0$.
 and all the forward edges going out of the cut are saturated. Since back edges have capacity 0, no flow is going into the cut $S$.
 If some flow was coming from $V-\hat{S}$, then there must be some edges with capacity $>0$. So, $|f|\leq C(S^*)$
 ### Example 2: Bipartite Matching
 input: Given $n$ classes and $n$ rooms; we want to match classes to rooms.
 Bipartite graph $G=(V,E)$ (unweighted and undirected)
 - Vertices are either in set $L$ or $R$
 - Edges only go between vertices of different sets
 Matching: A subset of edges $M\subseteq E$ s.t.
 - Each vertex has at most one edge from $M$ incident on it.
 Maximum Matching: matching of the largest size.
 We will reduce the problem to the problem of finding the maximum flow
 #### Reduction
 Given a bipartite graph $G=(V,E)$, construct a graph $G'=(V',E')$ such that
 $$
 |max-flow (G')|=|max-flow(G)|
 $$
 Let $s$ connects to all vertices in $L$ and all vertex in $R$ connects to $t$.
 $G'=G+s+t+$added edges form $S$ to $T$ and added capacities.
 #### Proof of correctness
 Claim: $G'$ has a flow of $k$ iff $G$ has a matching of size $k$
 Proof: Two directions:
 1. Say $G$ has a matching of size $k$, we want to prove $G'$ has a flow of size $k$.
 2. Say $G'$ has a flow of size $k$, we want to prove $G$ has a matching of size $k$.
 ## Conclusion: Maximum Flow
 Problem input and target
 Ford-Fulkerson Algorithm
 - Execution: residual graph
 - Runtime
 FF correctness proof
 - Max-flow Min-cut Theorem
 - Graph Cut definition
 - Capacity of cut
 Reduction to Bipartite Matching
 ### Example 3: Image Segmentation: (reduction from min-cut)
 Given:
 - Image consisting of an object and a background.
 - the object occupies some set of pixels $A$, while the background occupies the remaining pixels $B$.
 Required:
 - Separate $A$ from $B$ but if doesn't know which pixels are each.
 - For each pixel $i,p_i$ is the probability that $i\in A$
 - For each pair of adjacent pixels $i,j,c_{ij}$ is the cost of placing the object boundary between them. i.e. putting $i$ in $A$ and $j$ in $B$.
 - A segmentation of the image is an assignment of each pixel to $A$ or $B$.
 - The goal is to find a segmentation that maximizes
 $$
 \sum_{i\in A}p_i+\sum_{i\in B}(1-p_i)-\sum_{i,j\ on \ boundary}c_{ij}
 $$
 Solution:
 - Let's turn our maximization into a minimization
 - If the image has $N$ pixels, then we can rewrite the objective as
 $$
 N-\sum_{i\in A}(1-p_i)-\sum_{i\in B}p_i-\sum_{i,j\ on \ boundary}c_{ij}
 $$
 because $N=\sum_{i\in A}p_i+\sum_{i\in A}(1-p_i)+\sum_{i\in B}p_i+\sum_{i\in B}(1-p_i)$ boundary
 New maximization problem:
 $$
 Max\left( N-\sum_{i\in A}(1-p_i)-\sum_{i\in B}p_i-\sum_{i,j\ on \ boundary}c_{ij}\right)
 $$
 Now, this is equivalent ot minimizing
 $$
 \sum_{i\in A}(1-p_i)+\sum_{i\in B}p_i+\sum_{i,j\ on \ boundary}c_{ij}
 $$
 Second steps
 - Form a graph with $n$ vertices, $v_i$ on for each pixel
 - Add vertices $s$ and $t$
 - For each $v_i$, add edges $S-T$ cut of $G$ assigned each $v_i$ to either $S$ side or $T$ side.
 - The $S$ side of an $S-T$ is the $A$ side, while the $T$ side of the cur is the $B$ side.
 - Observer that if $v_i$ goes on the $S$ side, it becomes part of $A$, so the cut increases by $1-p$. Otherwise, it become part of $B$, so the cut increases by $p_i$ instead.
 - Now add edges $v_i\to v_j$ with capacity $c_{ij}$ for all adjacent pixels pairs $i,j$
 - If $v_i$ and $v_j$ end up on opposite sides of the cut (boundary), then the cut increases by $c_{ij}$.
 - Conclude that any $S-T$ cut that assigns $S\subseteq V$ to the $A$ side and $V\backslash S$ to the $B$ side pays a total of 
    1. $1-p_i$ for each $v_i$ on the $A$ side
    2. $p_i$ for each $v_i$ on the $B$ side
    3. $c_{ij}$ for each adjacent pair $i,j$ that is at the boundary. i.e. $i\in S\ and\ j\in V\backslash S$
 - Conclude that a cut with a capacity $c$ implies a segmentation with objective value $cs$.
 - The converse can (and should) be also checked: a segmentation with subjective value $c$ implies a $S-T$ cut with capacity $c$.
 #### Algorithm
 - Given an image with $N$ pixels, build the graph $G$ as desired.
 - Use the FF algorithm to find a minimum $S-T$ cut of $G$
 - Use this cut to assign each pixel to $A$ or $B$ as described, i.e pixels that correspond to vertices on the $S$ side are assigned to $A$ and those corresponding to vertices on the $T$ side to $B$.
 - Minimizing the cut capacity minimizes our transformed minimization objective function.
 #### Running time
 The graph $G$ contains $\Theta(N)$ edges, because each pixel is adjacent to a maximum of of 4 neighbors and $S$ and $T$.
 FF algorithm has running time $O((m+n)|F|)$, where $|F|\leq |n|$ is the size of set of min-cut. The edge count is $m=6n$.
 So the total running time is $O(n^2)$
--- a/pages/CSE347/CSE347_L5.md
+++ b/pages/CSE347/CSE347_L5.md
@@ -0,0 +1,341 @@
 # Lecture 5
 ## Takeaway from Bipartite Matching
 - We saw how to solve a problem (bi-partite matching and others) by reducing it to another problem (maximum flow).
 - In general, we can design an algorithm to map instances of a new problem to instances of known solvable problem (e.g., max-flow) to solve this new problem!
 - Mapping from one problem to another which preserves solutions is called reduction.
 ## Reduction: Basic Idea
 Convert solutions to the known problem to the solutions to the new problem
 - Instance of new problem
 - Instance of known problem
 - Solution of known problem
 - Solution of new problem
 ## Reduction: Formal Definition
 Problems $L,K$.
 $L$ reduces to $K$ ($L\leq K$) if there is a mapping $\phi$ from **any** instance $l\in L$ to some instance $\phi(l)\in K'\subset K$, such that the solution for $\phi(l)$ yields a solution for $l$.
 This means that **L is no harder than K**
 ### Using reduction to design algorithms
 In the example of reduction to solve Bipartite Matching:
 $L:$ Bipartite Matching
 $K:$ Max-flow Problem
 Efficiency:
 1. Reduction: $\phi:l\to\phi(l)$ (Polynomial time reduction $\phi(l)$)
 2. Solve prom $\phi(l)$ (Polynomial time to solve $poly(g)$)
 3. Convert the solution for $\phi(l)$ to a solution to $l$ (Polynomial time to solve $poly(g)$)
 ### Efficient Reduction
 A reduction $\phi:l\to\phi(l)$ is efficient ($L\leq p(k)$) if for any $l\in L$:
 1. $\phi(l)$ is computable from $l$ in polynomial ($|l|$) time.
 2. Solution to $l$ is computable from solution of $\phi(l)$ in polynomial ($|l|$) time.
 We call $L$ is **poly-time reducible** to $K$, or $L$ poly-time
 reduces to $K$.
 ### Which problem is harder?
 Theorem: If $L\leq p(k)$ and there is a polynomial time algorithm to solve $K$, then there is a polynomial time algorithm to solve $L$.
 Proof: Given an instance of $l\in L$ If we can convert the problem in polynomial time with respect to the original problem $l$.
 1. Compute $\phi(l)$: $p(l)$
 2. Solve $\phi(l)$: $p(\phi(l))$
 3. Convert solution: $p(\phi(l))$
 Total time: $p(l)+p(\phi(l))+p(\phi(l))=p(l)+p(\phi(l))$
 Need to show: $|\phi(l)|=poly(|l|)$
 Proof:
 Since we can convert $\phi(l)$ in $p(l)$ time, and on every time step, (constant step) we can only write constant amount of data.
 So $|\phi(l)|=poly(|l|)$
 ## Hardness Problems
 Reductions show the relationship between problem hardness!
 Question: Could you solve a problem in polynomial time?
 Easy: polynomial time solution
 Hard: No polynomial time solution (as far as we know)
 ### Types of Problems
 Decision Problem: Yes/No answer
 Examples: Subset sums
 1. Is the there a flow of size $F$
 2. Is there a shortest path of length $L$ from vertex $u$ to vertex $v$.
 3. Given a set of intercal, can you schedule $k$ of them.
 Optimization Problem: What is the value of an optimal feasible solution of a problem?
 - Minimization: Minimize cost
  - min cut
  - minimal spanning tree
  - shortest path
 - Maximization: Maximize profit
  - interval scheduling
  - maximum flow
  - maximum matching
 #### Canonical Decision Problem
 Does the instance $l\in L$ (an optimization problem) have a feasible solution with objective value $k$:
 Objective value $\geq k$ (maximization) $\leq k$ (minimization)
 $DL$ is the reduced Canonical Decision problem $L$
 ##### Hardness of Canonical Decision Problems
 Lemma 1: $DL\leq p(L)$ ($DL$ is no harder than $L$)
 Proof: Assume $L$ **maximization** problem $DL(l)$: does  have a solution $\geq k$.
 Example: Does graph $G$ have flow $\geq k$.
 Let $v^∗$  be the maximum objective on $l$ by solving $l$.
 Let the instance of $DL:(l,k)$ and $l$ be the problem and $k$ be the objective
 1. $l\to \phi(l)\in L$ (optimization problem) $\phi(l,k)=l$
 2. Is $v^*(l)\geq k$? If so, return true, else return false.
 Lemma 2: If $v^* =O(c^{|l|})$ for any constant $c$, then $L\leq p(DL)$.
 Proof: First we could show  $L\leq DL$. Suppose maximization problem, canonical decision problem is is there a solution $\geq k$.
 Naïve Linear Search: Ask $DL(l,k)$, if returns false, ask $DL(l,k+1)$ until returns true
 Runtime: At most $k$ search to iterate all possibilities.
 This is exponential! How to reduce it?
 Our old friend Binary (exponential) Search is back!
 You gets a no at some value: try power of 2 until you get a no, then do binary search
 \# questions: $=log_2(v^*(l))=poly(l)$
 Binary search in area: from last yes to first no.
 Runtime: Binary search ($O(n)=\log(v^*(l))$)
 ### Reduction for Algorithm Design vs Hardness
 For problems $L,K$
 If $K$ is “easy” (exists a poly-time solution), then $L$ is also easy.
 If $L$ is “hard” (no poly-time solution), then $k$ is also hard.
 Every problem that we worked on so far, $K$ is “easy”, so we reduce from new problem  to known problem  (e.g., max-flow).
 #### Reduction for Hardness: Independent Set (ISET)
 Input: Given an undirected graph $G = (V,E)$, 
 A subset of vertices $S\subset V$ is called an **independent set** if no two vertices of  are connected by an edge.
 Problem: Does $G$ contain an independent set of size $\geq k$?
 $ISET(G,k)$  returns true if $G$ contains an independent set of size $\geq k$, and false otherwise.
 Algorithm? NO! We think that this is a hard problem.
 A lot of people have tried and could not find a poly-time solution
 ### Example: Vertex Cover (VC)
 Input: Given an undirected graph $G = (V,E)$
 A subset of vertices $C\subset V$ is called a **vertex cover** if contains at least one end point of every edge.
 Formally, for all edges $(u,v)\in E$, either $u\in C$, or $v\in C$.
 Problem: $VC(G,j)$ returns true if has a vertex cover of size $\leq j$, and false otherwise (minimization problem)
 Example:
 #### How hard is Vertex Cover?
 Claim:  $ISET\leq p(VC)$
 Side Note: when we prove $VC$ is hard, we prove it is no easier than $ISET$.
 DO NOT: $VC\leq p(ISET)$
 Proof: Show that $G=(V,E)$ has an independent set of $k$ **if and only if** the same graph (not always!) has a vertex cover of size $|V|-k$.
 Map:
 $$
 ISET(G,k)\to VC(g,|v|-k)
 $$
 $G'=G$
 ##### Proof of reduction: Direction 1
 Claim 1: $ISET$ of size $k\to$ $VC$ of size $|V|-k$
 Proof: Assume $G$ has an $ISET$ of size $k:S$, consider $C = V-S,|C|=|V|-k$
 Claim: $C$ is a vertex cover
 ##### Proof of reduction: Direction 2
 Claim 2: $VC$ of size $|V|-k\to ISET$ of size $k$
 Proof: Assume $G$ has an $VC$ of size $|V| −k:C$, consider $S = V − C, |S| =k$
 Claim: $S$ is an independent set
 ### What does poly-time mean?
 Algorithm runs in time polynomial to input size.
 - If the input has  items, algorithm runs in $\Theta(n^c)$ for any constant  is poly-time.
  - Examples: intervals to schedule, number of integers to sort, # vertices + # edges in a graph
 - Numerical Value (Integer $n$), what is the input size?
  - Examples: weights, capacity, total time, flow constraints
  - It is not straightforward!
 ### Real time complexity of F-F?
 In class: $O(F( |V| + |E|))$
 - $|V| + |E|$ = this much space to represent the graph
 - $F$ : size of the maximum flow.
 If every edge has capacity , then $F = O(CE)$
 Running time:$O(C|E|(|V|  + |E| )))$
 ### What is the actual input size?
 Each edge ($|E|$ edges):
 - 2 vertices: $|V|$ distinct symbol, $\log |V|$ bits per symbol
 - 1 capacity: $\log C$
 Size of graph:
 - $O(|E|(|V| + \log C))$
  - $p( |E| , |V| , \log C)$
 Running time:
 - $P( |E| , |V| , |C| )$
  - Exponential if is exponential in $|V|+|E|$
 ### Pseudo-polynomial
 Naïve Ford-Fulkerson is bad!
 Problem ’s inputs contain some numerical values, say $|W|$. We need only log  bits to store . If algorithms runs in $p(W)$, then it is exponential, or **pseudopolynomial**.
 In homework, you improved F-F to make it work in
 $p( |V| ,|E|  , \log C)$, to make it a real polynomial algorithm.
 ## Conclusion: Reductions
 - Reduction
  - Construction of mapping with runtime
  - Bidirectional proof
 - Efficient Reduction $L\leq p(K)$
  - Which problem is harder?
  - If $L$ is hard, then $K$ is hard. $\to$ Used to show hardness
  - If $K$ is easy, then $L$ is easy. $\to$ Used for design algorithms
 - Canonical Decision Problem
  - Reduction to and from the optimization problem
 - Reduction for hardness
  - Independent Set$leq p$ Vertex Cover
 ## On class
 Reduction: $V^* = O(c^k)$
 OPT: Find max flow of at least one instance $(G,s,t)$
 DEC: Is there a flow of size $pK$, given $G,s,t  \implies$ the instance is defined by the tuple $(G,s,t,k)$
 Yes, if there exists one
 No, otherwise
 Forget about F-F and assume that you have an oracle that solves the decision problem.
 First solution (the naive solution): iterate over $k = 1, 2, \dots$ until the oracle returns false and the last one returns true would be the max flow.
 Time complexity: $K\cdot X$, where $X$ is the time complexity of the oracle
 Input size: $poly(||V|,|E|, |E|log(max-capacity))$, and $V^* \leq \sum$ capacities
 A better solution: do a binary search. If there is no upper bound, we use exponential binary search instead. Then,
 $$
 \begin{aligned}
 log(V^*) &\leq X\cdot log(\sum capacities)\\
 &\leq X\cdot log(|E|\cdot maxCapacity)\\
 &\leq X\cdot (log(|E| + log(maxCapacity)))
 \end{aligned}
 $$
 As $\log(maxCapacity)$ is linear in the size of the input, the running time is polynomial to the solution of the original problem.
 Assume that ISET is a hard problem, i.e. we don't know of any polynomial time solution. We want to show that vertex cover is also a hard problem here:
 $ISET \leq_{p} VC$
 1. Given an instance of ISET, construct an instance of VC
 2. Show that the construction can be done in polynomial time
 3. Show that if the ISET instance is true than the CV instance is true
 4. Show that if the VC instance is true then the ISET instance is true.
 > ISET: given $(G,K)$, is there a set of vertices that do not share edges of size $K$  
 > VC: given $(G,K)$, is there a set of vertices that cover all edges of size $K$
 1. Given $l: (G,K)$ being an instance of ISET, we construct $\phi(l): (G',K')$ as an instance of VC. $\phi(l): (G, |V|-K), \textup{i.e., } G' = G \cup K' = |V| - K$
 2. It is obvious that it is a polynomial time construction since copying the graph is linear, in the size of the graph and the subtraction of integers is constant time.
 **Direction 1**: ISET of size k $\implies$ VC of size $|V| - K$ Assume that ISET(G,K) returns true, show that $VC(G, |V|-K)$ returns true
 Let $S$ be an independent set of size $K$ and $C = V-S$
 We claim that $C$ is a vertex cover of size $|V|-K$
 Proof: 
 We proceed by contradiction. Assume that $C$ is NOT a vertex cover, and it means that there is an edge $(u,v)$ such that $u\notin c , v\notin C$. And it implies that $u\in S , v\in S$, which contradicts with the assumption that S is an independent set.
 Therefore, $c$ is an vertex cover
 **Direction 2**: VC of size $|V|-K \implies$ ISET of size $K$
 Let $C$ be a vertex cover of size $|V|-K$ , let $s = |v| - c$
 We claim that $S$ is an independent set of size $K$.
 Again, assume, for the sake of contradiction, that $S$ is not an independent set. And we get
 $\exists (u,v) \textup{such that } u\in S, v \in S$
 $u,v \notinC$
 $C \textup{ is not a vertex cover}$
 And this is a contradiction with our assumption.
--- a/pages/CSE347/CSE347_L6.md
+++ b/pages/CSE347/CSE347_L6.md
@@ -0,0 +1,287 @@
 # Lecture 6
 ## NP-completeness
 ### $P$: Polynomial-time Solvable
 $P$: Class of decision problems $L$ such that there is a polynomial-time algorithm that correctly answers yes or not for every instance $l\in L$.
 Algorithm "$A$ decides $L$". If algorithm $A$ always correctly answers for any instance $l\in L$.
 Example:
 Is the number $n$ prime? Best algorithm so far: $O(\log^6 n)$, 2002
 ## Introduction to NP
 - NP$\neq$ Non-polynomial (Non-deterministic polynomial time)
 - Let $L$ be a decision problem.
 - Let $l$ be an instance of the problem that the answer happens to be "yes".
 - A **certificate** c(l) for $l$ is a "proof" that the answer for $l$ is true. [$l$ is a true instance]
  - For canonical decision problems for optimization problems, the certificate is often a feasible solution for the corresponding optimization problem.
 ### Example of certificates
 - Problem: Is there a path from $s$ to $t$
  - Instance: graph $G(V,E),s,t$.
  - Certificate: path from $s$ to $t$.
 - Problem: Can I schedule $k$ intervals in the room so that they do not conflict.
  - Instance: $l:(I,k)$
  - Certificate: set of $k$ non-conflicting intervals.
 - Problem: ISET
  - Instance: $G(V,E),k$.
  - Certificate: $k$ vertices with no edges between them.
 If the answer to the problem is NO, you don't need to provide anything to prove that.
 ### Useful certificates
 For a problem to be in NP, the problem need to have "useful" certificates. What is considered a good certificate?
 - Easy to check
  - Verifying algorithm which can check a YES answer and a certificate in $poly(l)$
 - Not too long: [$poly(l)$]
 ### Verifier Algorithm
 **Verifier algorithm** is one that takes an instance $l\in L$ and a certificate $c(l)$ and says yes if the certificate proves that $l$ is a true instance and false otherwise.
 $V$ is a poly-time verifier for $L$ is it is a verifier and runs in $poly(|l|,|c|)$ time. (c=$poly(l)$)
 - The runtime must be polynomial
 - Must check **every** problem constraint
 - Not always trivial
 ## Class NP
 **NP:** A class of decision problems such that exists a certificate schema $c$ and a verifier algorithm $V$ such that:
 1. certificate is $poly(l)$ in size.
 2. $V:poly(l)$ in time.
 **P:** is a class of problems that you can **solve** in polynomial time
 **NP:** is a class of problems that you can **verify** TRUE instances in polynomial time given a poly-size certificate
 **Millennium question**
 $P\subseteq NP$? $NP\subseteq P$?
 $P\subseteq NP$ is true.
 Proof: Let $L$ be a problem in $P$, we want to show that there is a polynomial size certificate with a poly-time verifier.
 There is an algorithm $A$ which solves $L$ in polynomial time.
 **Certificate:** empty thing.
 **Verifier:** $(l,c)$
 1. Discard $c$.
 2. Run $A$ on $l$ and return the answer.
 Nobody knows the solution $NP\subseteq P$. Sad.
 ### Class of problem: NP complete
 Informally: hardest problem in NP
 Consider a problem $L$.
 - We want to show if $L\subseteq P$, then $NP\subseteq P$
 **NP-hard**: A decision problem $L$ is NP-hard if for any problem $K\in NP$, $K\leq_p L$.
 $L$ is at least as hard as all the problems in NP. If we have an algorithm for $L$, we have an algorithm for any problem in NP with only polynomial time extra cost.
 MindMap:
 $K\implies L\implies sol(L)\implies sol(K)$
 #### Lemma $P=NP$
 Let $L$ be an NP-hard problem. If $L\in P$, then $P=NP$.
 Proof:
 Say $L$ has a poly-time solution, some problem $K$ in $NP$.
 For any $K\in NP$, $NP\subset P$, $P\subset NP$, then $P=NP$.
 **NP-complete:** $L$ is **NP-complete** if it is both NP-hard and $L\in NP$.
 **NP-optimization:** $L$ is **NP-optimization** problem if the canonical decision problem is NP-complete.
 **Claim:** If any NP-optimization problem have polynomial-time solution, then $P=NP$.
 ### Is $P=NP$?
 - Answering this problem is hard.
 - But for any NP-complete problem, if you could find a poly-time algorithm for $L$, then you would have answered this question.
 - Therefore, finding a poly-time algorithm for $L$ is hard.
 ## NP-Complete problem
 ### Satisfiability (SAT)
 Boolean Formulas:
 A set of Boolean variables:
 $x,y,a,b,c,w,z,...$ they take values true or false.
 A boolean formula is a formula of Boolean variables with and, or and not.
 Examples:
 $\phi:x\land (\neg y \lor z)\land\neg(y\lor w)$
 $x=1,y=0,z=1,w=0$, the formula is $1$.
 **SAT:** given a formula $\phi$, is there a setting $M$ of variables such that the $\phi$ evaluates to True under this setting.
 If there is such assignment, then $\phi$ is satisfiable. Otherwise, it is not.
 Example: $x\land y\land \neg(x\lor y)$ is not satisfiable.
 A seminar paper by Cook and Levin in 1970 showed that SAT is NP-complete.
 1. SAT is in NP  
    Proof:  
    $\exists$ a certificate schema and a poly-time verifier.  
    $c$ satisfying assignment $M$ and $v$ check that $M$ makes $\phi$ true.
 2. SAT is NP-hard. we can just accept it has a fact.
 #### How to show a problem is NP-complete?
 Say we have a problem $L$.
 1. Show that $L\in NP$.  
   Exists certificate schema and verification algorithm in polynomial time.
 2. Prove that we can reduce SAT to $L$. $SAT\leq_p L$ **(NOT $L\leq_p SAT$)**
    Solving $L$ also solve SAT.
 ### CNF-SAT
 **CNF:** Conjugate normal form of SAT
 The formula $\phi$ must be an "and of ors"
 $$
 \phi=\land_{i=1}^n(\lor^{m_i}_{j=1}l_{i,j})
 $$
 $l_{i,j}$: clause
 ### 3-CNF-SAT
 **3-CNF-SAT:** where every clauses has exactly 3 literals.
 is NP complete [not all version of them are, 2-CNF-SAT is in P]
 Input: 3-CNF expression with $n$ variables and $m$ clauses in the form:
 number of total literals: $3m$
 Output: An assignment of the $n$ variables such that at least one literal from each clauses evaluates to true.
 Note:
 1. One variable can be used to satisfy multiple clauses.
 2. $x_i$ and $\neg x_i$ cannot both evaluate to true.
 Example: ISET is NP-complete.
 Proof:
 Say we have a problem $L$
 1. Show that $ISET\in NP$  
    Certificate: set of $k$ vertices: $|S|=k\in poly(g)$\
    Verifier: checks that there are no edges between them $O(E k^2)$
 2. ISET is NP-hard. We need to prove $3SAT\leq_p ISET$
    - Construct a reduction from $3SAT$ to $ISET$.
    - Show that $ISET$ is harder than $3SAT$.
 We need to prove $\phi\in 3SAT$ is satisfiable if and only if the constructed $G$ has an $ISET$ of size $\geq k=m$
 #### Reduction mapping construction
 We construct an ISET instance from $3-SAT$.
 Suppose the formula has $n$ variables and $m$ clauses
 1. for each clause, we construct vertex for each literal and connect them (for $x\lor \neg y\lor z$, we connect $x,\neg y,z$ together)
 2. then we connect all the literals with their negations (connects $x$ and $\neg x$)
 $\implies$
 If $\phi$ has a satisfiable assignment, then $G$ has an independent set of size $\geq m$,
 For a set $S$ we pick exactly one true literal from every clause and take the corresponding vertex to that clause, $|S|=m$
 Must also argue that $S$ is an independent set.
 Example: picked a set of vertices $|S|=4$.
 A literal has edges:
 - To all literals in the same clause: We never pick two literals form the same clause.
 - To its negation.
 Since it is a satisfiable 3-SAT assignment, $x$ and $\neg x$ cannot both evaluate to true, those edges are not a problem, so $S$ is an independent set.
 $\impliedby$
 If $G$ has an independent set of size $\geq m$, then $\phi$ is satisfiable.
 Say that $S$ is an independent set of $m$, we need to construct a satisfiable assignment for the original $\phi$.
 - If $S$ contains a vertex corresponding to literal $x_i$, then set $x_i$ to true.
 - If contains a vertex corresponding to literal $\neg x_i$, then set $x_i$ to false.
 - Other variables can be set arbitrarily
 Question: Is it a valid 3-SAT assignment?
 Your ISET $S$ can contain at most $1$ vertex from each clause. Since vertices in a clause are connected by edges.
 - Since $S$ contains $m$ vertices, it must contain exactly $1$ vertex from each clause.
 - Therefore, we will make at least $1$ literals form each clause to be true.
 - Therefore, all the clauses are true and $\phi$ is satisfied.
 ## Conclusion: NP-completeness
 - Prove NP-Complete:
  - If NP-optimization, convert to canonical decision problem
  - Certificate, Verification algorithm
  - Prove NP-hard: reduce from existing NP-Complete
  problems
 - 3-SAT Problem:
  - Input, output, constraints
  - A well-known NP-Complete problem
  - Reduce from 3-SAT to ISET to show ISET is NP-Complete
 ## On class
 ### NP-complete
 $p\in NP$, if we have a certificate schema and a verifier algorithm.
 ### NP-complete proof
 #### P is in NP
 what a certificate would looks like, show that if has a polynomial time o the problem size.
 design a verifier algorithm that checks a certificate if it indeed prove tha the answer is YES and has a polynomial time complexity. Inputs: certificate and the problem input $poly(|l|,|c|)=poly(|p|)$
 #### P is NP hard
 select an already known NP-hard problem: eg. 3-SAT, ISET, VC,...
 show that $3-SAT\leq_p p$
 - present an algorithm that given any instance of 3-SAT (on the chosen NP hard problem) to an instance of $p$.
 - show that the construction is done in polynomial time.
 - show that if $p$'s instance answer is YES, then the instance of 3-SAT is YES.
 - show that if 3-SAT's instance answer is YES then the instance of $p$ is YES.
--- a/pages/CSE347/CSE347_L7.md
+++ b/pages/CSE347/CSE347_L7.md
--- a/pages/CSE347/CSE347_L8.md
+++ b/pages/CSE347/CSE347_L8.md
@@ -0,0 +1,353 @@
 # Lecture 8
 ## NP-optimization problem
 Cannot be solved in polynomial time.
 Example:
 - Maximum independent set
 - Minimum vertex cover
 What can we do?
 - solve small instances
 - hard instances are rare - average case analysis
 - solve special cases
 - find an approximate solution
 ## Approximation algorithms
 We find a "good" solution in polynomial time, but may not be optimal.
 Example:
 - Minimum vertex cover: we will find a small vertex cover, but not necessarily the smallest one.
 - Maximum independent set: we will find a large independent set, but not necessarily the largest one.
 Question: How do we quantify the quality of the solution?
 ### Approximation ratio
 Intuition:
 How good is an algorithm $A$ compared to an optimal solution in the worst case?
 Definition:
 Consider algorithm $A$ for an NP-optimization problem $L$. Say for **any** instance $l$, $A$ finds a solution output $c_A(l)$ and the optimal solution is $c^*(l)$. 
 Approximation ratio is either:
 $$
 \max_{l \in L} \frac{c_A(l)}{c^*(l)}=\alpha
 $$
 for maximization problems, or
 $$
 \min_{l \in L} \frac{c^A(l)}{c_*(l)}=\alpha
 $$
 for minimization problems.
 Example:
 Alice's Algorithm, $A$, finds a vertex cover of size $c_A(l)$ for instance $l(G)$. The optimal vertex cover has size $c^*(l)$.
 We want approximation ratio to be as close to 1 as possible.
 > Vertex cover:
 > 
 > A vertex cover is a set of vertices that touches all edges.
 Let's try an approximation algorithm for the vertex cover problem, called Greedy cover.
 #### Greedy cover
 Pick any uncovered edge, both its endpoints are added to the cover $C$, until all edges are covered.
 Runtime: $O(m)$
 Claim: Greedy cover is correct, and it finds a vertex cover.
 Proof:
 Algorithm only terminates when all edges are covered.
 Claim: Greedy cover is a 2-approximation algorithm.
 Proof:
 Look at the two edges we picked.
 Either it is covered by Greedy cover, or it is not.
 If it is not covered by Greedy cover, then we will add both endpoints to the cover.
 In worst case, Greedy cover will add both endpoints of each edge to the cover. (Consider the graph with disjoint edges.)
 Thus, the size of the vertex cover found by Greedy cover is at most twice the size of the optimal vertex cover.
 Thus, Greedy cover is a 2-approximation algorithm.
 > Min-cut:
 >
 > Given a graph $G$ and two vertices $s$ and $t$, find the minimum cut between $s$ and $t$.
 >
 > Max-cut:
 >
 > Given a graph $G$, find the maximum cut.
 #### Local cut
 Algorithm:
 - start with an arbitrary cut of $G$.
 - While you can move a vertex from one side to the other side while increasing the size of the cut, do so.
 - Return the cut found.
 We will prove its:
 - Runtime
 - Feasibility
 - Approximation ratio
 ##### Runtime for local cut
 Since size of cut is at most $|E|$, the runtime is $O(m)$.
 When we move a vertex from one side to the other side, the size of the cut increases by at least 1.
 Thus, the algorithm terminates in at most $|V|$ steps.
 So the runtime is $O(|E||V|^2)$.
 ##### Feasibility for local cut
 The algorithm only terminates when no more vertices can be moved.
 Thus, the cut found is a feasible solution.
 ##### Approximation ratio for local cut
 This is a half-approximation algorithm.
 We need to show that the size of the cut found is at least half of the size of the optimal cut.
 We could first upper bound the size of the optimal cut is at most $|E|$.
 We will then prove that solution we found is at least half of the optimal cut $\frac{|E|}{2}$ for any graph $G$.
 Proof:
 When we terminate, no vertex could be moved
 Therefore, **The number of crossing edges is at least the number of non-crossing edges**.
 Let $d(u)$ be the degree of vertex $u\in V$.
 The total number of crossing edges for vertex $u$ is at least $\frac{1}{2}d(u)$.
 Summing over all vertices, the total number of crossing edges is at least $\frac{1}{2}\sum_{u\in V}d(u)=\frac{1}{2}|E|$.
 So the total number of non-crossing edges is at most $\frac{|E|}{2}$.
 EOP
 #### Set cover
 Problem:
 You are collecting a set of magic cards.
 $X$ is the set of all possible cards. You want at least one of each card.
 Each dealer $j$ has a pack $S_j\subseteq X$ of cards. You have to buy entire pack or none from dealer $j$.
 Goal: What is the least number of packs you need to buy to get all cards?
 Formally:
 Input $X$ is a universe of $n$ elements, and a collection of subsets of $X$, $Y=\{S_1, S_2, \ldots, S_m\}\subseteq X$.
 Goal: Pick $C\subseteq Y$ such that $\bigcup_{S_i\in C}S_i=X$, and $|C|$ is minimized.
 Set cover is an NP-optimization problem. It is a generalization of the vertex cover problem.
 #### Greedy set cover
 Algorithm:
 - Start with empty set $C$.
 - While there is an element $x$ in $X$ that is not covered, pick one such element $x\in S_i$ where $S_i$ is the set that has not been picked before.
 - Add $S_i$ to $C$.
 - Return $C$.
 ```python
 def greedy_set_cover(X, Y):
    # X is the set of elements
    # Y is the collection of sets, hashset by default
    C = []
    def non_covered_elements(X, C):
        # return the elements in X that are not covered by C
        # O(|X|)
        return [x for x in X if not any(x in c for c in C)]
    non_covered = non_covered_elements(X, C)
    # O(|X|) every loop reduce the size of non_covered by 1
    while non_covered:
        max_cover,max_set = 0,None
        # O(|Y|)
        for S in Y:
            # Intersection of two sets is O(min(|X|,|S|))
            cur_cover = len(set(non_covered) & set(S))
            if cur_cover > max_cover:
                max_cover,max_set = cur_cover,S
        C.append(max_set)
        non_covered = non_covered_elements(X, C)
    return C
 ```
 It is not optimal.
 Need to prove its:
 - Correctness:  
    Keep picking until all elements are covered.
 - Runtime:  
    $O(|X||Y|^2)$
 - Approximation ratio:  
 ##### Approximation ratio for greedy set cover
 > Harmonic number:
 >
 > $H_n=\sum_{i=1}^n\frac{1}{i}=\frac{1}{1}+\frac{1}{2}+\frac{1}{3}+\cdots+\frac{1}{n}=\Theta(\log n)$
 We claim that the size of the set cover found is at most $H_n\log n$ times the size of the optimal set cover.
 ###### First bound:
 Proof:
 If the optimal picks $k$ sets, then the size of the set cover found is at most $(1+\log n)k$ sets.
 Let $n=|X|$.
 Observe that
 For the first round, the elements that we not covered is $n$.
 $$
 |U_0|=n
 $$
 In the second round, the elements that we not covered is at most $|U_0|-x$ where $x=|S_1|$ is the number of elements in the set picked in the first round.
 $$
 |U_1|=|U_0|-|S_1|
 $$
 ...
 So $x_i\geq \frac{|U_{i-1}|}{k}$.
 We proceed by contradiction.
 Suppose all sets in the optimal solution are $< \frac{|U_0|}{k}$. Then the sum of the sizes of the sets in the optimal solution is $< |U_0|=n$.
 _There exists a least ratio of selection of sets determined by $k_i$. Otherwise the function (selecting the set cover) will not terminate (no such sets exists)_
 > Some math magics:
 > $$(1-\frac{1}{k})^k\leq \frac{1}{e}$$
 So $n(1-\frac{1}{k})^{|C|-1}=1$, $|C|\leq 1+k\ln n$.
 So the size of the set cover found is at most $(1+\ln n)k$.
 EOP
 So the greedy set cover is not too bad...
 ###### Second bound:
 Greedy set cover is a $H_d$-approximation algorithm of set cover.
 Proof:
 Assign a cost to the elements of $X$ according to the decisions of the greedy set cover.
 Let $\delta(S^i)$ be the new number of elements covered by set $S^i$.
 $$
 \delta(S^i)=|S_i\cap U_{i-1}|
 $$
 If the element $x$ is added by step $i$, when set $S_i$ is picked, then the cost of $x$ to
 $$
 \frac{1}{\delta(S^i)}=\frac{1}{x_i}
 $$
 Example:
 $$
 \begin{aligned}
 X&=\{A,B,C,D,E,F,G\}\\
 S_1&=\{A,C,E\}\\
 S_2&=\{B,C,F,G\}\\
 S_3&=\{B,D,F,G\}\\
 S_4&=\{D,G\}
 \end{aligned}
 $$
 First we select $S_2$, then $cost(B)=cost(C)=cost(F)=cost(G)=\frac{1}{4}$.
 Then we select $S_1$, then $cost(A)=cost(E)=\frac{1}{2}$.
 Then we select $S_3$, then $cost(D)=1$.
 If element $x$ was covered by greedy set cover due to the addition of set $S^i$ at step $i$, then the cost of $x$ is $\frac{1}{\delta(S^i)}$.
 $$
 \textup{Total cost of GSC}=\sum_{x\in X}c(x)=\sum_{i=1}^{|C|}\sum_{X\textup{ covered at iteration }i}c(x)=\sum_{i=1}^{|C|}\delta(S^i)\frac{1}{\delta(S^i)}=|C|
 $$
 Claim: Consider any set $S$ that is a subset of $X$. The cost paid by the greedy set cover for $S$ is at most $H_{|S|}$.
 Suppose that the greedy set covers $S$ in order $x_1,x_2,\ldots,x_{|S|}$, where $\{x_1,x_2,\ldots,x_{|S|}\}=S$.
 When GSC covers $x_j$, $\{x_j,x_{j+1},\ldots,x_{|S|}\}$ are not covered.
 At this point, the GSC has the option of picking $S$
 This implies that the $\delta(S)$ is at least $|S|-j+1$.
 Assume that $S$ is picked $\hat{S}$ for which $\delta(\hat{S})$ is maximized ($\hat{S}$ may be $S$ or other sets that have not covered $x_j$).
 So, $\delta(\hat{S})\geq \delta(S)\geq |S|-j+1$.
 So the cost of $x_j$ is $\delta(\hat{S})\leq \frac{1}{\delta(S)}\leq \frac{1}{|S|-j+1}$.
 Summing over all $j$, the cost of $S$ is at most $\sum_{j=1}^{|S|}\frac{1}{|S|-j+1}=H_{|S|}$.
 Back to the proof of approximation ratio:
 Let $C^*$ be optimal set cover.
 $$
 |C|=\sum_{x\in X}c(x)\leq \sum_{S_j\in C^*}\sum_{x\in S_j}c(x)
 $$
 This inequality holds because of counting element that is covered by more than one set.
 Since $\sum_{x\in S_j}c(x)\leq H_{|S_j|}$, by our claim.
 Let $d$ be the largest cardinality of any set in $C^*$.
 $$
 |C|\leq \sum_{S_j\in C^*}H_{|S_j|}\leq \sum_{S_j\in C^*}H_d=H_d|C^*|
 $$
 So the approximation ratio for greedy set cover is $H_d$.
 EOP
--- a/pages/CSE347/CSE347_L9.md
+++ b/pages/CSE347/CSE347_L9.md
@@ -0,0 +1,349 @@
 # Lecture 9
 ## Randomized Algorithms
 ### Hashing
 Hashing with chaining:
 Input: We have integers in range $[1,n-1]=U$. We want to map them to a hash table $T$ with $m$ slots.
 Hash function: $h:U\rightarrow [m]$
 Goal: Hashing a set $S\subseteq U$, $|S|=n$ into $T$ such that the number of elements in each slot is at most $1$.
 #### Collisions
 When multiple keys are mapped to the same slot, we call it a collision, we keep a linked list of all the keys that map to the same slot.
 **Runtime** of insert, query, delete of elements $=\Theta(\textup{length of the chain})$
 **Worst-case** runtime of insert, query, delete of elements $=\Theta(n)$
 Therefore, we want chains to be short, or $\Theta(1)$,  as long as $|S|$ is reasonably sized, or equivalently, we want the number in any set $S$ to hash **uniformly** across all slots.
 #### Simple Uniform Hashing Assumptions
 The $n$ elements we want to hash (the set $S$) is picked uniformly at random from $U$. Therefore, we could see that this simple hash function works fine:
 $$
 h(x)=x\mod m
 $$
 Question: What happens if an adversary knows this function and designs $S$ to make the worst-case runtime happen?
 Answer: The adversary can make the runtime of each operation $\Theta(n)$ by simply making all the elements hash to the same slot.
 #### Randomization to the rescue
 We don't want the adversary to know the hash function based on just looking at the code.
 Idea: Randomize the choice of the hash function.
 ### Randomized Algorithm
 #### Definition
 A randomized algorithm is an algorithm the algorithm makes internal random choices.
 2 kinds of randomized algorithms:
 1. Las Vegas: The runtime is random, but the output is always correct.
 2. Monte Carlo: The runtime is fixed, but the output is sometimes incorrect.
 We will focus on Las Vegas algorithms in this course.
 $$O(n)=E[T(n)]$$ or some other probabilistic quantity.
 #### Randomization can help
 Idea: Randomize the choice of hash function $h$ from a family of hash functions, $H$.
 If we randomly pick a hash function from this family, then the probability that the hash function is bad on **any particular** set $S$ is small.
 Intuitively, the adversary can not pick a bad input since most hash functions are good for any particular input $S$.
 #### Universal Hashing: Goal
 We want to design a universal family of hash functions, $H$, such that the probability that the hash table behaves badly on any input $S$ is small.
 #### Universal Hashing: Definition
 Suppose we have $m$ buckets in the hash table. We also have $2$ inputs $x\neq y$ and $x,y\in U$. We want $x$ and $y$ to be unlikely to hash to the same bucket.
 $H$ is a universal **family** of hash functions if for any two elements $x\neq y$,
 $$
 Pr_{h\in H}[h(x)=h(y)]=\frac{1}{m}
 $$
 where $h$ is picked uniformly at random from the family $H$.
 #### Universal Hashing: Analysis
 Claim: If we choose $h$ randomly from a universal family of hash functions, $H$, then the hash table will exhibit good behavior on any set $S$ of size $n$ with high probability.
 Question: What are some good properties and what does it mean by with high probability?
 Claim: Given a universal family of hash functions, $H$, $S=\{a_1,a_2,\cdots,a_n\}\subset \mathbb{N}$. For any probability $0\leq \delta\leq 1$, if $n\leq \sqrt{4m\delta}$, the chance that no two keys hash to the same slot is $\geq1-\delta$.
 Example: If we pick $\delta=\frac{1}{2}$. As long as $n<\sqrt{2m}$, the chance that no two keys hash to the same slot is $\geq\frac{1}{2}$.
 If we pick $\delta=\frac{1}{3}$. As long as $n<\sqrt{\frac{4}{3}m}$, the chance that no two keys hash to the same slot is $\geq\frac{2}{3}$.
 Proof Strategy:
 1. Compute the **expected value** of collisions. Note that collisions occurs when two different values are hashed to the same slot. (Indicator random variables)
 2. Apply a "tail" bound that converts the expected value to probability. (Markov's inequality)
 ##### Compute the expected number of collisions
 Let $m$ be the size of the hash table. $n$ is the number of keys in the set $S$. $N$ is the size of the universe.
 For inputs $x,y\in S,x\neq y$, we define a random variable
 $$
 C_{xy}=
 \begin{cases}
 1 & \text{if } h(x)=h(y) \\
 0 & \text{otherwise}
 \end{cases}
 $$
 $C_{xy}$ is called an indicator random variable, that takes value $0$ or $1$.
 The expected number of collisions is
 $$
 E[C_{xy}]=1\times Pr[C_{xy}=1]+0\times Pr[C_{xy}=0]=Pr[C_{xy}=1]=\frac{1}{m}
 $$
 Define $C_x$: random variable that represents the cost of inserting/searching/deleting $x$ from the hash table.
 $C_x\leq$ total number of elements that collide with $x$ (= number of elements $y$ such that $h(x)=h(y)$).
 $$
 C_x=\sum_{y\in S,y\neq x,h(x)=h(y)}1
 $$
 So, $C_x=\sum_{y\in S,y\neq x}C_{xy}$.
 By linearity of expectation,
 $$
 E[C_x]=\sum_{y\in S,y\neq x}E[C_{xy}]=\sum_{y\in S,y\neq x}\frac{1}{m}=\frac{n-1}{m}
 $$
 $E[C]=\Theta(1)$ if $n=O(m)$. Total cost of $K$ insert/search operations is $O(k)$. by linearity of expectation.
 Say $C$ is the total number of collisions.
 $C=\frac{\sum_{x\in S}C_x}{2}$ because each collision is counted twice.
 $$
 E[C]=\frac{1}{2}\sum_{x\in S}E[C_x]=\frac{1}{2}\sum_{x\in S}\frac{n-1}{m}=\frac{n(n-1)}{2m}
 $$
 If we want $E[C]\leq \delta$, then we need $n=\sqrt{2m\delta}$.
 #### The probability of no collisions $C=0$
 We know that the expected value of number of collisions is now $\leq \delta$, but what about the probability of **NO** collisions?
 > Markov's inequality: $$P[X\geq k]\leq\frac{E[X]}{k}$$
 > For non-negative random variable $X$, $Pr[X\geq k\cdot E[X]]\leq \frac{1}{k}$.
 Use Markov's inequality: For non-negative random variable $X$, $Pr[X\geq k\cdot E[X]]\leq \frac{1}{k}$.
 Apply this to $C$:
 $$
 Pr[C\geq \frac{1}{\delta}E[C]]<\delta\Rightarrow Pr[C\geq 1]<\delta
 $$
 So, if we want $Pr[C=0]>1-\delta$, $n<\sqrt{2m\delta}$ with probability $1-\delta$, you will have no collisions.
 #### More general conclusion
 Claim: For a universal hash function family $H$, if number of keys $n\leq \sqrt{Bm\delta}$, then the probability that at most $B+1$ keys hash to the same slot is $> 1-\delta$.
 ### Example: Quicksort
 Based on partitioning [assume all elements are distinct]: Partition($A[p\cdots r]$)
 - Rearranges $A$ into $A[p\cdots q-1],A[q],A[q+1\cdots r]$
 Runtime: $O(r-p)$, linear time.
 ```python
 def partition(A,p,r):
    x=A[r]
    lo=p
    for i in range(p,r):
        if A[i]<x:
            A[lo],A[i]=A[i],A[lo]
            lo+=1
    A[lo],A[r]=A[r],A[lo]
    return lo
 def quicksort(A,p,r):
    if p<r:
        q=partition(A,p,r)
        quicksort(A,p,q-1)
        quicksort(A,q+1,r)
 ```
 #### Runtime analysis
 Let the number of element in $A_{low}$ be $k$.
 $$
 T(n)=\Theta(n)+T(k)+T(n-k-1)
 $$
 By even split assumption, $k=\frac{n}{2}$.
 $$
 T(n)=T(\frac{n}{2})+T(\frac{n}{2}-1)+\Theta(n)\approx \Theta(n\log n)
 $$
 Which is approximately the same as merge sort.
 _Average case analysis is always suspicious._
 ### Randomized Quicksort
 - Pick a random pivot element.
 - Analyze the expected runtime. over the random choices of pivot.
 ```python
 def randomized_partition(A,p,r):
    ix=random.randint(p,r)
    x=A[ix]
    A[r],A[ix]=A[ix],A[r]
    lo=p
    for i in range(p,r):
        if A[i]<x:
            A[lo],A[i]=A[i],A[lo]
            lo+=1
    A[lo],A[r]=A[r],A[lo]
    return lo
 def randomized_quicksort(A,p,r):
    if p<r:
        q=randomized_partition(A,p,r)
        randomized_quicksort(A,p,q-1)
        randomized_quicksort(A,q+1,r)
 ```
 $$
 E[T(n)]=E(T(n-k-1)+T(k)+cn)=E(T(n-k-1))+E(T(k))+cn
 $$
 by linearity of expectation.
 $$
 Pr[\textup{pivot has rank }k]=\frac{1}{n}
 $$
 So,
 $$
 \begin{aligned}
 E[T(n)]&=\frac{1}{n}\sum_{k=0}^{n-1}(E[T(k)]+E[T(n-k-1)])+cn\\
 &=cn+\sum_{k=0}^{n-1}Pr[n-k-1=j]T(j)+\sum_{k=0}^{n-1}Pr[k=j]T(j)\\
 &=cn+\sum_{k=0}^{n-1}\frac{1}{n}T(j)+\sum_{k=0}^{n-1}\frac{1}{n}T(j)\\
 &=cn+\frac{2}{n}\sum_{k=0}^{n-1}T(j)
 \end{aligned}
 $$
 Claim: the solution to this recurrence is $E[T(n)]=O(n\log n)$ or $T(n)=c'n\log n+1$.
 Proof:
 We prove by induction.
 Base case: $n=1,T(n)=T(1)=c$
 Inductive step: Assume that $T(k)=c'k\log k+1$ for all $k<n$.
 Then,
 $$
 \begin{aligned}
 T(n)&=cn+\frac{2}{n}\sum_{k=0}^{n-1}T(k)\\
 &=cn+\frac{2}{n}\sum_{k=0}^{n-1}(c'k\log k+1)\\
 &=cn+\frac{2c'}{n}\sum_{k=0}^{n-1}k\log k+\frac{2}{n}\sum_{k=0}^{n-1}1
 \end{aligned}
 $$
 Then we use the fact that $\sum_{k=0}^{n-1}k\log k\leq \frac{n^2\log n}{2}-\frac{n^2}{8}$ (can be proved by induction).
 $$
 \begin{aligned}
 T(n)&=cn+\frac{2c'}{n}\left(\frac{n^2\log n}{2}-\frac{n^2}{8}\right)+\frac{2}{n}n\\
 &=c'n\log n-\frac{1}{4}c'n+cn+2\\
 &=(c'n\log n+1)-\left(\frac{1}{4}c'n-cn-1\right)
 \end{aligned}
 $$
 We need to prove that $\frac{1}{4}c'n-cn-1\geq 0$.
 Choose $c'$ and $c$ such that $\frac{1}{4}c'n\geq cn+1$ for all $n\geq 2$.
 If $c'\geq 8c$, then $T(n)\leq c'n\log n+1$.
 $E[T(n)]\leq c'n\log n+1=O(n\log n)$
 EOP
 A more elegant proof:
 Let $X_{ij}$ be an indicator random variable that is $1$ if element of rank $i$ is compared to element of rank $j$.
 Running time: $$X=\sum_{i=0}^{n-2}\sum_{j=i+1}^{n-1}X_{ij}$$
 So, the expected number of comparisons is
 $$
 E[X_{ij}]=Pr[X_{ij}=1]\times 1+Pr[X_{ij}=0]\times 0=Pr[X_{ij}=1]
 $$
 This is equivalent to the expected number of comparisons in randomized quicksort.
 The expected number of running time is
 $$
 \begin{aligned}
 E[X]&=E[\sum_{i=0}^{n-2}\sum_{j=i+1}^{n-1}X_{ij}]\\
 &=\sum_{i=0}^{n-2}\sum_{j=i+1}^{n-1}E[X_{ij}]\\
 &=\sum_{i=0}^{n-2}\sum_{j=i+1}^{n-1}Pr[X_{ij}=1]
 \end{aligned}
 $$
 For any two elements $z_i,z_j\in S$, the probability that $z_i$ is compared to $z_j$ is (either $z_i$ or $z_j$ is picked first as the pivot before the any elements of the ranks larger than $i$ and less than $j$)
 $$
 \begin{aligned}
 Pr[X_{ij}=1]&=Pr[z_i\text{ is picked first}]+Pr[z_j\text{ is picked first}]\\
 &=\frac{1}{j-i+1}+\frac{1}{j-i+1}\\
 &=\frac{2}{j-i+1}
 \end{aligned}
 $$
 So, with harmonic number, $H_n=\sum_{k=1}^{n}\frac{1}{k}$,
 $$
 \begin{aligned}
 E[X]&=\sum_{i=0}^{n-2}\sum_{j=i+1}^{n-1}\frac{2}{j-i+1}\\
 &\leq 2\sum_{i=0}^{n-2}\sum_{k=1}^{n-i-1}\frac{1}{k}\\
 &\leq 2\sum_{i=0}^{n-2}c\log(n)\\
 &=2c\log(n)\sum_{i=0}^{n-2}1\\
 &=\Theta(n\log n)
 \end{aligned}
 $$
 EOP
--- a/pages/CSE347/Exam_reviews/CSE347_E1.md
+++ b/pages/CSE347/Exam_reviews/CSE347_E1.md
@@ -0,0 +1,34 @@
 # Exam 1 review
 ## Greedy
 A Greedy Algorithm is an algorithm whose solution applies the same choice rule at each step over and over until no more choices can be made.
 - Stating and Proving a Greedy Algorithm
 - State your algorithm (“at this step, make this choice”)
 - Greedy Choice Property (Exchange Argument)
 - Inductive Structure
 - Optimal Substructure
 - "Simple Induction"
 - Asymptotic Runtime
 ## Divide and conquer
 Stating and Proving a Dividing and Conquer Algorithm
 - Describe the divide, conquer, and combine steps of your algorithm.
 - The combine step is the most important part of a divide and conquer algorithm, and in your recurrence this step is the "f (n)", or work done at each subproblem level. You need to show that you can combine the results of your subproblems somehow to get the solution for the entire problem.
 - Provide and prove a base case (when you can divide no longer)
 - Prove your induction step: suppose subproblems (two problems of size n/2, usually) of the same kind are solved optimally. Then, because of the combine step, the overall problem (of size n) will be solved optimally.
 - Provide recurrence and solve for its runtime (Master Method)
 ## Maximum Flow
 Given a weighted directed acyclic graph with a source and a sink node, the goal is to see how much "flow" you can push from the source to the sink simultaneously.
 Finding the maximum flow can be solved by the Ford-Fulkerson Algorithm. Runtime (from lecture slides): $O(F (|V | + |E |))$.
 Fattest Path improvement: $O(log |V |(|V | + |E |))$
 Min Cut-Max Flow: the maximum flow from source $s$ to sink $t$ is equal to the minimum sum of an $s-t$ cut.
 A cut is a partition of a graph into two disjoint sets by removing edges connecting the two parts. An $s-t$ cut will put $s$ and $t$ into the different sets.
--- a/pages/CSE347/_meta.js
+++ b/pages/CSE347/_meta.js
@@ -0,0 +1,29 @@
 export default {
    Exam_reviews: "Exam reviews",
    CSE347_L1: "Lecture 1",
    CSE347_L2: "Lecture 2",
    CSE347_L3: "Lecture 3",
    CSE347_L4: "Lecture 4",
    CSE347_L5: "Lecture 5",
    CSE347_L6: "Lecture 6",
    CSE347_L7: "Lecture 7",
    CSE347_L8: "Lecture 8",
    CSE347_L9: "Lecture 9",
    CSE347_L10: "Lecture 10",
    CSE347_L11: "Lecture 11",
    CSE347_L12: {
        display: 'hidden'
    },
    CSE347_L13: {
        display: 'hidden'
    },
    CSE347_L14: {
        display: 'hidden'
    },
    CSE347_L15: {
        display: 'hidden'
    },
    index: {
        display: 'hidden'
    }
 }
--- a/pages/CSE347/index.mdx
+++ b/pages/CSE347/index.mdx
@@ -0,0 +1 @@
 # Test
--- a/pages/CSE347/lecture_note_generator.py
+++ b/pages/CSE347/lecture_note_generator.py
@@ -0,0 +1,13 @@
 course_code=input('We will follow the naming pattern of {class}_L{lecture number}.md, enter the course code to start.\n')
 start=input('enter the number of lecture that you are going to start.\n')
 end=input('Enter the end of lecture (exclusive).\n')
 start=int(start)
 end=int(end)
 while start<end:
    # create a empty text file
    fp = open(f'{course_code}_L{start}.md', 'w')
    fp.write(f'# Lecture {start}')
    fp.close()
    start+=1
 print("Complete")
--- a/pages/CSE442T/CSE442T_L1.md
+++ b/pages/CSE442T/CSE442T_L1.md
@@ -0,0 +1,125 @@
 # Lecture 1
 > I changed all the element in set to lowercase letters. I don't know why K is capitalized.
 Brian Garnett
 bcgarnett@wustl.edu
 Math Phd... Great!
 Proof based course and write proofs.
 CSE 433 for practical applications.
 OH: Right after class! 4-5 Mon, Urbaur Hall 227
 Pass and Shalat
 ## Alice sending information to Bob
 Assuming _Eve_ can always listen
 Rule 1. Message, Encryption to Code and Decryption to original Message.
 ## Kerckhoffs' principle
 It states that the security of a cryptographic system shouldn't rely on the secrecy of the algorithm (Assuming Eve knows how everything works.)
 **Security is due to the security of the key.**
 ## Private key encryption scheme
 Let $\mathcal{M}$ be the set of message that Alice will send to Bob. (The message space) "plaintext"
 Let $\mathcal{K}$ be the set of key that will ever be used. (The key space)
 $Gen$ be the key generation algorithm.
 $k\gets Gen(\mathcal{K})$
 $c\gets Enc_k(m)$ denotes cipher encryption.
 $m'\gets Dec_k(c')$ $m'$ might be null for incorrect $c'$.
 $Pr[K\gets \mathcal{K}:Dec_k(Enc_k(M))=m]=1$ The probability of decryption of encrypted message is original message is 1.
 *_in some cases we can allow the probailty not be 1_
 ## Some examples of crypto system
 Let $\mathcal{M}=$ {all five letter strings}.
 And $\mathcal{K}=$ {1-$10^{10}$}
 Example:
 $P[k=k']=\frac{1}{10^{10}}$
 $Enc_{1234567890}("brion")="brion1234567890"$
 $Dec_{1234567890}(brion1234567890)="brion"$
 Seems not very secure but valid crypto system.
 ## Early attempts for crypto system.
 ### Caesar cipher
 $\mathcal{M}=$ finite string of texts
 $\mathcal{K}=$ {1-26}
 $Enc_k=[(i+K)\% 26\ for\ i \in m]=c$
 $Dec_k=[(i+26-K)\% 26\ for\ i \in c]$
 ```python
 def caesar_cipher_enc(s: str, k:int):
    return ''.join([chr((ord(i)-ord('a')+k)%26+ord('a')) for i in s])
 def caesar_cipher_dec(s: str, k:int):
    return ''.join([chr((ord(i)-ord('a')+26-k)%26+ord('a')) for i in s])
 ```
 ### Substitution cipher
 $\mathcal{M}=$ finite string of texts
 $\mathcal{K}=$ bijective linear transformations (for English alphabet, $|\mathcal{K}|=26!$)
 $Enc_k=[iK\ for\ i \in m]=c$
 $Dec_k=[iK^{-1}\ for\ i \in c]$
 Fails to frequency analysis
 ### Vigenere Cipher
 $\mathcal{M}=$ finite string of texts
 $\mathcal{K}=$ key phrase of a fixed length
 ```python
 def viginere_cipher_enc(s: str, k: List[int]):
    res=''
    n,m=len(s),len(k)
    j=0
    for i in s:
        res+=caesar_cipher_enc(i,k[j])
        j=(j+1)%m
    return res
 def viginere_cipher_dec(s: str, k: List[int]):
    res=''
    n,m=len(s),len(k)
    j=0
    for i in s:
        res+=caesar_cipher_dec(i,k[j])
        j=(j+1)%m
    return res
 ```
 ### One time pad
 Completely random string, sufficiently long.
--- a/pages/CSE442T/CSE442T_L10.md
+++ b/pages/CSE442T/CSE442T_L10.md
@@ -0,0 +1,199 @@
 # Lecture 10
 ## Continue
 ### Discrete Log Assumption
 This is collection of one-way functions
 $$
 p\gets \tilde\Pi_n(\textup{ safe primes }), p=2q+1
 $$
 $$
 a\gets \mathbb{Z}*_{p};g=a^2(\textup{ make sure }g\neq 1)
 $$
 $$
 f_{g,p}(x)=g^x\mod p
 $$
 $$
 f:\mathbb{Z}_q\to \mathbb{Z}^*_p
 $$
 #### Evidence for Discrete Log Assumption
 Best known algorithm to always solve discrete log  mod p, $p\in \Pi_n$
 $$
 O(2^{\sqrt{2}\sqrt{\log(n)}})
 $$
 ### RSA Assumption
 Let $e$ be the exponents
 $$
 P[p,q\gets \Pi_n;N\gets p\cdot q;e\gets \mathbb{Z}_{\phi(N)}^*;y\gets \mathbb{N}_n;x\gets \mathcal{A}(N,e,y);x^e=y\mod N]<\varepsilon(n)
 $$
 #### Theorem RSA Algorithm
 This is a collection of one-way functions
 $I=\{(N,e):N=p\cdot q,p,q\in \Pi_n \textup{ and } e\in \mathbb{Z}_{\phi(N)}^*\}$
 $D_{(N,e)}=\mathbb{Z}_N^*$
 $R_{(N,e)}=\mathbb{Z}_N^*$
 $f_{(N,e)}(x)=x^e\mod N$
 Example:
 On encryption side
 $p=5,q=11,N=5\times 11=55$, $\phi(N)=4*10=40$
 pick $e\in \mathbb{Z}_{40}^*$. say $e=3$, and $f(x)=x^3\mod 55$
 pick $y\in \mathbb{Z}_{55}^*$. say $y=17$. We have $(55,3,17)$
 $x^{40}\equiv 1\mod 55$
 $x^{41}\equiv x\mod 55$
 $x^{40k+1}\equiv x \mod 55$
 Since $x^a\equiv x^{a\mod 40}\mod 55$ (by corollary of Fermat's little Theorem: $a^x\mod N=a^{x\mod \Phi(N)}\mod N$
 s )
 The problem is, what can we multiply by $3$ to get $1\mod \phi(N)=1\mod 40$.
 by computing the multiplicative inverse using extended Euclidean algorithm we have $3\cdot 27\equiv 1\mod 40$.
 $x^3\equiv 17\mod 55$
 $x\equiv 17^{27}\mod 55$
 On adversary side.
 they don't know $\phi(N)=40$
 $$
 f(N,e):\mathbb{Z}_N^*\to \mathbb{Z}_N^*
 $$
 is a bijection.
 Proof: Suppose $x_1^e\equiv x_2^e\mod n$
 Then let $d=e^{-1}\mod \phi(N)$ (exists b/c $e\in\phi(N)^*$)
 So $(x_1^e)^d\equiv (x_2^e)^d\mod N$
 So $x_1^{e\cdot d\mod \phi(N)}\equiv x_2^{e\cdot d\mod \phi(N)}\mod N$ (Euler's Theorem)
 $x_1\equiv x_2\mod N$
 So it's one-to-one.
 EOP
 Let $y\in \mathbb{Z}_N^*$, letting $x=y^d\mod N$, where $d\equiv e^{-1}\mod \phi(N)$
 $x^e\equiv (y^d)^e \equiv y\mod n$
 Proof: 
 It's easy to sample from $I$:
 * pick $p,q\in \Pi_n$. $N=p\cdot q$
 * compute $\phi(N)=(p-1)(q-1)$
 * pick $e\gets \mathbb{Z}^*_N$. If $gcd(e,\phi(N))\neq 1$, pick again ($\mathbb{Z}_{\phi_(N)}^*$ has plenty of elements.)
 Easy to sample $\mathbb{\mathbb{Z}_N^*}$ (domain).
 Easy to compute $x^e\mod N$.
 Hard to invert:
 $$
 \begin{aligned}
 &~~~~P[(N,e)\in I;x\gets \mathbb{Z}_N^*;y=x^e\mod N:f(\mathcal{A}((N,e),y))=y]\\
 &=P[(N,e)\in I;x\gets \mathbb{Z}_N^*;y=x^e\mod N:x\gets \mathcal{A}((N,e),y)]\\
 &=P[(N,e)\in I;y\gets \mathbb{Z}_N^*;y=x^e\mod N:x\gets \mathcal{A}((N,e),y),x^e\equiv y\mod N]\\
 \end{aligned}
 $$
 By RSA assumption
 The second equality follows because for any finite $D$ and bijection $f:D\to D$, sampling $y\in D$ directly is equivalent to sampling $x\gets D$, then computing $y=f(x)$.
 EOP
 #### Theorem If inverting RSA is hard, then factoring is hard.
 $$
 \textup{ RSA assumption }\implies \textup{ Factoring assumption}
 $$
 If inverting RSA is hard, then factoring is hard.
 i.e If factoring is easy, then inverting RSA is easy.
 Proof:
 Suppose $\mathcal{A}$ is an adversary that breaks the factoring assumption, then
 $$
 P[p\gets \Pi_n;q\gets \Pi_n;N=p\cdot q;\mathcal{A}(N)=(p,q)]>\frac{1}{p(n)}
 $$
 infinitely often.for a polynomial $p$.
 Then we designing $B$ to invert RSA.
 Suppose
 $p,q\gets \Pi_n;N=p\cdot q;e\gets \mathbb{Z}_{\phi(N)}^*;x\gets \mathbb{Z}^n;y=x^e\mod N$
 ``` python
 def B(N,e,y):
    """
    Goal: find x
    """
    p,q=A(N)
    if n!=p*q:
        return None
    phiN=(p-1)*(q-1)
    # find modular inverse of e \mod N
    d=extended_euclidean_algorithm(e,phiN)
    # returns (y**d)%N
    x=fast_modular_exponent(y,d,N)
    return x
 ```
 So the probability of B succeeds is equal to A succeeds, which $>\frac{1}{p(n)}$ infinitely often, breaking RSA assumption.
 Remaining question: Can $x$ be found without factoring $N$? $y=x^e\mod N$
 ### Trapdoor permutations
 Idea: $f:D\to R$ is a one-way permutation.
 $y\gets R$.
 * Finding $x$ such that $f(x)=y$ is hard.
 * With some secret info about $f$, finding $x$ is easy.
 $\mathcal{F}=\{f_i:D_i\to R_i\}_{i\in I}$
 1. $\forall i,f_i$ is a permutation
 2. $(i,t)\gets Gen(1^n)$ efficient. ($i\in I$ paired with $t$), $t$ is the "trapdoor info"
 3. $\forall i,D_i$ can be sampled efficiently.
 4. $\forall i,\forall x,f_i(x)$ can be computed in polynomial time.
 5. $P[(i,t)\gets Gen(1^n);y\gets R_i:f_i(\mathcal{A}(1^n,i,y))=y]<\varepsilon(n)$ (note: $\mathcal{A}$ is not given $t$)
 6. (trapdoor) There is a p.p.t. $B$ such that given $i,y,t$, B always finds x such that $f_i(x)=y$. $t$ is the "trapdoor info"
 #### Theorem RSA is a trapdoor
 RSA collection of trapdoor permutation with factorization $(p,q)$ of $N$, or $\phi(N)$, as trapdoor info $f$.
--- a/pages/CSE442T/CSE442T_L11.md
+++ b/pages/CSE442T/CSE442T_L11.md
@@ -0,0 +1,112 @@
 # Lecture 11
 Exam info posted tonight.
 ## Pseudo-randomness
 Idea: **Efficiently** produce many bits
 which "appear" truly random.
 ### One-time pad
 $m\in\{0,1\}^n$
 $Gen(1^n):k\gets \{0,1\}^N$
 $Enc_k(m)=m\oplus k$
 $Dec_k(c)=c\oplus k$
 Advantage: Perfectly secret
 Disadvantage: Impractical
 The goal of pseudo-randomness is to make the algorithm, computationally secure, and practical.
 Let $\{X_n\}$ be a sequence of distributions over $\{0,1\}^{l(n)}$, where $l(n)$ is a polynomial of $n$.
 "Probability ensemble"
 Example:
 Let $U_n$ be the uniform distribution over $\{0,1\}^n$
 For all $x\in \{0,1\}^n$
 $P[x\gets U_n]=\frac{1}{2^n}$
 For $1\leq i\leq n$, $P[x_i=1]=\frac{1}{2}$
 For $1\leq i<j\leq n,P[x_i=1 \textup{ and } x_j=1]=\frac{1}{4}$ (by independence of different bits.)
 Let $\{X_n\}_n$ and $\{Y_n\}_n$ be probability ensembles (separate of dist over $\{0,1\}^{l(n)}$)
 $\{X_n\}_n$ and $\{Y_n\}_n$ are computationally **in-distinguishable** if for all non-uniform p.p.t adversary $D$ ("distinguishers")
 $$
 |P[x\gets X_n:D(x)=1]-P[y\gets Y_n:d(y)=1]|<\varepsilon(n)
 $$
 this basically means that the probability of finding any pattern in the two array is negligible.
 If there is a $D$ such that
 $$
 |P[x\gets X_n:D(x)=1]-P[y\gets Y_n:d(y)=1]|\geq \mu(n)
 $$
 then $D$ is distinguishing with probability $\mu(n)$
 If $\mu(n)\geq\frac{1}{p(n)}$, then $D$ is distinguishing the two $\implies X_n\cancel{\approx} Y_n$
 ### Prediction lemma
 $X_n^0$ and $X_n^1$ ensembles over $\{0,1\}^{l(n)}$
 Suppose $\exists$ distinguisher $D$ which distinguish by $\geq \mu(n)$. Then $\exists$ adversary $\mathcal{A}$ such that 
 $$
 P[b\gets\{0,1\};t\gets X_n^b]:\mathcal{A}(t)=b]\geq \frac{1}{2}+\frac{\mu(n)}{2}
 $$
 Proof:
 Without loss of generality, suppose
 $$
 P[t\gets X^1_n:D(t)=1]-P[t\gets X_n^0:D(t)=1]\geq \mu(n)
 $$
 $\mathcal{A}=\mathcal{D}$ (Outputs 1 if and only if $D$ outputs 1, otherwise 0.)
 $$
 \begin{aligned}
    &~~~~~P[b\gets \{0,1\};t\gets X_n^b:\mathcal{A}(t)=b]\\
    &=P[t\gets X_n^1;\mathcal{A}=1]\cdot P[b=1]+P[t\gets X_n^0;\mathcal{A}(t)=0]\cdot P[b=0]\\
    &=\frac{1}{2}P[t\gets X_n^1;\mathcal{A}(t)=1]+\frac{1}{2}(1-P[t\gets X_n^0;\mathcal{A}(t)=1])\\
    &=\frac{1}{2}+\frac{1}{2}(P[t\gets X_n^1;\mathcal{A}(t)=1]-P[t\gets X_n^0;\mathcal{A}(t)=1])\\
    &\geq\frac{1}{2}+\frac{1}{2}\mu(n)\\
 \end{aligned}
 $$
 ### Pseudo-random
 $\{X_n\}$ over $\{0,1\}^{l(n)}$ is **pseudorandom** if $\{X_n\}\approx\{U_{l(n)}\}$. i.e. indistinguishable from the true randomness.
 Example:
 Building distinguishers
 1. $X_n$: always outputs $0^n$, $D$: [outputs $1$ is $t=0^n$]  
    $$
    \vert P[t\gets X_n:D(t)=1]-P[t\gets U_n:D(t)=1]\vert=1-\frac{1}{2^n}\approx 1
    $$
 2. $X_n$: 1st $n-1$ bits are truly random $\gets U_{n-1}$ nth bit is $1$ with probability 0.50001 and $0$ with 0.49999, $D$: [outputs $1$ if $X_n=1$]  
    $$
    \vert P[t\gets X_n:D(t)=1]-P[t\gets U_n:D(t)=1]\vert=0.5001-0.5=0.001\neq 0
    $$
 3. $X_n$: For each bit $x_i\gets\{0,1\}$ **unless** there have been 1 million $0$'s. in a row. Then outputs $1$, $D$: [outputs $1$ if $x_1=x_2=...=x_{1000001}=0$]
   $$
    \vert P[t\gets X_n:D(t)=1]-P[t\gets U_n:D(t)=1]\vert=|0-\frac{1}{2^{1000001}}|\neq 0
   $$
--- a/pages/CSE442T/CSE442T_L12.md
+++ b/pages/CSE442T/CSE442T_L12.md
@@ -0,0 +1,152 @@
 # Lecture 12
 ## Continue on pseudo-randomness
 $\{X_n\}$ and $\{Y_n\}$ are distinguishable by $\mu(n)$ if $\exists$ distinguisher $D$
 $$
 |P[x\gets X_n:D(x)=1]-P[y\gets Y_n:D(y)=1]|\geq \mu(n)
 $$
 - If $\mu(n)\geq \frac{1}{p(n)}\gets poly(n)$ for infinitely many n, then $\{X_n\}$ and $\{Y_n\}$ are distinguishable.
 - Otherwise, indistinguishable ($|diff|<\varepsilon(n)$)
 Property: Closed under efficient procedures.
 If $M$ is any n.u.p.p.t. which can take a ample from $t$ from $X_n,Y_n$ as input $M(X_n)$
 If $\{X_n\}\approx\{Y_n\}$, then so are $\{M(X_n)\}\approx\{M(Y_n)\}$
 Proof:
 If $D$ distinguishes $M(X_n)$ and $M(Y_n)$ by $\mu(n)$ then $D(M(\cdot))$ is also a polynomial-time distinguisher of $X_n,Y_n$.
 ### Hybrid Lemma
 Let $X^0_n,X^1_n$ are ensembles indexed from $1,..,m$
 If $D$ distinguishes $X_n^0$ and $X_n^m$ by $\mu(n)$, then $\exists i,1\leq i\leq m$ where $X_{n}^{i-1}$ and $X_n^i$ are distinguished by $D$ by $\frac{\mu(n)}{m}$
 Proof: (we use triangle inequality.) Let $p_i=P[t\gets X_n^i:D(t)=1],0\leq i\leq m$. We have $|p_0-p_m|\geq m(n)$
 Using telescoping tricks:
 $$
 \begin{aligned}
 |p_0-p_m|&=|p_0-p_1+p_1-p_2+\dots +p_{m-1}-p_m|\\
 &\leq |p_0-p_1|+|p_1-p_2|+\dots+|p_{m-1}-p_m|\\
 \end{aligned}
 $$
 If all $|p_{i-1}-p_i|<\frac{\mu(n)}{m},|p_0-p_m|<\mu_n$ contradiction.
 In applications, only useful if $m\leq q(n)$ polynomial
 If $X_0$ and $X^m$ are distinguishable by $\frac{1}{p(n)}$, then $2$ inner "hybrids" are distinguishable $\frac{1}{p(n)q(n)}=\frac{1}{poly(n)}$
 Example:
 For some Brian in Week 1 and Week 50, a distinguisher $D$ outputs 1 if hair is considered "long".
 There is some week $i,1\leq i\leq 50$ $|p_{i-1}-p_i|\geq 0.02$
 By prediction lemma, there is a machine that could
 $$
 P[b\to \{0,1\};pic\gets X^{i-1+b}:\mathcal{A}(pic)=b]\geq \frac{1}{2}+\frac{0.02}{2}=0.51
 $$
 ### Next bit test (NBT)
 We say $\{X_n\}$ passes the next bit test if $\forall i\in\{0,1,...,l(n)-1\}$ on $\{0,1\}^{l(n)}$ and for all adversaries $\mathcal{A}:P[t\gets X_n:\mathcal{A}(t_1,t_2,...,t_i)=t_{i+1}]\leq \frac{1}{2}+\varepsilon(n)$ (given first $i$ bit, the probability of successfully predicts $i+1$ th bit is almost random $\frac{1}{2}$)
 Note that for any $\mathcal{A}$, and any $i$,
 $$
 P[t\gets U_{l(n)}:\mathcal{A}(t_1,...t_i)=t_{i+1}]=\frac{1}{2}
 $$
 If $\{X_n\}\approx\{U_{l(n)}\}$ (pseudorandom), then $X_n$ must pass NBT for all $i$.
 Otherwise $\exists \mathcal{A},i$ where for infinitely many $n$,
 $$
 P[t\gets X_n:\mathcal{A}(t_1,t_2,...,t_i)=t_{i+1}]\leq \frac{1}{2}+\varepsilon(n)
 $$
 We can build a distinguisher $D$ from $\mathcal{A}$.
 The converse if True!
 The NBT(Next bit test) is complete.
 If $\{X_n\}$ on $\{0,1\}^{l(n)}$ passes NBT, then it's pseudorandom.
 Idea of proof: full proof is on the text.
 Our idea is that we want to create $H^{l(n)}_n=\{X_n\}$ and $H^0_n=\{U_{l(n)}\}$
 We construct "random" bit stream:
 $$
 H_n^i=\{x\gets X_n;u\gets U_{l(n)};t=x_1x_2\dots x_i u_{i+1}u_{i+2}\dots u_{l(n)}\}
 $$
 If $\{X_n\}$ were not pseudorandom, there is a $D$
 $$
 |P[x\gets X_n:D(x)=1]-P[u\gets U_{l(n)}:D(u)=1]|=\mu(n)\geq \frac{1}{p(n)}
 $$
 By hybrid lemma, there is $i,1\leq i\leq l(n)$ where:
 $$
 |P[t\gets H^{i-1}:D(t)=1]-P[t\gets H^i:D(t)=1]|\geq \frac{1}{p(n)l(n)}=\frac{1}{poly(n)}
 $$
 $l(n)$ is the step we need to take transform $X$ to $X^n$
 Let,
 $$
 H^i=x_1\dots x_i u_{i+1}\dots u_{l(n)}\\
 H^i=x_1\dots x_i x_{i+1}\dots u_{l(n)}
 $$
 notice that only two bits are distinguished in the procedure.
 D can distinguish $x_{i+1}$ from a truly random $U_{i+1}$, knowing the first $i$ bits $x_i\dots x_i$ came from $x\gets x_n$
 So $D$ can predict $x_{i+1}$ from $x_1\dots x_i$ (contradicting with that $X$ passes NBT)
 EOP
 ## Pseudorandom Generator
 Suppose $G:\{0,1\}^*\to\{0,1\}^*$ is a pseudorandom generator if the following is true:
 1. $G$ is efficiently computable.
 2. $|G(x)|\geq |x|\forall x$ (expansion)
 3. $\{x\gets U_n:G(x)\}_n$ is pseudorandom
 $n$ truly random bits $\to$ $n^2$ pseudorandom bits
 ### PRG exists if and only if one-way function exists
 The other part of proof will be your homework, damn.
 If one-way function exists, then Pseudorandom Generator exists.
 Idea of proof:
 Let $f:\{0,1\}^n\to \{0,1\}^n$ be a strong one-way permutation (bijection).
 $x\gets U_n$
 $f(x)||x$
 Not all bits of $x$ would be hard to predict.
 **Hard-core bit:** One bit of information about $x$ which is hard to determine from $f(x)$. $P[$ success $]\leq \frac{1}{2}+\varepsilon(n)$
 Depends on $f(x)$
--- a/pages/CSE442T/CSE442T_L13.md
+++ b/pages/CSE442T/CSE442T_L13.md
@@ -0,0 +1,157 @@
 # Lecture 13
 ## Pseudorandom Generator (PRG)
 $G:\{0,1\}^n\to\{0,1\}^{l(n)}$ is a pseudorandom generator if the following is true:
 1. $G$ is efficiently computable.
 2. $l(n)> n$ (expansion)
 3. $\{x\gets \{0,1\}^n:G(x)\}_n\approx \{u\gets \{0,1\}^{l(n)}\}$
 ### Hard-core bit (predicate) (HCB)
 Hard-core bit (predicate) (HCB): $h:\{0,1\}^n\to \{0,1\}$ is a hard-core bit of $f:\{0,1\}^n\to \{0,1\}^*$ if for every adversary $A$,
 $$
 Pr[x\gets \{0,1\}^n;y=f(x);A(1^n,y)=h(x)]\leq \frac{1}{2}+\epsilon(n)
 $$
 Idea: $f:\{0,1\}^n\to \{0,1\}^*$ is a one-way function.
 Given $y=f(x)$, it is hard to recover $x$. A cannot produce all of $x$ but can know some bits of $x$.
 $h(x)$ is just a yes/no question regarding $x$.
 Example:
 In RSA function, we pick $p,q\in \Pi^n$ as primes and $N=pq$. $e\gets \mathbb{Z}_N^*$ and $f(x)=x^e\mod N$.
 $h(x)=x_n$ is a HCB of $f$. Given RSA assumption.
 **h(x) is not necessarily one of the bits of $x=x_1x_2\cdots x_n$.**
 #### Theorem Any one-way function has a HCB.
 A HCB can be produced for any one-way function.
 Let $f:\{0,1\}^n\to \{0,1\}^*$ be a strong one-way function.
 Define $g:\{0,1\}^{2n}\to \{0,1\}^*$ as $g(x,r)=(f(x), r),x\in \{0,1\}^n,r\in \{0,1\}^n$. $g$ is a strong one-way function. (proved in homework)
 $$
 h(x,r)=\langle x,r\rangle=x_1r_1+ x_2r_2+\cdots + x_nr_n\mod 2
 $$
 $\langle x,1^n\rangle=x_1+x_2+\cdots +x_n\mod 2$
 $\langle x,0^{n-1}1\rangle=x_ n$
 Idea of proof:
 If A could reliably find $\langle x,1^n\rangle$, with $r$ being completely random, then it could find $x$ too often.
 ### Pseudorandom Generator from HCB
 1. $G(x)=\{0,1\}^n\to \{0,1\}^{n+1}$
 2. $G(x)=\{0,1\}^n\to \{0,1\}^{l(n)}$
 For (1),
 #### Theorem HCB generates PRG
 Let $f:\{0,1\}^n\to \{0,1\}^n$ be a one-way permutation (bijective) with a HCB $h$. Then $G(x)=f(x)|| h(x)$ is a PRG.
 Proof:
 Efficiently computable: $f$ is one-way so $h$ is efficiently computable.
 Expansion: $n<n+1$
 Pseudorandomness:
 We proceed by contradiction.
 Suppose $\{G(U_n)\}\cancel{\approx} \{U_{n+1}\}$. Then there would be a next-bit predictor $A$ such that for some bit $i$.
 $$
 Pr[x\gets \{0,1\}^n;t=G(x);A(t_1t_2\cdots t_{i-1})=t_i]\geq \frac{1}{2}+\epsilon(n)
 $$
 Since $f$ is a bijection, $x\gets U_n$ and $f(x)\gets U_n$.
 $G(x)=f(x)|| h(x)$
 So $A$ could not predict $t_i$ with advantage $\frac{1}{2}+\epsilon(n)$ given any first $n$ bits.
 $$
 Pr[t_i=1|t_1t_2\cdots t_{i-1}]= \frac{1}{2}
 $$
 So $i=n+1$ the last bit, $A$ could predict.
 $$
 Pr[x\gets \{0,1\}^n;y=f(x);A(y)=h(x)]>\frac{1}{2}+\epsilon(n)
 $$
 This contradicts the HCB definition of $h$.
 ### Construction of PRG
 $G'=\{0,1\}^n\to \{0,1\}^{l(n)}$
 using PRG $G:\{0,1\}^n\to \{0,1\}^{n+1}$
 Let $s\gets \{0,1\}^n$ be a random string.
 We proceed by the following construction:
 $G(s)=X_1||b_1$
 $G(X_1)=X_2||b_2$
 $G(X_2)=X_3||b_3$
 $\cdots$
 $G(X_{l(n)-1})=X_{l(n)}||b_{l(n)}$
 $G'(s)=b_1b_2b_3\cdots b_{l(n)}$
 We claim $G':\{0,1\}^n\to \{0,1\}^{l(n)}$ is a PRG.
 #### Corollary: Combining constructions
 $f:\{0,1\}^n\to \{0,1\}^n$ is a one-way permutation with a HCB $h: \{0,1\}^n\to \{0,1\}$.
 $G(s)=h(x)||h(f(x))||h(f^2(x))\cdots h(f^{l(n)-1}(x))$ is a PRG. Where $f^a(x)=f(f^{a-1}(x))$.
 Proof:
 $G'$ is a PRG:
 1. Efficiently computable: since we are computing $G'$ by applying $G$ multiple times (polynomial of $l(n)$ times).
 2. Expansion: $n<l(n)$.
 3. Pseudorandomness: We proceed by contradiction. Suppose the output is not pseudorandom. Then there exists a distinguisher $D$ that can distinguish $G'$ from $U_{l(n)}$ with advantage $\frac{1}{2}+\epsilon(n)$.
 Strategy: use hybrid argument to construct distributions.
 $$
 \begin{aligned}
 H^0&=U_{l(n)}=u_1u_2\cdots u_{l(n)}\\
 H^1&=u_1u_2\cdots u_{l(n)-1}b_{l(n)}\\
 H^2&=u_1u_2\cdots u_{l(n)-2}b_{l(n)-1}b_{l(n)}\\
 &\cdots\\
 H^{l(n)}&=b_1b_2\cdots b_{l(n)}
 \end{aligned}
 $$
 By the hybrid argument, there exists an $i$ such that $D$ can distinguish $H^i$ and $H^{i+1}$ $0\leq i\leq l(n)-1$ by $\frac{1}{p(n)l(n)}$
 Show that there exists $D$ for 
 $$
 \{u\gets U_{n+1}\}\text{ vs. }\{x\gets U_n;G(x)=u\}
 $$
 with advantage $\frac{1}{2}+\epsilon(n)$. (contradiction)
--- a/pages/CSE442T/CSE442T_L14.md
+++ b/pages/CSE442T/CSE442T_L14.md
@@ -0,0 +1,176 @@
 # Lecture 14
 ## Recap
 $\exists$ one-way functions $\implies$ $\exists$ PRG expand by any polynomial amount
 $\exists G:\{0,1\}^n \to \{0,1\}^{l(n)}$ s.t. $G$ is efficiently computable, $l(n) > n$, and $G$ is pseudorandom
 $$
 \{G(U_n)\}\approx \{U_{l(n)}\}
 $$
 Back to the experiment we did long time ago:
 ||Group 1|Group 2|
 |---|---|---|
 |$00000$ or $11111$|3|16|
 |4 of 1's|42|56|
 |balanced|too often|usual|
 |consecutive repeats|0|4|
 So Group 1 is human, Group 2 is computer.
 ## New material
 ### Computationally secure encryption
 Recall with perfect security,
 $$
 P[k\gets Gen(1^n):Enc_k(m_1)=c] = P[k\gets Gen(1^n):Enc_k(m_2)=c]
 $$
 for all $m_1,m_2\in M$ and $c\in C$.
 $(Gen,Enc,Dec)$ is **single message secure** if $\forall n.u.p.p.t \mathcal{D}$ and for all $n\in \mathbb{N}$, $\forall m_1,m_2\gets \{0,1\}^n \in M^n$, $\mathcal{D}$ distinguishes $Enc_k(m_1)$ and $Enc_k(m_2)$ with at most negligble probability.
 $$
 P[k\gets Gen(1^n):\mathcal{D}(Enc_k(m_1),Enc_k(m_2))=1] \leq \epsilon(n)
 $$
 By the prediction lemma, ($\mathcal{A}$ is a ppt, you can also name it as $\mathcal{D}$)
 $$
 P[b\gets \{0,1\}:k\gets Gen(1^n):\mathcal{A}(Enc_k(m_b)) = b] \leq \frac{1}{2} + \frac{\epsilon(n)}{2}
 $$
 and the above equation is $\frac{1}{2}$ for perfect secrecy.
 ### Construction of single message secure cryptosystem
 cryptosystem with shorter keys. Mimic OTP(one time pad) with shorter keys with pseudorandom randomness.
 $K=\{0,1\}^n$, $\mathcal{M}=\{0,1\}^{l(n)}$, $G:K \to \mathcal{M}$ is a PRG.
 $Gen(1^n)$: $k\gets \{0,1\}^n$; output $k$.
 $Enc_k(m)$: $r\gets \{0,1\}^{l(n)}$; output $G(k)\oplus m$.
 $Dec_k(c)$: output $G(k)\oplus c$.
 Proof of security:
 Let $m_0,m_1\in \mathcal{M}$ be two messages, and $\mathcal{D}$ is a n.u.p.p.t distinguisher.
 Suppose $\{K\gets Gen(1^n):Enc_k(m_i)\}$ is distinguished for $i=0,1$ by $\mathcal{D}$ and by $\mu(n)\geq\frac{1}{poly(n)}$.
 Strategy: Move to OTP, then flip message.
 $$
 H_0(Enc_k(m_0)) = \{k\gets \{0,1\}^n: m_0\oplus G(k)\}
 $$
 $$
 H_1(OTP(m_1)) = \{u\gets U_{l(n)}: m_o\oplus u\}
 $$
 $$
 H_2(OTP(m_1)) = \{u\gets U_{l(n)}: m_1\oplus u\}
 $$
 $$
 H_3(Enc_k(m_0)) = \{k\gets \{0,1\}^n: m_1\oplus G(k)\}
 $$
 By hybrid argument, 2 neighboring messages are indistinguishable.
 However, $H_0$ and $H_1$ are indistinguishable since $G(U_n)$ and $U_{l(n)}$ are indistinguishable.
 $H_1$ and $H_2$ are indistinguishable by perfect secrecy of OTP.
 $H_2$ and $H_3$ are indistinguishable since $G(U_n)$ and $U_{l(n)}$ are indistinguishable.
 Which leads to a contradiction.
 ### Multi-message secure encryption
 $(Gen,Enc,Dec)$ is multi-message secure if $\forall n.u.p.p.t \mathcal{D}$ and for all $n\in \mathbb{N}$, and $q(n)\in poly(n)$.
 $$
 \overline{m}=(m_1,\dots,m_{q(n)})
 $$
 $$
 \overline{m}'=(m_1',\dots,m_{q(n)}')
 $$
 are list of $q(n)$ messages in $\{0,1\}^n$.
 $\mathcal{D}$ distinguishes $Enc_k(\overline{m})$ and $Enc_k(\overline{m}')$ with at most negligble probability.
 $$
 P[k\gets Gen(1^n):\mathcal{D}(Enc_k(\overline{m}),Enc_k(\overline{m}'))=1] \leq \frac{1}{2} + \epsilon(n)
 $$
 **THIS IS NOT MULTI-MESSAGE SECURE.**
 We can take $\overline{m}=(0^n,0^n)\to (G(k),G(k))$ and $\overline{m}'=(0^n,1^n)\to (G(k),G(k)+1^n)$ the distinguisher can easily distinguish if some message was sent twice.
 What we need is that the distinguisher cannot distinguish if some message was sent twice. To achieve multi-message security, we need our encryption function to use randomness (or change states) for each message, otherwise $Enc_k(0^n)$ will return the same on consecutive messages.
 Our fix is, if we can agree on a random function $F:\{0,1\}^n\to \{0,1\}^n$ satisfied that: for each input $x\in\{0,1\}^n$, $F(x)$ is chosen uniformly at random.
 $Gen(1^n):$ Choose random function $F:\{0,1\}^n\to \{0,1\}^n$.
 $Enc_F(m):$ let $r\gets U_n$; output $(r,F(r)\oplus m)$.
 $Dec_F(m):$ Given $(r,c)$, output $m=F(r)\oplus c$.
 Idea: Adversary sees $r$ but has no idea about $F(r)$. (we choose all outputs at random)
 If we could do this, this is MMS (multi-message secure).
 Proof:
 Suppose $m_1,m_2,\dots,m_{q(n)}$, $m_1',\dots,m_{q(n)}'$ are sent to the encryption oracle.
 Suppose the encryption are distinguished by $\mathcal{D}$ with probability $\frac{1}{2}+\epsilon(n)$.
 Strategy: move to OTP with hybrid argument.
 Suppose we choose a random function
 $$
 H_0:\{F\gets RF_n:((r_1,m_1\oplus F(r_1)),(r_2,m_2\oplus F(r_2)),\dots,(r_{q(n)},m_{q(n)}\oplus F(r_{q(n)})))\}
 $$
 and
 $$
 H_1:\{OTP:(r_1,m_1\oplus u_1),(r_2,m_2\oplus u_2),\dots,(r_{q(n)},m_{q(n)}\oplus u_{q(n)})\}
 $$
 $r_i,u_i\in U_n$.
 By hybrid argument, $H_0$ and $H_1$ are indistinguishable if $r_1,\dots,r_{q(n)}$ are different, these are the same.
 $F(r_1),\dots,F(r_{q(n)})$ are chosen uniformly and independently at random.
 only possible problem is $r_i=r_j$ for some $i\neq j$, and $P[r_i=r_j]=\frac{1}{2^n}$.
 And the probability that at least one pair are equal
 $$
 P[\text{at least one pair are equal}] =P[\bigcup_{i\neq j}\{r_i=r_j\}] \leq \sum_{i\neq j}P[r_i=r_j]=\binom{n}{2}\frac{1}{2^n} < \frac{n^2}{2^{n+1}}
 $$
 which is negligible.
 Unfortunately, we cannot do this in practice.
 How many random functions are there?
 The length of description of $F$ is $n 2^n$.
 For each $x\in \{0,1\}^n$, there are $2^n$ possible values for $F(x)$.
 So the total number of random functions is $(2^n)^{2^n}=2^{n2^n}$.
--- a/pages/CSE442T/CSE442T_L15.md
+++ b/pages/CSE442T/CSE442T_L15.md
@@ -0,0 +1,187 @@
 # Lecture 15
 ## Random Function
 $F:\{0,1\}^n\to \{0,1\}^n$
 For each $x\in \{0,1\}^n$, there are $2^n$ possible values for $F(x)$.
 pick $y=F(x)\gets \{0,1\}^n$ independently at random. ($n$ bits)
 This generates $n\cdot 2^n$ random bits to specify $F$.
 ### Equivalent description of $F$
 ```python
 # initialized empty list L
 L=collections.defaultdict(int)
 # initialize n bits constant
 n=10
 def F(x):
    """ simulation of random function
    param:
        x: n bits
    return:
        y: n bits
    """
    if L[x] is not None:
        return L[x]
    else:
        # y is a random n-bit string
        y=random.randbits(n)
        L[x]=y
        return y
 ```
 However, this is not a good random function since two communicator may not agree on the same $F$.
 ### Pseudorandom Function
 $f:\{0,1\}^n\to \{0,1\}^n$
 #### Oracle Access (for function $g$)
 $O_g$ is a p.p.t. that given $x\in \{0,1\}^n$ outputs $g(x)$.
 The distinguisher $D$ is given oracle access to $O_g$ and outputs $1$ if $g$ is random and $0$ otherwise. It can make polynomially many queries.
 ### Oracle indistinguishability
 $\{F_n\}$ and $\{G_n\}$ are sequence of distribution on functions
 $$
 f:\{0,1\}^{l_1(n)}\to \{0,1\}^{l_2(n)}
 $$
 that are computationally indistinguishable
 $$
 \{f_n\}\sim \{g_n\}
 $$
 if for all p.p.t. $D$ (with oracle access to $F_n$ and $G_n$),
 $$
 \left|P[f\gets F_n:D^f(1^n)=1]-P[g\gets G_n:D^g(1^n)=1]\right|< \epsilon(n)
 $$
 where $\epsilon(n)$ is negligible.
 Under this property, we still have:
 - Closure properties. under efficient procedures.
 - Prediction lemma.
 - Hybrid lemma.
 ### Pseudorandom Function Family
 Definition: $\{f_s:\{0,1\}^\{0.1\}^{|S|}\to \{0,1\}^P$  $t_0s\in \{0,1\}^n\}$ is a pseudorandom function family if $\{f_s\}_{s\in \{0,1\}^n}$ are oracle indistinguishable.
 - It is easy to compute for every $x\in \{0,1\}^{|S|}$.
 - $\{s \gets\{0,1\}^n\}_n\approx \{F\gets RF_n,F\}$ is indistinguishable from the uniform distribution over $\{0,1\}^P$.
  - $R$ is truly random function.
 Example:
 For $s\in \{0,1\}^n$, define $f_s:\overline{x}\mapsto s\cdot \overline{s}$.
 $\mathcal{D}$ gives oracle access to $g(0^n)=\overline{y_0}$, $g(1^n)=\overline{y_1}$. If $\overline{y_0}+\overline{y_1}=1^n$, then $\mathcal{D}$ outputs $1$ otherwise $0$.
 ```python
 def O_g(x):
    pass
 def D():
    # bit_stream(0,n) is a n-bit string of 0s
    y0=O_g(bit_stream(0,n))
    y1=O_g(bit_stream(1,n))
    if y0+y1==bit_stream(1,n):
        return 1
    else:
        return 0
 ```
 If $g=f_s$, then $D$ returns $\overline{s}+\overline{s}+1^n =1^n$.
 $$
 P[f_s\gets D^{f_s}(1^n)=1]=1
 $$
 $$
 P[F\gets RF^n,D^F(1^n)=1]=\frac{1}{2^n}
 $$
 #### Theorem PRG exists then PRF family exists.
 Proof:
 Let $g:\{0,1\}^n\to \{0,1\}^{2n}$ be a PRG.
 $$
 g(\overline{x})=[g_0(\overline{x})] [g_1(\overline{x})]
 $$
 Then we choose a random $s\in \{0,1\}^n$ (initial seed) and define $\overline{x}\gets \{0,1\}^n$, $\overline{x}=x_1\cdots x_n$.
 $$
 f_s(\overline{x})=f_s(x_1\cdots x_n)=g_{x_n}(\dots (g_{x_2}(g_{x_1}(s))))
 $$
 ```python
 s=random.randbits(n)
 #????
 def g(x):
    if x[0]==0:
        return g(f_s(x[1:]))
    else:
        return g(f_s(x[1:]))
 def f_s(x):
    return g(x)
 ```
 Suppose $g:\{0,1\}^3\to \{0,1\}^6$ is a PRG.
 | $x$ | $f_s(x)$ |
 | --- | -------- |
 | 000 | 110011 |
 | 001 | 010010 |
 | 010 | 001001 |
 | 011 | 000110 |
 | 100 | 100000 |
 | 101 | 110110 |
 | 110 | 000111 |
 | 111 | 001110 |
 Suppose the initial seed is $011$, then the constructed function tree goes as follows:
 Example: 
 $$
 \begin{aligned}
 f_s(110)&=g_0(g_1(g_1(s)))\\
 &=g_0(g_1(110))\\
 &=g_0(111)\\
 &=001
 \end{aligned}
 $$
 $$
 \begin{aligned}
 f_s(010)&=g_0(g_1(g_0(s)))\\
 &=g_0(g_1(000))\\
 &=g_0(001)\\
 &=010
 \end{aligned}
 $$
 Assume that $D$ distinguishes $f_s$ and $F\gets RF_n$ with non-negligible probability.
 By hybrid argument, there exists a hybrid $H_i$ such that $D$ distinguishes $H_i$ and $H_{i+1}$ with non-negligible probability.
 For $H_0$, 
 EOP
--- a/pages/CSE442T/CSE442T_L16.md
+++ b/pages/CSE442T/CSE442T_L16.md
@@ -0,0 +1,132 @@
 # Lecture 16
 ## Continue on PRG
 PRG exists $\implies$ Pseudorandom function family exists.
 ### Multi-message secure encryption
 $Gen(1^n):$ Output $f_i:\{0,1\}^n\to \{0,1\}^n$ from PRF family
 $Enc_i(m):$ Random $r\gets \{0,1\}^n$
 Ouput $(r,m\oplus f_i(r))$
 $Dec_i(r,c):$ Output $c\oplus f_i(r)$
 Proof of security:
 Suppose $D$ distinguishes, for infinitly many $n$.
 The encryption of $a$ pair of lists
 (1) $\{i\gets Gen(1^n):(r_1,m_1\oplus f_i(r_1)),(r_2,m_2\oplus f_i(r_2)),(r_3,m_3\oplus f_i(r_3)),\ldots,(r_q,m_q\oplus f_i(r_q)), \}$
 (2) $\{F\gets RF_n: (r_1,m_1\oplus F(r_1))\ldots\}$
 (3) One-time pad $\{(r_1,m_1\oplus s_1)\}$
 (4) One-time pad $\{(r_1,m_1'\oplus s_1)\}$
 If (1) (2) distinguished, 
 $(r_1,f_i(r_1)),\ldots,(r_q,f_i(r_q))$ is distinguished from 
 $(r_1,F(r_1)),\ldots, (r_q,F(r_q))$
 So $D$ distinguishing output of $r_1,\ldots, r_q$ of PRF from the RF, this contradicts with definition of PRF.
 EOP
 Noe we have 
 (RSA assumption and Discrete log assumption for one-way function exists.)
 One-way function exists $\implies$
 Pseudo random generator exists $\implies$
 Pseudo random function familiy exists $\implies$
 Mult-message secure encryption exists.
 ## Public key cryptography
 1970s.
 The goal was to agree/share a key without meeting in advance
 ### Diffie-Helmann Key exchange
 A and B create a secret key together without meeting.
 Rely on discrete log assumption.
 They pulicly agree on modulus $p$ and generator $g$. 
 Alice picks random exponent $a$ and computes $g^a\mod p$
 Bob picks random exponent $b$ and computes $g^b\mod p$
 and they send result to each other.
 And Alice do $(g^b)^a$ where Bob do $(g^a)^b$.
 #### Diffie-Helmann assumption
 With $g^a,g^b$ no one can compute $g^{ab}$.
 ### Public key encryption scheme
 Idea: The recipient Bob distributes opened Bob-locks
 - Once closed, only Bob can open it.
 Public-key encryption scheme:
 1. $Gen(1^n):$ Outputs $(pk,sk)$
 2. $Enc_{pk}(m):$ Efficient for all $m,pk$
 3. $Dec_{sk}(c):$ Efficient for all $c,sk$
 4. $P[(pk,sk)\gets Gen(1^n):Dec_{sk}(Enc_{pk}(m))=m]=1$
 Let $A, E$ knows $pk$ not $sk$ and $B$ knows $pk,sk$.
 Adversary can now encypt any message $m$ with the public key.
 - Perfect secrecy impossible
 - Randomness necessary 
 Security of public key
 $\forall n.u.p.p.t D,\exists \epsilon(n)$ such that $\forall n,m_0,m_1\in \{0,1\}^n$
 $$
 \{(pk,sk)\gets Gen(1^n):(pk,Enc_{pk}(m_0))\} \{(pk,sk)\gets Gen(1^n):(pk,Enc_{pk}(m_1))\} 
 $$ 
 are distinguished by at most $\epsilon (n)$
 This "single" message security implies multi-message security!
 _Left as exercise_
 We will achieve security in sending a single bit $0,1$
 Time for trapdoor permutation. (EX. RSA)
 Encryption Scheme: Given family of trapdoor permutation $\{f_i\}$ with hardcore bit $h(i)$
 $Gen(1^n):(f_i,f_i^{-1})$, where $f_i^{-1}$ uses trapdoor permutation of $t$
 $Output ((f_i,h_i),f_i^{-1})$
 $m=0$ or $1$.
 $Enc_{pk}(m):r\gets\{0,1\}^n$
 $Output (f_i(r),h_i(r)+m)$
 $Dec_{sk}(c_1,c_2)$
 $r=f_i^{-1}(c_1)$
 $m=c_2+h_1(r)$
--- a/pages/CSE442T/CSE442T_L17.md
+++ b/pages/CSE442T/CSE442T_L17.md
@@ -0,0 +1,159 @@
 # Lecture 17
 ## Strength through Truth
 ### Public key encryption scheme (1-bit)
 $Gen(1^n):(f_i, f_i^{-1})$
 $f_i$ is the trapdoor permutation. (eg. RSA)
 $Output((f_i, h_i), f_i^{-1})$, where $(f_i, h_i)$ is the public key and $f_i^{-1}$ is the secret key.
 $Enc_{pk}(m):r\gets \{0, 1\}^n$
 $Output(f_i(r), h_i(r)\oplus m)$
 where $f_i(r)$ is denoted as $c_1$ and $h_i(r)\oplus m$ is the tag $c_2$.
 The decryption function is:
 $Dec_{sk}(c_1, c_2)$:
 $r=f_i^{-1}(c_1)$
 $m=c_2\oplus h_i(r)$
 #### Validity of the decryption
 Proof of the validity of the decryption: Exercise.
 #### Security of the encryption scheme
 The encryption scheme is secure under this construction (Trapdoor permutation (TDP), Hardcore bit (HCB)).
 Proof:
 We proceed by contradiction. (Constructing contradiction with definition of hardcore bit.)
 Assume that there exists a distinguisher $\mathcal{D}$ that can distinguish the encryption of $0$ and $1$ with non-negligible probability $\mu(n)$.
 $$
 \{(pk,sk)\gets Gen(1^n):(pk,Enc_{pk}(0))\} v.s.\{(pk,sk)\gets Gen(1^n):(pk,Enc_{pk}(1))\} \geq \mu(n)
 $$
 By prediction lemma (the distinguisher can be used to create and adversary that can break the security of the encryption scheme with non-negligible probability $\mu(n)$).
 $$
 P[m\gets \{0,1\}; (pk,sk)\gets Gen(1^n):\mathcal{A}(pk,Enc_{pk}(m))=m]\geq \frac{1}{2}+\mu(n)
 $$
 We will use this to construct an agent $B$ which can determine the hardcore bit $h_i(r)$ of the trapdoor permutation $f_i(r)$ with non-negligible probability.
 $f_i,h_i$ are determined.
 $B$ is given $f_i(r)$ and $h_i(r)$ and outputs $b\in \{0,1\}$.
 - $r\gets \{0,1\}^n$ is chosen uniformly at random.
 - $y=f_i(r)$ is given to $B$.
 - $b=h_i(r)$ is given to $B$.
 - Choose $c_2\gets \{0,1\}= h_i(r)\oplus m$ uniformly at random.
 - Then use $\mathcal{A}$ with $pk=(f_i, h_i),Enc_{pk}(m)=(f_i(r), h_i(r)\oplus m)$ to determine whether $r$ is $0$ or $1$.
 - Let $m'\gets \mathcal{A}(pk,(y,c_2))$.
 - Since $c_2=h_i(r)\oplus m$, we have $m=b\oplus c_2$, $b=m'\oplus c_2$.
 - Output $b=m'\oplus c_2$.
 The probability that $B$ correctly guesses $b$ given $f_i,h_i$ is:
 $$
 \begin{aligned}
 &~~~~~P[r\gets \{0,1\}^n: y=f_i(r), b=h_i(r): B(f_i,h_i,y)=b]\\
 &=P[r\gets \{0,1\}^n,c_2\gets \{0,1\}: y=f_i(r), b=h_i(r):\mathcal{A}((f_i,h_i),(y,c_2))=(c_2+b)]\\
 &=P[r\gets \{0,1\}^n,m\gets \{0,1\}: y=f_i(r), b=h_i(r):\mathcal{A}((f_i,h_i),(y,b\oplus m))=m]\\
 &>\frac{1}{2}+\mu(n)
 \end{aligned}
 $$
 This contradicts the definition of hardcore bit.
 EOP
 ### Public key encryption scheme (multi-bit)
 Let $m\in \{0,1\}^k$.
 We can choose random $r_i\in \{0,1\}^n$, $y_i=f_i(r_i)$, $b_i=h_i(r_i),c_i=m_i\oplus b_i$.
 $Enc_{pk}(m)=((y_1,c_1),\cdots,(y_k,c_k)),c\in \{0,1\}^k$
 $Dec_{sk}:r_k=f_i^{-1}(y_k),h_i(r_k)\oplus c_k=m_k$
 ### Special public key cryptosystem: El-Gamal (based on Diffie-Hellman Assumption)
 #### Definition: Decisional Diffie-Hellman Assumption (DDH)
 > Define the group of squares mod $p$ as follows:
 > 
 > $p=2q+1$, $q\in \Pi_{n-1}$, $g\gets \mathbb{Z}_p^*/\{1\}$, $y=g^2$
 >
 > $G=\{y,y^2,\cdots,y^q=1\}\mod p$
 These two listed below are indistinguishable.
 $\{p\gets \tilde{\Pi_n};y\gets Gen_q;a,b\gets \mathbb{Z}_q:(p,y,y^a,y^b,y^{ab})\}_n$
 $\{p\gets \tilde{\Pi_n};y\gets Gen_q;a,b,\bold{z}\gets \mathbb{Z}_q:(p,y,y^a,y^b,y^\bold{z})\}_n$
 > Diffie-Hellman Assumption:
 >
 > Hard to compute $y^{ab}$ given $p,y,y^a,y^b$.
 So DDH assumption implies discrete logarithm assumption.
 Idea:
 If one can find $a,b$ from $y^a,y^b$, then one can find $ab$ from $y^{ab}$ and compare to $\bold{z}$ to check whether $y^\bold{z}$ is a valid DDH tuple.
 #### El-Gamal encryption scheme (public key cryptosystem)
 $Gen(1^n)$:
 $p\gets \tilde{\Pi_n},g\gets \mathbb{Z}_p^*/\{1\},y\gets Gen_q,a\gets \mathbb{Z}_q$
 Output:
 $pk=(p,y,y^a\mod p)$ (public key)
 $sk=(p,y,a)$ (secret key)
 **Message space:** $G_q=\{y,y^2,\cdots,y^q=1\}$
 $Enc_{pk}(m)$:
 $b\gets \mathbb{Z}_q$
 $c_1=y^b\mod p,c_2=(y^{ab}\cdot m)\mod p$
 Output: $(c_1,c_2)$
 $Dec_{sk}(c_1,c_2)$:
 Since $c_2=(y^{ab}\cdot m)\mod p$, we have $m=\frac{c_2}{c_1^a}\mod p$
 Output: $m$
 #### Security of El-Gamal encryption scheme
 Proof:
 If not secure, then there exists a distinguisher $\mathcal{D}$ that can distinguish the encryption of $m_1,m_2\in G_q$ with non-negligible probability $\mu(n)$.
 $$
 \{(pk,sk)\gets Gen(1^n):D(pk,Enc_{pk}(m_1))\}\text{ vs. }\\
 \{(pk,sk)\gets Gen(1^n):D(pk,Enc_{pk}(m_2))\}\geq \mu(n)
 $$
 And proceed by contradiction. This contradicts the DDH assumption.
 EOP
--- a/pages/CSE442T/CSE442T_L18.md
+++ b/pages/CSE442T/CSE442T_L18.md
@@ -0,0 +1,148 @@
 # Lecture 18
 ## Chapter 5: Authentication
 ### 5.1 Introduction
 Signatures
 **private key**
 Alice and Bob share a secret key $k$.
 Message Authentication Codes (MACs)
 **public key**
 Any one can verify the signature.
 Digital Signatures
 #### Definitions 134.1
 A message authentication codes (MACs) is a triple $(Gen, Tag, Ver)$ where
 - $k\gets Gen(1^k)$ is a p.p.t. algorithm that takes as input a security parameter $k$ and outputs a key $k$.
 - $\sigma\gets Tag_k(m)$ is a p.p.t. algorithm that takes as input a key $k$ and a message $m$ and outputs a tag $\sigma$.
 - $Ver_k(m, \sigma)$ is a deterministic algorithm that takes as input a key $k$, a message $m$, and a tag $\sigma$ and outputs "Accept" if $\sigma$ is a valid tag for $m$ under $k$ and "Reject" otherwise.
 For all $n\in\mathbb{N}$, all $m\in\mathcal{M}_n$.
 $$
 P[k\gets Gen(1^k):Ver_k(m, Tag_k(m))=\textup {``Accept''}]=1
 $$
 #### Definition 134.2 (Security of MACs)
 Security: Prevent an adversary from producing any accepted $(m, \sigma)$ pair that they haven't seen before.
 - Assume they have seen some history of signed messages. $(m_1, \sigma_1), (m_2, \sigma_2), \ldots, (m_q, \sigma_q)$.
 - Adversary $\mathcal{A}$ has oracle access to $Tag_k$. Goal is to produce a new $(m, \sigma)$ pair that is accepted but none of $(m_1, \sigma_1), (m_2, \sigma_2), \ldots, (m_q, \sigma_q)$.
 $\forall$ n.u.p.p.t. adversary $\mathcal{A}$ with oracle access to $Tag_k(\cdot)$,
 $$
 \Pr[k\gets Gen(1^k);(m, \sigma)\gets\mathcal{A}^{Tag_k(\cdot)}(1^k);\mathcal{A}\textup{ did not query }m \textup{ and } Ver_k(m, \sigma)=\textup{``Accept''}]<\epsilon(n)
 $$
 #### MACs scheme
 $F=\{f_s\}$ is a PRF family.
 $f_s:\{0,1\}^{|S|}\to\{0,1\}^{|S|}$
 $Gen(1^k): s\gets \{0,1\}^n$
 $Tag_k(m)$ outputs $f_s(m)$.
 $Ver_s(m, \sigma)$ outputs "Accept" if $f_s(m)=\sigma$ and "Reject" otherwise.
 Proof of security (Outline):
 Suppose we used $F\gets RF_n$ (true random function).
 If $\mathcal{A}$ wants $F(m)$ for $m\in \{m_1, \ldots, m_q\}$. $F(m)\gets U_n$.
 $$
 \begin{aligned}
 &P[F\gets RF_n; (m, \sigma)\gets\mathcal{A}^{F(\cdot)}(1^k);\mathcal{A}\textup{ did not query }m \textup{ and } Ver_k(m, \sigma)=\textup{``Accept''}]\\
 &=P[F\gets RF_n; (m, \sigma)\gets F(m)]\\
 &=\frac{1}{2^n}<\epsilon(n)
 \end{aligned}
 $$
 Suppose an adversary $\mathcal{A}$ has $\frac{1}{p(n)}$ chance of success with our PRF-based scheme...
 This could be used to distinguish PRF $f_s$ from a random function.
 The distinguisher runs as follows:
 - Runs $\mathcal{A}(1^n)$
 - Whenever $\mathcal{A}$ asks for $Tag_k(m)$, we ask our oracle for $f(m)$
 - $(m, \sigma)\gets\mathcal{A}^{F(\cdot)}(1^n)$
 - Query oracle for $f(m)$
 - If $\sigma=f(m)$, output 1
 - Otherwise, output 0
 $D$ will output 1 for PRF with probability $\frac{1}{p(n)}$ and for RF with probability $\frac{1}{2^n}$.
 #### Definition 135.1(Digital Signature D.S. over $\{M_n\}_n$)
 A digital signature scheme is a triple $(Gen, Sign, Ver)$ where
 - $(pk,sk)\gets Gen(1^k)$ is a p.p.t. algorithm that takes as input a security parameter $k$ and outputs a public key $pk$ and a secret key $sk$.
 - $\sigma\gets Sign_{sk}(m)$ is a p.p.t. algorithm that takes as input a secret key $sk$ and a message $m$ and outputs a signature $\sigma$.
 - $Ver_{pk}(m, \sigma)$ is a deterministic algorithm that takes as input a public key $pk$, a message $m$, and a signature $\sigma$ and outputs "Accept" if $\sigma$ is a valid signature for $m$ under $pk$ and "Reject" otherwise.
 For all $n\in\mathbb{N}$, all $m\in\mathcal{M}_n$.
 $$
 P[(pk,sk)\gets Gen(1^k); \sigma\gets Sign_{sk}(m); Ver_{pk}(m, \sigma)=\textup{``Accept''}]=1
 $$
 #### Security of Digital Signature
 $$
 \Pr[(pk,sk)\gets Gen(1^k); (m, \sigma)\gets\mathcal{A}^{Sign_{sk}(\cdot)}(1^k);\mathcal{A}\textup{ did not query }m \textup{ and } Ver_{pk}(m, \sigma)=\textup{``Accept''}]<\epsilon(n)
 $$
 For all n.u.p.p.t. adversary $\mathcal{A}$ with oracle access to $Sign_{sk}(\cdot)$.
 ### 5.4 One time security: $\mathcal{A}$ can only use oracle once.
 Output $(m, \sigma)$ if $m\neq m$
 Security parameter $n$
 One time security on $\{0,1\}^n$
 One time security on $\{0,1\}^*$
 Regular security on $\{0,1\}^*$
 Note: the adversary automatically has access to $Ver_{pk}(\cdot)$
 #### One time security scheme (Lamport Scheme on $\{0,1\}^n$)
 $Gen(1^k)$: $\mathbb{Z}_n$ random n-bit string
 $sk$: List 0: $\bar{x_1}^0, \bar{x_2}^0, \ldots, \bar{x_n}^0$
 List 1: $\bar{x_1}^1, \bar{x_2}^1, \ldots, \bar{x_n}^1$
 All $\bar{x_i}^j\in\{0,1\}^n$
 $pk$: For a strong one-way function $f$
 List 0: $f(\bar{x_1}^0), f(\bar{x_2}^0), \ldots, f(\bar{x_n}^0)$
 List 1: $f(\bar{x_1}^1), f(\bar{x_2}^1), \ldots, f(\bar{x_n}^1)$
 $Sign_{sk}(m):(m_1, m_2, \ldots, m_n)\mapsto(\bar{x_1}^{m_1}, \bar{x_2}^{m_2}, \ldots, \bar{x_n}^{m_n})$
 $Ver_{pk}(m, \sigma)$: output "Accept" if $\sigma$ is a prefix of $f(m)$ and "Reject" otherwise.
 > Example: When we sign a message $01100$, $$Sign_{sk}(01100)=(\bar{x_1}^0, \bar{x_2}^1, \bar{x_3}^1, \bar{x_4}^0, \bar{x_5}^0)$$
 > We only reveal the $x_1^0, x_2^1, x_3^1, x_4^0, x_5^0$
 > For the second signature, we need to reveal exactly different bits.  
 > The adversary can query the oracle for $f(0^n)$ (reveals list0) and $f(1^n)$ (reveals list1) to produce any valid signature they want.
--- a/pages/CSE442T/CSE442T_L19.md
+++ b/pages/CSE442T/CSE442T_L19.md
@@ -0,0 +1,112 @@
 # Lecture 19
 ## Chapter 5: Authentication
 ### Lamport's One-Time Signature
 Given a oneway function $f$, we can create a signature scheme as follows:
 We construct a key pair $(sk, pk)$ as follows:
 $sk$ is two list of random bits, 
 where $sk_0=\{\bar{x_1}^0, \bar{x_2}^0, \ldots, \bar{x_n}^0\}$ 
 and $sk_1=\{\bar{x_1}^1, \bar{x_2}^1, \ldots, \bar{x_n}^1\}$.
 $pk$ is the image of $sk$ under $f$, i.e. $pk = f(sk)$.
 where $pk_0 = \{f(\bar{x_1}^0), f(\bar{x_2}^0), \ldots, f(\bar{x_n}^0)\}$
 and $pk_1 = \{f(\bar{x_1}^1), f(\bar{x_2}^1), \ldots, f(\bar{x_n}^1)\}$.
 To sign a message $m\in\{0,1\}^n$, we output the signature $Sign_{sk}(m=m_1m_2\ldots m_n) = \{\bar{x_1}^{m_1}, \bar{x_2}^{m_2}, \ldots, \bar{x_n}^{m_n}\}$.
 To verify a signature $\sigma$ on $m$, we check if $f(\sigma) = pk_m$.
 This is not more than one-time secure since the adversary can ask oracle for $Sign_{sk}(0^n)$ and $Sign_{sk}(1^n)$ to reveal list $pk_0$ and $pk_1$ to sign any message.
 We will show it is one-time secure
 Idea of proof:
 Say their query is $Sign_{sk}(0^n)$ and reveals $pk_0$. 
 Now must sign $m\neq 0^n$. There must be a 1, somewhere in the message. Say the $i$th bit is the first 1. then they need to produce $x'$ such that $f(x_i)=f(x_i')$, which inverts the one-way function.
 Proof of one-time security:
 Suppose there exists an adversary $\mathcal{A}$ that can produce a valid signature on a different message after one query to oracle with non-negligible probability $\mu>\frac{1}{p(n)}$.
 We will design a function $B$ which use $\mathcal{A}$ to invert the one-way function with non-negligible probability.
 Let $x\gets \{0,1\}^n$ be a random variable, $y=f(x)$.
 B: input is $y$ and $1^n$. Our goal is to find $x'$ such that $f(x')=y$.
 Create 2 lists:
 $sk_0=\{x_0^0, x_1^0, \ldots, x_{n-1}^0\}$
 $sk_1=\{x_0^1, x_1^1, \ldots, x_{n-1}^1\}$
 Then we pick a random $(c,i)\gets \{0,1\}^n\times [n]$. ($2n$ possibilities)
 Replace $f(x_i^c)$ with $y$.
 Return $sk_c$ with None.
 Run $\mathcal{A}$ on input $y$ and $1^n$. It will query $Sign_{sk}$ on some message $m$.
 Case 1: $m_i=1-c$
 We can answer with all of $x_1^{m_1}, x_2^{m_2}, \ldots, x_{1-c}^{m_{1-c}}, \ldots, x_n^{m_n}$
 Case 2: $m_i=c$
 We must abort we don't know what to do.
 Since $\mathcal{A}$ outputs $(m',\sigma)$ with non-negligible probability, we are hoping that $m_i'=c$. Then it's attempting to provide $x\to y$
 Since $m'$ differs at most 1 bit from $m$, we have $x\to y$ with probability $P[m_i'=c]\geq \frac{1}{n}$.
 $\sigma=(x_1^1,x_2^1,\ldots,x_n^1)$
 Check if $f(\sigma)=y$. If so, output $x'$. (all correct with prob $\geq \frac{1}{p(n)}$)
 If not, try again.
 $B$ inverts $f$ with prob $\geq \frac{1}{p(n)}$
 ### Collision Resistant Hash Functions (CRHF)
 We now have one-time secure signature scheme.
 We want one-time secure signature scheme that increase the size of messages relative tothe keys.
 Let $H:\{h_i:D_i\to R_i\}_{i\in I}$ be a family of CRHF if
 Easy to pick: 
 $Gen(1^n)$: outputs $i\in I$ (p,p,t)
 Compression
 $|R_i|<|D_i|$ for each $i\in I$
 Easy to compute:
 Can computer $h_i(x),\forall i,x\in D_i$ with a p.p.t
 Collision resistant:
 $\forall n.u.p.p.t \mathcal{A}$, $\forall n$, 
 $$
 P[i\gets Gen(1^n); (x_1,x_2)\gets \mathcal{A}(1^n,i): h_i(x_1)=h_i(x_2)\land x_1\neq x_2]\leq \epsilon(n)
 $$
 CRHF implies one-way function.
 But not the other way around. (CRHF is a stronger notion than one-way function.)
--- a/pages/CSE442T/CSE442T_L2.md
+++ b/pages/CSE442T/CSE442T_L2.md
@@ -0,0 +1,97 @@
 # Lecture 2
 ## Probability review
 Sample space $S=$ set of outcomes (possible results of experiments)
 Event $A\subseteq S$
 $P[A]=P[$ outcome $x\in A]$
 $P[\{x\}]=P(x)$
 Conditional probability:
 $P[A|B]={P[A\cap B]\over P[B]}$
 Assuming $B$ is the known information. Moreover, $P[B]>0$
 Probability that $A$ and $B$ occurring: $P[A\cap B]=P[A|B]\cdot P[B]$
 $P[B\cap A]=P[B|A]\cdot P[A]$
 So  $P[A|B]={P[B|A]\cdot P[A]\over P[B]}$ (Bayes Theorem)
 **There is always a chance that random guess would be the password... Although really, really, low...**
 ### Law of total probability
 Let $S=\bigcup_{i=1}^n B_i$. and $B_i$ are disjoint events.
 $A=\bigcup_{i=1}^n A\cap B_i$ ($A\cap B_i$ are all disjoint)
 $P[A]=\sum^n_{i=1} P[A|B_i]\cdot P[B_i]$
 ## Back to cryptography
 Defining security.
 ### Perfect Secrecy (Shannon Secrecy)
 $K\gets Gen()$ $K\in\mathcal{K}$
 $c\gets Enc_K(m)$ or we can also write as $c\gets Enc(K,m)$ for $m\in \mathcal{M}$
 And the decryption procedure:
 $m'\gets Dec_K(c')$, $m'$ might be null.
 $P[K\gets Gen(): Dec_K(Enc_K(m))=m]=1$
 #### Shannon Secrecy
 Distribution $D$ over the message space $\mathcal{M}$
 $P[K\gets Gen;m\gets D: m=m'|c\gets Enc_K(m)]=P[m\gets D: m=m']$
 Basically, we cannot gain any information from the encoded message.
 Code shall not contain any information changing the distribution of expectation of message after viewing the code.
 **NO INFO GAINED**
 #### Perfect Secrecy
 For any 2 messages, say $m_1,m_2\in \mathcal{M}$ and for any possible cipher $c$,
 $P[K\gets Gen:c\gets Enc_K(m_1)]=P[K\gets Gen():c\gets Enc_K(m_2)]$
 For a fixed $c$, any message could be encrypted to that...
 #### Theorem 
 Shannon secrecy is equivalent to perfect secrecy.
 Proof:
 If a crypto-system satisfy perfect secrecy, then it also satisfy Shannon secrecy.
 Let $(Gen, Enc,Dec)$ be a perfectly secret crypto-system with $\mathcal{K}$ and $\mathcal{M}$.
 Let $D$ be any distribution over messages.
 Let $m'\in \mathcal{M}$.
 $$
 ={P_K[c\gets Enc_K(m')]\cdot P[m=m']\over P_{K,m}[c\gets Enc_K(m)]}\\
 $$
 $$
 P[K\gets Gen();m\gets D:m=m'|c\gets Enc_K(m)]={P_{K,m}[c\gets Enc_K(m)\vert m=m']\cdot P[m=m']\over P_{K,m}[c\gets Enc_K(m)]}\\
 P_{K,m}[c\gets Enc_K(m)]=\sum^n_{i=1}P_{K,m}[c\gets Enc_k(m)|m=m_i]\cdot P[m=m_i]\\
 =\sum^n_{i=1}P_{K,m_i}[c\gets Enc_k(m_i)]\cdot P[m=m_i]
 $$
 and $P_{K,m_i}[c\gets Enc_K(m_i)]$ is constant due to perfect secrecy
 $\sum^n_{i=1}P_{K,m_i}[c\gets Enc_K(m_i)]\cdot P[m=m_i]=\sum^n_{i=1} P[m=m_i]=1$
--- a/pages/CSE442T/CSE442T_L20.md
+++ b/pages/CSE442T/CSE442T_L20.md
@@ -0,0 +1 @@
 # Lecture 20
--- a/pages/CSE442T/CSE442T_L21.md
+++ b/pages/CSE442T/CSE442T_L21.md
@@ -0,0 +1 @@
 # Lecture 21
--- a/pages/CSE442T/CSE442T_L22.md
+++ b/pages/CSE442T/CSE442T_L22.md
@@ -0,0 +1 @@
 # Lecture 22
--- a/pages/CSE442T/CSE442T_L23.md
+++ b/pages/CSE442T/CSE442T_L23.md
@@ -0,0 +1 @@
 # Lecture 23
--- a/pages/CSE442T/CSE442T_L24.md
+++ b/pages/CSE442T/CSE442T_L24.md
@@ -0,0 +1 @@
 # Lecture 24
--- a/pages/CSE442T/CSE442T_L3.md
+++ b/pages/CSE442T/CSE442T_L3.md
@@ -0,0 +1,114 @@
 # Lecture 3
 All algorithms $C(x)\to y$, $x,y\in \{0,1\}^*$ 
 P.P.T= Probabilistic Polynomial-time Turing Machine.
 ## Turing Machine: Mathematical model for a computer program
 A machine that can:
 1. Read in put
 2. Read/Write working tape move left/right
 3. Can change state
 ### Assumptions
 Anything can be accomplished by a real computer program can be accomplished by a "sufficiently complicated" Turing Machine (TM).
 ## Polynomial time
 We say $C(x),|x|=n,n\to \infty$ runs in polynomial time if it uses at most $T(n)$ operations bounded by some polynomials. $\exist c>0$ such that $T(n)=O(n^c)$
 If we can argue that algorithm runs in polynomially-many constant-time operations, then this is true for the T.M.
 $p,q$ are polynomials in $n$,
 $p(n)+q(n),p(n)q(n),p(q(n))$ are polynomial of $n$.
 Polynomial-time $\approx$ "efficient" for this course.
 ## Probabilistic
 Our algorithm's have access to random "coin-flips" we can produce poly(n) random bits.
 $P[C(x)$ takes at most $T(n)$ steps $]=1$
 Our adversary $a(x)$ will be a P.P.T which is non-uniform (n.u.) (programs description size can grow polynomially in n)
 ## Efficient private key encryption scheme 
 $m=\{0,1\}^n$
 $Gen(1^n)$ p.p.t output $k\in \mathcal{K}$
 $Enc_k(m)$ p.p.t outputs $c$
 $Dec_k(c')$ p.p.t outputs $m$ or "null"
 $P_k[Dec_k(Enc_k(m))=m]=1$
 ## Negligible function
 $\varepsilon:\mathbb{N}\to \mathbb{R}$ is a negligible function if $\forall c>0$, $\exists N\in\mathbb{N}$ such that $\forall n\geq N, \varepsilon(n)<\frac{1}{n^c}$
 Idea: for any polynomial, even $n^{100}$, in the long run $\varepsilon(n)\leq \frac{1}{n^{100}}$
 Example: $\varepsilon (n)=\frac{1}{2^n}$, $\varepsilon (n)=\frac{1}{n^{\log (n)}}$
 Non-example: $\varepsilon (n)=O(\frac{1}{n^c})\forall c$
 ## One-way function
 Idea: We are always okay with our chance of failure being negligible.
 Foundational concept of cryptography
 Goal: making $Enc_k(m),Dec_k(c')$ easy and $Dec^{-1}(c')$ hard.
 ### Strong one-way function
 #### Definition: Strong one-way function
 $$
 f:\{0,1\}^n\to \{0,1\}^*(n\to \infty)
 $$
 There is a negligible function $\varepsilon (n)$ such that for any adversary $a$ (n.u.p.p.t)
 $$
 P[x\gets\{0,1\}^n;y=f(x):f(a(y))=y,a(y)=x']\leq\varepsilon(n)
 $$
 _Probability of guessing correct message is negligible_
 and
 there is a p.p.t which computes $f(x)$ for any $x$.
 - Hard to go back from output
 - Easy to find output
 $a$ sees output y, they wan to find some $x'$ such that $f(x')=y$.
 Example: Suppose $f$ is one-to-one, then $a$ must find our $x$, $P[x'=x]=\frac{1}{2^n}$, which is negligible.
 Why do we allow $a$ to get a different $x'$?
 > Suppose the definition is $P[x\gets\{0,1\}^n;y=f(x):a(y)=x]\neq\varepsilon(n)$, then a trivial function $f(x)=x$ would also satisfy the definition.
 To be technically fair, $a(y)=a(y,1^n)$, size of input $\approx n$, let them use $poly(n)$ operations.
 ### Do one-way function exists?
 Unknown, actually...
 But we think so!
 We will need to use various assumptions. one that we believe very strongly based on evidence/experience
 Ex. $p,q$ are large random primes
 $N=p\cdot q$
 Factoring $N$ is hard. (without knowing $p,q$)
--- a/pages/CSE442T/CSE442T_L4.md
+++ b/pages/CSE442T/CSE442T_L4.md
@@ -0,0 +1,129 @@
 # Lecture 4
 ## Recap
 Negligible function $\varepsilon(n)$ if $\forall c>0,\exist N$ such that $n>N$, $\varepsilon (n)<\frac{1}{n^c}$
 Ex: $\varepsilon(n)=2^{-n},\varepsilon(n)=\frac{1}{n^{\log (\log n)}}$
 ### Strong One-Way Function
 1. $\exists$ a P.P.T. that computes $f(x),\forall x\in\{0,1\}^n$
 2. $\forall a$ adversaries, $\exists \varepsilon(n),\forall n$.
    $$
    P[x\gets \{0,1\}^n;y=f(x):f(a(y,1^n))=y]<\varepsilon(n)
    $$
 _That is, the probability of success guessing should decreasing as encrypted message increase..._
 To negate statement 2:
 $$
 P[x\gets \{0,1\}^n;y=f(x):f(a(y,1^n))=y]=\mu_a(n)
 $$
 is a negligible function.
 Negation:
 $\exists a$, $P[x\gets \{0,1\}^n;y=f(x):f(a(y,1^n))=y]=\mu_a(n)$ is not  a negligible function.
 That is, $\exists c>0,\forall N \exists n>N \varepsilon(n)>\frac{1}{n^c}$
 $\mu_a(n)>\frac{1}{n^c}$ for infinitely many $n$. or infinitely often.
 > Keep in mind: $P[success]=\frac{1}{n^c}$, it can try $O(n^c)$ times and have a good chance of succeeding at least once.
 ## New materials
 ### Week One-Way Function
 $f:\{0,1\}^n\to \{0,1\}^*$
 1. $\exists$ a P.P.T. that computes $f(x),\forall x\in\{0,1\}^n$
 2. $\forall a$ adversaries, $\exists \varepsilon(n),\forall n$.
    $$
    P[x\gets \{0,1\}^n;y=f(x):f(a(y,1^n))=y]<1-\frac{1}{p(n)}
    $$
    _The probability of success should not be too close to 1_
 ### Probability
 ### Useful bound $0<p<1$
 $1-p<e^{-p}$
 (most useful when $p$ is small)
 For an experiment has probability $p$ of failure and $1-p$ of success.
 We run experiment $n$ times independently.
 $P[$success all n times$]=(1-p)^n<(e^{-p})^n=e^{-np}$
 Theorem: If there exists a weak one-way function, there there exists a strong one-way function
 In particular, if $f:\{0,1\}^n\to \{0,1\}^*$ is weak one-way function.
 $\exists$ polynomial $q(n)$ such that
 $$
 g(x):\{0,1\}^{nq(n)}\to \{0,1\}^*
 $$
 and for every $n$ bits $x_i$
 $$
 g(x_1,x_2,..,x_{q(n)})=(f(x_1),f(x_2),...,f(x_{q(n)}))
 $$
 is a strong one-way function.
 Proof:
 1. Since $\exist P.P.T.$ that computes $f(x),\forall x$ we use this $q(n)$ polynomial times to compute $g$.
 2. (Idea) $a$ has to succeed in inverting $f$ all $q(n)$ times.
    Since $x$ is a weak one-way, $\exists$ polynomial $p(n)$. $\forall q, P[q$ inverts $f]<1-\frac{1}{p(n)}$ (Here we use $<$ since we can always find a polynomial that works)
    Let $q(n)=np(n)$.
    Then $P[a$ inverting $g]\sim P[a$ inverts $f$ all $q(n)]$ times. $<(1-\frac{1}{p(n)})^{q(n)}=(1-\frac{1}{p(n)})^{np(n)}<(e^{-\frac{1}{p(n)}})^{np(n)}=e^{-n}$ which is negligible function.
 EOP
 _we can always force the adversary to invert the weak one-way function for polynomial time to reach the property of strong one-way function_
 Example: $(1-\frac{1}{n^2})^{n^3}<e^{-n}$
 ### Some candidates of one-way function
 #### Multiplication
 $Mult(m_1,m_2)=\begin{cases}
    1,m_1=1 | m_2=1\\
    m_1\cdot m_2
 \end{cases}$
 But we don't want trivial answers like (1,1000000007)
 Idea: Our "secret" is 373 and 481, Eve cna see the product 179413.
 Not strong one-way for all integer inputs because there are trivial answer for $\frac{3}{4}$ of all outputs. `Mult(2,y/2)`
 Factoring Assumption:
 The only way to efficiently factorizing the product of prime is to iterate all the primes.
 In other words:
 $\forall a\exists \varepsilon(n)$ such that $\forall n$. $P[p_1\gets \prod n_j]$
 We'll show this is a weak one-way function under the Factoring Assumption.
 $\forall a,\exists \varepsilon(n)$ such that $\forall n$,
 $$
 P[p_1\gets \Pi_n;p_2\gets \Pi_n;N=p_1\cdot p_2:a(n)=\{p_1,p_2\}]<\varepsilon(n)
 $$
 where $\Pi_n=\{$ all primes $p<2^n\}$
--- a/pages/CSE442T/CSE442T_L5.md
+++ b/pages/CSE442T/CSE442T_L5.md
@@ -0,0 +1,114 @@
 # Lecture 5
 Proving that there are one-way functions relies on assumptions.
 Factoring Assumption: $\forall a, \exist \varepsilon (n)$, let $p,q\in prime,p,q<2^n$
 $$
 P[p\gets \Pi_n;q\gets \Pi_n;N=p\cdot q:a(N)\in \{p,q\}]<\varepsilon(n)
 $$
 Evidence: To this point, best known procedure to always factor has run time $O(2^{\sqrt{n}\sqrt{log(n)}})$
 Distribution of prime numbers:
 - We have infinitely many prime
 - Prime Number Theorem $\pi(n)\approx\frac{n}{\ln(n)}$, that means, $\frac{1}{\ln n}$ of all integers are prime.
 We want to (guaranteed to) find prime:
 $\pi(n)>\frac{2^n}{2n}$
 e.g. 
 $$
 P[x\gets \{0,1\}^n:x\in prime]\geq {\frac{2^n}{2n}\over 2^n}=\frac{1}{2n}
 $$
 Theorem:
 $$
 f_{mult}:\{0,1\}^{2n}\to \{0,1\}^{2n},f_{mult}(x_1,x_2)=x_1\cdot x_2
 $$
 Idea: There are enough pairs of primes to make this difficult.
 > Reminder: Weak on-way if easy to compute and $\exist p(n)$,
 > $$P[a\ inverts=success]<1-\frac{1}{p(n)}$$
 > $$P[failure]>\frac{1}{p(n)}$$ high enough
 ## Prove one-way function (under assumptions)
 To prove $f$ is on-way (under assumption)
 1. Show $\exists p.p.t$ solves $f(x),\forall x$.
 2. Proof by contradiction.
   - For weak: Provide $p(n)$ that we know works.
     - Assume $\exists a$ such that $P[a\ inverts]>1-\frac{1}{p(n)}$
   - For strong: Provide $p(n)$ that we know works.
     - Assume $\exists a$ such that $P[a\ inverts]>\frac{1}{p(n)}$
 Construct p.p.t B
 which uses $a$ to solve a problem, which contradicts assumption or known fact.
 Back to Theorem:
 We will show that $p(n)=8n^2$ works.
 We claim $\forall a$,
 $$
 P[(x_1,x_2)\gets \{0,1\}^{2n};y=f_{mult}(x_1,x_2):f(a(y))=y]<1-\frac{1}{8n^2}
 $$
 For the sake of contradiction, suppose
 $$
 \exists a \textup{ such that} P[success]>1-\frac{1}{8n^2}
 $$
 We will use this $a$ to design p.p.t $B$ which can factor 2 random primes with non-negligible prob.
 ```python
 def A(y):
    # the adversary algorithm
    # expecting N to be product of random integer, don't need to be prime
 def is_prime(x):
    # test if x is a prime
 def gen(n):
    # generate number up to n bits
 def B(y):
    # N is the input cipher
    x1,x2=gen(n),gen(n)
    p=x1*x2
    if is_prime(x1) and is_prime(x2):
        return A(p)
    return A(y)
 ```
 How often does B succeed/fail?
 B fails to factor $N=p\dot q$, if:
 - $x$ and $y$ are not both prime
  - $P_e=1-P(x\in prime)P(y\in prime)\leq 1-(\frac{1}{2n})^2=1-\frac{1}{4n^2}$
 - if $a$ fails to factor
  - $P_f<\frac{1}{8n^2}$
 So
 $$
 P[B\ fails]\leq P[E\cup F]\leq P[E]+P[F]\leq (1-\frac{1}{4n^2}+\frac{1}{8n^2})=1-\frac{1}{8n^2}
 $$
 So
 $$
 P[B\ succeed]\geq \frac{1}{8n^2}\ (non\ negligible)
 $$
 This contradicting factoring assumption. Therefore, our assumption that $a$ exists was wrong.
 Therefore $\forall a$, $P[(x_1,x_2)\gets \{0,1\}^{2n};y=f_{mult}(x_1,x_2):f(a(y))=y]<1-\frac{1}{8n^2}$ is wrong.
--- a/pages/CSE442T/CSE442T_L6.md
+++ b/pages/CSE442T/CSE442T_L6.md
@@ -0,0 +1,114 @@
 # Lecture 6
 ## Review
 $$
 f_{mult}:\{0,1\}^{2n}\to \{0,1\}^{2n}
 $$
 is a weak one-way.
 $P[a\ invert]\leq 1-\frac{1}{8n^2}$ over $x,y\in$ random integers $\{0,1\}^n$
 ## Converting to strong one-way function
 By factoring assumptions, $\exists$ strong one-way function
 $f:\{0,1\}^N\to \{0,1\}^N$ for infinitely many $N$.
 $f=\left(f_{mult}(x_1,y_1),f_{mult}(x_2,y_2),\dots,f_{mult}(x_q,y_q)\right)$, $x_i,y_i\in \{0,1\}^n$.
 $f:\{0,1\}^{8n^4}\to \{0,1\}^{8n^4}$
 Idea: With high probability, at least one pair $(x_i,y_i)$ are both prime.
 Factoring assumption: $a$ has low chance of factoring $f_{mult}(x_i,y_i)$
 Use $P[x \textup{ is prime}]\geq\frac{1}{2n}$
 $$
 P[\forall p,q \in x_i,y_i, p\textup{ and } q \textup{ is not prime }]=P[p,q \in x_i,y_i, p\textup{ and } q \textup{ is not prime }]^q
 $$
 $$
 P[\forall p,q \in x_i,y_i, p\textup{ and } q \textup{ is not prime }]\leq(1-\frac{1}{4n^2})^{4n^3}\leq (e^{-\frac{1}{4n^2}})^{4n^3}=e^{-n}
 $$
 ### Proof of strong one-way
 1. $f_{mult}$ is efficiently computable, and we compute it poly-many times.
 2. Suppose it's not hard to invert. Then
    $\exists n.u.p.p.t.\ a$such that $P[w\gets \{0,1\}^{8n^4};z=f(w):f(a(z))=0]=\mu (n)>\frac{1}{p(n)}$
 We will use this to construct $B$ that breaks factoring assumption.
 $p\gets \Pi_n,q\gets \Pi_n,N=p\cdot q$
 ```psudocode
 function B:
    Receives N
    Sample (x,y) q times
    Compute z_i = f_mult(x_i,y_i) for each i
    From i=1 to q
        check if both x_i y_i are prime
        If yes,
            z_i = N
            break   // replace first instance
    Let z = (z_1,z_2,...,z_q) // z_k = N hopefully
    ((x_1,y_1),...,(x_k,y_k),...,(x_q,y_q)) <- a(z)
    if (x_k,y_k) was replaced
        return x_k,y_k
    else
        return null
 ```
 Let $E$ be the event that all pairs of sampled integers were not both prime.
 Let $F$ be the event that $a$ failed to invert
 $P(B\ fails)\leq P[E\cup F]\leq P[E]+P[F]\leq e^{-n}+(1-\frac{1}{p(n)})=1-(\frac{1}{p(n)}-e^{-n})\leq 1-\frac{1}{2p(n)}$
 $P[B\ succeeds]=P[p\gets \Pi_n,q\gets \Pi_n,N=p\cdot q:B(N)\in \{p,q\}]\geq \frac{1}{2p(n)}$
 Contradicting factoring assumption
 We've defined one-way functions to hae domain $\{0,1\}^n$ for some $n$.
 Our strong one-way function $f(n)$
 - Takes $4n^3$ pairs of random integers
 - Multiplies all pairs
 - Hope at least pair are both prime $p,q$ b/c we know $N=p\cdot q$ is hard to factor
 ### General collection of strong one-way functions
 $F=\{f_i:D_i\to R_i\},i\in I$, $I$ is the index set.
 1. We can effectively choose $i\gets I$ using $Gen$.
 2. $\forall i$ we ca efficiently sample $x\gets D_i$.
 3. $\forall i\forall x\in D_i,f_i(x)$ is efficiently computable
 4. For any n.u.p.p.t $a$, $\exists$ negligible function $\varepsilon (n)$.
    $P[i\gets Gen(1^n);x\gets D_i;y=f_i(x):f(a(y,i,1^n))=y]\leq \varepsilon(n)$
 #### Theorem
 $f_{mult,n}:(\Pi_n\times \Pi_n)\to \{0,1\}^{2n}$ is a collection of strong one way function.
 Ideas of proof:
 1. $n\gets Gen(1^n)$
 2. We can efficiently sample $p,q$ (with justifications)
 3. Factoring assumption
 Algorithm for sampling a random prime $p\gets \Pi_n$
 1. $x\gets \{0,1\}^n$ (n bit integer)
 2. Check if $x$ is prime.
   - Deterministic poly-time procedure
   - In practice, a much faster randomized procedure (Miller-Rabin) used
        $P[x\cancel{\in} prime|test\ said\ x\ prime]<\varepsilon(n)$
 3. If not, repeat. Do this for polynomial number of times
 > $;$ means and, $:$ means given that. $1$ usually interchangable with $\{0,1\}^n$
--- a/pages/CSE442T/CSE442T_L7.md
+++ b/pages/CSE442T/CSE442T_L7.md
@@ -0,0 +1,84 @@
 # Lecture 7
 ## Letter choosing experiment
 For 100 letter tiles,
 $p_1,...,p_{27}$ (with oe blank)
 $(p_1)^2+\dots +(p_{27})^2\geq\frac{1}{27}$
 For any $p_1,...,p_n$, $0\leq p_i\leq 1$.
 $\sum p_i=1$
 $P[$the same event twice in a row$]=p_1^2+p_2^2....+p_n^2$
 By Cauchy-Schwarz: $|u\cdot v|^2 \leq ||u||\cdot ||v||^2$.
 let $\vec{u}=(p_1,...,p_n)$, $\vec{v}=(1,..,1)$, so $(p_1^2+p_2^2....+p_n)^2\leq (p_1^2+p_2^2....+p_n^2)\cdot n$. So $p_1^2+p_2^2....+p_n^2\geq \frac{1}{n}$
 So for an adversary $A$, who random choose $x'$ and output $f(x')=f(x)$ if matched. $P[f(x)=f(x')]\geq\frac{1}{|Y|}$
 So $P[x\gets f(x);y=f(x):f(a(y,1^n))=y]\geq \frac{1}{|Y|}$
 ## Modular arithmetic
 For $a,b\in \mathbb{Z}$, $N\in \mathbb{Z}^2$
 $a\equiv b \mod N\iff N|(a-b)\iff \exists k\in \mathbb{Z}, a-b=kN,a=kN+b$
 Ex: $N=23$, $-20\equiv 3\equiv 26\equiv 49\equiv 72\mod 23$.
 ### Equivalent relations for any $N$ on $\mathbb{Z}$
 $a\equiv a\mod N$
 $a\equiv b\mod N\iff b\equiv a\mod N$
 $a\equiv b\mod N$ and $b\equiv c\mod N\implies a\equiv c\mod N$
 ### Division Theorem
 For any $a\in \mathbb{Z}$, and $N\in\mathbb{Z}^+$, $\exists unique\ r,0\leq r<N$.
 $\mathbb{Z}_N=\{0,1,2,...,N-1\}$ with modular arithmetic.
 $a+b\mod N,a\cdot b\mod N$
 Theorem: If $a\equiv b\mod N$ and$c\equiv d\mod N$, then $a\cdot c\equiv b\cdot d\mod N$.
 Definition: $gcd(a,b)=d,a,b\in \mathbb{Z}^+$, is the maximum number such that $d|a$ and $d|b$.
 Using normal factoring is slow... (Example: large $p,q,r$, $N=p\cdot q,,M=p\cdot r$)
 #### Euclidean algorithm.
 Recursively relying on fact that $(a>b>0)$
 $gcd(a,b)=gcd(b,a\mod b)$
 ```python
 def euclidean_algorithm(a,b):
    if a<b: return euclidean_algorithm(b,a)
    if b==0: return a
    return euclidean_algorithm(b,a%b)
 ```
 Proof:
 We'll show $d|a$ and $d|b\iff d|b$ and $d|(a\mod b)$
 $\impliedby$ $a=q\cdot b+r$, $r=a\mod b$
 $\implies$ $d|r$, $r=a\mod b$
 Runtime analysis:
 Fact: $b_{i+2}<\frac{1}{2}b_i$
 Proof:
 Since $a_i=q_i\cdot b_i+b_{i+1}$, and $b_1=q_2\cdot b_2+b_3$, $b_2>b_3$, and $q_2$ in worst case is $1$, so $b_3<\frac{b_1}{2}$
 $T(n)=2\Theta(\log b)=O(\log n)$ (linear in size of bits input)
--- a/pages/CSE442T/CSE442T_L8.md
+++ b/pages/CSE442T/CSE442T_L8.md
@@ -0,0 +1,72 @@
 # Lecture 8
 ## Computational number theory/arithmetic
 We want to have a easy-to-use one-way functions for cryptography.
 How to find $a^x\mod N$ quickly. $a,x,N$ are positive integers. We want to reduce $[a\mod N]$
 Example: $129^{39}\mod 41\equiv (129\mod 41)^{39}\mod 41=6^{39}\mod 41$
 Find the binary representation of $x$. e.g. express as sums of powers of 2.
 `x=39=bin(1,0,0,1,1,1)`
 Repeatedly square $floor(\log_2(x))$ times.
 $$
 \begin{aligned}
    6^{39}\mod 41&=6^{32+4+2+1}\mod 41\\
    &=(6^{32}\mod 41)(6^{4}\mod 41)(6^{2}\mod 41)(6^{1}\mod 41)\mod 41\\
    &=(-4)(25)(-5)(6)\mod 41\\
    &=7
 \end{aligned}
 $$
 The total multiplication steps is $floor(\log_2(x))$
 _looks like fast exponentiation right?_
 Goal: $f_{g,p}(x)=g^x\mod p$ is a one-way function, for certain choice of $p,g$ (and assumptions)
 ### A group (Nice day one for MODERN ALGEBRA)
 A group $G$ is a set with, a binary operation $\oplus$. and $\forall a,b\in G$, $a \oplus b\to c$
 1. $a,b\in G,a\oplus b\in G$
 2. $(a\oplus b)\oplus c=a\oplus(b\oplus c)$
 3. $\exists e$ such that $\forall a\in G, e\oplus g=g=g\oplus e$
 4. $\exists g^{-1}\in G$ such that $g\oplus g^{-1}=e$
 Example: 
 - $\mathbb{Z}_N=\{0,1,2,3,...,N-1\}$ with addition $\mod N$, with identity element $0$. $a\in \mathbb{Z}_N, a^{-1}=N-a$.
 - A even simpler group is $\Z$ with addition.
 - $\mathbb{Z}_N^*=\{x:x\in \mathbb{Z},1 \leq x\leq N: gcd(x,N)=1\}$ with multiplication $\mod N$ (we can do division here! yeah...).
  - If $N=p$ is prime, then $\mathbb{Z}_p^*=\{1,2,3,...,p-1\}$
  - If $N=24$, then $\mathbb{Z}_{24}^*=\{1,5,7,11,13,17,19,23\}$
    - Identity is $1$.
    - Let $a\in \mathbb{Z}_N^*$, by Euclidean algorithm, $gcd(a,N)=1$,$\exists x,y \in Z$ such that $ax+Ny=1,ax\equiv 1\mod N,x=a^{-1}$
    - $a,b\in \mathbb{Z}_N^*$. Want to show $gcd(ab,N)=1$. If $gcd(ab,N)=d>1$, then some prime $p|d$. so $p|(a,b)$, which means $p|a$ or $p|b$. In either case, $gcd(a,N)>d$ or $gcd(b,N)>d$, which contradicts that $a,b\in \mathbb{C}_N^*$
 ### Euler's totient function
 $\phi:\mathbb{Z}^+\to \mathbb{Z}^+,\phi(N)=|\mathbb{Z}_N^*|=|\{1\leq x\leq N:gcd(x,N)=1\}|$
 Example: $\phi(1)=1$, $\phi(24)=8$, $\phi (p)=p-1$, $\phi(p\cdot q)=(p-1)(q-1)$
 ### Euler's Theorem
 For any $a\in \mathbb{Z}_N^*$, $a^{\phi(N)}\equiv 1\mod N$
 Consequence: $a^x\mod N$, $x=K\cdot \phi(N)+r,0\leq r\leq \phi(N)$
 $$
 a^x\equiv a^{K \cdot \phi (N) +r}\equiv ( a^{\phi(n)} )^K \cdot a^r \mod N$
 $$
 So computing $a^x\mod N$ is polynomial in $\log (N)$ by reducing $a\mod N$ and $x\mod \phi(N)<N$
 Corollary: Fermat's little theorem:
 $1\leq a\leq p-1,a^{p-1}\equiv 1 \mod p$
--- a/pages/CSE442T/CSE442T_L9.md
+++ b/pages/CSE442T/CSE442T_L9.md
@@ -0,0 +1,118 @@
 # Lecture 9
 ## Continue on Cyclic groups
 $$
 \begin{aligned}
 107^{662}\mod 51&=(107\mod 51)^{662}\mod 51\\
 &=5^{662}\mod 51
 \end{aligned}
 $$
 Remind that $\phi(p),p\in\Pi,\phi(p)=p-1$.
 $51=3\times 17,\phi(51)=\phi(3)\times \phi(17)=2\times 16=32$, So $5^{32}\mod 1$
 $5^2\equiv 25\mod 51=25$  
 $5^4\equiv (5^2)^2\equiv(25)^2 \mod 51\equiv 625\mod 51=13$  
 $5^8\equiv (5^4)^2\equiv(13)^2 \mod 51\equiv 169\mod 51=16$  
 $5^16\equiv (5^8)^2\equiv(16)^2 \mod 51\equiv 256\mod 51=1$  
 $$
 \begin{aligned}
 5^{662}\mod 51&=107^{662\mod 32}\mod 51\\
 &=5^{22}\mod 51\\
 &=5^{16}\cdot 5^4\cdot 5^2\mod 51\\
 &=19
 \end{aligned}
 $$
 For $a\in \mathbb{Z}_N^*$, the order of $a$, $o(a)$ is the smallest positive $k$ such that $a^k\equiv 1\mod N$. $o(a)\leq \phi(N),o(a)|\phi (N)$
 In a general finite group
 $g^{|G|}=e$ (identity)
 $o(g)\vert |G|$
 If a group $G=\{a,a^2,a^3,...,e\}$ $G$ is cyclic
 In a cyclic group, if $o(a)=|G|$, then a is a generator of $G$.
 Fact: $\mathbb{Z}^*_p$ is cyclic
 $|\mathbb{Z}^*_p|=p-1$, so $\exists$ generator $g$, and $\mathbb{Z}$, $\phi(\mathbb{Z}_{13}^*)=12$
 For example, $2$ is a generator for $\mathbb{Z}_{13}^*$ with $2,4,8,3,6,12,11,9,5,10,7,1$.
 If $g$ is a generator, $f:\mathbb{Z}_p^*\to \mathbb{Z}_p^*$, $f(x)=g^x \mod p$ is onto.
 What type of prime $p$?
 - Large prime.
 - If $p-1$ is very factorable, that is very bad.
  - Pohlig-Hellman algorithm
  - $p=2^n+1$ only need polynomial time to invert
 - We want $p=2q+1$, where $q$ is prime. (Sophie Germain primes, or safe primes)
 There are _probably_ infinitely many safe prime and efficient to sample as well.
 If $p$ is safe, $g$ generator.
 $$
 \mathbb{Z}_p^*=\{g,g^2,..,e\}
 $$
 Then $\{g^2,...g^{2q}\}S_{g,p}\subseteq \mathbb{Z}_p^*$ is a subgroup; $g^{2k}\cdot g^{2l}=g^{2(k+l)}\in S_{g,p}$
 It is cyclic with generator $g^2$.
 It is easy to find a generator.
 - Pick $a\in \mathbb{Z}_p^*$
 - Let $x=a^2$. If $x\neq 1$, it is a generator of subgroup $S_p$
  - $S_p=\{x,x^2,...,x^q\}\mod p$
 Example: $p=2\cdot 11+1=23$
 we have a subgroup with generator $4$ and $S_4=\{4,16,18,3,12,2,8,9,13,6,1\}$
 ```python
 def get_generator(p):
    """
    p should be a prime, or you need to do factorization
    """
    g=[]
    for i in range(2,p-1):
        k=i
        sg=[]
        step=p
        while k!=1 and step>0:
            if k==0:
                raise ValueError(f"Damn, {i} generates 0 for group {p}")
            sg.append(k)
            k=(k*i)%p
            step-=1
        sg.append(1)
        # if len(sg)!=(p-1): continue
        g.append((i,[j for j in sg]))
    return g
 ```
 ### Diffie-Hellman assumption
 If $p$ is a randomly sampled safe prime.
 Denote safe prime as $\tilde{\Pi}_n=\{p\in \Pi_n:q=\frac{p-1}{2}\in \Pi_{n-1}\}$
 Then
 $$
 P\left[p\gets \tilde{\Pi_n};a\gets\mathbb{Z}_p^*;g=a^2\neq 1;x\gets \mathbb{Z}_q;y=g^x\mod p:\mathcal{A}(y)=x\right]\leq \varepsilon(n)
 $$
 $p\gets \tilde{\Pi_n};a\gets\mathbb{Z}_p^*;g=a^2\neq 1$ is the function condition when we do the encryption on cyclic groups.
 Notes: $f:\Z_q\to \mathbb{Z}_p^*$ is one-to-one, so $f(\mathcal{A}(y))\iff \mathcal{A}(y)=x$
--- a/pages/CSE442T/Exam_reviews/CSE442T_E1.md
+++ b/pages/CSE442T/Exam_reviews/CSE442T_E1.md
@@ -0,0 +1,210 @@
 # System check for exam list
 **The exam will take place in class on Monday, October 21.**
 The topics will cover Chapters 1 and 2, as well as the related probability discussions we've had (caveats below).  Assignments 1 through 3 span this material.
 ## Specifics on material:
 NOT "match-making game" in 1.2 (seems fun though)
 NOT the proof of Theorem 31.3 (but definitely the result!)
 NOT 2.4.3 (again, definitely want to know this result, and we have discussed the idea behind it)
 NOT 2.6.5, 2.6.6
 NOT 2.12, 2.13
 The probability knowledge/techniques I've expanded on include conditional probability, independence, law of total probability, Bayes' Theorem, union bound, 1-p bound (or "useful bound"), collision
 I expect you to demonstrate understanding of the key definitions, theorems, and proof techniques.  The assignments are designed to reinforce all of these.  However, exam questions will be written with the understanding of the time limitations.
 The exam is "closed-book," with no notes of any kind allowed.  The advantage of this is that some questions might be very basic.  However, I will expect that you will have not just memorized definitions and theorems, but you can also explain their meaning and apply them.
 ## Chapter 1
 ### Prove security
 #### Definition 11.1 Shannon secrecy
 $(\mathcal{M},\mathcal{K}, Gen, Enc, Dec)$ (A crypto-system) is said to be private-key encryption scheme that is *Shannon-secrete with respect to distribution $D$ over the message space $\mathcal{M}$* if for all $m'\in \mathcal{M}$ and for all $c$,
 $$
 P[k\gets Gen;m\gets D:m=m'|Enc_k(m)=c]=P[m\gets D:m=m']
 $$
 (The adversary cannot learn all, part of, any letter of, any function off, or any partial information about the plaintext)
 #### Definition 11.2 Perfect Secrecy
 $(\mathcal{M},\mathcal{K}, Gen, ENc, Dec)$ (A crypto-system) is said to be private-key encryption scheme that is *perfectly secret* if forall $m_1,m_2\in \mathcal{M},\forall c$:
 $$
 P[k\gets Gen:Enc_k(m_1)=c]=P[k\gets Gen:Enc_k(m_2)=c]
 $$
 (For all coding scheme in the crypto system, for any two different message, they are equally likely to be mapped to $c$)
 #### Definition 12.3
 A private-key encryption scheme is perfectly secret if and only if it is Shannon secret.
 ## Chapter 2
 ### Efficient Private-key Encryption
 #### Definition 24.7
 A triplet of algorithms $(Gen,Enc,Dec)$ is called an efficient private-key encryption scheme if the following holds.
 1. $k\gets Gen(1^n)$ is a p.p.t. such that for every $n\in \mathbb{N}$, it samples a key $k$.
 2. $c\gets Enc_k(m)$ is a p.p.t. that given $k$ and $m\in \{0,1\}^n$ produces a ciphertext $c$.
 3. $m\gets Dec_c(c)$ is a p.p.t. that given a ciphertext $c$ and key $k$ produces a message $m\in \{0,1\}^n\cup \perp$.
 4. For all $n\in \mathbb{N},m\in \{0,1\}^n$
 $$
 Pr[k\gets Gen(1^n);Dec_k(Enc_k(m))=m]=1
 $$
 ### One-Way functions
 #### Definition 26.1
 A function $f:\{0,1\}^*\to\{0,1\}^*$ is worst-case one-way if the function is:
 1. Easy to compute. There is a p.p.t $C$ that computes $f(x)$ on all inputs $x\in \{0,1\}^*$, and 
 2. Hard to invert. There is no adversary $\mathcal{A}$ such that
 $$
 \forall x,P[\mathcal{A}(f(x))\in f^{-1}(f(x))]=1
 $$
 #### Definition 27.2 Negligible function
 A function $\varepsilon(n)$ is negligible if for every $c$. there exists some $n_0$ such that for all $n>n_0$, $\epsilon (n)\leq \frac{1}{n^c}$.
 #### Definition 27.3 Strong One-Way Function
 A function mapping strings to strings $f:\{0,1\}^*\to \{0,1\}^*$ is a strong one-way function if it satisfies the following two conditions:
 1. Easy to compute. There is a p.p.t $C$ that computes $f(x)$ on all inputs $x\in \{0,1\}^*$, and 
 2. Hard to invert. There is no adversary $\mathcal{A}$ such that
 $$
 P[x\gets\{0,1\}^n;y\gets f(x):f(\mathcal{A}(1^n,y))=y]\leq \epsilon(n)
 $$
 #### Definition 28.4 (Weak One-Way Function)
 A function mapping strings to strings $f:\{0,1\}^*\to \{0,1\}^*$ is a strong one-way function if it satisfies the following two conditions:
 1. Easy to compute. There is a p.p.t $C$ that computes $f(x)$ on all inputs $x\in \{0,1\}^*$, and 
 2. Hard to invert. There is no adversary $\mathcal{A}$ such that
 $$
 P[x\gets\{0,1\}^n;y\gets f(x):f(\mathcal{A}(1^n,y))=y]\leq 1-\frac{1}{q(n)}
 $$
 #### Notation for prime numbers
 Denote the (finite) set of primes that are smaller than $2^n$ as
 $$
 \Pi_n=\{q|q<2^n\textup{ and } q \textup{ is prime}\}
 $$
 #### Assumption 30.1 (Factoring)
 For every adversary $\mathcal{A}$, there exists a negligible function $\epsilon$ such that
 $$
 P[p\gets \Pi_n;q\gets \Pi_n;N\gets pq:\mathcal{A}(N)\in \{p,q\}]<\epsilon(n)
 $$
 (For every product of random 2 primes, the probability for any adversary to find the prime factors is negligible.)
 (There is no polynomial function that can decompose the product of two $n$ bit prime, the best function is $2^{O(n^{\frac{1}{3}}\log^{\frac{2}{3}}n)}$)
 #### Theorem 35.1
 For any weak one-way function $f:\{0,1\}^n\to \{0,1\}^*$, there exists a polynomial $m(\cdot)$ such that function
 $$
 f'(x_1,x_2,\dots, x_{m(n)})=(f(x_1),f(x_2),\dots, f(x_{m(n)})).
 $$
 from $f'=(\{0,1\}^n)^{m(n)}\to(\{0,1\}^*)^{m(n)}$ is strong one-way.
 ### RSA
 #### Definition 46.7
 A group $G$ is a set of elements with a binary operator $\oplus:G\times G\to G$ that satisfies the following properties
 1. Closure: $\forall a,b\in G, a\oplus b\in G$
 2. Identity: $\exists i\in G$ such that $\forall a\in G, i\oplus a=a\oplus i=a$
 3. Associativity: $\forall a,b,c\in G,(a\oplus b)\oplus c=a\oplus(b\oplus c)$.
 4. Inverse: $\forall a\in G$, there is an element $b\in G$ such that $a\oplus b=b\oplus a=i$
 #### Definition Euler totient function $\Phi(N)$.
 $$
 \Phi(p)=p-1
 $$ if $p$ is prime
 $$
 \Phi(N)=(p-1)(q-1)
 $$ if $N=pq$ and $p,q$ are primes
 #### Theorem 47.10
 $\forall a\in \mathbb{Z}_N^*,a^{\Phi(N)}=1\mod N$
 #### Corollary 48.11
 $\forall a\in \mathbb{Z}_p^*,a^{p-1}\equiv 1\mod p$.
 #### Corollary 48.12
 $a^x\mod N=a^{x\mod \Phi(N)}\mod N$
 ## Some other important results
 ### Exponent
 $$
 (1-\frac{1}{n})^n\approx e
 $$
 when $n$ is large.
 ### Primes
 Let $\pi(x)$ be the lower-bounds for prime less than or equal to $x$.
 #### Theorem 31.3 Chebyshev
 For $x>1$,$\pi(x)>\frac{x}{2\log x}$
 #### Corollary 31.3
 For $2^n>1$, $p(n)>\frac{1}{n}$
 (The probability that a uniformly sampled n-bit integer is prime is greater than $\frac{1}{n}$)
 ### Modular Arithmetic
 #### Extended Euclid Algorithm
 ```python
 def eea(a,b)->tuple(int):
    # assume a>b
    # return x,y such that ax+by=gcd(a,b)=d.
    # so y is the modular inverse of b mod a
    # so x is the modular inverse of a mod b
    # so gcd(a,b)=ax+by
    if a%b==0:
        return (0,1)
    x,y=eea(b,a%b)
    return (y,x-y(a//b))
 ```
--- a/pages/CSE442T/_meta.js
+++ b/pages/CSE442T/_meta.js
@@ -0,0 +1,36 @@
 export default {
    Exam_reviews: "Exam reviews",
    CSE442T_L1: "Lecture 1",
    CSE442T_L2: "Lecture 2",
    CSE442T_L3: "Lecture 3",
    CSE442T_L4: "Lecture 4",
    CSE442T_L5: "Lecture 5",
    CSE442T_L6: "Lecture 6",
    CSE442T_L7: "Lecture 7",
    CSE442T_L8: "Lecture 8",
    CSE442T_L9: "Lecture 9",
    CSE442T_L10: "Lecture 10",
    CSE442T_L11: "Lecture 11",
    CSE442T_L12: "Lecture 12",
    CSE442T_L13: "Lecture 13",
    CSE442T_L14: "Lecture 14",
    CSE442T_L15: "Lecture 15",
    CSE442T_L16: "Lecture 16",
    CSE442T_L17: "Lecture 17",
    CSE442T_L18: "Lecture 18",
    CSE442T_L19: "Lecture 19",
    CSE442T_L20: "Lecture 20",
    CSE442T_L21: "Lecture 21",
    CSE442T_L22: {
        display: 'hidden'
    },
    CSE442T_L23: {
        display: 'hidden'
    },
    CSE442T_L24: {
        display: 'hidden'
    },
    index: {
        display: 'hidden'
    }
 }
--- a/pages/CSE442T/index.mdx
+++ b/pages/CSE442T/index.mdx
--- a/pages/_meta.js
+++ b/pages/_meta.js
@@ -21,6 +21,14 @@ export default {
      title: 'Math 4111',
      type: 'page'
    },
    CSE442T: {
      title: 'CSE 442T',
      type: 'page'
    },
    CSE347: {
      title: 'CSE347',
      type: 'page'
    },
    about: {
        display: 'hidden'
    },