Files
NoteNextra-origin/pages/CSE347/Exam_reviews/CSE347_E2.md
Zheyuan Wu 2f781e226a update
2024-12-02 11:07:52 -06:00

7.1 KiB
Raw Blame History

Exam 2 Review

Reductions

We say that a problem A reduces to a problem B if there is a polynomial time reduction function f such that for all x, x \in A \iff f(x) \in B.

To prove a reduction, we need to show that the reduction function f:

  1. runs in polynomial time
  2. x \in A \iff f(x) \in B.

Useful results from reductions

  1. B is at least as hard as A if A \leq B.
  2. If we can solve B in polynomial time, then we can solve A in polynomial time.
  3. If we want to solve problem A, and we already know an efficient algorithm for B, then we can use the reduction A \leq B to solve A efficiently.
  4. If we want to show that B is NP-hard, we can do this by showing that A \leq B for some known NP-hard problem A.

P is the class of problems that can be solved in polynomial time. NP is the class of problems that can be verified in polynomial time.

We know that P \subseteq NP.

NP-complete problems

A problem is NP-complete if it is in NP and it is also NP-hard.

NP

A problem is in NP if

  1. there is a polynomial size certificate for the problem, and
  2. there is a polynomial time verifier for the problem that takes the certificate and checks whether it is a valid solution.

NP-hard

A problem is NP-hard if every instance of NP hard problem can be reduced to it in polynomial time.

List of known NP-hard problems:

  1. 3-SAT (or SAT):
    • Statement: Given a boolean formula in CNF with at most 3 literals per clause, is there an assignment of truth values to the variables that makes the formula true?
  2. Independent Set:
    • Statement: Given a graph G and an integer k, does G contain a set of k vertices such that no two vertices in the set are adjacent?
  3. Vertex Cover:
    • Statement: Given a graph G and an integer k, does G contain a set of k vertices such that every edge in G is incident to at least one vertex in the set?
  4. 3-coloring:
    • Statement: Given a graph G, can each vertex be assigned one of 3 colors such that no two adjacent vertices have the same color?
  5. Hamiltonian Cycle:
    • Statement: Given a graph G, does G contain a cycle that visits every vertex exactly once?
  6. Hamiltonian Path:
    • Statement: Given a graph G, does G contain a path that visits every vertex exactly once?

Approximation Algorithms

  • Consider optimization problems whose decision problem variant is NP-hard. Unless P=NP, finding an optimal solution to these problems cannot be done in polynomial time.
  • In approximation algorithms, we make a trade-o↵: were willing to accept sub-optimal solutions in exchange for polynomial runtime.
  • The Approximation Ratio of our algorithm is the worst-case ratio of our solution to the optimal solution.
  • For minimization problems, this ratio is \text{Approximation Ratio} = \frac{\text{Our Solution}}{\text{Optimal Solution}}, since our solution will be larger than OPT.
  • For maximization problems, this ratio is \text{Approximation Ratio} = \frac{\text{Optimal Solution}}{\text{Our Solution}}, since our solution will be smaller than OPT.
  • If given an algorithm, and you need to show it has some desired approximation ratio, there are a few approaches.
  • In recitation, we saw Max-Subset Sum. We found upper bounds on the optimal solution and showed that the given algorithm would always give a solution with value at least half of the upper bound, giving our approximation ratio of 2.
  • In lecture, you saw the Vertex Cover 2-approximation. Here, you would select any uncovered edge (u, v) and add both u and v to the cover. We argued that at least one of u or v must be in the optimal cover, as the edge must be covered, so at every step we added at least one vertex from an optimal solution, and potentially one extra. So, the size of our cover could not be any larger than twice the optimal.

Randomized Algorithms

Sometimes, we can get better expected performance from an algorithm by introducing randomness.

We make the tradeoff guarantee runtime and solution quality from a deterministic algorithm, to expected runtime and quality from randomized algorithms.

We can make various bounds and tricks to calculate and amplify the probability of succeeding.

Chernoff Bound

Statement:


Pr[X < (1-\delta)E[x]] \leq e^{-\frac{\delta^2 E[x]}{2}}

Requirements:

  • X is the sum of n independent random variables
  • You used the Chernoff bound to bound the probability of getting less than d good partitions, since the probability of each partition being good is independent the quality of one partition does not affect the quality of the next.
  • Usage: If you have some probability Pr[X < \text{something}] that you want to bound, you must find E[X], and find a value for \delta such that (1-\delta)E[X] = \text{something}. You can then plug in and E[X] into the Chernoff bound.

Markov's Inequality

Statement:


Pr[X \geq a] \leq \frac{E[X]}{a}

Requirements:

  • X is a non-negative random variable
  • No assumptions about independence
  • Usage: If you have some probability Pr[X \geq \text{something}] that you want to bound, you must find E[X], and find a value for a such that a = \text{something}. You can then plug in and E[X] into Markov's inequality.

Union Bound

Statement:


Pr[\bigcup_{i=1}^n e_i] \leq \sum_{i=1}^n Pr[e_i]
  • Conceptually, it's saying that at least one event out of a collection will occur is no more than the sum of the probabilities of each event.
  • Usage: To bound some bad event e, we can use the union bound to sum up the probabilities of each of the bad events e_i and use that to bound Pr[e].

Probabilistic Boosting via Repeated Trials

  • If we want to reduce the probability of some bad event e to some value p, we can run the algorithm repeatedly and make majority votes for the decision.
  • Assume we run the algorithm k times, and the probability of success is \frac{1}{2} + \epsilon.
  • The probability that all trials fail is at most (1-\epsilon)^k.
  • The majority vote of k runs is wrong is the same as probability that more than \frac{k}{2}+1 trials fail.
  • So, the probability is
    $$\begin{aligned} Pr[\text{majority fails}] &=\sum_{i=\frac{k}{2}+1}^{k}\binom{k}{i}(\frac{1}{2}-\epsilon)^i(\frac{1}{2}+\epsilon)^{k-i}\ &= \binom{k}{\frac{k}{2}+1}(\frac{1}{2}-\epsilon)^{\frac{k}{2}+1} \end{aligned}$$
  • If we want this probability to be at most p, we can just solve for k in the inequality make it less than some \delta. Then we solve for k in the inequality \binom{k}{\frac{k}{2}+1}(\frac{1}{2}-\epsilon)^{\frac{k}{2}+1} \leq \delta.

Online Algorithms

  • We make decisions on the fly, without knowing the future.
  • The offline optimum is the optimal solution that knows the future.
  • The competitive ratio of an online algorithm is the worst-case ratio of the cost of the online algorithm to the cost of the offline optimum. (when offline problem is NP-complete, an online algorithm for the problem is also an approximation algorithm) \text{Competitive Ratio} = \frac{C_{online}}{C_{offline}}
  • We do case by case analysis to show that the competitive ratio is at most some value. Just like approximation ratio proofs.