diff --git a/content/CSE510/CSE510_L17.md b/content/CSE510/CSE510_L17.md index 86cf02b..332c0c2 100644 --- a/content/CSE510/CSE510_L17.md +++ b/content/CSE510/CSE510_L17.md @@ -10,3 +10,56 @@ - Explainability - Super-human performance in practice +### Deterministic Environment: Cross-Entropy Method + +#### Stochastic Optimization + +abstract away optimal control/planning: + +$$ +a_1,\ldots, a_T =\argmax_{a_1,\ldots, a_T} J(a_1,\ldots, a_T) +$$ + +$$ +A=\argmax_{A} J(A) +$$ + +Simplest method: guess and check: "random shooting method" + +- pick $A_1, A_2, ..., A_n$ from some distribution (e.g. uniform) +- Choose $A_i$ based on $\argmax_i J(A_i)$ + +#### Cross-Entropy Method with continuous-valued inputs + +1. sample $A_1, A_2, ..., A_n$ from some distribution $p(A)$ +2. evaluate $J(A_1), J(A_2), ..., J(A_n)$ +3. pick the _elites_ $A_1, A_2, ..., A_m$ with the highest $J(A_i)$, where $m