From 5ac36745e265fd46837f59bf4a711a47d5e5e0e5 Mon Sep 17 00:00:00 2001 From: Zheyuan Wu <60459821+Trance-0@users.noreply.github.com> Date: Thu, 23 Oct 2025 10:58:10 -0500 Subject: [PATCH] Update CSE510_L17.md --- content/CSE510/CSE510_L17.md | 53 ++++++++++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) diff --git a/content/CSE510/CSE510_L17.md b/content/CSE510/CSE510_L17.md index 86cf02b..332c0c2 100644 --- a/content/CSE510/CSE510_L17.md +++ b/content/CSE510/CSE510_L17.md @@ -10,3 +10,56 @@ - Explainability - Super-human performance in practice +### Deterministic Environment: Cross-Entropy Method + +#### Stochastic Optimization + +abstract away optimal control/planning: + +$$ +a_1,\ldots, a_T =\argmax_{a_1,\ldots, a_T} J(a_1,\ldots, a_T) +$$ + +$$ +A=\argmax_{A} J(A) +$$ + +Simplest method: guess and check: "random shooting method" + +- pick $A_1, A_2, ..., A_n$ from some distribution (e.g. uniform) +- Choose $A_i$ based on $\argmax_i J(A_i)$ + +#### Cross-Entropy Method with continuous-valued inputs + +1. sample $A_1, A_2, ..., A_n$ from some distribution $p(A)$ +2. evaluate $J(A_1), J(A_2), ..., J(A_n)$ +3. pick the _elites_ $A_1, A_2, ..., A_m$ with the highest $J(A_i)$, where $m