breaking update

This commit is contained in:
Trance-0
2025-09-16 10:29:48 -05:00
parent 03baf25685
commit 5e7b8a141d
4 changed files with 93 additions and 3 deletions

View File

@@ -1,4 +1,4 @@
# CSE510 Lecture 6
# CSE510 Deep Reinforcement Learning (Lecture 6)
## Active reinforcement learning
@@ -242,6 +242,6 @@ From the example we see that it can take many learning trials for the final rewa
$$
5. Goto 2
> [!NOTES]
> [!NOTE]
>
> Compared with Q-learning, SARSA (on-policy) usually takes more "safer" actions.