8 lines
636 B
Plaintext
8 lines
636 B
Plaintext
[2025-10-11 22:54:20,323][__main__][INFO] - Training for 50000 timesteps with NormalQNetwork and NormalReplayBuffer
|
|
[2025-10-11 22:54:29,346][core][INFO] - Step: 2000, Eval mean: 9.2, Eval std: 0.6
|
|
[2025-10-11 22:54:39,848][core][INFO] - Step: 4000, Eval mean: 9.2, Eval std: 0.6
|
|
[2025-10-11 22:54:50,222][core][INFO] - Step: 6000, Eval mean: 9.2, Eval std: 0.6
|
|
[2025-10-11 22:55:00,611][core][INFO] - Step: 8000, Eval mean: 9.2, Eval std: 0.6
|
|
[2025-10-11 22:55:11,300][core][INFO] - Step: 10000, Eval mean: 9.4, Eval std: 0.66332495807108
|
|
[2025-10-11 22:55:22,394][core][INFO] - Step: 12000, Eval mean: 9.4, Eval std: 0.66332495807108
|