MountainAI
Войти
rl

Q-learning and DQN

Value-based reinforcement learning — Q-tables, temporal difference updates, and deep Q-networks.

Уровни глубины

L0Intro~0ч

Knows Q-learning estimates action values; has heard of DQN playing Atari.

L1Basics~10ч

Implements tabular Q-learning on GridWorld; understands epsilon-greedy exploration and TD updates.

L2Working~25ч

Builds DQN with experience replay and target networks; applies Double DQN, Dueling DQN, Prioritised Replay.

L3Advanced~35ч

Understands distributional RL (C51), Rainbow; analyses overestimation bias; implements multi-step returns.

L4Research~70ч

Contributes to offline RL, model-based RL, or sample-efficient value-based methods.

Ресурсы