rl-blogs 6
- Why Do Neural TD Converge ?
- Function Approximation in RL: From Tables to Linear Models to Neural Networks
- The Beauty of a Simple Proof: TD Learning Without Projection
- Stochastic Gradient Descent: Why Randomness Works
- Why Gradient Descent Works: A Small Mathematical Story
- Why Vanilla Q-Learning Breaks Under Corrupted Rewards