rl-blogs 14

Why TD Learning Likes the 2-Norm: Mean Directions, Inner Products, and Bellman Geometry Jun 13, 2026
Linear TD vs Neural TD: A Tale of Two Geometries Jun 12, 2026
From Snail Trails to Robust POMDPs: Safe Learning with Hidden Monsters Jun 11, 2026
TD Learning Is Almost Gradient Descent: A Finite-Time View of Linear TD Jun 11, 2026
Bellman Operators and Bellman Optimality Jun 10, 2026
Discounted vs. Average Reward Reinforcement Learning Jun 9, 2026
Concentration Inequalities: A Researcher's Guide from Markov to Freedman Jun 9, 2026
The One-Pixel Attack: Fooling a Neural Network by Changing One Pixel Jun 4, 2026
Why Do Neural TD Converge ? Jun 2, 2026
Function Approximation in RL: From Tables to Linear Models to Neural Networks May 27, 2026
The Beauty of a Simple Proof: TD Learning Without Projection May 26, 2026
Stochastic Gradient Descent: Why Randomness Works May 13, 2026
Why Gradient Descent Works: A Small Mathematical Story May 12, 2026
Why Vanilla Q-Learning Breaks Under Corrupted Rewards Apr 12, 2026