theory 2 Why Gradient Descent Works: A Small Mathematical Story May 12, 2026 Why Vanilla Q-Learning Breaks Under Corrupted Rewards Apr 12, 2026