reinforcement-learning-theory 1 TD Learning Is Almost Gradient Descent: A Finite-Time View of Linear TD Jun 11, 2026