Tuan Dam: Optimal Regret Bounds via Low-Rank Structured Variation in Non-Stationary Reinforcement Learning. NeurIPS 2025