Learning Value Functions in Deep Policy Gradients using Residual Variance
Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, and Philippe Preux. "Learning value functions in deep policy gradients using residual variance." International Conference on Learning Representations. 2021.