Latent Variable Representation for Reinforcement Learning

1 Google Research, Brain Team 2 University of Texas at Austin 3 University of Alberta 4 UC Berkeley 5 Harvard University 6 Northwestern University

*indicates equal contribution.
arXiv 2023

Abstract

Deep latent variable models have achieved significant empirical successes in modelbased reinforcement learning (RL) due to their expressiveness in modeling complex transition dynamics. On the other hand, it remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of RL. In this paper, we provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle in the face of uncertainty for exploration. In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models. Theoretically, we establish the sample complexity of the proposed approach in the online and offline settings. Empirically, we demonstrate superior performance over current state-of-the-art algorithms across various benchmarks.


Combinatorial Generalization

Below, we illustrate generated videos on unseen combinatorial combinations of goals. Our approach is able to synthesize a diverse set of different behaviors which satisfy unseen language subgoals.


Citation



            @article{ren2022latent,
                title={Latent Variable Representation for Reinforcement Learning},
                author={Ren, Tongzheng and Xiao, Chenjun and Zhang, Tianjun and Li, Na and Wang, Zhaoran and Sanghavi, Sujay and Schuurmans, Dale and Dai, Bo},
                journal={arXiv preprint arXiv:2212.08765},
                year={2022}
              }
        
This webpage template was recycled from here.