Model-based methods in machine learning aim to speed up the learning process by exploiting an explicit representation of the underlying model. In Reinforcement Learning (RL), classic model-based approaches leverage the available samples to construct an estimate of the underlying environment. There are several advantages in having an explicit model representation: I) learning behaviors is usually faster and more sample efficient; II) prior knowledge and experience can be integrated more easily; III) the model can be flexibly reused for a wide variety of goals and objectives. Model-based approaches can also enable counterfactual reasoning (“what would have happened if …”) which is exceedingly difficult without a model (e.g., using value-based approaches). They also enable easier transfer learning when reward (and to some extent the dynamics) changes. More generally, the ability to build an internal representation of the environment can be viewed as a hallmark of intelligence. Indeed, prediction and intuitive physics often figure in neuroscience, psychology, and cognitive science research into the development of internal representations in the human brain. Finally, model-based methods are of substantial theoretical and practical importance, since they have shown to be able to learn faster in large continuous environments, to provide insights in the way humans behave, and to be at the core theoretically efficient/optimal methods for exploration-exploitation in discrete domains. However, model-based algorithms are not without their challenges: constructing accurate models in complex real-world environments can be difficult, and imperfect models can give rise to highly suboptimal behavior. Although recent years have seen substantial advances in generative modeling, prediction, image generation, and other types of forecasting applications, many of these advances have yet to produce a large impact on reinforcement learning and control.

The aim of this workshop is to investigate questions in model-based reinforcement learning, as well as how tools and ideas from other generative modelling and prediction fields can influence the development of novel decision making and control algorithms. For example, can generative adversarial networks provide an answer to the question of which loss function should be used to fit a model? How can model-free reinforcement learning ideas influence model-based learning while benefiting from its improved efficiency and flexibility? Can we design hybrid approaches that integrates model-free and model-based learning? how can the best innovations in prediction and time series modelling translate into improved reinforcement learning algorithms? Alongside that, we encourage submission on any topic related to core model-based reinforcement learning. Some of the open questions are: How can we exploit side information? Is it possible to design algorithms for optimal exploration-exploitation in large domains? How can we incorporate safety in model-based approaches? What are the current limits of model-based approaches and what can we expect in the future? Which are the classes of environments we are able to represent (e.g., MDP, POMDPS and PSR)? And what are the suited models (e.g., NN, RNN)? Can we design efficient (even theoretically) approaches for particular classes of problems (e.g., linearly-solvable MDPs or linear-quadratic regulator)?

Speakers

Organizers

Important dates

Schedule

08:30 Introduction and opening remarks
08:40 Marc Deisenroth – Probabilistic Prediction Models for Date-Efficient Reinforcement Learning
09:10 Junhyuk Oh – Value Prediction Network: A Minimal Model for Planning
09:40 Poster session 1 ( + Coffee Break)
11:00 Jess Hamrick – Structured Computation and Representation in Deep Reinforcement Learning
11:30 Ian Osband – Deep Exploration via Randomized Value Functions
12:00 Lunch Break
13:30 Evangelos Theodorou – From Finite to Infinite Dimensional Representations: The Next Frontier in Stochastic Control and AI research.
14:00 Contributed talk 1 – Hybrid Global Search for Sample Efficient Controller Optimization
14:15 Contributed talk 2 – VFunc: A Deep Generative Model for Functions
14:30 Poster session 2 ( + Coffee Break)
15:30 Yann LeCun – Learning World Models with Self-Supervised Learning
16:00 Panel discussion
17:00 End

Accepted Papers

Program Committee

We thank the program committee for shaping this excellent technical program (in alphabetical order): Kavosh Asadi, Ronan Fruit, Gregory Kahn, Sanket Kamthe, Rowan McAllister, Simone Parisi, Marcello Restelli, Samuele Tosatto, Grady Williams.

Contact us

For any question you can contact us at pgmrl2018@reinforcement-learning.ml