Model-based methods in machine learning aim to speed up the learning process by exploiting an explicit representation of the underlying model. In Reinforcement Learning (RL), classic model-based approaches leverage the available samples to construct an estimate of the underlying environment. There are several advantages in having an explicit model representation: I) learning behaviors is usually faster and more sample efficient; II) prior knowledge and experience can be integrated more easily; III) the model can be flexibly reused for a wide variety of goals and objectives. Model-based approaches can also enable counterfactual reasoning (“what would have happened if …”) which is exceedingly difficult without a model (e.g., using value-based approaches). They also enable easier transfer learning when reward (and to some extent the dynamics) changes. More generally, the ability to build an internal representation of the environment can be viewed as a hallmark of intelligence. Indeed, prediction and intuitive physics often figure in neuroscience, psychology, and cognitive science research into the development of internal representations in the human brain. Finally, model-based methods are of substantial theoretical and practical importance, since they have shown to be able to learn faster in large continuous environments, to provide insights in the way humans behave, and to be at the core theoretically efficient/optimal methods for exploration-exploitation in discrete domains. However, model-based algorithms are not without their challenges: constructing accurate models in complex real-world environments can be difficult, and imperfect models can give rise to highly suboptimal behavior. Although recent years have seen substantial advances in generative modeling, prediction, image generation, and other types of forecasting applications, many of these advances have yet to produce a large impact on reinforcement learning and control.

The aim of this workshop is to investigate questions in model-based reinforcement learning, as well as how tools and ideas from other generative modelling and prediction fields can influence the development of novel decision making and control algorithms. For example, can generative adversarial networks provide an answer to the question of which loss function should be used to fit a model? How can model-free reinforcement learning ideas influence model-based learning while benefiting from its improved efficiency and flexibility? Can we design hybrid approaches that integrates model-free and model-based learning? how can the best innovations in prediction and time series modelling translate into improved reinforcement learning algorithms? Alongside that, we encourage submission on any topic related to core model-based reinforcement learning. Some of the open questions are: How can we exploit side information? Is it possible to design algorithms for optimal exploration-exploitation in large domains? How can we incorporate safety in model-based approaches? What are the current limits of model-based approaches and what can we expect in the future? Which are the classes of environments we are able to represent (e.g., MDP, POMDPS and PSR)? And what are the suited models (e.g., NN, RNN)? Can we design efficient (even theoretically) approaches for particular classes of problems (e.g., linearly-solvable MDPs or linear-quadratic regulator)?

## Speakers

- Marc Deisenroth (Imperial College London)
- Yann LeCun (Facebook & NYU)
- Junhyuk Oh (University of Michigan)
- Jess Hamrick (DeepMind)
- Ian Osband (DeepMind)
- Evangelos Theodorou (Georgia Institute of Technology)

## Organizers

- Matteo Pirotta (INRIA)
- Roberto Calandra (UC Berkeley)
- Sergey Levine (UC Berkeley)
- Martin Riedmiller (Google DeepMind)
- Alessandro Lazaric (Facebook)

## Important dates

- Submission deadline:
~~04 June 2018~~(Anywhere on Earth) - Notification:
**19 June 2018** - Camera ready:
**9 July 2018** - Workshop:
**15 July 2018**

## Schedule

08:30 | Introduction and opening remarks |

08:40 | Marc Deisenroth – Probabilistic Prediction Models for Date-Efficient Reinforcement Learning |

09:10 | Junhyuk Oh – Value Prediction Network: A Minimal Model for Planning |

09:40 | Poster session 1 ( + Coffee Break) |

11:00 | Jess Hamrick – Structured Computation and Representation in Deep Reinforcement Learning |

11:30 | Ian Osband – Deep Exploration via Randomized Value Functions |

12:00 | Lunch Break |

13:30 | Evangelos Theodorou – From Finite to Infinite Dimensional Representations: The Next Frontier in Stochastic Control and AI research. |

14:00 | Contributed talk 1 – Hybrid Global Search for Sample Efficient Controller Optimization |

14:15 | Contributed talk 2 – VFunc: A Deep Generative Model for Functions |

14:30 | Poster session 2 ( + Coffee Break) |

15:30 | Yann LeCun – Learning World Models with Self-Supervised Learning |

16:00 | Panel discussion |

17:00 | End |

## Accepted Papers

- Sample-Efficient Deep RL with Generative Adversarial Tree Search

Kamyar Azizzadenesheli, Zachary Lipton, Animashree Anandkumar, Weitang Liu - Imitating Latent Policies from Observation

Ashley Edwards, Himanshu Sahni, Yannick Schroecker, Charles Isbell - Model-based Reinforcement Learning with Non-linear Expectation Models and Stochastic Environments

Yi Wan, Muhammad Zaheer, Martha White, Richard Sutton - Hybrid Global Search for Sample Efficient Controller Optimization

Akshara Rai, Rika Antonova, Danica Kragic, Christopher G. Atkeson - Feature Selection by Singular Value Decomposition for Reinforcement Learning

Bahram Behzadian, Marek Petrik - As Expected? An Analysis of Distributional Reinforcement Learning

Clare Lyle, Marc G. Bellemare - The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces

G. Zacharias Holland, Erik Talvitie, Michael Bowling - Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee - Equivalence Between Wasserstein and Value-Aware Loss for Model-based Reinforcement Learning

Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman - Algorithmic Framework for Model-based Reinforcement Learning with Theoretical Guarantees

Huazhe (Harry) Xu, Yuanzhi Li, Yuandong Tian, Trevor Darrell, Tengyu Ma - Learning and Querying Fast Generative Models for Reinforcement Learning

Lars Buesing, Théophane Weber, Sébastien Racanière, S. M. Ali Eslami, Danilo Rezende, David P. Reichert, Fabio Viola, Frédéric Besse, Karol Gregor, Demis Hassabis, Daan Wierstra - Navigation and planning in latent maps

Baris Kayalibay, Atanas Mirchev, Maximilian Soelch, Patrick van der Smagt, Justin Bayer - Generalizing Value Estimation over Timescale

Craig Sherstan, James MacGlashan, Patrick M. Pilarski - Failure Modes of Variational Inference for Decision Making

Carlos Riquelme, Matthew Johnson, Matthew D. Hoffman - Task-Relevant Embeddings for Robust Perception in Reinforcement Learning

Eric Liang, Roy Fox, Joseph E. Gonzalez, Ion Stoica - Planning in Dynamic Environments with Conditional Autoregressive Models

Johanna Hansen, Kyle Kastner, Aaron Courville, Gregory Dudek - VFunc: A Deep Generative Model for Functions

Philip Bachman, Riashat Islam, Alessandro Sordoni, Zafarali Ahmed - SULFR: Simulation of Urban Logistics for Reinforcement

Guillaume Bono, Jilles Steeve Dibangoye, Laëtitia Matignon, Florian Pereyron, Olivier Simonin - Iterative Model-Fitting and Local Controller Optimization - Towards a Better Understanding of Convergence Properties

Manuel Wüthrich, Bernhard Schölkopf - Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine

## Program Committee

We thank the program committee for shaping this excellent technical program (in alphabetical order): Kavosh Asadi, Ronan Fruit, Gregory Kahn, Sanket Kamthe, Rowan McAllister, Simone Parisi, Marcello Restelli, Samuele Tosatto, Grady Williams.

## Contact us

For any question you can contact us at pgmrl2018@reinforcement-learning.ml