Session Overview

Reinforcement Learning (RL) has seen an explosion of work in the last few years with some high-profile results around game playing and simulated multi-agent behavior evolution. Many of these successes, while impressive and pushing the state of the art, have certain properties that don’t translate to real-world challenges faced by practitioners. First, is the ability to simulate the task at massive scales needed to train these RL algorithms. Second, the task environment is often fully observable in that everything about the state of the world is available to the learning agent at each iteration. For many sequential decision processes, however, there may be no simulator and the state of the world is only partially observable at any given time.

So what can we use today? Two related approaches for agent-based learning that exist for real world use cases are Contextual Bandits and Imitation Learning. Both can be seen as simplifications to the full RL problem by relaxing certain assumptions, such as the number of environmental states or the need to balance online exploitation vs exploration, respectively. This talk will introduce both the formal Contextual Bandit and Imitation Learning problems, how they differ from full RL, what their limitations are, and where they can be used to solve real world problems.



  • 1

    Alternatives to Reinforcement Learning for Real World Problems

    • Abstract & Bio

    • Alternatives to Reinforcement Learning for Real World Problems


Start your 7-days trial. Cancel anytime.