AI work tends to focus on how to optimize a specified reward function, but rewards that lead to the desired behavior consistently are not so easy to specify. Rather than optimizing specified reward, which is already hard, robots have the much harder job of optimizing intended reward. While the specified reward does not have as much information as we make our robots pretend, the good news is that humans constantly leak information about what the robot should optimize. In this talk, we will explore how to read the right amount of information from different types of human behavior -- and even the lack thereof.
Learning outcomes: After participating, you should be able to articulate the common pitfalls we face in defining an AI reward, loos, or objective function. You should also develop a basic understanding of the main algorithmic tools we have for avoiding these pitfalls.
Target audience: Participants with some AI experience, be in supervised or reinforcement learning.
Abstract & Bio
Learning Intended Reward Functions: Extracting all the Right Information from All the Right Places