Description

Preference Learning from Minimal Human Feedback for Interactive Autonomy

The lack of large robotics datasets is arguably the most important obstacle in front of robot learning and interactive autonomy. While large pretrained models and algorithms like reinforcement learning from human feedback (RLHF) led to breakthroughs in other domains like natural language processing and computer vision, robotics has not experienced such a significant breakthrough due to the excessive cost of collecting large datasets. In this talk, I will discuss techniques that enable us to train robots from very little human feedback. I will dive into reinforcement learning from human feedback and describe how active learning methods can enable us to make it more data-efficient. I will finally propose an alternative type of human feedback based on language corrections to further improve both data-efficiency and time-efficiency.

Instructors Bio

Erdem Bıyık, PhD

Assistant Professor of Computer Science | lead the Learning and Interactive Robot Autonomy | Lab LiraLab | University of Southern California

Erdem Bıyık is an assistant professor in Thomas Lord Department of Computer Science at the University of Southern California, and in Ming Hsieh Department of Electrical and Computer Engineering by courtesy. He leads the Learning and Interactive Robot Autonomy Lab (Lira Lab). Prior to joining USC, he was a postdoctoral researcher at UC Berkeley's Center for Human-Compatible Artificial Intelligence. He received his Ph.D. and M.Sc. degrees in Electrical Engineering from Stanford University, working at the Stanford Artificial Intelligence Lab (SAIL), and his B.Sc. degree in Electrical and Electronics Engineering from Bilkent University in Ankara, Türkiye. During his studies, he worked at the research departments of Google and Aselsan. Erdem was an HRI 2022 Pioneer and received an honorable mention award for his work at HRI 2020. His works were published at premier robotics and artificial intelligence journals and conferences, such as IJRR, CoRL, RSS, NeurIPS.