
Arxiv: https://arxiv.org/abs/2509.15155
This episode of "The AI Research Deep Dive" explores a groundbreaking Google DeepMind paper that offers a solution to a major roadblock in robotics: the "imitation learning ceiling," where robots can't improve beyond their initial human demonstrations. The host explains how the researchers created a two-stage system to enable robots to become their own coaches. First, a foundation model learns not only how to perform a task from human videos but also how to judge progress by predicting the "steps-to-go" until completion. Listeners will learn how this learned judgment is then used in the second stage to create a self-generated reward signal, allowing the robot to autonomously practice, improve its skills, and even learn entirely new behaviors for objects it has never seen before, effectively breaking through the imitation barrier.