Google’s AI robots learn from watching movies – just like the rest of us

Google DeepMind’s robotics team is teaching robots how a human trainee would learn: by watching a video. The team has a new paper demonstrating how Google’s RT-2 robots, equipped with the Gemini 1.5 Pro generative AI model, can absorb information from videos to learn how to move and even execute commands at their destination.

The Gemini 1.5 Pro model’s long context window makes it possible to train a robot like a new trainee. This window allows the AI ​​to process large amounts of information at once. The researchers would film a video tour of a designated area, such as a home or office. The robot would then watch the video and learn about its surroundings.