How DeepMind’s New AI Predicts What It Cannot See
Two Minute Papers explores DeepMind's D4RT AI, a groundbreaking system capable of full 4D scene reconstruction, combining 3D spatial dimensions with time. This innovative approach uses a single transformer model to simultaneously handle depth,...
Video Snippets
How DeepMind’s New AI Predicts What It Cannot See
Two Minute Papers explores DeepMind's D4RT AI, a groundbreaking system capable of full 4D scene reconstruction, combining 3D spatial dimensions with time. This innovative approach uses a single transformer model to simultaneously handle depth, motion, and camera pose, eliminating the need for multiple specialized AIs. D4RT can track objects through occlusion, predicting their location even when unseen, and achieves speeds up to 300 times faster than previous methods. While its output is point clouds, prioritizing geometric accuracy over photorealism or easy editing, it marks a significant leap in creating dynamic digital worlds.
https://youtube.com/watch?v=ssbHkYB0jYM