Tristan GreeneEditor, Neural by TNW
Tristan is a futurist covering human-centric artificial intelligence advances, quantum computing, STEM, physics, and space stuff. Pronouns: Tristan is a futurist covering human-centric artificial intelligence advances, quantum computing, STEM, physics, and space stuff. Pronouns: He/him
What if developing a 3D gaming world were as easy as snapping pics with your phone? Nvidia researchers recently developed an AI system capable of predicting a complete 3D model from any 2D image.
Called “DIB-R,” the AI takes a picture of any 2D object – an image of a bird, for example – and predicts what it would look like in three dimensions. This prediction includes lighting, texture, and depth.
DIB-R stands for differentiable interpolation-based renderer, meaning it combines what it “sees,” a 2D image, and makes inferences based on a 3D “understanding” of the world. This is strikingly similar to how humans translate the 2D input from our eyes into a 3D mental image.
According to Nvidia, this research has numerous implications for the field of robotics:
For an autonomous robot to interact safely and efficiently with its environment, it must be able to sense and understand its surroundings. DIB-R could potentially improve those depth perception capabilities.
With further development the researchers hope to expand DIB-R to include functionality that would essentially make it a virtual reality renderer. One day, the team hopes, such a system will make it possible for the AI to create fully-immersive 3D worlds in milliseconds using only photographs:
Sanja Fidler, Nvidia’s director of AI and coauthor on the team’s paper, told Venture Beat’s Khari Johnson:
Imagine you can just take a photo and out comes a 3D model, which means that you can now look at that scene that you have taken a picture of [from] all sorts of different viewpoints. You can go inside it potentially, view it from different angles — you can take old photographs in your photo collection and turn them into a 3D scene and inspect them like you were there, basically.
The ability to render the world from photographs could lead to amazing content creation pipelines. Technology such as Google Maps could become more immersive than ever. And, possibly, creatives more skilled at photography or painting than coding and development could leave all the heavy development to the machines.
Imagine if making huge open-world games such as Skyrim and Grand Theft Auto, the kind traditionally relegated to companies with hundreds of staff members, were something a handful of creatives and an AI could handle on their own.
Get the TNW newsletter
Get the most important tech news in your inbox each week.