Nvidia built an AI that creates 3D models from 2D images

What if developing a 3D gaming world were as easy as snapping pics with your phone? Nvidia researchers recently developed an AI system capable of predicting a complete 3D model from any 2D image.

Called “DIB-R,” the AI takes a picture of any 2D object – an image of a bird, for example – and predicts what it would look like in three dimensions. This prediction includes lighting, texture, and depth.

DIB-R stands for differentiable interpolation-based renderer, meaning it combines what it “sees,” a 2D image, and makes inferences based on a 3D “understanding” of the world. This is strikingly similar to how humans translate the 2D input from our eyes into a 3D mental image.

According to Nvidia, this research has numerous implications for the field of robotics:

TNW City Coworking space - Where your best work happens

A workspace designed for growth, collaboration, and endless networking opportunities in the heart of tech.

Book a tour now

For an autonomous robot to interact safely and efficiently with its environment, it must be able to sense and understand its surroundings. DIB-R could potentially improve those depth perception capabilities.

With further development the researchers hope to expand DIB-R to include functionality that would essentially make it a virtual reality renderer. One day, the team hopes, such a system will make it possible for the AI to create fully-immersive 3D worlds in milliseconds using only photographs:

Sanja Fidler, Nvidia’s director of AI and coauthor on the team’s paper, told Venture Beat’s Khari Johnson:

Imagine you can just take a photo and out comes a 3D model, which means that you can now look at that scene that you have taken a picture of [from] all sorts of different viewpoints. You can go inside it potentially, view it from different angles — you can take old photographs in your photo collection and turn them into a 3D scene and inspect them like you were there, basically.

The ability to render the world from photographs could lead to amazing content creation pipelines. Technology such as Google Maps could become more immersive than ever. And, possibly, creatives more skilled at photography or painting than coding and development could leave all the heavy development to the machines.

Imagine if making huge open-world games such as Skyrim and Grand Theft Auto, the kind traditionally relegated to companies with hundreds of staff members, were something a handful of creatives and an AI could handle on their own.

Story by Tristan Greene

Editor, Neural by TNW

Tristan is a futurist covering human-centric artificial intelligence advances, quantum computing, STEM, physics, and space stuff. Pronouns: (show all) Tristan is a futurist covering human-centric artificial intelligence advances, quantum computing, STEM, physics, and space stuff. Pronouns: He/him

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Nvidia built an AI that creates 3D models from 2D images

Get the TNW newsletter

Also tagged with

Synthesia’s valuation jumps to $4B after $200M raise

Beyond the click: How brands can influence visibility in AI-generated answers

Discover TNW All Access

Mews raises €255M to accelerate AI and automation in hospitality

How Flippa Is Removing the Language Barrier from Global Deal-Making