OpenAI’s new image generator sparks both excitement and fear

OpenAI has unveiled a new AI tool that turns text into images — and the results are stunning.

Named DALL-E 2, the system is the successor to a model unveiled last year. While its predecessor generated some impressive outputs, the new version is a major upgrade.

This picture in the style of Claude Monet illustrates the improvements of the new model. — This picture in the style of Claude Monet illustrates the improvements. Credit: OpenAI

DALL-E-2 adds enhanced textual comprehension, faster image generation, and four times greater resolution.

“When approaching DALL-E 2 we focused on improving the image resolution quality and improving latency, rather than building a bigger system,” OpenAI researcher Aditya Ramesh told TNW.

Animal helicopter chimeras generated with DALL·E 2: pic.twitter.com/5b8a9iq3k9

— Aditya Ramesh (@model_mechanic) April 7, 2022

TNW Conference - The 2025 Agenda has just touched down

Discover the insightful and dare we say controversial sessions that will take place June 19-20.

Check It Out

The new tool also introduces two extra capabilities: reinterperations of existing images and an editing feature called inpainting.

Inpainting makes edits to an existing image by analyzing a natural language caption.

It can add and remove components, while integrating the expected changes to shadows, reflections, and textures.

DALL·E 2 can make realistic edits to existing images from a natural language caption. It can add and remove elements while taking shadows, reflections, and textures into account. — In this example, a corgi is placed in the user’s choice of location. Credit: OpenAI

DALL·E 2 was trained on pairs of images and their corresponding captions, which taught the model about the relationships between pictures and words.

New images are generated through a process called diffusion.

This begins with a pattern of random dots. The system then gradually transforms the pattern into a picture when it recognizes specific aspects of that image.

DALL-E 2 can combine concepts, attributes, and styles in its creations. Credit: OpenAI

Some of DALL-E 2’s creations look almost too good to be true. Yet the researchers say the system tends to generate visually coherent images for most captions that people try.

The above pictures of an astronaut, for example, were curated from a set of nine produced by the model. Prafulla Dhariwal a research scientist at OpenAI, said the results are generally consistent:

Sometimes, it can be helpful to iterate with the model in a feedback loop by modifying the prompt based on its interpretation of the previous one or by trying a different style like ‘an oil painting,’ ‘digital art,’ ‘a photo,’ ‘an emoji,’ etcetera. This can be helpful for achieving a desired style or aesthetic.

Budding fraudsters note that Klimt's less renowned Adele Bloch-Bauer I sold for $135 million in 2006. — A reimagining of Gustav Klimt’s “The Kiss.” Budding fraudsters may recall that a Klimt painting sold for $150 million in 2016. Credit: OpenAI

DALL-E 2’s potential uses are vast.

Graphic designers, app developers, media outlets, architects, commercial illustrators, and product designers could all use the tool for inspiration, new creations, and editing.

Commercial artists may be nervous about their future employment prospects. Ramesh acknowledges that many jobs could change:

We have seen AI be a good tool for people in the creative space. For example, as photo editing software has become more powerful and accessible it has allowed more people to enter the photography field. In recent years, we’ve also seen artists use AI to create new kinds of art.

It’s hard to predict the future, but we do know AI will have an impact on jobs much like personal computers did. The nature of many jobs will change, jobs that never existed before will be created, and others may be eliminated.

Created with DALL·E 2 by @OpenAI

Prompt:
"Mona Lisa is drinking wine with da Vinci."

// Even if we don't see Maestro, the composition is perfect. Note the horizontal level of liquid in the glass.

Made with #DALLE // #DALLEmerz pic.twitter.com/wk8Kf6DKcd

— Merzmensch Kosmopol (@Merzmensch) April 6, 2022

The system hasn’t yet been released to the public. OpenAI CEO Sam Altman hopes to launch the product this summer, but the researchers first want to investigate the risks.

They plan to integrate safeguards that prevent the system from generating deceptive and otherwise harmful content.

In addition, DALL·E 2 inherits various biases from its training data — and its outputs sometimes reinforce societal stereotypes.

The team has already removed explicit content from the training data and banned violent, hateful, and adult content in their content policy.

If filters identify images and text prompts that break the rules, the system won’t generate the outputs. Automated and human monitoring systems have also been implemented as safeguards against misuse.

View this post on Instagram

A post shared by DALL·E by OpenAI (@openaidalle)

Altman believes DALL-E’s mechanism could change how we interact with machines.

“This is another example of what I think is going to be a new computer interface trend: you say what you want in natural language or with contextual clues, and the computer does it,” he said in a blogpost.

DALL-E may also boost our understanding of how AI sees the world. OpenAI hopes this helps them create systems that benefit humanity — and aren’t manipulated to foster hatred and deception.

Story by Thomas Macaulay

Managing editor

Thomas is the managing editor of TNW. He leads our coverage of European tech and oversees our talented team of writers. Away from work, he e (show all) Thomas is the managing editor of TNW. He leads our coverage of European tech and oversees our talented team of writers. Away from work, he enjoys playing chess (badly) and the guitar (even worse).

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with

Artificial intelligence

OpenAI’s new image generator sparks both excitement and fear

Get the TNW newsletter

Also tagged with

‘Sorry, I didn’t get that’: AI misunderstands some people’s words more than others

UK’s answer to DARPA backs synthetic muscles and e-skin in new robotics project

Discover TNW All Access

Dutch unicorn Bird flees ‘overregulated’ Europe for ‘global hubs’ — and a meditation retreat

Landmark digital declaration from EU ministers ignites calls to cut startup regulation