Microsoft has developed an AI to draw entirely original images based on nothing more than text. You type it, a computer draws it, and we’re one step closer to a world where using software like Photoshop and Illustrator is a hands-off experience.
Researchers created a text-to-image bot that spits out pretty amazing images when fed a series of descriptive words like “this bird is red with white and has a very short beak.” This was accomplished through the creation of neural network called an Attentional Generative Adversarial Network (AttnGAN) that creates the image pixel-by-pixel. Like any other artist or designer, it does both broad strokes and fine details in layers.
The Deep Learning Group at the Redmond company created this as part of a trilogy of AI projects that include one called Caption Bot, which provides text descriptions for images and another which provides audio answers to questions about images. Each was developed to provide useful applications which combine both computer vision and natural language processing.
The idea with all three is to teach machines how to understand humans and the world the same way we do. The researchers are trying to fix the “this robot thinks a turtle is a rifle” problem, and it looks like they’re succeeding.
Xiaodong He, an AI research manager with the group, said in a Microsoft blog post:
If you go to Bing and you search for a bird, you get a bird picture. But here, the pictures are created by the computer, pixel by pixel, from scratch. These birds may not exist in the real world — they are just an aspect of our computer’s imagination of birds.
That’s quite beautiful, actually. But, AI doesn’t exactly have imagination – in no sense is this AI capable of inspiration – yet it does present another exhibit for the philosophical arguments to come.
Perhaps the more enticing potential application for this, aside from fine art and faking photography, is in the design industry. Imagine lying on your couch with your fingers laced behind your head while you conjure user interfaces or model specifications in your mind’s eye. And then telling your virtual assistant to draw them for you with a simple voice command.
We’re not quite there yet though. And some of those pictures are fugly — the Salvador Dali-inspired melting stop signs are a bit unsettling. Still, it’s amazing to think we’re on the cusp of a world where, potentially, designers will rely solely on human imagination and artificial intelligence — not computer skills and software training.
Better yet: here’s hoping the Microsoft does the smart thing and makes this the next version of the beloved Paint.