MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) just developed an AI-assisted image editing tool that automates object selection. This is the Holy Grail of selection tools — you can behold it in the video below.
For millions of people, Photoshop is a program that’s used to bring out the best visual features in images. But at TNW we’re more likely to use it to make Mark Zuckerberg look like a vampire or to put a sombrero on a hacker. And, speaking only for myself, using Photoshop is time-consuming and hard.
What we need is an object grabber AI. We could call it: Grabber Bot 2000. Unfortunately MIT CSAIL researchers made one but didn’t even name it. They simply call the technique used by their AI-assisted image editor “Semantic Soft Segmentation,” or SSS.
The editor separates the objects and background in an image into different segments, which allows for easy selection. Unlike the magnetic lasso or magic lasso tools in most photo editing software, this doesn’t rely on user input for context, you don’t have to trace around an object or zoom in and catch the fine details. The AI just works.
Of course, the secret sauce behind the magic involves a lot of complex algorithms and computations. The team uses a neural network (read more about those here) to process the image features and make determinations about the soft edges of an image.
When a human looks at a picture we’re pretty good at making inferences based on context. If there’s a giraffe standing in front of an elephant in an image we don’t tend to struggle with figuring out where one ends and the other begins. Computers have to be taught how to do this, and it’s not a simple task.
According to visiting MIT CSAIL researcher Yagiz Aksoy:
The tricky thing about these images is that not every pixel solely belongs to one object. In many cases it can be hard to determine which pixels are part of the background and which are part of a specific person.
This is because soft transitions can cause two different objects, or an object and the background to share pixels around edges. MIT’s AI takes this into account and does the tedious detail work of splitting the difference autonomously.
The applications for this technology are obvious, whether we’re talking Instagram filters that allow you to seamlessly change the background or add depth-of-field effects, or the potential for scaling this to work with video.
The future of image and video editing is certainly AI, but we’re not quite there. The current process MIT’s CSAIL team is working on doesn’t work with video just yet. And it takes about four minutes to process an image – a human Photoshop expert could probably beat it in a race.
But this isn’t one of those pie-in-the-sky projects that could pay dividends in 10 or 20 years when society catches up to its ambition – this could provide an immediate benefit to anyone who uses any sort of photo or video editing software, including the built-in tools that come with our phones.
Rest assured, once this AI hits the prime time we’ll use it for more than just making Zuckerberg vampires and fake Elon Musk Narcos promos. We’ll still do that, of course, but we’ll finally have time for the less serious projects we’ve had in mind too.
For more information check out the CSAIL team’s white paper. And don’t forget to check out our artificial intelligence section for all the latest machine learning news and analysis.
Get the TNW newsletter
Get the most important tech news in your inbox each week.