Save over 40% when you secure your tickets today to TNW Conference 💥 Prices will increase on November 22 →

This article was published on June 29, 2017

Instagram is now using AI to fight trolls and spammers


Instagram is now using AI to fight trolls and spammers

Using social media comes with the unfortunate risk of encountering hate speech and offensive comments. The companies behind them are constantly looking for new ways to combat hate speech and now Instagram is turning to the logical next step: Artificial intelligence.

Last year, Instagram introduced a keyword filter that allowed you to automatically remove specific offensive words that may appear on your feed. The tool removed words “often reported as inappropriate” and let you set custom keywords, but at the time, we mentioned some of its caveats; mainly, fighting hateful comments can be exhausting as trolls come up with new offensive language all the time:

…trolls have been finding their way around filters since the internet was born, using misspelled slurs or sometimes creating entirely new ones.

Thankfully, you can add your own custom keywords and phrases, but Instagram will have to do more than just a simple word filter to make the network feel welcome; Bigots can often shout offensive or abusive comments without using specific offensive words.

This new filter tries to get around those caveats using machine intelligence. Instagram is using AI to try and understand the context around offensive speech and get rid of abuse even if it doesn’t trigger specific keywords. That should help mitigate the cat-and-mouse nature of fighting hateful comments, although it currently only works in English.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

Wired goes into more detail about how the AI system, called DeepText, is able to identify offensive language. For example, it can understand that the word “white” may not be offensive as a color or title (White Sox, white snow), but that “white power” could be offensive in many uses.

Of course, defining hate speech will always be murky (humans aren’t even all that good at identifying it, after all), and there is the risk of false positives. What if you are only quoting hate speech you don’t agree with, perhaps for the purposes of a response? Not to mention there are offensive words that are homonyms or that have different meanings in different contexts.

But the system should get smarter over time, as AI is wont to do. Still, if you don’t like the idea of a robot brain deciding what’s offensive or not, you have the option to turn it off from the comments section of Instagram’s settings. Instagram is also using DeepText to identify spam, which is hopefully something we can all agree deserves disappear.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with