Early bird prices are coming to an end soon... ⏰ Grab your tickets before January 17

This article was published on May 6, 2021

Twitter will now reprimand you for nasty replies


Twitter will now reprimand you for nasty replies

If you’re planning to reply “FUCK YOU!” or something similar to a tweet, Twitter will make you think twice — literally. Starting today, the social network company is rolling out new prompts that aim you to stop from posting mean replies.

Twitter started this experiment last May with a limited set of users on iOS. Now it’s expanding to all users on Android and iOS.

The company said that this will cover potentially harmful or offensive replies — such as insults, strong language, or hateful remarks — in English for now. If the app’s algorithm detects such a reply, it’ll ask the user to reconsider sending it. You can delete the tweet or edit your response, but if you’re determined, you can still send the tweet with profanities.

Twitter’s warning prompt for replies with profanities

The firm admitted in the test last year, the algorithm failed to contextually separate a mean reply, sarcasm, and friendly banter. While the team has observed this behavior and made some changes to it, there’s a chance that the algorithm might get it wrong. In that case, you can tap on the “Did we get this wrong?” link to submit your feedback.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

Twitter also considers if you and the person you’re sending your reply to interact frequently, to gauge if the reply is mean or just meant as a joke.

Submitting feedback if Twitter’s algorithm for mean reply detection gets is wrong

Twitter said that this method of prompting yielded encouraging results in its tests as 34% of people decided to alter or delete their replies.

That also means that 66% of people still decided to send it. Plus, there are ways to modify words and fool the algorithm into thinking that it’s a clean reply. And it doesn’t cover languages other than English, so if anyone’s multilingual, they can get away with abusive replies.

Despite all these hiccups, Twitter’s new feature is a positive step in reducing toxicity on the platform if it can bring down hateful comments by just a few notches.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with