Twitter is experimenting with new solutions to reduce toxicity

One of the biggest challenges Twitter has right now is to reduce abuse and bullying on its platform. Last week, the company’s head of product, Kayvon Beykpour, sat down with Wired editor-in-chief Nicholas Thompson during the Consumer Electronics Show (CES) in Las Vegas to discuss toxicity on the platform, the health of conversations, and more. Through the interview, he revealed some aspects of Twitter’s work to tackle abusive and offensive content.

Beykpour said one of the steps the company takes to reduce toxicity is to de-rank abusive replies using machine learning:

I think increasingly, leveraging machine learning to try and model the behaviors that we think are most optimal for that area. So for example, we would like to show replies that are most likely to be replied to. That’s one attribute you might want to optimize for, not the only attribute by any means. You’ll want to deemphasize replies that are likely to be blocked or reported for abuse.

He added that Twitter optimizes replies that are more likely to get reactions or replies. However, it tweaks its algorithm to de-rank replies that are reaction-worthy, yet abusive.

When Thompson asked him about how the company tries to control system so it doesn’t incentivize toxicity, Beykpour said the social network trains its AI models rigorously to understand its rules and regulations:

Today, a very prominent way that we leverage AI to try to determine toxicity is basically having a very good definition of what our rules are, and then having a huge amount of sample data around tweets that violate rules and building models around that.

Basically we’re trying to predict the tweets that are likely to violate our rules. And that’s just one form of what people might consider abusive, because something that you might consider abusive may not be against our policies, and that’s where it gets tricky.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

The last line is quite intriguing, and is likely at the heart of many a controversy surrounding Twitter. Users who get banned often complain that Twitter’s moderation wasn’t adequately nuanced to understand the context of the tweets that got them in trouble. On the flip side, some accounts aren’t banned when they tweet controversial or abusive content.

When Thompson jokingly asked if Twitter planned to give abusers a ‘red tick’ or roll out a toxicity score to de-incentivize them, Beykpour waved it off, and said the company is experimenting with more subtle features in its beta app, such as hiding like counts and retweet counts.

Twitter’s challenge in terms of training its AI and moderation team is to consider the ever-changing social and political context of different geographies. Some terms or statements that were normalized a few years ago, might be abusive in the current context. So, the company needs to review and refine its policy constantly.

The whole interview is full of interesting tidbits about how Twitter is thinking about the future of its platform, including open-sourcing it. Find it on Wired here.

Story by Ivan Mehta

Ivan covers Big Tech, India, policy, AI, security, platforms, and apps for TNW. That's one heck of a mixed bag. He likes to say "Bleh." Ivan covers Big Tech, India, policy, AI, security, platforms, and apps for TNW. That's one heck of a mixed bag. He likes to say "Bleh."

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Twitter is experimenting with new solutions to reduce toxicity

Get the TNW newsletter

Also tagged with

7 of the coolest Dutch tech startups at CES 2025

Europe has opened a door to a universal wallet. The web’s inventor wants to enter

Discover TNW All Access

Lucien Engelen: Wellbeing tech is the new battleground for top talent

Proton VPN rises to top UK app charts as porn age checks kick in