Oh no... Someone trained an AI on 4chan

An AI chatbot trained on 4chan has sparked outrage and fascination

If you’re concerned about the biases and bigotry of AI models, you’re gonna love the latest addition to the ranks: a text generator trained on 4chan’s /pol/ board.

Short for “Politically Incorrect,” /pol/ is a bastion of hate speech, conspiracy theories, and far-right extremism. It’s also 4chan’s most active board, accumulating around 150,000 daily posts.

These attributes attracted Yannick Kilcher, an AI whizz and YouTuber, to use /pol/ as a testing ground for bots.

Kilcher first fine-tuned the GPT-J language model on over 134.5 million posts made on /pol/ across three and a half years.

TNW Conference - Are you an investor? This message is for you!

Meet with the hottest startups and uncover your next venture on June 20-21. Hurry! Price increase May 17.

He then incorporated the board’s thread structure into the system. The result: an AI that could post in the style of a real /pol/ user.

The model was good — in a terrible sense.

Kilcher named his monstrous creation GPT-4chan.

“The model was good — in a terrible sense,” he said on YouTube. “It perfectly encapsulated the mix of offensiveness, nihilism, trolling, and deep distrust of any information whatsoever that permeates most posts on /pol/.

“It could respond to context and coherently talk about things and events that happened a long time after the last training data was collected. I was quite happy.”

Kilcher further assessed GPT-4chan on the Language Model Evaluation Harness, which tests AI systems on various tasks.

He was particularly impressed by the performance in one category: truthfulness.

On the benchmark, Kilcher says GPT-4chan was “significantly better” at generating truthful replies to questions than both GPT-J and GPT-3.

Yet this may merely be an indictment of the benchmark’s shortcomings — as Kilcher himself suggested.

Regardless, it wouldn’t be the ultimate test of GPT-4chan.

In the wild

Kilcher wasn’t content with merely mimicking 4chan in private. The engineer chose to go a step further — and let the AI run rampant on /pol/.

He converted GPT-4chan into a chatbot that automatically posted on the board. Bearing a Seychelles flag on its profile, the bot quickly racked up thousands of messages.

/pol/ users soon realized something was up. Some suspected a bot was behind the posts, but others blamed undercover government officials.

Seychelle anon was not alone.

The biggest clue left by the culprit was an abundance of replies devoid of text.

While authentic users also post empty replies, they usually include an image — something GPT-4chan was incapable of doing.

“After 48 hours, it was clear to many it is a bot, and I turned it off,” said Kilcher. “But see, that’s only half the story, because what most users didn’t realize was that Seychelle anon was not alone.”

For the previous 24 hours, the engineer had nine other bots running in parallel. Collectively, they’d left over 15,00 replies — more than 10% of all the posts on /pol/ that day.

Kilcher then gave the botnet an upgrade and ran it for another day. After producing over 30,000 posts in 7,000 threads, he finally retired GPT-4chan.

“People are still discussing the user but also things like the consequences of having AIs interact with people on the site,” Kilcher said. “And it also seems the word Seychelles has become sort of general slang — and that seems like a good legacy for now.”

But not everyone shares this rosy outlook.

The backlash

Kilcher’s experiment has proven controversial.

While the idea of evaluating a 4chan-based model won support, the decision to unleash the chatbot on /pol/ sparked condemnation.

“Imagine the ethics submission!” tweeted Lauren Oakden-Rayner, an AI safety researcher at the University of Adelaide.

“Plan: to see what happens, an AI bot will produce 30k discriminatory comments on a publicly accessible forum with many underage users and members of the groups targeted in the comments. We will not inform participants or obtain consent.”

This week an #AI model was released on @huggingface that produces harmful + discriminatory text and has already posted over 30k vile comments online (says it's author).

This experiment would never pass a human research #ethics board. Here are my recommendations.

1/7 https://t.co/tJCegPcFan pic.twitter.com/Mj7WEy2qHl

— Lauren Oakden-Rayner (Dr.Dr. 🥳) (@DrLaurenOR) June 6, 2022

Roman Ring, a research engineer at DeepMind, added that the exercise had amplified and solidified 4chan’s echo chamber.

“It’s not impossible that GPT-4chan pushed somebody over the edge in their worldview,” he said.

Critics also slammed the move to make the model freely accessible. It was downloaded over 1,000 times before being removed from the Hugging Face platform.

“We don’t advocate or support the training and experiments done by the author with this model,” said Clement Delangue, the cofounder and CEO of Hugging Face, in a post on the platform.

“In fact, the experiment of having the model post messages on 4chan was IMO pretty bad and inappropriate and if the author would have asked us, we would probably have tried to discourage them from doing it.”

FYI we rushed a first version of the gating that is now live (that’s the first thing that the tech team in Paris worked on as soon they woke up) and will improve during the day.

— clem 🤗 (@ClementDelangue) June 7, 2022

The concerns about GPT-4chan have detracted from potentially powerful insights.

The experiment highlights AI’s ability to automate harassment, disrupt online communities, and manipulate public opinion. Yet it also spread discriminatory language at scale.

Nonetheless, Kilcher and his critics have raised awareness about the threats of language models. With their capabilities rapidly expanding, the risks seem set to rise.

Story by Thomas Macaulay

Senior reporter

Thomas is a senior reporter at TNW. He covers European tech, with a focus on deeptech, startups, and government policy. Thomas is a senior reporter at TNW. He covers European tech, with a focus on deeptech, startups, and government policy.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with

Artificial intelligence

An AI chatbot trained on 4chan has sparked outrage and fascination

In the wild

The backlash

Get the TNW newsletter

Also tagged with

Max Planck spinout nets €20M to build ‘stellarator’ fusion machine

This AI model spots workplace accidents before they happen

Join TNW All Access

‘Hidden’ racial slurs are flooding across Russian media, AI study reveals

Is this the future of coffee? Kaffa Roastery releases AI-conic blend