Who’s afraid of OpenAI’s big, bad text generator?

Who’s afraid of OpenAI’s big, bad text generator?

The existential threat of harmful AI recently reached ‘Cuban Missile Crisis’ proportions after a major research institute (checks notes…) modified the model size of a text generator. Apparently we’ve finally run out of real things to be scared of.

What happened

If you follow AI news you’ve already heard about it: OpenAI, a non-profit co-founded by Elon Musk (who’s no longer involved), developed a text generator and chose not to disclose the full model in its research paper. That’s it.

It’s kind of boring actually. Don’t get us wrong, the text generator – called GPT-2 – is pretty cool. It can sometimes generate coherent blocks of text from a single phrase. Here’s an example:

System Prompt (human written):

A train carriage containing controlled nuclear materials was stolen in Cincinnati today. Its whereabouts are unknown.

Model completion (machine written):

The incident occurred on the downtown train line, which runs from Covington and Ashland stations.

In an email to Ohio news outlets, the U.S. Department of Energy said it is working with the Federal Railroad Administration to find the thief.

‘The theft of this nuclear material will have significant negative consequences on public and environmental health, our workforce and the economy of our nation,’ said Tom Hicks, the U.S. Energy Secretary, in a statement. ‘Our top priority is to secure the theft and ensure it doesn’t happen again.’

The stolen material was taken from the University of Cincinnati’s Research Triangle Park nuclear research site, according to a news release from Department officials.

The Nuclear Regulatory Commission did not immediately release any information.

According to the release, the U.S. Department of Energy’s Office of Nuclear Material Safety and Security is leading that team’s investigation.

‘The safety of people, the environment and the nation’s nuclear stockpile is our highest priority,’ Hicks said, ‘We will get to the bottom of this and make no excuses.’

Pretty cool, right? None of the events in the AI-generated article actually happened; it’s easy to verify that it’s fake news. But it’s impressive to see a machine riff like that. Impressive, not terrifying.

My what big models you have

The OpenAI researchers took millions of Reddit posts, fed them to a big ass AI model, and trained it to spit out coherent text. The novel accomplishment here was not the text generator. That’s old hat. It was just having the resources available to train a bigger model than anyone else has before. To put that in layman’s terms: OpenAI added more computers so the AI could use more data at once. The result produced better text generation than the previous smaller model.

Here’s what the headlines should’ve looked like: “OpenAI improves machine learning model for text generator.” It’s not sexy or scary, but neither is GPT-2. Here’s what the headlines actually looked like:

What the heck happened? OpenAI took a fairly normal approach to revealing the GPT-2 developments. It sent out an email full of information to select journalists who agreed not to publish anything before a specific time (called an embargo) – par for the course. This is why we saw a slew of reports on February 14 about the AI so dangerous it couldn’t be released.

In the initial email, and in a subsequent blog post, OpenAI policy director Jack Clark stated:

Due to concerns about large language models being used to generate deceptive, biased, or abusive language at scale, we are only releasing a much smaller version of GPT-2 along with sampling code. We are not releasing the dataset, training code, or GPT-2 model weights.

Clark goes on to explain that OpenAI doesn’t naively believe holding back the model will save the world from bad actors — but someone has to start the conversation. He explicitly states the release strategy is an experiment:

This decision, as well as our discussion of it, is an experiment: while we are not sure that it is the right decision today, we believe that the AI community will eventually need to tackle the issue of publication norms in a thoughtful way in certain research areas.

OpenAI “will further publicly discuss this strategy in six months,” wrote Clark.

Six seconds later

All hell shook loose in the machine learning world when the story broke on Valentine’s Day. OpenAI came under immediate scrutiny for having the audacity to withhold portions of its research (something not uncommon):

For over-exaggerating the problem:

And for allegedly shutting out researchers and academia in favor of soliciting input from journalists and politicians:

Poor Jack tried his best to contain the insanity:

But GPT-2’s mythology was no longer up to OpenAI. No matter how much the team loved their monster, the media never met an AI development it couldn’t wave pitchforks and torches at. The story immediately became about the decision to withhold the full model. Few news outlets covered the researchers’ progress straight-up.

GPT-2’s release spurred plenty of argument, but not the debate that Clark and OpenAI were likely hoping for. Instead of discussing the ethics and merits of cutting-edge AI, detecting fake text, or the potential implications of releasing unsupervised learning models to the public, the AI community became embroiled in a debate over hyperbolic media coverage. Again.

An unhappy ending

There’s a lot of blame to go around here, but let’s start with OpenAI. Whether intentional or not, it manipulated the press. OpenAI is on the record stating it didn’t intend for journalists to believe it was withholding this specific model because it was known to be dangerous – the institute just wasn’t completely sure it wasn’t. Moreover, representatives stated that the concerns were more about AI-powered text generators in general, not GPT-2 specifically.

Let’s be clear here: sending journalists an email that’s half about a specific AI system and half about the ethics of releasing the models for AI systems in general, played a significant role in this kerfuffle. Those are two entirely different stories and they probably shouldn’t have been conflated to the media. We won’t editorialize why OpenAI chose to do it that way, but the results speak for themselves.

The technology journalists reporting on GPT-2 also deserve a modicum of reproach for allowing themselves to be used as a mouthpiece for nobody’s message. Despite the fact that most of the actual reporting was quite deep, the headlines weren’t.

The general public likely still believes OpenAI made a text generator so dangerous it couldn’t be released, because that’s what they saw when they scrolled through their news aggregator of choice. But it’s not true, there’s nothing definitively dangerous about this particular text generator. Just like Facebook never developed an AI so dangerous it had to be shut down after inventing its own language. The kernels of truth in these stories are far more interesting than the lies in the headlines about them – but sadly, nowhere near as exciting.

The biggest problem here is that by virtue of on onslaught of misleading headlines, the general public’s perception of what AI can and cannot do is now even further skewed from reality. It’s too late for damage-control, though OpenAI did try to set the record straight.

Sam Charrington’s excellent “This Week In Machine Learning” program recently hosted a couple of OpenAI’s representatives alongside a panel of experts to discuss what happened with the GPT-2 release. The OpenAI reps relayed again what Clark explained in the aforementioned blog post: this was all just a big experiment to help plot the path forward for ethical public disclosure of potentially harmful AI models. The detractors made their objections heard. The entire 1:07:06 video can be viewed here.

Unfortunately, the general public probably isn’t going to watch an hour-long interview with a bunch of polite, rational people calmly discussing ethics in AI. And they’re also probably not going to read the sober follow-up articles on GPT-2 with the same voracious appetite as the hyperbolically-titled ones.

No amount of slow news reporting can entirely undo the damage that’s done when dozens of news outlets report that an AI system is “too dangerous,” when it’s clearly not the case. It hurts research, destroys media credibility, and distorts politicians’ views.

To paraphrase Anima Anandkumar: I’m not worried about AI-generated fake news, I’m worried about fake news about AI.


At TNW 2019, we have a whole track dedicated to exploring the role of AI and machine learning in our professional and daily lives. Find out more here.

Read next: Fujifilm X-T3 review: a camera too good to be true (and yet, it is)