A team of researchers from University College Maastricht recently published a study exploring the use of GPT-3 as an email manager. As someone with an inbox that can only be described as ludicrous, color me intrigued.
The big idea: We spend hours a day reading and responding to emails, what if an AI could automate both processes?
The Maastricht team explored the idea of letting GPT-3 loose in our email systems from a pragmatic point of view. Rather than focus on exactly how good GPT-3 is at responding to specific emails, the team examined whether there’d be any merit to even trying.
Their paper (read here) breaks down the potential efficacy of GPT-3 as an email secretary by examining how useful it is versus fine-tuned machines, how financially viable it is versus human workers, and how impactful machine-generated mistakes would be to senders and recipients.
Background: The quest to build a better email client is a never-ending one, but ultimately we’re talking about letting GPT-3 respond to incoming emails. According to the researchers:
Our research indicates that a market for GPT-3-based email rationalisation exists in several different sectors of the economy, of which we shall explore just a few. In all sectors, the damage of a small mistake in wording seems minor as content generally involves neither vast amounts of money nor human safety.
The authors go on to describe use-cases in the insurance, energy, and public administration sectors.
Objections: First off, it’s worth pointing out that this is a pre-print paper. Often this means the science is good, but the paper itself is still in revisions. This particular paper is currently a bit of a mess. Three separate sections contain the same information, for example, so it’s difficult to truly discern the point of the study.
It does seem to indicate that it would save us both time and money if GPT-3 could be applied to the task of responding to our work emails. But that’s a gigantic “if.”
GPT-3 lives in a black box. A human would have to proofread every email it sends out because there’s no way to ever be certain it won’t say something that invites litigation. Aside from fears the machine would generate offensive or false text, there’s also the issue with trying to figure out what good a general-knowledge bot would be for this task.
GPT-3 was trained on the internet, so it may be able to tell you the wingspan of an albatross or who won the 1967 World Series, but it certainly can’t decide whether you want to chip in for a birthday card for a co-worker or if you’re interested in heading up a new subcommittee.
The point is, GPT-3 would likely be worse at responding to general emails than a simple chatbot trained to select a pre-generated response.
Quick take: A bit of Googling tells me the landline telephone wasn’t ubiquitous in the US until 1998. And now, just a couple decades later, only a tiny fraction of US homes still have a landline.
I can’t help but wonder if email will be the standard for communication much longer – especially if the last line of innovation involves coming up with ways to keep us out of our own inboxes. Who knows how long away we could be from a hypothetical version of OpenAI’s GPT that’s trustworthy enough to make it worth using on any commercial level.
The research here is laudable and the paper makes for an interesting read, but ultimately the usefulness of GPT-3 as an email-responder is purely academic. There are better solutions to inbox filtering and automated response out there than a brute-force text generator.
Get the TNW newsletter
Get the most important tech news in your inbox each week.