AI & futurism

powered by

This article was published on February 14, 2022

DeepMind: Reward may NOT be enough for AGI — but it’s worth a try

Reinforcement learning is just one approach being pursued

DeepMind: Reward may NOT be enough for AGI — but it’s worth a try Image by: ANTONI SHKRABA (edited)
Thomas Macaulay
Story by

Thomas Macaulay

Writer at Neural by TNW Writer at Neural by TNW

DeepMind has been connected to artificial general intelligence since birth.

The lab was launched with a mission to develop AGI, was cofounded by a researcher who coined the term, and has made some compelling advances in the field.

It also recently produced a provocative paper on the subject: “Reward is Enough

Greetings, humanoids

Subscribe to our newsletter now for a weekly recap of our favorite AI stories in your inbox.

The study hypothesizes that AGI could be achieved through a single approach: reinforcement learning.

This technique provides feedback in the form of a “reward” — a positive number that tells an algorithm that the action it just performed will benefit its goal.

The approach has shown promise in programs such as MuZero, which mastered multiple games without being told their rules. DeepMind called the system a “significant step forward in the pursuit of general-purpose algorithms.” 

“Reward is Enough” suggests that reinforcement learning alone could lead to AGI.

This theory has been challenged by many computer scientists — including some at DeepMind. But Doina Precup, one of the paper’s coauthors, told TNW that the study merely sought to probe the possibilities.

“Ultimately, we want to test this as a hypothesis and to think of it in the context of other methods as well,” said Precup, who heads up DeepMind’s Montreal office.

Indeed, reinforcement learning is just one approach that the Alphabet subsidiary is exploring. In a new episode of the DeepMind podcast, the lab’s researchers discuss the promise of various pathways to AGI.

Among the reward-is-enough skeptics is Raia Hadsell, the company’s director of robotics, who notes the difficulty of designing an all-powerful reward that leads to AGI. DeepMind cofounder Shane Legg, meanwhile, suspects that reinforcement learning may have to combine with learning algorithms.

Precup also has doubts that reward alone is enough, but she believes it could be a crucial ingredient in AGI.

“Because it’s learning from interaction in an incremental way, it feels very much like what biological intelligence systems do,” she said.

“Is it at the end of the day going to be the only technology that contributes to AGI? Well, that’s not clear at all — there’s a lot of other really interesting things that are going on.”

Precup is nonetheless optimistic that we’re already on a path to AGI. Ultimately, she’s more concerned about the safety of the destination than the route that takes us there.

“The road to AGI,” the fifth episode in season two of “DeepMind: The Podcast,” is available here from February 15.

Get the Neural newsletter

Greetings Humanoids! Did you know we have a newsletter all about AI? You can subscribe to it right here.

Back to top