Scientists claim they can teach AI to judge ‘right’ from ‘wrong’

Scientists claim they can “teach” an AI moral reasoning by training it to extract ideas of right and wrong from texts.

Researchers from Darmstadt University of Technology (DUT) in Germany fed their model books, news, and religious literature so it could learn the associations between different words and sentences. After training the system, they say it adopted the values of the texts.

As the team put it in their research paper:

The resulting model, called the Moral Choice Machine (MCM), calculates the bias score on a sentence level using embeddings of the Universal Sentence Encoder since the moral value of an action to be taken depends on its context.

This allows the system to understand contextual information by analyzing entire sentences rather than specific words. As a result, the AI could work out that it was objectionable to kill living beings, but fine to just kill time.

[Read: Vatican’s AI ethics plan lacks the legal restrictions it needs to be effective]

The <3 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

Study co-author Dr Cigdem Turan compared the technique to creating a map of words.

“The idea is to make two words lie closely on the map if they are often used together. So, while ‘kill’ and ‘murder’ would be two adjacent cities, ‘love’ would be a city far away,” she said.

“Extending this to sentences, if we ask, ‘Should I kill?’ we expect that ‘No, you shouldn’t’ would be closer than ‘Yes, you should.’ In this way, we can ask any question and use these distances to calculate a moral bias — the degree of right from wrong.

Making a moral AI

Previous research has shown that AI can learn from human biases to perpetuate stereotypes, such as Amazon’s automated hiring tools that downgraded graduates of all-women colleges. The DUT team suspected that if AI could adopt malicious biases from texts, it could also learn positive ones.

They acknowledge that their system has some pretty serious flaws. Firstly, it merely reflects the values of a text, which can lead to some extremely dubious ethical views, such as ranking eating animal products a more negative score than killing people.

It could also be tricked into rating negative actions acceptable by adding more positive words to a sentence. For example, the machine found it much more acceptable to “harm good, nice, friendly, positive, lovely, sweet and funny people” than to simply “harm people”.

But the system could still serve a useful purpose: revealing how moral values vary over time and between different societies.

Changing values

After feeding it news published between 1987 and 1997, the AI rated getting married and becoming a good parent as extremely positive actions. But when they fed it news from 2008 – 2009, these were deemed less important. Sorry kids.

It also found that values varied between the different types of texts. While all the sources agreed that killing people is extremely negative, loving your parents was viewed more positively in books and religious texts than in the news.

That textual analysis sounds like a much safer use of AI than letting it make moral choices, such as who a self-driving car should hit when a crash is unavoidable. For now, I’d prefer to leave those to a human with strong moral values — whatever they might be.

Story by Thomas Macaulay

Senior reporter

Thomas is a senior reporter at TNW. He covers European tech, with a focus on deeptech, startups, and government policy. Thomas is a senior reporter at TNW. He covers European tech, with a focus on deeptech, startups, and government policy.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Scientists claim they can teach AI to judge ‘right’ from ‘wrong’

Making a moral AI

Changing values

Get the TNW newsletter

Also tagged with

French AI scene toasts $200M for Holistic as Sonio sale shows risks of success

Paris startup adds ‘universal compute’ weapon to France’s growing AI arsenal

Join TNW All Access

OpenAI to train LLMs on Financial Times content — with permission

Darktrace agrees £4.3B sale to US investor in blow to UK stock market