Microsoft’s speech recognition is now just as accurate as humans

Robots are now just as good at transcribing speech as humans.

According to a paper published yesterday, a team of Microsoft engineers in the Artificial Intelligence and Research division reported their system reached a word error rate (WER) of 5.9 percent, a figure that is roughly equal to that of human abilities.

“We’ve reached human parity,” said Xuedong Huang, the company’s chief speech scientist. “This is a historic achievement.”

After decades of testing, the milestone comes on the heels of last month’s ‘close but no cigar’ score of 6.3 WER and figures to have wide-reaching implications as the battle for digital assistant supremacy heats up. Cortana, Xbox, and Windows could see the biggest initial impact.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

To achieve these levels of accuracy, researchers employed deep neural networks to store significant amounts of data — called training sets — that helped systems recognize patterns from human input. Sounds and images were both used to train the network to utilize its stored data more efficiently.

Researchers want to be clear that parity is far from perfection. In this case, it just means it’s as good as humans, and we’re far from flawless.

Moving forward, the team hopes to achieve even higher levels of accuracy as well as ensure that speech recognition works better in real-world situations, such as noisy restaurants, crowded streets, and in powerful winds. In the future, the team dreams of a system that will not just recognize speech, but truly understand it.

We’re still a ways off, but the future consists of a world where we no longer have to understand computers, they have to understand us.

Story by Bryan Clark

Former Managing Editor, TNW

Bryan is a freelance journalist. Bryan is a freelance journalist.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with

Microsoft

Microsoft’s speech recognition is now just as accurate as humans

Get the TNW newsletter

Also tagged with

Samsung’s $2,100 Galaxy Book6 Edge ships with 16GB of RAM in 2026

The MSI Claw 8 EX AI+ costs $1,699 and that is gaming laptop money for a handheld

Discover TNW All Access

Three Xbox studios are trying to buy their way out of Microsoft’s gaming restructuring

Microsoft is considering spinning off Xbox entirely as the division’s margins hit 3%