Apple fixes Siri’s robotic voice with deep learning

Ahead of the launch of iOS 11 this fall, Apple has published a research paper detailing its methods for improving Siri to make the voice assistant sound more natural, with the help of machine learning.

Beyond capturing several hours of high-quality audio that can be sliced and diced to create voice responses, developers face the challenge of getting the prosody – the patterns of stress and intonation in spoken language – just right. That’s compounded by the fact that these processes can heavily tax a processor, and so straightforward methods of stringing sounds together would be too much for a phone to handle.

That’s where machine learning comes in. With enough training data, it can help a text-to-speech system understand how to select segments of audio that pair well together to create natural-sounding responses.

For iOS 11, the engineers at Apple worked with a new female voice actor to record 20 hours of speech in US English and generate between 1 and 2 million audio segments, which were then used to train a deep learning system. The team noted in its paper that test subjects greatly preferred the new version over the old one found in iOS 9 from back in 2015.

TNW City Coworking space - Where your best work happens

A workspace designed for growth, collaboration, and endless networking opportunities in the heart of tech.

Book a tour now

The results speak for themselves (ba dum tiss): Siri’s navigation instructions, responses to trivia questions and ‘request completed’ notifications sound a lot less robotic than they did two years ago. You can hear them for yourself at the end of this paper from Apple.

That’s another neat treat to look forward to in iOS 11. If you’re keen to check out all the cool features that are coming soon, you can install the beta and try them out right away.

Story by Abhimanyu Ghoshal

Managing Editor

Abhimanyu is TNW's Managing Editor, and is all about personal devices, Asia's tech ecosystem, as well as the intersection of technology and (show all) Abhimanyu is TNW's Managing Editor, and is all about personal devices, Asia's tech ecosystem, as well as the intersection of technology and culture. Hit him up on Twitter, or write in: [email protected].

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with

Apple

Apple fixes Siri’s robotic voice with deep learning – hear the difference

Get the TNW newsletter

Also tagged with

Tim Cook returned $1 trillion to shareholders. John Ternus is being given permission to keep it.

Apple posted its best quarter ever by not building an AI model

Discover TNW All Access

The $599 Mac Mini is dead. AI data centres killed it.

Apple is betting John Ternus will bring back Jobs-era decisiveness at the worst possible time to be behind on AI