Millions of people around the world rely on Wikipedia for information on just about every topic under the sun, offered in nearly 300 languages and at no charge. But it’s less accessible to those with vision impairments and learning disabilities.
To address this, the online encyclopedia is teaming up with Sweden’s KTH Royal Institute of Technology to develop an open-source speech synthesis engine that will read out articles with more natural pronunciation than traditional text-to-speech tools.
The idea is to create a way to make content on Wikipedia and any other site built using the MediaWiki platform easier to consume, whether you’re blind, have dyslexia or simply find audio more convenient than reading from a screen.
In addition, the engine will be designed to work with a crowdsourced lexicon of pronunciations of words and syllables. This means it can be expanded to work with any number of languages and it can be improved upon by the community, similar to how Wikipedia entries are updated and corrected by volunteers.
High-quality text-to-speech tools don’t exist for every language and so Wikipedia’s engine could help fill those gaps. According to Wikimedia Sweden, about 125 million people per month need or prefer consuming text in spoken form.
The team behind the project hopes to ready the engine with support for English, Swedish and Arabic by September 2017. It will then seek the assistance of the Wikipedia community to add the other 285 languages in which the encyclopedia is available.
In addition, KTH will back the effort with a co-project on improved intonation modelling in speech synthesis for more natural-sounding audio.
You can learn more about the project’s progress and find its complete pilot study on this Wiki page.
➤ KTH and Wikipedia develop first crowdsourced speech engine [KTH Royal Institute of Technology]
Get the TNW newsletter
Get the most important tech news in your inbox each week.