Microsoft has unveiled an AI system called Speller100 that corrects spelling in over 100 languages used in search queries on Bing.
“We believe Speller100 is the most comprehensive spelling correction system ever made in terms of language coverage and accuracy,” the company said in a blog post.
Bing previously provided high-quality spelling corrections for around two dozen languages. However, it didn’t have enough training data to work well on languages with little web presence and user feedback.
Speller100 overcomes these limitations by looking for similarities in large language families.
It also applies zero-shot learning to correct errors without needing extra language-specific labeled training data.
Microsoft said it built around a dozen language family-based models to maximize the zero-shot benefit:
Imagine someone had taught you how to spell in English and you automatically learned to also spell in German, Dutch, Afrikaans, Scots, and Luxembourgish. That is what zero-shot learning enables, and it is a key component in Speller100 that allows us to expand to languages with very little to no data.
The system also reduces the need for human-labeled annotations by extracting text from web pages to generate common errors.
“This text can easily be extracted through web crawling, and there is a sufficient amount of text for the training of hundreds of languages,” Microsoft said.
In tests, Speller100 reduced the number of pages with no reduced by up to 30%. It also increased the number of times users clicked on spelling suggestions from single digits to 67%.
Microsoft said shipping the system to Bing is just the first step. The company plans to add the tech to “many more” of its products in the near future.