Join us at TNW Conference 2022 for insights into the future of tech →

Human-centric AI news and analysis

This article was published on December 17, 2020

Google’s new open-source AI model understands Indic languages better

Google’s new open-source AI model understands Indic languages better
Ivan Mehta
Story by

Ivan Mehta

Ivan covers Big Tech, India, policy, AI, security, platforms, and apps for TNW. That's one heck of a mixed bag. He likes to say "Bleh." Ivan covers Big Tech, India, policy, AI, security, platforms, and apps for TNW. That's one heck of a mixed bag. He likes to say "Bleh."

Google’s various products, such as Search and Assistant, are already available in India in multiple local languages. The company is now turning to a new AI to potentially make more of its offerings accessible to Indic language speakers — more specifically, it’s using a technology called MuRIL.

At its virtual event today, the Big G unveiled a new language model called Multilingual Representations for Indian Languages (MuRIL). This is the first model to support interoperation between 16 different Indic languages.  

That includes Assamese, Bengali, English, Gujarati, Hindi, Kannada, Kashmiri, Malayalam, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Sindhi, Tamil, Telugu, and Urdu.

While MuRIL is based on Google’s own BERT (Bidirectional Encoder Representations from Transformers) model, researchers claim it’s more efficient for Indian languages.

MuRIL

Partha Talukdar, a researcher at Google India, said that the new model understands the context of statements in local languages better. 

For example, the previous model understood the following Hindi statement as a negative emotion: a Hindi statement “Accha hua account bandh ho gaya” (It’s good that the account got closed). However, the new model correctly predicts that the statement is positive.

Users in India often use their English language keyboard to type in local languages — like the sentence above. For that, researchers have included support for transliteration detection in other languages while using the Roman script.

Google is making this model open-source for other researchers and startups to use.

Currently, MuRIL is not embedded in any of Google’s products. However, based on inputs from researchers and programmers, it aims to include this model into its offerings in the future for better accuracy.

You can learn more and check out MuRIL’s code here.

Get the Neural newsletter

Greetings Humanoids! Did you know we have a newsletter all about AI? You can subscribe to it right here.

Also tagged with