At just 2.2MB, Google’s new speech filtering tech is perfect for mobile apps

Google has plenty of apps on your phone that use speech detection from Google Assistant to Google Translate and Pixel’s nifty recorder app. However, one of the challenges these apps face is separating your voice from other people or background noise.

To overcome these challenges, Google’s AI team has built a new lightweight model called VoiceFilter-lite. In 2018, the team unveiled the first VoiceFilter model which used the company’s voice match tech. It’s used in Google Assistant to analyze your speech and sound when you enroll for a service.

A lot of times recognizing voices efficiently — technically, achieving better source-to-distortion ratio (SDR) — takes a model with a large size, prominent CPU power, and battery consumption.

That’s why the Google team came up with the VoiceFilter-lite model that’s sized at just 2.2MB, making it suitable for tons of lightweight mobile applications.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

It uses the already enrolled voice of a user and improves the recognition even when there’s overlapped speech. Google claims the model enhances recognition by 25% word error rate (WER) — a ratio used to measure how many words a model recognizes from reference sentence

An advantage of this model is that you don’t need to include it in your speech recognition model. So, if a speaker’s voice was not enrolled previously, your app can bypass VoiceFilter-Lite and carry on with recognition commands. This also helps if an enrolled user wants to issue some commands to a digital assistant in incognito mode.

For the next steps, the researchers will try to apply this model for languages other than English. Plus, they want to improve direct speech recognition so the model can be used for more than recognizing voices from overlapping speeches.

You can read more about VoiceFilter-Lite here

Story by Ivan Mehta

Ivan covers Big Tech, India, policy, AI, security, platforms, and apps for TNW. That's one heck of a mixed bag. He likes to say "Bleh." Ivan covers Big Tech, India, policy, AI, security, platforms, and apps for TNW. That's one heck of a mixed bag. He likes to say "Bleh."

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with

Google

At just 2.2MB, Google’s new speech filtering tech is perfect for mobile apps

Get the TNW newsletter

Also tagged with

Google DeepMind’s TacticAI can predict football plays 8 seconds before they happen. Palmeiras is the first to use it.

Bezos’s Prometheus raises $12 billion at $41 billion valuation to build AI that engineers physical products

Discover TNW All Access

AI bubble fears are spreading, even as SpaceX readies the biggest IPO ever

What SpaceX’s record IPO really means for the OpenAI and Anthropic listings behind it