How Google uses lots of data about your searches and the Web to improve Voice Search and YouTube

Google Research today announced new findings on how the search giant uses large amounts of data from the Web to improve its automatic speech recognition. Using large language models, anonymized queries on Google.com are used to improve Voice Search and data from the Web in general is analyzed to improve YouTube speech transcription. The full results are available in a seven-page paper titled “Large Scale Language Modeling in Automatic Speech Recognition (PDF).

The abstract should give you a better taste of what this is about:

Large language models have been proven quite beneficial for a variety of automatic speech recognition tasks in Google. We summarize results on Voice Search and a few YouTube speech transcription tasks to highlight the impact that one can expect from increasing both the amount of training data, and the size of the language model estimated from such data. Depending on the task, availability and amount of training data used, language model size and amount of work and care put into integrating them in the lattice rescoring step we observe reductions in word error rate between 6% and 10% relative, for systems on a wide range of operating points between 17% and 52% word error rate.

Speech recognizers uses language models to assign probabilities to words being said, based on previous ones that have already been said. For example, if you’re saying “I’m going to go walk the…” then a good language model will assign a higher probability to the word “dog” than to the word “bog.”

Google uses an n-gram approach to language modeling (predicting the next word based on the previous n-1 words) because the company says it is well-suited to large amounts of data as “it scales gracefully” as the company gets more data. As you can see in the graph above, both word error rate (a metric Google uses to measure speech recognition accuracy) and search error rate (a metric Google uses to evaluate speech recognition effectiveness for search) decrease significantly with larger language models.

TNW City Coworking space - Where your best work happens

A workspace designed for growth, collaboration, and endless networking opportunities in the heart of tech.

Book a tour now

This is a perfect example of how Google improves its services by collecting data on you and the broader Web. As long as it stays anonymous, everyone wins.

Image credit: Petre Birlea

Story by Emil Protalinski

Emil was a reporter for The Next Web between 2012 and 2014. Over the years, he has covered the tech industry for multiple publications, incl (show all) Emil was a reporter for The Next Web between 2012 and 2014. Over the years, he has covered the tech industry for multiple publications, including Ars Technica, Neowin, TechSpot, ZDNet, and CNET. Stay in touch via Facebook, Twitter, and Google+.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

How Google uses lots of data about your searches and the Web to improve Voice Search and YouTube

Get the TNW newsletter

Also tagged with

Anthropic and Google DeepMind called for a US-led AI coalition at the G7, and Canada said yes

A built-in Google Workspace feature became a Chinese espionage group’s favourite exfiltration tool

Discover TNW All Access

Google’s first speaker in six years is really a $10-a-month Gemini subscription

Google rolls out Android 17 with Gemini Intelligence, foldable gaming mode, and tighter privacy controls