The heart of tech

This article was published on August 26, 2016


Stanford speech recognition study suggests you should give dictation apps a chance

Stanford speech recognition study suggests you should give dictation apps a chance
Abhimanyu Ghoshal
Story by

Abhimanyu Ghoshal

Managing Editor

Abhimanyu is TNW's Managing Editor, and is all about personal devices, Asia's tech ecosystem, as well as the intersection of technology and Abhimanyu is TNW's Managing Editor, and is all about personal devices, Asia's tech ecosystem, as well as the intersection of technology and culture. Hit him up on Twitter, or write in: [email protected].

Even though voice recognition is now available on more apps and devices than ever, I rarely find myself using the feature to dictate text or issue commands. I’ve certainly noticed improvements in the past few years, but it can still be awkward and inaccurate.

A new study conducted by researchers at Stanford University, the University of Washington and Chinese search firm Baidu suggests that I should probably try it more often. According to the paper (PDF), which was published earlier this week, speech recognition tools have been found to be three times faster than typing for English and Mandarin text entry.

The researchers hypothesized that speech recognition, which has come a long way in recent times thanks to advancements in big data analysis and deep learning, is more accurate and faster than humans typing on keyboards.

They tested this in a study with 38 participants, who used Baidu’s Deep Speech 2 system, as well as standard software keyboards on the iPhone. The test subjects’ performance in dictating and typing 120 phrases each in English and Mandarin was monitored and compared.

The study revealed that not only was speech recognition 3x faster than typing in English and 2.8x faster in Mandarin, but the error rates were 20.4 percent and 63.4 percent lower respectively.

Not bad, huh? Those figures are significant enough to convince me to try using voice commands and dictations tools more often.

Of course, It’s worth noting that this test was conducted using Baidu’s latest speech recognition system – so your mileage with Google Now, Siri and Cortana will certainly vary.

Deep Speech 2 powers Baidu’s Duer personal assistant app and has found favor among Mandarin speaker because typing in that language is time-consuming and because not everyone is familiar with the Pinyin phonetic system used for transcribing on software keyboards.

Do you use speech recognition often? What’s your experience with it been lately? Let us know in the comments.