Eerie tech promises to copy anyone’s voice from just 1 minute of audio

I’m not sure how I feel about the upcoming launch of Montreal-based Lyrebird’s new service. The company says its API will let you synthesize speech in anyone’s voice from just a minute-long recording – which means you could, for instance, generate a clip of President Trump declaring war on Canada.

Lyrebird has posted some audio examples that sound pretty convincing (listen below, and find more on this page). The company says that it doesn’t require the speaker to say the words that you’ll use the voice to speak in the audio you generate, and it’ll also be able to create different intonations.

If any of this sounds familiar, it might be because you’re thinking of Adobe’s demo of its similar tech last November. But while Adobe’s Project VoCo requires 20 minutes of audio and appears to use system resources for speech synthesis, Lyrebird only needs a minute-long recording and says it’s close to launching its cloud-based API to process audio and spit out results.

TNW Conference 2024- 2for1 offer this week only!

Don't miss out on the world-class speakers. Secure your 2for1 tickets before 23 April.

As I wrote when we covered Project VoCo last year, it’s likely that such software will lead to the creation and distribution of plenty of misleading information that people might believe to be genuine.

On its Ethics page, Lyrebird says that its technology “questions the validity of such evidence as it allows to easily manipulate audio recordings.” It added:

By releasing our technology publicly and making it available to anyone, we want to ensure that there will be no such risks. We hope that everyone will soon be aware that such technology exists and that copying the voice of someone else is possible. More generally, we want to raise attention about the lack of evidence that audio recordings may represent in the near future.

Lyrebird might be on to something there: the widespread availability of image manipulation tools has led to people questioning the veracity of photographs that are circulated in the press and on the web, as well as the integrity of their sources. But there’s still a huge risk of people falling prey to scams and misinformation through tampered audio.

And we’re not just talking about copying the voices of world leaders: people could be duped into handing over sensitive data when they think they’re speaking with a significant other or a family member, and company employees could find themselves following counter-productive orders from someone on the phone who happens to sound an awful lot like their boss.

We’ve contacted Lyrebird to learn more and will update this post if there’s a response.

Story by Abhimanyu Ghoshal

Managing Editor

Abhimanyu is TNW's Managing Editor, and is all about personal devices, Asia's tech ecosystem, as well as the intersection of technology and (show all) Abhimanyu is TNW's Managing Editor, and is all about personal devices, Asia's tech ecosystem, as well as the intersection of technology and culture. Hit him up on Twitter, or write in: [email protected].

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Eerie tech promises to copy anyone’s voice from just 1 minute of audio

Get the TNW newsletter

The Digital Markets Act will change how you use apps

Apps are dead. Long live apps.

Join TNW All Access

Introducing ‘Watch History’ will solve one of TikTok’s most annoying issues

These are the top 10 React Native interview questions… and how to answer them