Eerie tech promises to copy anyone’s voice from just 1 minute of audio

I’m not sure how I feel about the upcoming launch of Montreal-based Lyrebird’s new service. The company says its API will let you synthesize speech in anyone’s voice from just a minute-long recording – which means you could, for instance, generate a clip of President Trump declaring war on Canada.

Lyrebird has posted some audio examples that sound pretty convincing (listen below, and find more on this page). The company says that it doesn’t require the speaker to say the words that you’ll use the voice to speak in the audio you generate, and it’ll also be able to create different intonations.

If any of this sounds familiar, it might be because you’re thinking of Adobe’s demo of its similar tech last November. But while Adobe’s Project VoCo requires 20 minutes of audio and appears to use system resources for speech synthesis, Lyrebird only needs a minute-long recording and says it’s close to launching its cloud-based API to process audio and spit out results.

As I wrote when we covered Project VoCo last year, it’s likely that such software will lead to the creation and distribution of plenty of misleading information that people might believe to be genuine.

TNW City Coworking space - Where your best work happens

A workspace designed for growth, collaboration, and endless networking opportunities in the heart of tech.

Book a tour now

On its Ethics page, Lyrebird says that its technology “questions the validity of such evidence as it allows to easily manipulate audio recordings.” It added:

By releasing our technology publicly and making it available to anyone, we want to ensure that there will be no such risks. We hope that everyone will soon be aware that such technology exists and that copying the voice of someone else is possible. More generally, we want to raise attention about the lack of evidence that audio recordings may represent in the near future.

Lyrebird might be on to something there: the widespread availability of image manipulation tools has led to people questioning the veracity of photographs that are circulated in the press and on the web, as well as the integrity of their sources. But there’s still a huge risk of people falling prey to scams and misinformation through tampered audio.

And we’re not just talking about copying the voices of world leaders: people could be duped into handing over sensitive data when they think they’re speaking with a significant other or a family member, and company employees could find themselves following counter-productive orders from someone on the phone who happens to sound an awful lot like their boss.

We’ve contacted Lyrebird to learn more and will update this post if there’s a response.

Story by Abhimanyu Ghoshal

Managing Editor

Abhimanyu is TNW's Managing Editor, and is all about personal devices, Asia's tech ecosystem, as well as the intersection of technology and (show all) Abhimanyu is TNW's Managing Editor, and is all about personal devices, Asia's tech ecosystem, as well as the intersection of technology and culture. Hit him up on Twitter, or write in: abhimanyu@thenextweb.com.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Eerie tech promises to copy anyone’s voice from just 1 minute of audio

Get the TNW newsletter

These are the top 10 React Native interview questions… and how to answer them

Introducing ‘Watch History’ will solve one of TikTok’s most annoying issues

Discover TNW All Access

Apps are dead. Long live apps.

GraphQL could be the key to taming the API explosion