OpenAI’s new voice synthesizer can copy your voice from just 15 seconds of audio

OpenAI has rapidly developed its generative AI chatbot ChatGPT and Sora AI video maker over the past year, and now has a new artificial intelligence tool to show off: Voice Generation, which can create synthetic voices from just 15 seconds of audio.

In a blog post (through The edge), OpenAI says it is running “a small-scale preview” of Voice Engine, which has been in development since late 2022. It’s actually already used in the Read Aloud feature in the ChatGPT app, which (as the name suggests) reads replies to you.

After training the voice using a 15-second example, you can have it read any text you want, in an “emotional and realistic” way. OpenAI says it can be used for educational purposes, for translating podcasts into new languages, for reaching remote communities and for supporting people who are non-verbal.

This isn’t something everyone can use right now, but you can go and listen to the examples created by Voice Engine. The clips that OpenAI has published sound quite impressive, although there is a slightly robotic and stilted edge to them.

Safety first

Voice Engine is already used in ChatGPT’s Read Aloud feature (Image credit: OpenAI)

Concerns about misuse are the main reason why Voice Engine is only in limited preview for now: OpenAI says it wants to do more research into how it can protect these types of tools from being used to spread disinformation and copy votes without permission.

“We hope to spark a dialogue about the responsible use of synthetic voices, and how society can adapt to these new capabilities,” says Open AI. “Based on these conversations and the results of these small-scale tests, we will make a more informed decision about whether and how to deploy this technology at scale.”

With major elections taking place in both the US and Britain this year, and generative AI tools becoming increasingly sophisticated, it is a concern for every type of AI content – ​​audio, text and video – and it is becoming increasingly difficult to know what to do. to trust.

As OpenAI itself points out, this can cause problems with voice authentication measures and scams where you may not know who you’re talking to on the phone, or who has left you a voicemail. These are not easy problems to solve, but we will have to find ways to deal with them.

You might also like it

Related Post