Microsoft’s AI can clone your voice after analyzing a 3-second audio clip

Science

By William On Jan 11, 2023

Microsoft’s AI can clone your voice after analyzing a 3-second audio clip of you speaking – but scammers could use the technology to steal your voice

Microsoft has announced a new artificial intelligence tool that replicates voices
VALL-E was trained on 60,000 hours of the English language
The AI needs to only listen to three seconds of an audio clip, in which it analyzes how the person speaks
It then breaks down the sounds into components and then uses its trained data to find something similar to replicate the original sample

By Stacy Liberatore For Dailymail.com

Published: 19:30 GMT, 11 January 2023 | Updated: 19:48 GMT, 11 January 2023

<!–

<!–<!–

<!–

Microsoft has developed artificial intelligence that clones a person’s voice perfectly after analyzing just three seconds of an audio clip of them speaking – but some fear it provides a tool for scammers to steal your voice.

Called VALL-E, the system could be used by a telephone scammer to capture just three seconds of your voice and replicate it, which would also include your emotional range and acoustic environment.

This would allow bad actors to bypass systems that use your voice as a password.

VALL-E is not available to the public and Microsoft has not revealed plans for when or if it will be.

While the AI sparks fear among some users, others see the technology as a way for people who lost their voice to throat disease ALS or another injury to regain their speech.

Microsoft has developed a new AI tool called VALL-E. It can clone a person’s voice just by listening to three seconds of an audio clip

However, some Twitter users have raised an important question – do you own the sound of your voice?

The Microsoft Vall-E team has addressed the ethics question with a statement: ‘The experiments in this work were carried out under the assumption that the user of the model is the target speaker and has been approved by the speaker.

‘However, when the model is generalized to unseen speakers, relevant components should be accompanied by speech editing models, including the protocol to ensure that the speaker agrees to execute the modification and the system to detect the edited speech.’

VALLE was trained on 60,000 hours of English and Microsoft claims it can replicate American, British and several European-sounding accents.

VALL-E can only turn written text into speech, but this is enough for someone to use the technology to steal your voice and ‘put words in your mouth.’

Microsoft has not yet released it to the public, but the company has high hopes for its AI – it is poised to revolutionize how we hear audiobooks and smart assistants.

The creators of VALL-E said the AI tool is designed for high-quality text-to-speech applications.

This includes editing speech in a recording of a person – such as an audiobook.

VALL-E analyzes how the person in the audio clip sounds, breaks that information into different components, then uses its training data to find something similar and combines the two.

The AI is making waves on Twitter where it has received mixed opinions.

One user said VALL-E does not have any use except for scam and impersonation purposes, while another is hopeful it will be a game changer for people who have lost their speech.

Some Twitter users have raised an important question – do you own the sound of your voice?

The system could be used by a telephone scammer to capture just three seconds of your voice and replicate it, which would also capture your emotional range and acoustic environment

While the AI sparks fear among some users, others see the technology as a way for people who lost their voice to throat disease ALS or another injury to regain their speech

The AI is making waves on Twitter where it has received mixed opinions. Several people highlighted that VALL-E spells terrible news for voice-over actors

Another Twitter user said this would have been great for the late Stephen Hawking, who lost his voice and used a computer-generated sound.

Several people highlighted that VALL-E spells terrible news for voice-over actors.

‘Now they are going after voice actors, who’s next,’ a user named ‘Gabriel’ tweeted.

If you enjoyed this article…

Artificial intelligence expert warns that there may already be a ‘slightly conscious’ AI out in the world

Google’s DeepMind says it is close to achieving ‘human-level’ artificial intelligence – but it still needs to be scaled up

OpenAI releases Artificial Intelligence tool that can produce a full image from a text description – or even create new angles for existing images

Microsoft’s AI can clone your voice after analyzing a 3-second audio clip

Microsoft’s AI can clone your voice after analyzing a 3-second audio clip of you speaking – but scammers could use the technology to steal your voice

Share or comment on this article: