How to Use ElevenLabs AI to Clone Your Voice & Generate Natural Speech from Text

In 2024, you can’t skip talking about Generative AI while discussing technology. Be it AI coding tools, local LLMs, or AI image generators, the technology has invaded almost everything and companies are quickly adapting to the new tech. ElevenLabs is one such company that specializes in speech synthesis and voice cloning using AI. You can use ElevenLabs AI to generate natural speech from text and clone your voice with near-perfect accuracy. So go through this tutorial and learn how to do that.

How to Use ElevenLabs AI to Generate Natural Speech from Text

ElevenLabs is free to use for individual users. Under the free tier, you can generate speech from text of up to 10,000 characters in a month. You can also generate speech in multiple languages and accents. That said, here’s how this AI tool works:

1. Head to the ElevenLabs website (visit) and click “Sign up” to create a free account.

sign up elevenlabs

2. After signing up, you will land on the Speech Synthesis page. Here, in Settings, you can preview different voices and choose your preferred voice.

3. You can also choose the audio model right below. If your text is in English, choose Eleven Monolingual v1. In case your text is in other languages, including English, German, Hindi, Spanish, Italian, French, Portuguese, or Polish, choose “Eleven Multilingual v1” here.

4. Finally, enter the text below and click on “Generate” to have ElevenLabs AI convert your text to speech.

5. Here, I have generated the speech from a sample text in Sam’s voice. You can click on the “Download” button to get the generated speech in MP3 format.

https://beebom.com/wp-content/uploads/2023/06/ElevenLabs_2023-06-20T14_58_41.000Z_Sam_gSUzRTKklsxiS4gDGuar.mp3

6. You can also enter the text in a different language and it can generate speech perfectly. Make sure to select the “Multilingual” model from the drop-down menu.

https://beebom.com/wp-content/uploads/2023/06/synthesized_audio.mp3

7. You can also clone your voice using ElevenLabs AI, which we have demonstrated below. However, there is a library of voice samples in different accents that you can add from Voice Library.

8. Click on “Add to VoiceLab” next to your preferred speech. For example, here I am adding a young male voice in a British accent.

9. Now, simply select the voice from the drop-down menu and generate the speech. You are done.

https://beebom.com/wp-content/uploads/2023/06/synthesized_audio-1.mp3

How to Use ElevenLabs AI to Clone Your Voice

You might have already seen people on Instagram and TikTok using voice cloning to get prominent figures like Obama, Drake, and many others to say random things. ElevenLabs used to offer Voice Cloning for free, but you now need to pay $5 to create up to 10 custom voices. In case you don’t want to pay, you can use PlayHT (visit) to clone your voice for free. In this article, I am going to use ElevenLabs AI to clone my voice.

1. To clone your voice with ElevenLabs AI, click on “Voice Lab” at the top. After that, click on Add Generative or Cloned Voice.

2. Next, click on Instant Voice Cloning.

3. Here, give a name to your voice. After that, upload your recorded audio. Make sure the recording doesn’t have loud background noise. It’s recommended to upload at least 5 minutes of audio for better speech synthesis. Write a description below and click the “Add Voice” button.

4. After a few seconds, your voice will be cloned and ready to use. Click on “Use” to convert text to speech using it right away.

5. Here, make sure your voice is selected in the drop-down menu. Now, add your text, and click on Generate. It will take a few seconds to synthesize your speech and generate audio. You can now download the audio as well.

https://beebom.com/wp-content/uploads/2023/06/ElevenLabs_2023-06-20T15_00_45.000Z_Arjun_Eyhw5AVkoxbacpKkDS7K.mp3

In my opinion, ElevenLabs AI didn’t do a good job at cloning my voice despite uploading a 5-minute audio file. Perhaps, I need to add more audio samples and train the model again. Or, it could be because the AI model is delivering the output in English (US) instead of localizing the accent for India. Also, in my audio sample, there was some background noise which may have reduced the quality. Nevertheless, it’s an exciting AI project, and we will keep track of all the new advancements in speech synthesis.

#Tags
Comments 4
Leave a Reply

Loading comments...