How to Use ElevenLabs AI to Clone Your Voice & Generate Natural Speech from Text

In 2024, you can’t skip talking about Generative AI while discussing technology. Be it AI coding tools, local LLMs, or AI image generators, the technology has invaded almost everything and companies are quickly adapting to the new tech. ElevenLabs is one such company that specializes in speech synthesis and voice cloning using AI. You can use ElevenLabs AI to generate natural speech from text and clone your voice with near-perfect accuracy. So go through this tutorial and learn how to do that.

How to Use ElevenLabs AI to Generate Natural Speech from Text

ElevenLabs is free to use for individual users. Under the free tier, you can generate speech from text of up to 10,000 characters in a month. You can also generate speech in multiple languages and accents. That said, here’s how this AI tool works:

1. Head to the ElevenLabs website (visit) and click “Sign up” to create a free account.

sign up elevenlabs

2. After signing up, you will land on the Speech Synthesis page. Here, in Settings, you can preview different voices and choose your preferred voice.

choose premade audio on elevenlabs

3. You can also choose the audio model right below. If your text is in English, choose Eleven Monolingual v1. In case your text is in other languages, including English, German, Hindi, Spanish, Italian, French, Portuguese, or Polish, choose “Eleven Multilingual v1” here.

choose model on elevenlabs

4. Finally, enter the text below and click on “Generate” to have ElevenLabs AI convert your text to speech.

generate text to speech on elevenlabs

5. Here, I have generated the speech from a sample text in Sam’s voice. You can click on the “Download” button to get the generated speech in MP3 format.

download generated speech on elevenlabs

6. You can also enter the text in a different language and it can generate speech perfectly. Make sure to select the “Multilingual” model from the drop-down menu.

generate speech for different language (hindi) on elevenlabs

7. You can also clone your voice using ElevenLabs AI, which we have demonstrated below. However, there is a library of voice samples in different accents that you can add from Voice Library.

voice library on elevenlabs

8. Click on “Add to VoiceLab” next to your preferred speech. For example, here I am adding a young male voice in a British accent.

voice library on elevenlabs

9. Now, simply select the voice from the drop-down menu and generate the speech. You are done.

choose different accent on elevenlabs

How to Use ElevenLabs AI to Clone Your Voice

You might have already seen people on Instagram and TikTok using voice cloning to get prominent figures like Obama, Drake, and many others to say random things. ElevenLabs used to offer Voice Cloning for free, but you now need to pay $5 to create up to 10 custom voices. In case you don’t want to pay, you can use PlayHT (visit) to clone your voice for free. In this article, I am going to use ElevenLabs AI to clone my voice.

1. To clone your voice with ElevenLabs AI, click on “Voice Lab” at the top. After that, click on Add Generative or Cloned Voice.

create cloned voice on elevenlabs

2. Next, click on Instant Voice Cloning.

instant voice cloning on elevenlabs

3. Here, give a name to your voice. After that, upload your recorded audio. Make sure the recording doesn’t have loud background noise. It’s recommended to upload at least 5 minutes of audio for better speech synthesis. Write a description below and click the “Add Voice” button.

upload audio file on elevenlabs

4. After a few seconds, your voice will be cloned and ready to use. Click on “Use” to convert text to speech using it right away.

How to Use ElevenLabs AI to Clone Your Voice & Generate Natural Speech from Text

5. Here, make sure your voice is selected in the drop-down menu. Now, add your text, and click on Generate. It will take a few seconds to synthesize your speech and generate audio. You can now download the audio as well.

How to Use ElevenLabs AI to Clone Your Voice & Generate Natural Speech from Text

In my opinion, ElevenLabs AI didn’t do a good job at cloning my voice despite uploading a 5-minute audio file. Perhaps, I need to add more audio samples and train the model again. Or, it could be because the AI model is delivering the output in English (US) instead of localizing the accent for India. Also, in my audio sample, there was some background noise which may have reduced the quality. Nevertheless, it’s an exciting AI project, and we will keep track of all the new advancements in speech synthesis.

#Tags
Comments 4
Leave a Reply

Loading comments...