ElevenLabs

How to Clone a Voice With AI with ElevenLabs Eleven v3

ElevenLabs offers two voice cloning tiers that map directly to how much reference audio you have. Instant Voice Cloning trains on a 10-second sample and is ready in seconds — fine for internal narration drafts, prototype dubs, and personal video voiceover. Professional Voice Cloning needs 30+ minutes of clean studio audio, but the resulting voice can carry an entire course or audiobook without drifting. On Martini, both modes feed Eleven v3 (or Multilingual v2 for non-English work), so once your voice is registered you can generate new narration in 70+ languages with inline emotion tags. Critical: only clone voices you own or have explicit written permission to clone. ElevenLabs requires voice verification for your own voice, and consent matters whether the platform enforces it or not.

Try ElevenLabs Eleven v3 Free

Step-by-Step Guide

Confirm rights and pick the cloning tier

Before recording or uploading, get explicit written consent from the voice owner. ElevenLabs requires voice verification when you clone your own voice, and Martini follows the same policy. Decide between Instant Voice Cloning (10-second sample, ready in seconds, good for drafts and short narration) and Professional Voice Cloning (30+ minutes of clean studio audio, ready in 24 hours, the only mode acceptable for long-form courses, audiobooks, or branded narrator voices). If your sample is a 30s phone recording, IVC is your only option — PVC will reject low-quality input. If you control the recording session, plan for 30 minutes of varied scripted content with no background noise: that one-time effort buys you a voice that holds up across thousands of generations.

Record or upload a clean reference sample

For Instant Voice Cloning, record 10 seconds of conversational speech in a quiet room — no music bed, no AC hum, no second speaker. Speak the way you want the cloned voice to sound: a measured, narrator pace produces a measured cloned voice; an excited reading produces an excited clone. For Professional Voice Cloning, you need 30+ minutes of studio-grade recordings: vary content (narrative paragraphs, conversational lines, technical readings, emotional ranges) so the model captures your full delivery range. Convert all uploads to 44.1kHz WAV or 320kbps MP3. Audio with hiss, room reverb, lip smacks, or breath pops will train into the clone — you cannot strip those out later.

Register the voice and test on the canvas

Add an Audio node, select ElevenLabs Eleven v3 (or Multilingual v2 for non-English work), and pick your newly cloned voice from the voice picker. Generate a 30-second test sentence that uses sounds your sample didn't cover: an "s" word, a question intonation, a number sequence, an exclamation. This is where IVC clones fail and PVC clones hold up. If the IVC clone struggles on questions or numbers, that's the trade-off — re-record with more varied content or upgrade to PVC. Once the test passes, the cloned voice is reusable across every Audio node in the project, and you can wire it into Lipsync nodes (OmniHuman, Kling LipSync) for talking-head delivery.

Direct delivery with Eleven v3 inline tags

A cloned voice still needs direction. Eleven v3 understands inline tags like [whispers], [laughs], [sighs], [excited], [pause] placed near the words they should affect. For a course intro: "Hi everyone, [excited] welcome to module three!" produces noticeably warmer delivery than the same line without the tag. Keep tags sparse and local — three tags in a 60-second narration is plenty; ten tags fight each other and produce inconsistent reads. Punctuation also drives pacing: ellipses create contemplative pauses, em dashes create sharp transitions, and short sentences read at a faster confident clip than long ones.

Prompt Examples

Podcast intro with cloned host voice — the [excited] tag bumps energy at the welcome line, the ellipsis sets up a contemplative pause before the show kickoff. Works equally well with IVC for prototype recordings or PVC for the released cut.

Hi everyone, welcome back to the show. [excited] Today we're diving into something I've been waiting weeks to talk about... let's get into it.

Course narration with cloned founder voice — numbered structure gives the cloned voice clear pacing markers, and the [pause] before "Ready?" creates a deliberate moment that mirrors how a real instructor would let the class catch up.

In this module, we'll cover three core concepts. First, we look at how the data flows through the pipeline. Then, we trace each transformation step. Finally, we audit the output for quality. [pause] Ready? Let's start.

Parameter Tips

Instant Voice Cloning needs 10s of clean audio and is ready in seconds. Professional Voice Cloning needs 30+ minutes and 24 hours of training time, but produces a voice that holds up across long-form content. There is no middle tier.

For non-English cloned voices, switch the Audio node from Eleven v3 to ElevenLabs Multilingual v2 — both can use the same cloned voice ID, but Multilingual v2 produces more natural prosody outside English.

Voice consent is non-optional. Document written permission for any voice that is not your own, even for internal drafts. ElevenLabs voice verification covers your own voice; permission for others is on you.

A cloned voice is reusable across every Audio node in the workspace and connects to Lipsync nodes (OmniHuman, Kling LipSync) for talking-head delivery.

What to Expect

A cloned ElevenLabs voice is the closest you can get to your real voice without re-recording. IVC drafts in seconds and works for everything except marquee content; PVC takes a day to train and is the only mode you should ship to a course or audiobook. The output stays consistent across generations because the voice is registered once — every subsequent Audio node call reuses the same voice ID. Trade-off vs. Fish Audio S2-Pro: ElevenLabs has the broader voice ecosystem and stronger English emotional inflection; Fish Audio S2-Pro has open-source serving and natural-language bracket control. For a creator already running ElevenLabs voiceovers in their pipeline, cloning into the same family keeps everything in one canvas.

Use ElevenLabs Eleven v3 on Martini

Connect ElevenLabs Eleven v3 with other AI models on Martini's infinite canvas. No GPU required — start free.

Get Started Free

Related features

Docs

nodes/audio

Try Other Models for This Task

Fish Audio

Fish Audio S2-Pro

Fish Audio S2-Pro is the open-source alternative to ElevenLabs cloning, with two real differentiators: natural-language bracket control inside the prompt (`[whispering]`, `[laughing nervously]`, `[pause]`) and an open serving stack you can self-host. Voice cloning needs a clean reference audio sample plus a matching transcript — Fish Audio uses the transcript text to disambiguate phonemes, so a misaligned transcript hurts cloning quality more than it does on ElevenLabs. Coverage is 80+ languages with automatic detection. Critical: only clone voices you own or have explicit written permission to clone. Fish Audio is open-source, which means consent enforcement is on you, not the platform — make the rights clearance explicit before you upload reference audio.

View guide

How to Clone a Voice With AI

ElevenLabs

How to Clone a Voice With AI with ElevenLabs Eleven v3

Try ElevenLabs Eleven v3 Free

Step-by-Step Guide

Confirm rights and pick the cloning tier

Record or upload a clean reference sample

Register the voice and test on the canvas

Direct delivery with Eleven v3 inline tags

Prompt Examples

Hi everyone, welcome back to the show. [excited] Today we're diving into something I've been waiting weeks to talk about... let's get into it.

Parameter Tips

A cloned voice is reusable across every Audio node in the workspace and connects to Lipsync nodes (OmniHuman, Kling LipSync) for talking-head delivery.

What to Expect

Use ElevenLabs Eleven v3 on Martini

Connect ElevenLabs Eleven v3 with other AI models on Martini's infinite canvas. No GPU required — start free.

Get Started Free

Related features

Docs

nodes/audio

Try Other Models for This Task

Fish Audio

Fish Audio S2-Pro

View guide

How to Clone a Voice With AI

How to Clone a Voice With AI with ElevenLabs Eleven v3

Step-by-Step Guide

Confirm rights and pick the cloning tier

Record or upload a clean reference sample

Register the voice and test on the canvas

Direct delivery with Eleven v3 inline tags

Prompt Examples

Parameter Tips

What to Expect

Use ElevenLabs Eleven v3 on Martini

Related features

Docs

Related reading

Try Other Models for This Task

Fish Audio S2-Pro

This website uses cookies

How to Clone a Voice With AI with ElevenLabs Eleven v3

Step-by-Step Guide

Confirm rights and pick the cloning tier

Record or upload a clean reference sample

Register the voice and test on the canvas

Direct delivery with Eleven v3 inline tags

Prompt Examples

Parameter Tips

What to Expect

Use ElevenLabs Eleven v3 on Martini

Related features

Docs

Related reading

Try Other Models for This Task

Fish Audio S2-Pro