ElevenLabs

How to Create AI Voiceovers with ElevenLabs TTS Eleven v3

ElevenLabs Eleven v3 produces the most natural-sounding English voiceovers of any TTS model on Martini. It offers 21 distinct voices — from warm narrator tones (Rachel, Sarah) to authoritative male voices (Roger, Brian, Daniel) — each with realistic emotional inflection that adapts to your script's content. The English voice quality and emotional expressiveness are unmatched, while Minimax Speech remains the stronger pick for Chinese. ElevenLabs also offers a faster Turbo v2.5 variant and a Multilingual v2 for non-English languages.

Try ElevenLabs TTS Eleven v3 Free

Step-by-Step Guide

Choose the right voice for your content type

ElevenLabs offers 21 voices, each with a distinct personality. For product narration and brand videos, try Rachel (warm, professional female) or Brian (confident, authoritative male). For tutorials and explainers, try Sarah (clear, friendly) or Daniel (calm, instructional). For storytelling and podcasts, try Aria (expressive, versatile) or Callum (engaging male narrator). Generate a short test sentence with 2-3 voices before committing to a full script — voice-content fit has more impact on quality than any other factor.

Write your script for spoken delivery, not reading

The most common mistake in TTS scripts is writing formal text that sounds stiff when spoken. Write conversationally: use contractions ("we'll" not "we will"), shorter sentences, and natural transitions ("Now, let's look at..." not "The following section demonstrates..."). ElevenLabs v3 handles emotional nuance — if you want excitement, write excitedly. If you want calm authority, write with measured, declarative sentences. The model infers tone from the writing style.

Control pacing with punctuation

Punctuation is your primary pacing tool. Periods create natural pauses between ideas. Commas create brief pauses within sentences. Ellipses (...) create dramatic or contemplative pauses. Em dashes (—) create sharp transitions. Line breaks between paragraphs add slightly longer pauses than periods. For a 30-second ad, aim for 80-90 words. For tutorial narration, slow the pace to 120-130 words per minute (about 60 words per 30 seconds) with more punctuation breaks.

Chain into a video production pipeline

The real power of ElevenLabs on Martini is the canvas pipeline: connect the Audio output directly to a Lipsync node (OmniHuman or Kling LipSync) to create a talking head video, or connect it to a Video node to pair narration with AI-generated visuals. This enables complete ad production — script → voiceover → video — in a single workflow without leaving Martini.

Prompt Examples

Brand narration — the ellipsis before "designed to last a lifetime" creates a contemplative pause that emphasizes the value proposition. Short, declarative sentences give the voice a confident, premium feel. Try this with Rachel or Brian for different brand personalities.

Welcome to our new collection. Each piece is carefully crafted from sustainable materials... designed to last a lifetime. Discover what makes us different.

Tutorial narration — the numbered structure ("First... Then... Finally") gives the TTS natural pacing markers. The exclamation on "you're all set!" signals ElevenLabs to add upbeat energy at the end. Try this with Sarah or Daniel for clear, instructional delivery.

In this tutorial, we'll walk through three simple steps to set up your account. First, click the sign-up button on the homepage. Then, enter your email and choose a password. Finally, verify your email — and you're all set!

Parameter Tips

For long-form narration where you want quicker turnarounds, use the Turbo v2.5 variant — slightly less expressive but renders faster than Eleven v3.

The 21 voices are: Rachel, Aria, Roger, Sarah, Laura, Charlie, George, Callum, River, Liam, Charlotte, Alice, Matilda, Will, Jessica, Eric, Chris, Brian, Daniel, Lily, Bill. Always test 2-3 before committing.

For non-English voiceovers, use ElevenLabs TTS Multilingual v2 instead — it supports 29+ languages. For Chinese specifically, Minimax Speech 2.5 HD produces more natural Mandarin.

Write scripts at 120-150 words per minute for comfortable listening speed. A 60-second ad should be 120-150 words, not 200+.

What to Expect

ElevenLabs Eleven v3 produces the most human-sounding English TTS on Martini — emotional inflection, natural breathing patterns, and expressive delivery that sounds like a professional voice actor rather than AI. The trade-off vs. Minimax Speech: ElevenLabs is the clear winner for English, but Minimax Speech 2.5 HD produces more natural Chinese (especially Mandarin tonal accuracy). For multilingual projects, use ElevenLabs Multilingual v2 for Western languages and Minimax for Chinese/Asian languages.

Use ElevenLabs TTS Eleven v3 on Martini

Connect ElevenLabs TTS Eleven v3 with other AI models on Martini's infinite canvas. No GPU required — start free.

Get Started Free

Related features

Docs

nodes/audio

Try Other Models for This Task

Minimax 2.5

Minimax Speech 2.5 HD

Minimax Speech 2.5 HD is the best text-to-speech model for Mandarin Chinese and multilingual voiceovers. While ElevenLabs dominates English-language TTS, Minimax Speech handles Chinese tonal accuracy with a naturalness that Western TTS models cannot match — the four Mandarin tones, sentence-level intonation, and emotional cadence all sound native rather than robotic. The model offers 17 distinct voices in two tiers (HD for delivery quality, Turbo for faster drafts), making it competitive with ElevenLabs while offering superior CJK language support.

View guide

How to Create AI Voiceovers

ElevenLabs

How to Create AI Voiceovers with ElevenLabs TTS Eleven v3

Try ElevenLabs TTS Eleven v3 Free

Step-by-Step Guide

Choose the right voice for your content type

Write your script for spoken delivery, not reading

Control pacing with punctuation

Chain into a video production pipeline

Prompt Examples

Welcome to our new collection. Each piece is carefully crafted from sustainable materials... designed to last a lifetime. Discover what makes us different.

Parameter Tips

For long-form narration where you want quicker turnarounds, use the Turbo v2.5 variant — slightly less expressive but renders faster than Eleven v3.

For non-English voiceovers, use ElevenLabs TTS Multilingual v2 instead — it supports 29+ languages. For Chinese specifically, Minimax Speech 2.5 HD produces more natural Mandarin.

Write scripts at 120-150 words per minute for comfortable listening speed. A 60-second ad should be 120-150 words, not 200+.

What to Expect

Use ElevenLabs TTS Eleven v3 on Martini

Connect ElevenLabs TTS Eleven v3 with other AI models on Martini's infinite canvas. No GPU required — start free.

Get Started Free

Related features

Docs

nodes/audio

Try Other Models for This Task

Minimax 2.5

Minimax Speech 2.5 HD

View guide

How to Create AI Voiceovers

How to Create AI Voiceovers with ElevenLabs TTS Eleven v3

Step-by-Step Guide

Choose the right voice for your content type

Write your script for spoken delivery, not reading

Control pacing with punctuation

Chain into a video production pipeline

Prompt Examples

Parameter Tips

What to Expect

Use ElevenLabs TTS Eleven v3 on Martini

Related features

Docs

Related reading

Try Other Models for This Task

Minimax Speech 2.5 HD

This website uses cookies

How to Create AI Voiceovers with ElevenLabs TTS Eleven v3

Step-by-Step Guide

Choose the right voice for your content type

Write your script for spoken delivery, not reading

Control pacing with punctuation

Chain into a video production pipeline

Prompt Examples

Parameter Tips

What to Expect

Use ElevenLabs TTS Eleven v3 on Martini

Related features

Docs

Related reading

Try Other Models for This Task

Minimax Speech 2.5 HD