2 Models Available

How to Generate AI Dialogue

An animation team scripts a 4-character scene — natural turn-taking, distinct voices, emotion tags — without booking voice actors. On Martini's canvas, set up a script node with speaker turns, route it through ElevenLabs Eleven v3 Dialogue (the dedicated multi-speaker endpoint), Fish Audio S2-Pro multi-speaker, or Minimax Speech, and use inline tags like [whispers], [laughs], [excited] for emotional delivery. Output is dialogue ready for a multi-character animated short, audio drama, or interactive prototype. Pick a model below to walk through the multi-speaker production flow.

Try Free

Choose a Model to Get Started

ElevenLabs

ElevenLabs Dialogue v3

ElevenLabs Dialogue v3 is the multi-speaker endpoint of Eleven v3 — built for natural turn-taking between distinct character voices, with inline emotion tags ([whispers], [laughs], [excited], [sighs]) directing per-line delivery. Where standard Eleven v3 is one voice reading a paragraph, Dialogue v3 lets you assign different voices to different speakers and have them read a scripted scene with natural pacing, breath, and emotional response. On Martini, you build dialogue scenes as Audio nodes on the canvas — one node per character if you want fine-grained control, or a single Dialogue v3 node for the full multi-speaker generation. The 21-voice library covers the full range of character archetypes, and the cloned voice support lets you bring in custom characters when the prebuilt voices don't match.

4 steps + 2 promptsView guide

Fish Audio

Fish Audio S2-Pro

Fish Audio S2-Pro's multi-speaker dialogue mode is exclusive to S2-Pro within the Fish Audio family — older S1 doesn't support it. Use [Speaker:Name] syntax to assign different voices to different speakers, with natural-language bracket cues like [whispering], [laughing nervously], or [pause two seconds] directing per-line delivery. Coverage is 80+ languages with automatic detection on the same voice IDs, which makes Fish Audio the strongest pick for multilingual dialogue scenes (an audio drama shipping in English + Mandarin + Japanese, for example) or scenes that need expressive ranges beyond ElevenLabs' fixed inline tag set. Open-source serving means you can self-host the dialogue generation outside Martini for sensitive or pre-release content.

4 steps + 2 promptsView guide

More How-To Guides

This website uses cookies

We use cookies to keep Martini secure, remember your preferences, and, if you allow it, measure product performance. Read more

Strictly necessary

Required for authentication, security, payments, and core product flows.

Functionality

Remembers product preferences such as theme, language, and your most recent workspace.

Performance

Helps us understand product usage and site performance with PostHog, Vercel Analytics, Speed Insights, and Ahrefs.

Targeting

Allows marketing and advertising tags we may run through Google Tag Manager.

How to Generate AI Dialogue

Choose a Model to Get Started

ElevenLabs Dialogue v3

Fish Audio S2-Pro

More How-To Guides

How to Generate AI Background Music

How to Create AI Voiceovers

How to Clone a Voice With AI

How to Create an AI Podcast Intro

This website uses cookies

How to Generate AI Dialogue

Choose a Model to Get Started

ElevenLabs Dialogue v3

Fish Audio S2-Pro

More How-To Guides

How to Generate AI Background Music

How to Create AI Voiceovers

How to Clone a Voice With AI

How to Create an AI Podcast Intro