AI Multi-Shot Video Prompts
Multi-shot AI video lives or dies on what feeds into the next shot. These recipes are not 8 disconnected prompts — each one is a 3-to-5-shot sequence with explicit anchor handoffs (last-frame chain, character anchor, style anchor) so the cuts feel like a scene rather than separate generations. Use Sora 2 for long-take coherence, Kling 3 for cinematic camera language across cuts, Vidu for character-locked sequences, Seedance 2 for reference-locked products, and Kling O3 for character-aware dialogue.
When to use this prompt
- Prototyping a 8-to-12-shot narrative short over a weekend without booking talent or locations.
- Producing a 30-second branded multi-cut spot with a single recurring spokesperson.
- Locking a recurring lead across cuts for an episodic AI series.
- Building a director's animatic from a script before committing to a live shoot.
- Stitching together a coverage sequence (wide → medium → close-up) from one anchor still.
Required inputs
- A character anchor image (or product / location anchor) that persists across shots.
- A short shot list — at minimum, the cut intent for each shot (master, OTS, close-up, insert).
- A style anchor image when you want a consistent cinematic look across cuts.
- A script or beat sheet for dialogue and reaction shots.
- Optional: previous-shot last-frame image when chaining continuous action.
Prompt recipes
Wide → medium → close-up coverage
Standard coverage sequence on a single anchor. Each cut inherits the character node, so identity holds.
Shot 1 (wide, 4s): Subject from anchor image stands in environment, full-body framing, slow ambient parallax. Shot 2 (medium, 4s): Same subject framed waist-up, slight handheld breathe, soft side light. Shot 3 (close-up, 3s): Same subject framed shoulders-up, locked focus on face, identical wardrobe and lighting as previous shots.
Variations
- Reverse the order for an "introduction" cut.
- Insert a cutaway between medium and close-up.
Establishing exterior → interior reveal
Two-location continuity using two scene anchors plus a character anchor. Last-frame extraction chains exterior to interior.
Shot 1 (4s): Slow push-in toward referenced building exterior, golden-hour light. Shot 2 (5s): Camera transitions through doorway into interior referenced from second image, matched warm light, motion continues forward. Shot 3 (3s): Subject from anchor revealed inside seated at table, soft window light from camera left.
Use frame-extraction on Shot 1 to seed Shot 2 for clean continuity.
Reverse-angle dialogue between anchored characters
Two-character dialogue with locked identities. Vidu accepts up to 7 character refs for cross-shot consistency.
Shot 1 (5s): Character A from anchor 1 framed three-quarter, speaking, soft front-key. Shot 2 (5s): Character B from anchor 2 in reverse three-quarter framing, listening then reacting, matched lighting and color script. Shot 3 (3s): Cut back to Character A reacting, same lighting, locked wardrobe.
Variations
- Add a wide two-shot at the start for context.
Cutaway insert that returns to master
Master-insert-master pattern. The detail cutaway gives the editor a cut point, then the master returns with the same identity intact.
Shot 1 (5s): Subject from anchor performs action in master frame, soft natural light. Shot 2 (3s): Cutaway to detail insert (hands, object, texture) from referenced still, matched lighting. Shot 3 (5s): Return to master frame on subject, action continues from same beat.
Wire the same character anchor into both master shots so identity is identical.
Action-to-reaction (last-frame chain)
Continuous-action sequence chained via last-frame extraction. The hand-off feels like one event, not three takes.
Shot 1 (4s): Subject from anchor begins action — opens door, picks up object, turns to camera. Shot 2 (4s, seeded from Shot 1 final frame): Subject in same lighting and wardrobe reacts, expression shifts, hands settle, ends in held composition. Shot 3 (3s): Tighter close-up reaction frame, same anchor, holds final beat.
Tracking shot break to static close-up
Energetic-to-quiet transition. The contrast in motion gives the editor a strong cut point.
Shot 1 (6s): Camera tracks alongside subject from anchor walking through environment, locked subject, smooth dolly motion, environmental parallax. Shot 2 (4s): Static close-up of subject in same wardrobe and lighting, breath visible, no camera motion, same color script.
Time-jump within same location
Compressed-time montage on a locked location. Wardrobe swaps come from Flux Kontext upstream; the video nodes inherit the character anchor.
Shot 1 (4s): Subject from anchor in environment, morning light from camera left, locked composition. Shot 2 (4s): Same composition, evening warm tungsten replacing morning daylight, subject in alternate wardrobe (anchor + outfit-swap reference), continuity in pose. Shot 3 (3s): Same composition, night, single warm interior practical, subject in third wardrobe variant.
Establishing → product hero → talent reaction
Branded multi-cut spot pattern — environment, product, talent reaction. The brand-color script ties all three shots visually.
Shot 1 (4s): Wide establishing of brand environment, soft daylight, no characters. Shot 2 (4s): Hero spin on referenced product, label-locked, brand color rim. Shot 3 (5s): Talent from anchor reacts holding product, soft window light, matched color script across all three shots.
Use the brand-color script as a separate reference node wired into all three video nodes.
Martini canvas workflow
Drop your script or beat sheet into a text node, then pin character and style anchor images on the canvas. The character anchor is the single most important node — it feeds every video node downstream so identity holds across cuts.
Lay out one video node per shot, left to right in cut order. Each video node inherits the character anchor; per-shot anchors (location reference, prop reference) wire in alongside. The canvas mirrors the cut, which is how you avoid writing 8 disconnected prompts.
Pick the model per shot intent: Sora 2 when you want long-take coherence and within-shot multi-action, Kling 3 for cinematic camera moves between cuts, Vidu when you have multiple characters to lock, Seedance 2 for reference-locked products, Kling O3 for character-aware reaction shots. Mixing models per shot is normal and recommended.
For continuous action across cuts (Shot 1 ends → Shot 2 begins), use the frame-extraction tool to pull the last frame of Shot 1 and seed it as the input image for Shot 2. The hand-off is seamless because Shot 2 inherits Shot 1's final composition exactly.
Sequence cuts in cut order on the canvas. Review for identity drift before exporting — re-run only the weak shots, not the whole sequence. NLE-export the bundle to Premiere or DaVinci for finishing. Save the canvas as a template; the next short film reuses the entire chain.
Variations
3-shot mini-sequence
Tightest pattern — typically wide → medium → close-up. Use for paid-social spots and ad-creative variants.
5-shot scene
Standard scene coverage — establishing, master, two reverse-angles, insert. Use for branded shorts.
8+ shot sequence
Full short-film coverage. Heavy on character anchors and last-frame chaining. Use for narrative shorts.
Commercial pacing
Quick cuts (2–4 seconds each), high beat density, hook in first cut.
Narrative pacing
Slower cuts (5–8 seconds each), longer holds on reactions, room for dialogue.
Explainer pacing
Steady mid-length cuts (4–6 seconds), each cut introduces or reinforces one idea, voiceover-friendly pacing.
Related features
Related how-tos
Related models
Related blog posts
Related docs
Frequently asked questions
- Why do my multi-shot sequences feel disconnected?
- Most likely you wrote 8 prompts without anchor handoffs. Every cut needs to inherit something — the character anchor, the style anchor, or the previous shot's last frame. The recipes here include explicit handoff language; copy that pattern into your own sequences.
- When do I use Sora 2 vs chain shorter clips?
- Sora 2 can deliver multiple shot changes within a single generation, which is great for tight 8-to-12-second sequences with one or two cuts. For greater control over per-shot model choice and more cuts, chain across the canvas — one video node per shot, one anchor per character.
- How do I keep a character looking the same across cuts?
- Wire one canonical character image into every video node as a reference. Do not re-prompt the character per shot — that is the recipe for drift. For full identity workflow, see /prompts/image/consistent-character-prompts upstream and the ai-character-consistency feature page for cross-modal coverage.
- Can I mix models across shots in one sequence?
- Yes — that is the differentiator vs single-tool workflows. Mixing models per shot intent is a feature, not a bug. The character anchor is the same node feeding all of them, so identity stays consistent even though the engines differ.
- How do I chain action between two shots?
- Use the frame-extraction tool on the canvas to pull the last frame of Shot 1, then wire that frame as the seed image for Shot 2 along with a continuity prompt. The hand-off is seamless because Shot 2 starts from Shot 1's exact final composition.
- How many shots should I plan?
- 3 for a paid-social spot, 5 for a branded short, 8–12 for a narrative scene. Past 12, identity drift and pacing fatigue become likely; consider breaking into multiple scenes that each have their own anchor set.
Try this prompt on Martini
Copy a recipe above, drop it into a node, and run it inside a full canvas workflow.