Workflow

Character Consistency Workflow

This workflow keeps one character locked across image refinement and multi-shot video on Martini's canvas. Generate a canonical portrait, freeze it as the upstream identity anchor, refine outfit and expression variants from the same source, then fan out to video shots that all inherit the locked face. The unique lesson: refine the still to a clean state before feeding video. Raw portraits drift far more than refined stills downstream.

Try on Martini See pricing

When to use this workflow

Producing a 12-week AI influencer drop where the same face appears in every weekly cut
Building a brand mascot or recurring spokesperson the team uses across the next campaign
Locking the protagonist across an episodic AI series so every episode opens on the same character
Generating shot coverage for a narrative scene with two anchored leads in dialogue
Producing recurring-protagonist artwork for a Substack serial, novel cover, or webtoon
Replacing the per-shot re-prompt habit with a single anchor that holds across a full bundle

Required inputs

A clear character description with non-negotiable identity traits (age, build, hair, signature features)
A canonical reference image — one master portrait, not multiple "ideal" versions
Wardrobe and expression variants the script requires (default, alt outfit, hero pose, three reactions)
Per-shot motion intent and dialogue beats so prompts stay action-led, not identity-led
A scene list with which shots are master coverage versus reverse / cutaway

Steps

1
1. Generate the canonical portrait on Nano Banana 2
Open the canvas, drop a Nano Banana 2 image node, and write a portrait prompt with every non-negotiable identity trait spelled out. Generate four to six candidates, pick the strongest single image, and lock it as the canonical reference. Resist the temptation to keep two close runner-ups; the model averages multiple anchors and the face drifts. One canonical portrait is the foundation of every step that follows. Label this node "char-anchor" so it surfaces clearly in the bin after export.
2
2. Freeze the anchor and refine the still
Pin the canonical portrait as the upstream image-anchor node and do not regenerate from scratch. If the still has soft edges, low-resolution artifacts, or color noise, run a refinement pass on Nano Banana 2 or Flux Kontext to clean it up. Refine in-place — the goal is the cleanest single source. Raw portraits drift in video far more than refined stills, so this step is non-skippable for any pipeline that ends in a moving image. The refined version becomes the master reference everything downstream inherits.
3
3. Generate turnaround and expression sheets on Flux Kontext
Wire the master portrait into a Flux Kontext node and produce the views the script requires: three-quarter turn, profile, expression sheet (happy, focused, surprised, sad, determined), and outfit alts. Flux Kontext is edit-aware — it preserves the locked face while changing what surrounds it. Avoid subjective edits like "more handsome" or "softer face" — they cause regression. Stick to concrete changes: outfit, pose, environment, expression. Generate the full sheet, then curate the strongest still per shot intent.
4
4. Curate the strongest still per shot type
Lay the curated stills left to right on the canvas in cut order: master, three-quarter, profile, reverse, reaction. This becomes a visual storyboard the team can scan in one pass before committing video credits. Mark the hero stills the script depends on — the close-up beat, the hero pose, the reaction shot. Re-run only the weak frames, never the whole sheet. The cleaner this still bench is before video, the cheaper and more consistent the downstream generation runs.
5
5. Wire each still into a video node
Connect each curated still to its own video node, choosing the model by shot intent. Use Vidu when you need to feed up to seven character image references for cross-shot identity. Use Kling O3 for character-aware motion in dialogue and reaction beats. Use Kling 3 for cinematic motion with strong identity anchoring. The same canonical portrait threads through every node — never re-prompt the character in a video node, only describe action. The image is the identity contract; the video prompt is the motion contract.
6
6. Run multi-shot fan-out and check cross-shot continuity
Trigger every video node in parallel from the canvas. Each shot inherits the same anchor, so identity should hold across the bundle. Once the first pass finishes, scrub through the cuts in order on the canvas and watch for drift — eye color shift, hair length change, jawline softening between shot two and shot four. Drift past five to ten seconds is a model limit, not a workflow failure; for longer takes, plan to chain shorter clips with a last-frame hand-off rather than asking one generation to hold for fifteen seconds.
7
7. Re-run only the weak shots
The point of fan-out is that you keep the strong cuts and only regenerate the weak ones. Identify the shots where identity slipped, leave the good ones untouched, and re-trigger only those nodes. Tweak the per-shot motion prompt, never the character description. If a shot keeps drifting after two retries, swap the reference still on that node to a closer view of the face — sometimes the issue is that a profile shot is too lossy a reference for a tight close-up, and the master portrait holds better.
8
8. Sequence the cuts and export to NLE
Drop a sequence-builder node and wire the approved video outputs into it in cut order. Add audio (ElevenLabs voiceover, ambience) where the script needs it. Then add an NLE export node and ship the labeled bundle to Premiere, DaVinci, or Final Cut. Premiere takes the bundle straight to the bin; DaVinci accepts it with cut order intact for color; Final Cut imports as a connected sequence. Save the canvas as a character template — week two of the AI influencer drop reuses the wiring with a fresh script.

Recommended models

Martini canvas notes

The canonical portrait is one image node with many downstream connections — drop once, anchor everywhere; never duplicate the source.
Flux Kontext takes the locked portrait as a reference slot and edits around it, so outfit and expression changes do not regenerate the face.
Vidu accepts up to seven character reference images on its node, so a turnaround sheet plus the master portrait gives the strongest cross-shot identity.
Multi-shot fan-out runs in parallel — wall time is the slowest video, not the sum, so fanning eight shots costs roughly the same as fanning two.
The sequence builder previews continuity on the canvas; identity drift is visible in the preview before you commit to NLE export.

Variations

AI influencer weekly drop

One canvas template, one canonical portrait, one weekly script. Swap script and outfit anchor each week; the face stays locked across twelve cuts.

Two-character dialogue scene

Two canonical portraits, two anchor nodes, both wired into Kling O3 for shot-reverse-shot coverage. Each shot inherits its own character anchor.

Brand mascot pose bible

Generate the master portrait, fan out 12 outfit and expression variations on Flux Kontext, export as a static bible the design team references for years.

Episodic AI series intro

Lock the protagonist once at season start. Each episode reuses the same anchor; the canvas template scales across thirteen episodes without per-episode re-prompting.

Related features

AI Character Consistency Across Images and Video

Keep a subject consistent across image and video generations on Martini using reference workflows.

Consistent Character AI Video — Reference-Driven Video on Martini

Preserve character identity through reference-driven video models on Martini.

AI Character Reference — Reference-Image Workflows on Martini

Use reference images to guide AI model outputs on Martini's canvas.

AI Character Design — Game and Story Characters on Martini

Design original characters for games, stories, and animations on Martini's canvas.

Multi-Shot AI Video — Build Connected Scenes, Not Isolated Clips

Plan, generate, and sequence multi-shot AI video on Martini — keep characters, style, and motion consistent across shots.

Related how-to guides

Related docs

Frequently asked questions

Why one canonical reference instead of multiple?

Models average across multiple anchors. If you feed two "ideal" portraits, the output ends up halfway between them, which means the face drifts away from either source. One canonical portrait — the cleanest, most representative single image — is the strongest anchor. Refine that one image to a hyper-clean state before chaining downstream.

When do I use Nano Banana 2 versus Flux Kontext?

Nano Banana 2 is the canonical face-locker — use it to generate and refine the master portrait. Flux Kontext is edit-aware — use it to swap outfits, change expressions, and place the locked face into new scenes without losing identity. They complement each other: Nano Banana 2 builds the anchor, Flux Kontext varies the surroundings.

How do I handle identity drift past five seconds?

Drift past five to ten seconds is a model limit on most video generators, not a workflow failure. Plan to chain shorter clips with last-frame hand-offs rather than asking one generation to hold for fifteen seconds. The canvas makes this cheap: the last frame of clip A becomes the first frame anchor of clip B, identity stays locked, and the cut feels continuous in the NLE.

Can I use this workflow for a real-person likeness?

Only with consent. Generating likenesses of real people without explicit permission is a legal and ethical issue, and several platforms have policies against it. For original characters, this workflow is the right tool. For real-person spokespeople, license the likeness, generate with permission, and disclose AI use when client policy or platform terms require.

What if a shot still drifts after I anchor the canonical portrait?

Two fixes — first, make sure the per-shot reference still is appropriate for the framing. A profile reference is too lossy for a tight close-up; switch to the master portrait for tight shots. Second, tweak the motion prompt rather than the character description. If those do not resolve it, swap to a different video model (Kling O3 or Vidu) that has stronger character-aware behavior for that shot type.

Does style transfer of a known IP affect commercial release?

Yes — style mimicry of named IP (Pixar, Ghibli, Marvel, Disney) or living artists is risky for commercial release even when the character is original. Style guidance through art-direction language (color palette, line weight, lighting) is safer than naming the studio. For commercial campaigns, brief original style and avoid style-transfer prompts that lean on protected IP.

Build it on the canvas

Open Martini and wire this workflow up in minutes. Free to start — no card required.

Open the canvas See pricing

Character Consistency Workflow

When to use this workflow

Producing a 12-week AI influencer drop where the same face appears in every weekly cut

Building a brand mascot or recurring spokesperson the team uses across the next campaign

Locking the protagonist across an episodic AI series so every episode opens on the same character

Generating shot coverage for a narrative scene with two anchored leads in dialogue

Producing recurring-protagonist artwork for a Substack serial, novel cover, or webtoon

Replacing the per-shot re-prompt habit with a single anchor that holds across a full bundle

Required inputs

A clear character description with non-negotiable identity traits (age, build, hair, signature features)

A canonical reference image — one master portrait, not multiple "ideal" versions

Wardrobe and expression variants the script requires (default, alt outfit, hero pose, three reactions)

Per-shot motion intent and dialogue beats so prompts stay action-led, not identity-led

A scene list with which shots are master coverage versus reverse / cutaway

Steps

1. Generate the canonical portrait on Nano Banana 2

Open the canvas, drop a Nano Banana 2 image node, and write a portrait prompt with every non-negotiable identity trait spelled out. Generate four to six candidates, pick the strongest single image, and lock it as the canonical reference. Resist the temptation to keep two close runner-ups; the model averages multiple anchors and the face drifts. One canonical portrait is the foundation of every step that follows. Label this node "char-anchor" so it surfaces clearly in the bin after export.

2. Freeze the anchor and refine the still

Pin the canonical portrait as the upstream image-anchor node and do not regenerate from scratch. If the still has soft edges, low-resolution artifacts, or color noise, run a refinement pass on Nano Banana 2 or Flux Kontext to clean it up. Refine in-place — the goal is the cleanest single source. Raw portraits drift in video far more than refined stills, so this step is non-skippable for any pipeline that ends in a moving image. The refined version becomes the master reference everything downstream inherits.

3. Generate turnaround and expression sheets on Flux Kontext

Wire the master portrait into a Flux Kontext node and produce the views the script requires: three-quarter turn, profile, expression sheet (happy, focused, surprised, sad, determined), and outfit alts. Flux Kontext is edit-aware — it preserves the locked face while changing what surrounds it. Avoid subjective edits like "more handsome" or "softer face" — they cause regression. Stick to concrete changes: outfit, pose, environment, expression. Generate the full sheet, then curate the strongest still per shot intent.

4. Curate the strongest still per shot type

Lay the curated stills left to right on the canvas in cut order: master, three-quarter, profile, reverse, reaction. This becomes a visual storyboard the team can scan in one pass before committing video credits. Mark the hero stills the script depends on — the close-up beat, the hero pose, the reaction shot. Re-run only the weak frames, never the whole sheet. The cleaner this still bench is before video, the cheaper and more consistent the downstream generation runs.

5. Wire each still into a video node

Connect each curated still to its own video node, choosing the model by shot intent. Use Vidu when you need to feed up to seven character image references for cross-shot identity. Use Kling O3 for character-aware motion in dialogue and reaction beats. Use Kling 3 for cinematic motion with strong identity anchoring. The same canonical portrait threads through every node — never re-prompt the character in a video node, only describe action. The image is the identity contract; the video prompt is the motion contract.

6. Run multi-shot fan-out and check cross-shot continuity

Trigger every video node in parallel from the canvas. Each shot inherits the same anchor, so identity should hold across the bundle. Once the first pass finishes, scrub through the cuts in order on the canvas and watch for drift — eye color shift, hair length change, jawline softening between shot two and shot four. Drift past five to ten seconds is a model limit, not a workflow failure; for longer takes, plan to chain shorter clips with a last-frame hand-off rather than asking one generation to hold for fifteen seconds.

7. Re-run only the weak shots

The point of fan-out is that you keep the strong cuts and only regenerate the weak ones. Identify the shots where identity slipped, leave the good ones untouched, and re-trigger only those nodes. Tweak the per-shot motion prompt, never the character description. If a shot keeps drifting after two retries, swap the reference still on that node to a closer view of the face — sometimes the issue is that a profile shot is too lossy a reference for a tight close-up, and the master portrait holds better.

8. Sequence the cuts and export to NLE

Drop a sequence-builder node and wire the approved video outputs into it in cut order. Add audio (ElevenLabs voiceover, ambience) where the script needs it. Then add an NLE export node and ship the labeled bundle to Premiere, DaVinci, or Final Cut. Premiere takes the bundle straight to the bin; DaVinci accepts it with cut order intact for color; Final Cut imports as a connected sequence. Save the canvas as a character template — week two of the AI influencer drop reuses the wiring with a fresh script.

Martini canvas notes

The canonical portrait is one image node with many downstream connections — drop once, anchor everywhere; never duplicate the source.

Flux Kontext takes the locked portrait as a reference slot and edits around it, so outfit and expression changes do not regenerate the face.

Vidu accepts up to seven character reference images on its node, so a turnaround sheet plus the master portrait gives the strongest cross-shot identity.

Multi-shot fan-out runs in parallel — wall time is the slowest video, not the sum, so fanning eight shots costs roughly the same as fanning two.

The sequence builder previews continuity on the canvas; identity drift is visible in the preview before you commit to NLE export.

Variations

AI influencer weekly drop

One canvas template, one canonical portrait, one weekly script. Swap script and outfit anchor each week; the face stays locked across twelve cuts.

Two-character dialogue scene

Two canonical portraits, two anchor nodes, both wired into Kling O3 for shot-reverse-shot coverage. Each shot inherits its own character anchor.

Brand mascot pose bible

Generate the master portrait, fan out 12 outfit and expression variations on Flux Kontext, export as a static bible the design team references for years.

Episodic AI series intro

Lock the protagonist once at season start. Each episode reuses the same anchor; the canvas template scales across thirteen episodes without per-episode re-prompting.

When to use this workflow

Required inputs

Steps

1. Generate the canonical portrait on Nano Banana 2

2. Freeze the anchor and refine the still

3. Generate turnaround and expression sheets on Flux Kontext

4. Curate the strongest still per shot type

5. Wire each still into a video node

6. Run multi-shot fan-out and check cross-shot continuity

7. Re-run only the weak shots

8. Sequence the cuts and export to NLE

Recommended models

nano-banana-2

flux-kontext

vidu

kling-o3

kling-3

Martini canvas notes

Variations

AI influencer weekly drop

Two-character dialogue scene

Brand mascot pose bible

Episodic AI series intro

Related features

AI Character Consistency Across Images and Video

Consistent Character AI Video — Reference-Driven Video on Martini

AI Character Reference — Reference-Image Workflows on Martini

AI Character Design — Game and Story Characters on Martini

Multi-Shot AI Video — Build Connected Scenes, Not Isolated Clips

Related how-to guides

Related reading

Related docs

Frequently asked questions

Why one canonical reference instead of multiple?

When do I use Nano Banana 2 versus Flux Kontext?

How do I handle identity drift past five seconds?

Can I use this workflow for a real-person likeness?

What if a shot still drifts after I anchor the canonical portrait?

Does style transfer of a known IP affect commercial release?

Build it on the canvas

This website uses cookies

When to use this workflow

Required inputs

Steps

1. Generate the canonical portrait on Nano Banana 2

2. Freeze the anchor and refine the still

3. Generate turnaround and expression sheets on Flux Kontext

4. Curate the strongest still per shot type

5. Wire each still into a video node

6. Run multi-shot fan-out and check cross-shot continuity

7. Re-run only the weak shots

8. Sequence the cuts and export to NLE

Recommended models

nano-banana-2

flux-kontext

vidu

kling-o3

kling-3

Martini canvas notes

Variations

AI influencer weekly drop

Two-character dialogue scene

Brand mascot pose bible

Episodic AI series intro

Related features

AI Character Consistency Across Images and Video

Consistent Character AI Video — Reference-Driven Video on Martini

AI Character Reference — Reference-Image Workflows on Martini

AI Character Design — Game and Story Characters on Martini

Multi-Shot AI Video — Build Connected Scenes, Not Isolated Clips

Related how-to guides

Related reading

Related docs

Frequently asked questions

Why one canonical reference instead of multiple?

When do I use Nano Banana 2 versus Flux Kontext?

How do I handle identity drift past five seconds?

Can I use this workflow for a real-person likeness?

What if a shot still drifts after I anchor the canonical portrait?

Does style transfer of a known IP affect commercial release?