Black Forest Labs
Edit your locked character into new outfits, scenes, and poses on Martini using FLUX Kontext — built specifically for instructed image edits that preserve subject identity. Where Nano Banana 2 generates the canonical character sheet, FLUX Kontext is the wardrobe-and-scene editor that takes that locked still and modifies it without losing the face. The two-model chain (Nano Banana 2 to lock identity, FLUX Kontext to vary wardrobe) is the cleanest character-consistency pipeline on the canvas.
FLUX Kontext is an instruction-edit model — it takes an existing image and modifies it per a text instruction. Drop your locked character still (typically a Nano Banana 2 output) onto the canvas as the input image. The face from this image is preserved through the edit; the prompt directs what changes around it.
FLUX Kontext expects instruction-style prompts: "Change the outfit to a black leather jacket and dark jeans," "Move the character to a Tokyo neon-lit street at night," "Change the expression to a confident smirk." Skip generation-style prompts ("a person wearing a leather jacket") — Kontext reads instructions, not descriptions. The instruction format is what preserves the rest of the image.
For complex transformations, chain multiple FLUX Kontext nodes — each makes one focused edit. Node 1: change outfit. Node 2: change background. Node 3: change pose. The output of each node becomes the input of the next. Single-edit nodes preserve identity better than multi-edit prompts; the model has fewer competing directives per pass.
FLUX Kontext ships in Standard, Max, and Pro variants. Standard handles light edits (outfit swaps, background changes) cleanly. Pro is the safer pick when the edit is close to the face (hair color, makeup, expression) — Pro's identity preservation is measurably tighter. For full-body edits where the face stays small in frame, Standard is sufficient.
Duplicate the FLUX Kontext node 12 times — one per wardrobe/scene combination (gym workout, business attire, beach sunset, evening event, athletic streetwear, etc.). Each duplicate takes the SAME Nano Banana 2 still as input; only the instruction prompt changes. The fan-out ships a 12-piece content batch from one canonical face.
Identity drift is the failure mode. After every batch, drop each FLUX Kontext output next to the original Nano Banana 2 reference and compare face geometry — eye spacing, jaw shape, hairline. If drift is visible, switch the affected node to Pro tier or chain back through Nano Banana 2 to re-lock. Catching drift early beats reshooting the whole batch.
Single-edit instruction. The "Preserve the face exactly" reminder is belt-and-suspenders — Kontext usually does this, but the explicit instruction adds margin.
[Input image: Nano Banana 2 locked character still] + Instruction: Change the outfit to a black leather jacket over a white t-shirt, dark slim jeans, and white sneakers. Keep the same pose, same lighting, same background. Preserve the face exactly.
Background swap. The face and outfit stay; only the scene changes. Single-instruction edits preserve identity best.
[Input: same locked character still] + Instruction: Move the character to a Tokyo neon-lit street at night. Keep the outfit, pose, and face exactly the same. Add subtle reflections from neon signs in the foreground.
Expression edit. Use FLUX Kontext Pro for this — close to the face, tighter identity preservation needed.
[Input: same locked character still] + Instruction: Change the expression to a slight confident smile, eyes looking three-quarter to camera left. Keep everything else identical including outfit, lighting, background.
Multi-element wardrobe + scene change in one instruction. Test on Standard first; if face drifts, re-run on Pro.
[Input: same locked character still] + Instruction: Change to athletic apparel - dark gym leggings, fitted athletic top, white sneakers. Move to a clean photography studio backdrop with soft three-point lighting. Keep the face and the same three-quarter angle.
Use instruction-style prompts ("Change the outfit to..."), not generation-style ("a person wearing..."). Kontext is built for the former.
Chain single-edit nodes for complex transformations. One change per node preserves identity better than multi-change prompts.
Pin Pro tier for face-close edits (hair, makeup, expression). Standard is fine for full-body wardrobe and background swaps.
Always include a preservation instruction ("Keep the face exactly the same"). Belt-and-suspenders, costs nothing, adds margin.
For animated content, run the FLUX Kontext output through a Sora 2 or Kling 3 video node — the locked face survives the image-to-video transition cleanly.
QA every batch against the original Nano Banana 2 reference. Drift compounds; catching it on output 5 beats catching it on output 50.
FLUX Kontext returns instruction-edited outputs at the input image's resolution (typically 1K-2K from Nano Banana 2). Identity preservation runs ~92% on Standard, ~96% on Pro. Generation time: Standard 15-30s, Pro 30-60s. Output drops onto the canvas as the edited still — chain into more Kontext nodes for sequential edits, or into Sora 2 / Kling 3 video nodes for animated character content. The two-model character pipeline (Nano Banana 2 lock → FLUX Kontext edit) is what makes 12-piece content batches feasible from one canonical reference.
Connect FLUX Kontext with other AI models on Martini's infinite canvas. No GPU required — start free.
Get Started FreeBuild an AI persona once on Nano Banana 2 and ship a character sheet that holds across pose, outfit, and scene shifts on the Martini canvas. Nano Banana 2 is the strongest face-locker in the stack: it accepts up to 10 reference images and outputs at 1K, 2K, or 4K, with face consistency that survives 50+ generations from the same canonical reference. For AI influencer producers who keep one persona identical across a 12-week content series, this is the load-bearing model — every other model in the chain inherits the lock from here.
View guideVidu
Generate consistent character stills on Martini using Vidu Reference-to-Image — accepts 1-7 reference images per generation and outputs character stills that flow directly into Vidu video nodes for matched motion. Vidu's reference workflow is optimized for the image-to-video character pipeline: the same model family that locks identity on the still also handles the motion, eliminating cross-model identity drift at the modality boundary. For producers who plan to ship character video content downstream, Vidu Reference-to-Image is the cleanest single-vendor path.
View guide