Alibaba
Wan VACE Video Edit is the open-weight V2V editor in Alibaba's Wan family — supports up to three reference images for guided style and content changes, ideal when a brand team needs reference-driven edits at high volume without premium-model pricing. For batch reskinning of campaign clips (a templated brand pivot across 20+ assets), Wan VACE's open-weight architecture keeps Sutui low while delivering reference-faithful edits. Pair with Wan Animate Mix when the swap target is a character.
Wan VACE Video Edit accepts a source video plus up to three reference images. On Martini's canvas, route the source into a video reference node, then attach 1-3 image references. Three references is the maximum: typically one for palette/look, one for environment, one for stylistic detail. Beyond three the model dilutes.
For a winter-pivot edit, use distinct reference roles: ref 1 = winter palette moodboard, ref 2 = snow-environment image, ref 3 = winter-fabric texture detail. Each reference contributes a different vector. Three overlapping winter moodboards add no information; three role-distinct references guide the model precisely.
Wan VACE responds best when edits are layered one reference at a time. First pass: source + palette reference only. Review. Second pass: previous output + environment reference. Review. Third pass: + stylistic detail reference. This incremental approach keeps changes controlled and lets you stop at the right depth of restyle.
Wan VACE reads both prompt and references. Prompt should match the references' direction without restating them. "Restyle to winter morning, preserve character motion and timing, palette and environment shift gradually." The references handle the visual specifics; the prompt guides timing and preservation guardrails.
Wan's open-weight architecture is the cost win — VACE Sutui per render is materially below Aleph or Kling O3. For a campaign reskinning 20+ clips with the same reference set, build the canvas once with the Wan VACE node and the three references, then duplicate the source clip across the canvas to batch-process. Total credits often half what Aleph would charge.
VACE Video Edit is built for style/content edits but is less specialized for character swap. If the brief is "replace this character with that one," route to Wan Animate Mix instead — it takes a source video and a replacement character image and swaps the subject while keeping motion. VACE is the right tool for environment/style/object edits; Animate Mix is the right tool for character.
Companion prompt to a 3-reference setup: palette + environment + texture refs. The prompt is the steering wheel.
Restyle to winter morning, preserve character motion and timing, palette and environment shift gradually
Aesthetic shift with motion preservation. Pair with one painting reference image.
Apply oil-painting aesthetic across all surfaces, preserve original action and camera move, gradual brushstroke buildup
Time-of-day pivot with brand identity guard. The "keep brand logo unchanged" lock matters in template campaigns.
Shift to dusk blue hour with warm street light glow, preserve all character action, keep brand logo unchanged
Wan VACE accepts up to 3 reference images — assign each a distinct role (palette, environment, texture) rather than overlapping.
Apply edits incrementally — one reference at a time — to keep changes controlled.
Open-weight architecture means cheaper Sutui — best pick for batch volume work (20+ clips with same reference set).
For character swap specifically, use Wan Animate Mix instead of VACE Video Edit.
Pair the references with a directional prompt; the prompt is the steering wheel, the references are the engine.
Wan VACE Video Edit outputs at source timing and resolution (typically 720p-1080p, depending on tier). Render times: 90-180 seconds per pass, materially faster and cheaper than Aleph or Kling O3. Best for batch work where the same reference set is applied across many clips. Trade-offs: less suited for fine-detail edits than Kling O3 Video Edit, less suited for creative tone pivots than Runway Aleph. The right tool when budget and volume win over premium fidelity.
Connect Wan VACE Video Edit with other AI models on Martini's infinite canvas. No GPU required — start free.
Get Started FreeRunway
Runway Aleph is the V2V model that preserves camera move and timing exactly while restyling the look. For a brand team that has source footage needing a seasonal reskin (a summer campaign repurposed for winter, a daytime spot pushed to dusk), Aleph is the cleanest path: feed in the original clip plus a look reference image, and the output reads as the same shot in a new world. No re-prompting, no re-shoot.
View guideKling
Kling O3 Video Edit (Omni Edit) is the V2V variant in Kling's O3 family that takes existing footage and swaps characters, environments, or specific elements while preserving original motion and timing. It shares the Kling 3.0 backbone — native 4K up to 60fps, 16-bit HDR, and Omni Native Audio. For a brand team running a Kling-native pipeline already, O3 Video Edit is the in-family edit step; for jobs that need element-level swaps (logo on a car, color of a wardrobe), it is the most surgical of the three editing models on this page.
View guide