Kling
Kling O3 Video Edit (Omni Edit) is the V2V variant in Kling's O3 family that takes existing footage and swaps characters, environments, or specific elements while preserving original motion and timing. It shares the Kling 3.0 backbone — native 4K up to 60fps, 16-bit HDR, and Omni Native Audio. For a brand team running a Kling-native pipeline already, O3 Video Edit is the in-family edit step; for jobs that need element-level swaps (logo on a car, color of a wardrobe), it is the most surgical of the three editing models on this page.
O3 Video Edit is V2V-only and requires existing video as input. On Martini's canvas, route the campaign clip into a video reference node. The model preserves the source's motion vector and timing while applying the prompt-driven edit.
O3 Video Edit reads element-level prompts well. Be specific: "replace the red logo on the car door with the new brand mark in white", "swap the character's blue jacket for a green leather one", "change the daytime sky to overcast evening". Specificity beats general instructions like "make it look like winter" — the model edits per-element when told to.
Pure character swap (different person, same scene and action) reads cleaner on Kling O3 Video Ref — the dedicated V2V + reference variant. O3 Video Edit is better for non-character swaps (objects, backgrounds, lighting). For character work, route the source + character reference into O3 Video Ref, not Video Edit.
Both tiers preserve motion and timing exactly. Pro renders at native 4K with 16-bit HDR; Standard outputs at lower resolution. For final-deliverable hero edits, Pro. For internal review or social cutdown variants, Standard saves credits. Render times: Standard 2-3 min, Pro 4-6 min for a 10s edit.
O3 Video Edit handles one major element swap per render most cleanly. For multi-swap edits (logo + wardrobe + sky), chain two O3 Video Edit nodes in sequence: first node swaps logo, second node swaps wardrobe, etc. Each pass preserves prior changes. Cleaner than asking for three swaps in a single prompt.
Kling O3 supports Omni Native Audio. If the source's audio still works for the edited footage (timing is preserved, so dialogue stays in sync), keep the original audio in the NLE. If the swap changes ambient context (daytime to evening), regenerate the audio bake by enabling Omni Native Audio in the prompt.
Surgical element swap — exactly what O3 Video Edit is built for. The "preserve all other elements" guard prevents drift.
Replace the red logo on the car door with the new brand mark in white, preserve all other elements and motion
Wardrobe swap. The "keep face and pose identical" instruction is critical — without it the model can drift identity.
Swap the character's blue jacket for a green leather one, keep face and pose identical, same lighting
Environment swap with consistency guard. O3 Video Edit handles sky and ground reflection coherently.
Change the daytime sky to overcast evening, ground reflections shift accordingly, keep all other elements original
O3 Video Edit is V2V-only — must feed source video plus a precise edit prompt.
For character swaps specifically, switch to O3 Video Ref — Video Edit is for non-character element swaps.
Be element-specific in the prompt; "make it look like winter" reads loose, "change the daytime sky to overcast evening" reads tight.
Layer multi-element edits across multiple chained O3 Video Edit nodes, not in a single prompt.
Use "preserve" guards in the prompt to lock the elements that should not change.
Pro tier renders native 4K; Standard for cheaper internal review variants.
Kling O3 Video Edit outputs at source timing with native 4K + 16-bit HDR on Pro tier (lower on Standard). Render times: Standard 2-3 min, Pro 4-6 min for a 10s edit. The Omni Native Audio in same-pass option is unique among the three editing models — Aleph and Wan VACE both require separate audio chains. Best for surgical element swaps inside a Kling-native pipeline; for creative tone pivots, use Aleph; for full character swap with low cost, use Wan Animate Mix.
Connect Kling O3 Video Edit with other AI models on Martini's infinite canvas. No GPU required — start free.
Get Started FreeRunway
Runway Aleph is the V2V model that preserves camera move and timing exactly while restyling the look. For a brand team that has source footage needing a seasonal reskin (a summer campaign repurposed for winter, a daytime spot pushed to dusk), Aleph is the cleanest path: feed in the original clip plus a look reference image, and the output reads as the same shot in a new world. No re-prompting, no re-shoot.
View guideAlibaba
Wan VACE Video Edit is the open-weight V2V editor in Alibaba's Wan family — supports up to three reference images for guided style and content changes, ideal when a brand team needs reference-driven edits at high volume without premium-model pricing. For batch reskinning of campaign clips (a templated brand pivot across 20+ assets), Wan VACE's open-weight architecture keeps Sutui low while delivering reference-faithful edits. Pair with Wan Animate Mix when the swap target is a character.
View guide