Kling
Drive Kling 3.0 multi-shot sequences on Martini using captured stills from a navigable 3D world — Kling supports 2-6 scenes per video with explicit per-scene descriptions, which makes it the strongest single-pass multi-shot pick when paired with locked location backplates. The 3D world is canvas-internal reference, not an exportable .obj/.fbx/.glb/USD file. Kling reads the captured stills as starting frames and renders multi-shot videos that all share the locked location, with 3D Spacetime Joint Attention handling parallax and occlusion across cuts.
Source the world from Nano Banana 2 or FLUX.2 → World Labs / Image-to-3D-World node, OR from a text prompt routed through the Marble 3D node. Generation runs ~5 minutes. The output is a navigable canvas-internal scene preview — orbit, pan, screenshot. Cannot be exported as .obj/.fbx/.glb/USD from Martini. The world is the spine of the multi-shot sequence; Kling derives every shot from it.
Kling 3.0 multi-shot caps at 6 cuts in a single 15-second pass. Capture 4-6 matched-angle stills from the navigable world: front, three-quarter left, three-quarter right, back/over-shoulder, plus a tight close and a closing wide if you want a 6-cut sequence. Each capture lands as an image node. Plan all captures from one world generation — re-running produces a different scene.
Drop a Kling 3.0 multi-shot node onto the canvas (vs the standard image-to-video). Wire each captured still into a per-scene reference slot (slots 1-6). Write the per-scene prompt for each cut: scene 1 (wide establishing), scene 2 (medium orbit), scene 3 (close-up static), etc. Kling renders all 4-6 cuts in a single 15-second pass, with shared 3D spacetime attention holding the location across cuts.
Kling 3.0 reads cinematographic language. Per scene: "Scene 1: slow camera push forward through the room. Scene 2: gentle orbit clockwise around the armchair. Scene 3: static camera, dust motes drifting. Scene 4: dolly forward with parallax. Scene 5: slow camera pull back revealing the full space." Match each scene's prompt to the corresponding captured still in slot N.
Kling 3.0 supports targeted scene editing — re-render just one cut inside a multi-shot sequence without re-rendering the others. After the first multi-shot pass, if scene 3 is off, route it into Kling 3.0's targeted edit endpoint with an updated prompt. Other cuts stay locked. Saves credits on iteration and preserves the cuts that landed correctly the first time.
Drop the Kling multi-shot output into Martini's sequence builder. Cut markers preserved between scenes. Layer audio (ElevenLabs Eleven v3 dialogue + Minimax Music ambient bed). Export as native sequence to Premiere, DaVinci Resolve, or Final Cut. The locked 3D world made the multi-shot read as one set; the NLE export is the final delivery.
Four-cut multi-shot in one Kling pass. Slot stack matches scene order; per-scene prompts write the camera moves.
[Slot 1: wide establishing capture] + [Slot 2: three-quarter left] + [Slot 3: tight close on fireplace] + [Slot 4: reverse over-shoulder] + Kling 3.0 multi-shot prompt: Scene 1 (4s): slow camera push forward through the room toward the fireplace, soft afternoon light. Scene 2 (4s): gentle orbit clockwise around the armchair, lighting unchanged. Scene 3 (4s): static camera on the fireplace mantelpiece, embers glow softly. Scene 4 (3s): dolly forward with parallax revealing dust motes. 16:9, 15s total.
Six-cut maximum sequence. Kling 3.0 multi-shot caps at 6 cuts in 15s; this fills the cap.
[Slots 1-6: six captures from the same world covering wide → medium → close → reverse → detail → wide] + Kling 3.0 multi-shot prompt: 6 scenes, 15s total. Scene 1 (3s): wide establishing slow push. Scene 2 (3s): medium orbit. Scene 3 (3s): tight close-up static. Scene 4 (2s): detail insert static. Scene 5 (2s): reverse parallax. Scene 6 (2s): closing wide pull-back. 16:9.
Three-cut sequence for shorter narrative beats. Wider per-scene durations give Kling more time to develop the camera move.
[Slots 1-3: three captures from the world] + Kling 3.0 multi-shot prompt: 3 scenes, 12s total. Scene 1 (5s): slow camera push through the alley. Scene 2 (4s): gentle orbit around the lantern. Scene 3 (3s): static camera on neon sign in foreground. 16:9.
Targeted scene edit. Use this when one cut needs revision without re-rendering the whole 15s sequence.
Targeted edit on scene 3 of an existing multi-shot output: re-render scene 3 only with an updated prompt — "static camera, slower zoom, more atmospheric haze." Other scenes stay locked.
Capture all needed angles from ONE world generation BEFORE running Kling. Re-running the world produces a different scene; capture once, route to Kling multi-shot.
Plan the cut count before generating: 2-3 cuts for short narrative beats (longer per-scene), 4-6 cuts for full sequences (tighter per-scene). Kling caps at 6 cuts in 15s.
Match each captured still to its corresponding scene slot (slot N → scene N). Reordering breaks the spatial continuity Kling's 3D Spacetime Joint Attention is built for.
Use cinematographic verbs (push, orbit, dolly, static, parallax) in per-scene prompts. Generic verbs produce inconsistent shot-to-shot behavior.
For iteration, use targeted scene editing — re-render one scene without re-rendering the whole 15s pass. Saves credits and preserves correct cuts.
The 3D world is canvas-internal — Kling uses captured stills, not the navigable world directly. Export from Martini = NLE-ready video, not a 3D file.
Kling 3.0 multi-shot returns 2-6 cuts in a single 15s pass at 1080p. 3D Spacetime Joint Attention holds spatial continuity across cuts when seeded with matched-angle stills from a navigable world. Generation time 90-180s per multi-shot pass. Targeted scene editing supports per-cut iteration without full re-render. The 3D world remains canvas-internal (not exportable as .obj/.fbx/.glb/USD); Kling outputs are exportable video deliverables. Chain via sequence builder for additional shots, NLE export for native Premiere/DaVinci sequences.
Connect Kling 3.0 with other AI models on Martini's infinite canvas. No GPU required — start free.
Get Started FreeOpenAI
Lock down a location once with a navigable 3D world on Martini, capture matched-angle stills, and feed each as a starting frame into Sora 2 for a five-shot sequence that reads as one set instead of five different rooms. Sora 2 has deep understanding of 3D space, motion, and scene continuity — captured stills from a navigable world translate to consistent camera moves with parallax, occlusion, and depth honored. The 3D world is canvas-internal reference, not an exportable .obj/.fbx/.glb/USD file; Sora 2 reads the captured stills and ships exportable video clips that all share the same spatial anchor.
View guideRunway
Use Runway Gen4 Turbo on Martini for fast iteration on 3D-world-derived shots — captured stills from the navigable world feed into Gen4 image-to-video nodes that ship 5- or 10-second clips in under a minute. The 3D world is canvas-internal reference, not an exportable .obj/.fbx/.glb/USD file. Gen4 Turbo is the speed pick when the brief lands at 4 PM and a sequence ships at 9: per-clip generation completes faster than Sora 2 or Kling 3, which makes it the right tool for fast multi-shot iteration before committing the bigger render budget on the hero clips.
View guide