World Labs AI on Martini
World Labs turns a single image or text prompt into a navigable 3D world you can pan, orbit, and explore. On Martini you wire World Labs into a pre-viz canvas, lock down a location once, then feed reference angles into Sora 2, Kling 3, and Nano Banana 2 nodes for video shoots and stills.
What it creates
World Labs creates a navigable 3D world from a single still image or short text prompt. The output is an interactive scene — not a flat render — so you can move the camera, orbit objects, and capture multiple angles from one generation. Inside Martini it shows up as a world node with a preview viewport, and the underlying scene data can be referenced by downstream image and video nodes as a consistent backdrop.
Because the world is navigable rather than baked, you can pull frame captures from any viewpoint without re-running the model. That makes World Labs especially useful as a reference layer: lock a location once, then pull twenty stills from twenty angles to feed into Sora 2 for camera moves or Nano Banana 2 for hero stills. Quality of the navigable region depends heavily on the source image — clean architectural references and well-lit interiors hold together best.
Treat the output as pre-visualization, not a final asset. The geometry can be loose at the edges, foreground detail is sharper than far-field, and lighting is baked into the source image. For storyboards, location libraries, virtual sets, and reference frames it is a strong unlock; for finished export-ready 3D you would still hand off to a dedicated DCC tool.
Inputs and outputs
Inputs are a single reference image or a short text prompt — a clean, well-composed reference yields the most coherent navigable region. World Labs returns an interactive 3D scene rendered in a Martini preview viewport, plus the ability to capture stills from any camera angle. Captured frames flow downstream as standard image inputs, so you can route them into Nano Banana 2, Flux, Sora 2, or Kling 3 nodes without leaving the canvas. The world itself is not exported as a glTF or splat file inside Martini — it lives as a referenceable scene the canvas can re-query.
Best workflows
- •Virtual sets — generate a fixed location once and shoot every scene of a short film against the same world without rebuilding it for each angle.
- •Navigable backgrounds — produce explorable backdrops for product photography, character compositing, and social filter scenes from a single concept frame.
- •Immersive pre-visualization — block out a location for a director or client review before committing to a real shoot or a full 3D build.
- •World reference for video shoots — capture matched stills from multiple angles and feed them into Sora 2 or Kling 3 as image-to-video starting frames.
- •Storyboard locations — pair with Martini storyboard nodes so every panel takes place in the same coherent world rather than drifting between AI hallucinations.
- •Concept exploration — quickly test whether a location idea reads as a real space before investing in a finished render or build.
How to use it in Martini
- 1
Open a Martini canvas and add a World Labs world node from the node palette. Connect a reference image node — usually a Nano Banana 2 or Flux generation, an uploaded photo, or a hand-painted concept frame — into the world input. If you only have a text idea, you can prompt the world node directly, but image-conditioned worlds hold geometry together more reliably.
- 2
Run the node and wait for the navigable preview to load. Use the in-canvas viewport to orbit, dolly, and pan the camera until you find the angles that read best. Anything in the foreground is typically sharper than far-field geometry, so frame your shots accordingly.
- 3
Capture stills from each angle you want to use downstream. Each capture becomes a regular image output on the world node and can be wired into any image- or video-input port — feed them into Sora 2 for cinematic camera moves, Kling 3 for character animation, or Nano Banana 2 for cleaned-up hero stills.
- 4
Lock the world once you are happy with it and treat it as the ground truth for the project. Re-use the same world across multiple downstream nodes so every shot, panel, and reference shares the same lighting and layout — this is what stops AI video sequences from drifting location-to-location.
- 5
For longer pieces, branch the canvas: one path captures wide establishing frames for storyboarding, another path pulls tight character-level angles for video generation, and a third path exports clean stills for marketing and thumbnails.
Pair with image / video models
Limitations
- !The output is a navigable scene optimized for viewing inside Martini, not an exportable glTF, USD, or Gaussian splat asset you can drop into a game engine.
- !Navigation depth is bounded — you can orbit and step into the scene meaningfully, but pushing the camera far past the framed area surfaces stretched geometry and reconstruction artifacts.
- !Lighting is essentially baked from the source image. You cannot relight the world after generation, so plan the input frame for the lighting mood you want.
- !Far-field detail and very thin geometry (railings, wires, foliage edges) are the weakest parts of the reconstruction; treat the foreground as hero, the mid-ground as usable, and the background as suggestive.
- !Coherence drops with very busy or low-quality reference images. Clean, well-composed interiors and architecture hold together far better than chaotic outdoor crowd scenes.
Related features
Related how-to guides
Related docs
Frequently asked questions
What does World Labs actually output inside Martini?
A navigable 3D world rendered in an in-canvas preview viewport. You can orbit, pan, and capture stills from any angle — those captures become standard image outputs that downstream nodes can use. The world is referenced inside the canvas rather than exported as a 3D file.
Can I export the World Labs scene to Blender, Unreal, or Unity?
Not from Martini. World Labs inside Martini is positioned as a navigable reference and pre-viz layer, not a 3D asset pipeline. If you need a tradeable mesh or splat, capture the angles you need and rebuild in your DCC tool, or use the dedicated 3D model generator features.
Image input or text input — which works better?
Image input. Text-conditioned worlds are useful for early exploration, but image-conditioned worlds hold geometry, lighting, and style together far more reliably. A clean reference photo or a Nano Banana 2 generation is the strongest starting point.
Why does the background look stretched when I push the camera back?
World Labs reconstructs a navigable region around the framed source image. Foreground is hero, mid-ground is usable, far-field is suggestive. Stay within the comfortable navigation envelope and you will avoid the stretching artifacts.
How do I keep video shots in the same location across an entire scene?
Generate the world once, then capture multiple stills from each angle you need and feed them as starting frames into Sora 2 or Kling 3. The shared world reference is what stops AI video from drifting between subtly different locations shot to shot.
Is World Labs good enough for a final hero render?
Treat it as pre-viz and reference, not a finishing render. Use the captured angles to drive Nano Banana 2 or Flux for hero stills, or feed them into Sora 2 for the final video — World Labs is the location anchor underneath those finishing models.
Ready to build with World Labs?
Open Martini, drop a world node, and chain it into your image and video pipeline. No GPU required.
Get started in dashboard