Black Forest Labs
Generate the source reference image for an Image-to-3D-World workflow on Martini using FLUX.2 — its prompt-fidelity rendering produces clean, literal scene compositions that the world node can reconstruct cleanly. The world node's output is a navigable canvas-internal scene preview you can orbit and screenshot, not a portable .obj, .fbx, .glb, or USD mesh file. Concept artists use FLUX.2 when they need an alt-look reference (different palette, different lighting, different style) than what Nano Banana 2 produces — same workflow, different aesthetic.
FLUX.2 reads literal compositional prompts. Write the source scene with explicit depth and composition language: "Wide shot of a Tokyo backstreet at dusk, foreground left: a vending machine glowing pink, mid-ground center: wet cobblestone alley with neon reflections, background right: small ramen shop with warm interior light, atmospheric depth, photorealistic, 4K." The literal staging gives the world node clear depth cues to reconstruct.
FLUX.2 Pro's prompt adherence is meaningfully tighter on detailed compositions. Pin Pro on the source reference for any Image-to-3D-World workflow — the prompt-literal staging gives the world node more accurate depth signals. Base tier still works for exploratory drafts; switch to Pro before the final source generation.
Save the scene-style language in a Text node: "Photorealistic cinematic photography, atmospheric depth, soft natural light, anamorphic compression, sharp foreground with bokeh on the far-field." Wire it into the FLUX.2 source node as a brand-style prefix. This keeps the look consistent if you generate multiple source candidates before picking one to feed the world node.
Drop a World Labs or Image-to-3D-World node onto the canvas and wire the FLUX.2 output as the input. Generation takes around 5 minutes for a full navigable world. The output is a canvas-internal scene preview — you can orbit, pan, and screenshot inside the canvas, but cannot export the world as a portable mesh or splat file from Martini.
Inside the navigable preview, capture stills from four angles: front view, three-quarter left, three-quarter right, back/over-shoulder. Each capture lands as an image node on the canvas. Capture more angles than you think you need — re-running the world node produces a different scene, so screenshot first, iterate later. These are the shot starting frames for downstream video.
Wire each captured still into its own Sora 2 or Kling 3 video node — image-to-video with the captured angle as the starting frame. Add cinematographic motion prompts. Each video clip inherits the world; only the camera move changes. The locked location is what makes the multi-shot sequence read as one place across cuts.
Source scene with literal depth-axis composition. The "foreground / mid-ground / background" structure gives the world node clear depth signals.
Wide shot of a Tokyo backstreet at dusk, foreground left a vending machine glowing pink, mid-ground center wet cobblestone alley with neon reflections, background right a small ramen shop with warm interior light, atmospheric depth, photorealistic cinematic style, 4K, FLUX.2 Pro tier.
Interior with strong vertical depth. Two-story composition gives the world node depth signals on multiple axes.
[Scene-style prefix] + A grand library interior with two stories of bookshelves, ladders connecting upper level, central reading table foreground, leather armchairs mid-ground, soft warm lamp light, deep wood tones, atmospheric depth, photorealistic, 4K.
Atmospheric exterior with explicit depth-axis staging. The mist gradient supports the world node's depth reconstruction.
[Scene-style prefix] + A coastal cliff at golden hour, foreground a winding path with tall grass, mid-ground a lighthouse small in the upper-right third, background mist over the sea horizon, warm gold palette, photorealistic, 4K.
Clean interior reference. Foreground/mid-ground/background structure + clean composition = strongest reconstruction.
[Scene-style prefix] + An empty mid-century living room, foreground polished wooden floor, mid-ground a single armchair near a stone fireplace, background tall windows with afternoon light, neutral warm palette, photorealistic, 4K.
Use FLUX.2 Pro for source references destined for the world node. Prompt-literal compositions give the world node tighter depth signals.
Write explicit foreground/mid-ground/background staging. The depth-axis structure is what world reconstruction needs most.
Generate at 4K. Lower resolutions still work, but the navigable world will look softer at far-field detail.
Avoid cluttered scenes. Single hero composition with clean depth cues outperforms busy multi-element scenes for world reconstruction.
Capture stills BEFORE iterating the world node. Re-running produces a different scene; screenshot first.
The world output is canvas-internal — you cannot export it as .obj, .fbx, .glb, or USD from Martini. Captured stills are the deliverable.
FLUX.2 returns 1024-2048 wide source references with literal compositional fidelity. The downstream world node returns a navigable canvas-internal scene preview (not exportable as .obj/.fbx/.glb/USD). Generation time: FLUX.2 source 30-60s on Pro tier, world reconstruction ~5 minutes. Captured stills land on the canvas ready to feed Sora 2 / Kling 3 / Runway Gen4 video nodes. For exportable mesh assets, route through Martini's Tripo3D or Hunyuan3D image-to-3d nodes — those produce GLB/FBX; the world node does not.
Connect FLUX.2 with other AI models on Martini's infinite canvas. No GPU required — start free.
Get Started FreeGenerate the canonical reference image for an Image-to-3D-World workflow on Martini using Nano Banana 2 — the cleaner the source, the navigable the resulting scene. The output of the world node is a navigable canvas-internal scene preview you can orbit and screenshot, not a portable .obj, .fbx, .glb, or USD mesh file. Concept artists use this to lock a location once on Nano Banana 2, pass the locked still into the World Labs or Image-to-3D-World node, and capture matched-angle stills that feed downstream Sora 2 or Kling 3 nodes for shots that all share the same world.
View guideOpenAI
Use Sora 2 as the downstream camera-move engine for an Image-to-3D-World workflow on Martini — the captured stills from the navigable world feed directly into Sora 2 video nodes for matched-angle motion shots. The world node's output is a canvas-internal navigable scene preview, not a portable .obj, .fbx, .glb, or USD mesh. Sora 2 takes the captured stills as starting frames and produces video clips that all share the same locked location, with cinematographic camera moves that respect the spatial structure of the source world.
View guide