OpenAI
Generate storyboard frames on Martini using GPT Image 2 — strong narrative reasoning makes it the right pick when the script reads like a beat-driven story rather than a literal shot list. GPT Image 2 interprets emotional context and composes scenes that carry meaning across panels, which is exactly what directors need when boarding a 30-60 second commercial or short-form narrative. Pair it with a saved style prompt block and feed selected frames into Sora 2 or Kling 3 for the animatic.
GPT Image 2 reasons over story. List the beats as narrative moments: "Frame 1: a runner pauses on a coastal trail to look at the dawn ahead — convey both exhaustion and resolve." GPT Image 2 will compose the shot including subtle storytelling — the hand on a knee, the breath visible in cool air, the gaze fixed on the horizon. This is its differentiator vs FLUX.2, which renders literal but not interpretive.
GPT Image 2 has no style-reference image input. Save the visual language in a Text node: "Cinematic photography style, anamorphic compression, soft natural light, muted desaturated palette, shallow depth of field, 16:9 framing." Wire it into every storyboard node; each node concatenates the style block with its narrative beat prompt. Visual coherence flows from the shared prefix.
GPT Image 2 handles 5+ named elements per frame without dropping any. For dense beats — "the runner mid-stride, dawn breaking over a distant lighthouse, mist rising off the rocks below, cool blue palette, wide cinematic frame" — use GPT Image 2. Other Martini image models start dropping elements past 3-4 named items.
For storyboards with a recurring protagonist, generate the canonical character once on Nano Banana 2, then describe the locked character in detailed terms in the GPT Image 2 prompt ("a runner in their thirties, dark athletic gear, distinct facial features [describe from Nano Banana 2 reference]"). GPT Image 2 cannot accept reference images directly for character lock — descriptive prompts get you 70-80% identity coherence across the board.
On the Martini canvas, place the GPT Image 2 nodes in chronological order — frame 1 establishing at far left, frame 12 closing tag at far right. Each shares the cinematic style block; each has its own narrative beat. The board reads as a story arc: setup → inciting incident → rising action → climax → resolution → tag. The visual language stays consistent because the style prefix is identical.
Each GPT Image 2 frame is also a video keyframe. Once the board is approved, route selected frames into Sora 2 or Kling 3 video nodes with shot-specific motion prompts ("slow camera push," "static frame as the runner moves," "gentle pan left"). The animatic ships from the same canvas, no re-prompting from scratch.
Establishing beat with multi-element composition. The "convey both exhaustion and resolve" guides emotional context — GPT Image 2's strength.
[Cinematic style block prefix] + Frame 1 (establishing): A runner pauses on a coastal trail to look at the dawn ahead. Convey both exhaustion and resolve. Wide cinematic shot with the figure small in frame against breaking blue-gold light, mist rising off the rocks below, lighthouse in the distance.
Story turn with subtle emotional beat. GPT Image 2 reasons through the recognition moment.
[Style block] + Frame 4 (story turn): The runner notices a small detail on the path - perhaps a lost glove or a bird taking flight. Medium close-up, the runner's face shifts from determination to recognition. Same coastal palette, soft golden side light breaking through.
Tension peak in tight close-up. The "single bead of sweat catching the light" gives GPT Image 2 the specific detail to compose around.
[Style block] + Frame 8 (tension peak): Tight close-up of the runner's eyes, full of resolve, dawn light breaking across the face, single bead of sweat catching the light, cinematic anamorphic compression.
Closing tag with tonal shift. The "hopeful tonal shift" guides the palette evolution; GPT Image 2 carries the cinematic style across the shift.
[Style block] + Frame 12 (closing tag): Wide reverse over the cliff, the runner silhouetted against the now-fully-broken sun, the path stretching ahead, hopeful tonal shift to warmer palette, sense of arrival but also continuation.
Write narrative beats, not shot directions. GPT Image 2 reasons through story; pure shot lists waste its strength.
Save the cinematic style block once and wire it into every storyboard node. Editing the block once propagates to all panels.
Use 5+ named elements per beat for dense compositions. GPT Image 2 handles density better than FLUX.2 / Midjourney.
For character lock, describe the protagonist in detailed terms each frame ("the same runner, dark gear, distinctive features"). GPT Image 2 cannot accept reference images for face lock; descriptive prompting gets ~70-80% coherence.
Set quality to "high" for hero panels, "standard" for variants and exploration. Quality affects both detail and prompt adherence.
For animatics, board the full 8-12 frames first, get approval, then animate selected frames in Sora 2 / Kling 3 — animating every panel is wasteful.
GPT Image 2 returns 1024-2048 wide outputs with strong narrative coherence and consistent style across 12+ frames when wired through a shared style block. Generation time 30-60s per panel. Best at narrative beats and multi-element composition; pair with Nano Banana 2 for character-lock workflows where face consistency must be near-perfect. Output drops onto the canvas — wire each chosen frame into Sora 2 / Kling 3 for animatic, or chain into Ideogram for any in-frame title cards.
Connect GPT Image 2 with other AI models on Martini's infinite canvas. No GPU required — start free.
Get Started FreeMidjourney
Draft a shot list as image nodes laid out left-to-right on the Martini canvas using Midjourney v7 — cinematic frames with editorial mood that read as cohesive storyboard panels for a commercial or short film. Midjourney is the strongest pick when the storyboard needs atmospheric weight and painterly cinematography rather than literal staging. Lock a single style reference, fan into 8-12 frames keyed to script beats, then feed the strongest panels straight into Sora 2 or Kling 3 for animatic motion tests.
View guideBlack Forest Labs
Generate storyboard frames on Martini using FLUX.2 — the prompt-fidelity pick for boards where every shot is staged literally and frame composition must match the director's shot list verbatim. FLUX.2 renders explicit camera angles, character positions, and prop placements with literal accuracy, which is exactly what feature-quality pre-viz needs. Pair it with a saved cinematic style prompt, fan into 8-12 frames keyed to the shot list, and feed selected panels into Sora 2 or Kling 3 for animatic.
View guide