OpenAI
Sora 2 is OpenAI's video model, and its standout strength is physics simulation — liquids pour realistically, fabrics drape naturally, and objects interact with believable weight and momentum. For video ads, this means product shots look physically convincing without the uncanny "AI float" that plagues other models. On Martini, Sora 2 costs 100 credits for a 10-second clip or 150 credits for 15 seconds, with only two aspect ratios: 16:9 (landscape) and 9:16 (portrait). There are no quality tiers, speed options, or other knobs to tune — Sora 2 is a zero-config model where all your creative energy goes into the prompt and reference image.
Sora 2 supports exactly two aspect ratios: 16:9 for landscape (YouTube pre-rolls, TV commercials, website hero videos, LinkedIn) and 9:16 for portrait (TikTok, Instagram Reels, YouTube Shorts, Snapchat). This is the only configuration decision — there are no quality tiers or speed options. Choose your format before writing the prompt, because the composition must match the orientation. A product centered in a 16:9 frame will be awkwardly cropped if you later decide to use the video vertically. If you need both orientations (most brands do), generate the same prompt twice in each format rather than trying to crop one into the other — Sora 2's composition adapts naturally to the aspect ratio.
Think of each Sora 2 generation as one continuous camera shot, not a multi-scene video. Describe a single, uninterrupted action with explicit camera movement: "A slow-motion close-up of espresso being pulled, crema forming in rich swirls, steam rising into soft backlight, camera slowly dollying forward." Sora 2 executes smooth camera paths better than most video models — it handles tracking shots, dolly movements, crane rises, and steadicam follows naturally. The key mistake beginners make is describing multiple scenes in one prompt ("first the product appears, then someone picks it up, then the logo fades in"). Sora 2 interprets the entire prompt as a single continuous shot, so describing scene transitions creates confused output. Instead, generate each shot separately on the canvas.
For any ad where the product must look exactly right, upload your product photo as the starting frame. Connect an Image node's output to the Video node's input on the canvas. Sora 2 will animate from this image, preserving the exact product appearance — colors, logo placement, proportions, label text — while adding physically realistic motion. This is essential for brand advertising: text-to-video generation will approximate your product's look, but "approximately right" is unacceptable when a client's brand guidelines specify exact Pantone colors and logo sizing. Image-to-video mode guarantees the first frame matches your approved creative, and the model's strong physics simulation ensures the motion doesn't distort the product as it animates.
Sora 2 generates silent video — no built-in audio. For a complete ad, add audio nodes on the canvas: a Music node (Suno V5 for background music) and/or a TTS node (Minimax Speech or ElevenLabs for voiceover). This separation is actually an advantage for professional ad production: you control audio independently rather than relying on AI-generated audio that may not match your brand tone, pacing, or music licensing requirements. For a 10-second product ad, a typical canvas pipeline is: Image node (product photo) → Video node (Sora 2, 100 credits) → combined with Music node (Suno V5) and TTS node (voiceover). Total cost per ad variant: roughly 100-120 credits for the video plus 10-20 credits for audio — significantly less than Kling 3.0 Pro at 25 credits/second (250 credits for 10 seconds of video alone).
Food product commercial — this prompt exploits Sora 2's physics engine. The honey stretches, pools, and catches light with realistic viscosity that other models approximate but don't nail. The "slow-motion" cue is key: it forces the model to render the pour at a pace where every physical detail (surface tension, light refraction, fluid dynamics) is visible. "Shallow depth of field" keeps the product sharp against a blurred background, mimicking a real macro lens. This type of physical interaction shot is Sora 2's strongest differentiator — Kling 3.0 and Hailuo 02 can generate similar compositions, but the liquid behavior won't be as physically convincing.
A slow-motion pour of golden honey into a glass jar, each drop catching light in exquisite detail, shallow depth of field, clean white background, product commercial style, 16:9
Automotive ad with complex multi-axis camera movement — "drone swooping" creates a smooth aerial tracking shot that simultaneously moves forward, downward, and rotates to follow the car. Sora 2 handles this kind of multi-axis camera movement naturally because it simulates the physical inertia of a real camera drone, producing smooth acceleration and deceleration rather than robotic linear motion. The "waves crashing on cliffs" adds secondary physics (water, spray) that reinforces the scene's realism.
Aerial drone shot swooping over a coastal highway at sunrise, a sleek electric car navigating the curves, ocean waves crashing on cliffs below, automotive ad cinematic quality, 16:9
Sora 2 has zero configurable parameters — just aspect ratio (16:9 or 9:16) and your prompt. This simplicity is a feature, not a limitation: you never waste time tweaking quality tiers or speed settings. All iteration happens at the prompt level, which is where it matters most for creative work.
At 100 credits for 10 seconds, Sora 2 is the most cost-effective model for 10-second ad clips. Compare: Kling 3.0 Pro charges 25 credits/second (250 credits for 10s), and Hailuo 02 Pro is similarly priced per second. If your ad concept works at 10 seconds, Sora 2 is hard to beat on value.
Describe slow motion explicitly ("slow-motion pour", "slow-motion fabric flutter") — Sora 2's physics simulation shines most when motion is slowed enough to reveal detail. The model renders surface tension, light refraction, and material deformation that only become visible at reduced speed.
For multi-shot ads, generate each shot separately on the canvas (one Video node per shot) and assemble them in post. Never try to describe scene transitions in a single prompt — Sora 2 treats everything as one continuous shot, so "first X, then Y" produces confused output.
Sora 2 generates 1080p video at 100 credits for 10 seconds or 150 credits for 15 seconds — making it the best value for longer ad clips on Martini. The model's core strength is physical plausibility: liquids, fabrics, smoke, and rigid objects behave with realistic weight, momentum, and surface interaction. This makes Sora 2 the top choice for product commercials where physical realism sells — food pours, fabric drapes, cosmetic textures, automotive motion. Its weakness is human faces at close range, where Kling 3.0 Pro is noticeably better at micro-expressions, natural blinking, and lip movement. The decision framework: if the product is the star (food, beauty, automotive, technology), choose Sora 2 for physics. If a person is the star (testimonials, lifestyle, fashion), choose Kling 3.0 for human motion. If color consistency and commercial polish matter most (brand campaigns with strict color guidelines), Hailuo 02 occupies a middle ground.
Connect Sora 2 with other AI models on Martini's infinite canvas. No GPU required — start free.
Get Started FreeKling
Kling 3.0 is the best video model for ads featuring people. It generates the most natural human motion, facial expressions, and lip movement of any model on Martini. With Standard and Pro quality tiers, it scales from quick storyboarding to final ad-quality output. If your video ad shows a person — drinking coffee, unboxing a product, giving a testimonial — Kling 3.0 Pro should be your first choice.
View guideGoogle's Veo 3 is the only video model on Martini that generates synchronized audio alongside video. Every other model produces silent video that requires separate audio work. For ads, this is transformative — you get ambient sound, sound effects, and even music in a single generation step. The latest version (Veo 3.1) offers Standard and Fast tiers with support for reference images.
View guideMinimax
Hailuo 02 by Minimax is the workhorse for video ad production — reliably generating clean, well-composed product commercials with consistent color accuracy. Where Sora 2 excels at physics and Kling 3.0 at people, Hailuo 02 excels at commercial polish: product reveals, beauty shots, and food content with the kind of clean, controlled composition that clients expect from ad agencies. Its Standard and Pro tiers let you iterate cheaply and deliver expensively.
View guide