Vidu
Vidu Reference-to-Image generates new images from up to 4 reference inputs, preserving subject identity while applying new prompts. Supports 1:1, 16:9, and 9:16 aspect ratios with editing capability.
Vidu Reference-to-Image is a specialized model built for reference-driven creative workflows. Upload up to 4 reference images — the model extracts and preserves the subject's identity, clothing, pose cues, and visual style while generating entirely new compositions based on your text prompt. Unlike general-purpose generators, Vidu is optimized specifically for this reference-to-output pipeline: you always start with an existing image and transform it. The model operates in image-to-image mode by default, supporting editing instructions to modify specific elements while preserving the overall composition. It outputs in three aspect ratios (1:1, 16:9, 9:16), covering square social posts, widescreen content, and vertical stories. At 13 credits per image, it's positioned as a premium reference tool. On Martini, Vidu Reference-to-Image forms the first stage of a powerful pipeline: generate a reference-consistent still, then animate it with Vidu Q1, Q2, or Q3 video models — the same character identity carries through from still to motion.
Connect Vidu Reference-to-Image with other AI models on Martini's infinite canvas. No GPU required — start free.
Get Started FreeNo. Vidu Reference-to-Image is specifically designed for reference-based generation. It always requires at least one input image. For text-only image generation, use FLUX, Imagen 4, or Midjourney instead.
Vidu Reference-to-Image accepts up to 4 reference images. Providing multiple angles of the same subject (frontal, three-quarter, profile) gives the model better understanding for identity preservation.
Yes. Vidu Reference-to-Image pairs directly with Vidu Q1, Q2, and Q3 video models. Generate a reference-consistent still, then feed it into a Vidu video node — the character identity carries through to the animation.
Midjourney
Midjourney v7 delivers artistic, highly detailed images with an iconic aesthetic style. It excels at creative illustration, concept art, and photorealistic renders with strong prompt adherence and built-in Niji mode for anime styles.
View detailsBlack Forest Labs
FLUX by Black Forest Labs is a fast, high-quality image generation family known for photorealistic output and excellent prompt adherence. Variants span free-tier dev models to ultra-resolution Pro outputs.
View detailsBlack Forest Labs
FLUX Kontext is a context-aware image generation and editing model that uses reference images to maintain character and style consistency across outputs. Available in Pro and Max tiers.
View details