Vidu
Vidu is a reference-driven video generation family with five specialized variants. It excels at maintaining character consistency across clips using multiple image references and video reference inputs.
The Vidu family is built around reference-guided generation. Q1 generates video from multiple reference images, combining visual cues into a coherent clip. Q3 is the general-purpose model supporting text-to-video and image-to-video across five aspect ratios at 1080p. Q2 offers subject reference with 1-7 input images for strong identity locking. Q2 Image Reference takes multi-image reference sets for broader visual guidance. Q2 Pro Video Reference operates in V2V mode, using an existing video plus reference images to produce a new clip that merges both inputs. This layered approach makes Vidu the strongest choice when character consistency across a series of clips is the top priority.
| Variant | Description |
|---|---|
| Vidu Q1 | Reference-to-video from multiple images, combining visual cues. |
| Vidu Q3 | General T2V and I2V across 5 aspect ratios at 1080p. |
| Vidu Q2 Subject Ref | Subject reference with 1-7 input images for strong identity locking. |
| Vidu Q2 Image Ref | Multi-image reference sets for broader visual guidance. |
| Vidu Q2 Pro Video Ref | V2V with reference images, merges existing video with new visual direction. |
Connect Vidu with other AI models on Martini's infinite canvas. No GPU required — start free.
Get Started Free