Alibaba
HappyHorse 1.0 is Alibaba's flagship multimodal AI video model — a unified 15B-parameter Transformer that generates 1080p video with native synchronized audio from a single text or image prompt. It topped public benchmarks at launch with an Elo score of 1381, leading the second-place model by 107 points.
HappyHorse 1.0 supports Text-to-Video (T2V), Image-to-Video (I2V) and Subject-to-Video (S2V), letting you generate from a prompt, animate a still image, or insert a reference subject into a generated video while preserving identity. Output is up to 15 seconds of 1080p with multiple shots and synchronized audio — including lip-synced dialogue, ambient soundscapes and emotionally expressive vocal performances. Video editing capabilities include Video-to-Video (V2V) for restyling existing footage while preserving structure and motion, and Subject-and-Video-to-Video (SV2V) for replacing or inserting subjects from a reference image while keeping the original motion, composition and unaffected regions intact. Surfaced in Martini through the Geneasy provider with text-to-video and first-frame image-to-video modes.
Connect HappyHorse 1.0 with other AI models on Martini's infinite canvas. No GPU required — start free.
Get Started Free