Generator
AI Avatar Generator on Martini
Generate the avatar portrait that becomes a talking head on the same canvas. Most avatar tools are stock-face pickers; this one generates the custom portrait, locks it as a reference node, and wires it into Kling Avatar or OmniHuman for lip-sync without re-uploading. Built for SaaS founders who cant record weekly, course creators producing twelve modules, and brands building a recurring AI spokesperson for daily social drops. The portrait is the upstream — the talking head is the deliverable.
What you can generate
- Custom brand spokesperson portraits anchored to a recurring identity
- Course host portraits that survive twelve weeks of weekly module drops
- Daily-Reels avatar personas with locked face and shifting wardrobe
- Multilingual avatar portraits — same face, different language voice downstream
- Executive-comms avatars for internal product and policy updates
- AI influencer base portraits ready for the multi-week character pipeline
- Podcast cover host portraits that match the show audio identity
- Expression and pose variants from a single canonical portrait
Best Martini workflow
Why this is more than a one-shot generator on Martini.
- The portrait is the anchor. Generate it once on Nano Banana 2, lock it as a reference node, and reuse it across every talking-head, expression sheet, and outfit variant on the canvas.
- Wire the locked portrait into Kling Avatar for the talking-head video — the lip-sync model reads the still and produces the motion without a separate avatar import step.
- Pair with ElevenLabs voice-clone audio plus the script node and the canvas now drives the full talking-head pipeline end-to-end.
- High-resolution portrait equals better lip-sync. Soft or low-res portraits produce smudged mouth shapes; clean the still on Flux Kontext or gpt-image-2 before the video step.
- Save the canvas as a spokesperson template — every weekly drop reuses the same portrait, voice, and node chain, only the script changes.
Recommended models
nano-banana-2
imageReference-faithful portrait generation and refinement — the canonical face-locker for recurring spokesperson identity.
flux
imageHigh-fidelity portrait rendering for studio-grade hero shots and editorial frames.
kling-avatar
videoDownstream talking-avatar lip-sync — reads the locked portrait and animates the mouth to the audio track.
omnihuman
videoDownstream full-body avatar motion when the brief calls for gesture and posture, not just talking head.
flux-kontext
imageOutfit and expression variations from the canonical portrait without breaking the face anchor.
Prompt examples
Approachable founder, mid-thirties, navy crew-neck, soft front key light, neutral studio backdrop, calm direct gaze, 4:5 portrait framing.
SaaS-founder spokesperson — clean studio anchor that survives across daily product-update talking heads.
Course host, warm grandmother type, late 60s, glasses, soft smile, kitchen background with soft daylight, eye-level camera, 4:5 framing.
Cooking-course recurring host; the kitchen backdrop pre-sells the show.
Athletic brand spokesperson, late 20s, energetic, gym lighting with rim accent, slight forward lean, focused expression, 9:16 vertical framing.
Vertical-first fitness avatar; ready for Reels and Shorts talking-head cuts.
Tech presenter, mid-30s, sharp navy suit, futuristic backdrop with subtle gradient, confident pose, three-quarter angle, 16:9 framing.
Keynote-style avatar for product launches; landscape framing matches webinar layouts.
Educational channel host, early 40s, soft cardigan, library backdrop with warm tungsten, kind expression, eye-level direct gaze, 4:5 framing.
Education-channel host; the library backdrop signals trustworthy long-form content.
AI influencer base portrait, mid-20s, natural makeup, beach late-afternoon golden light, three-quarter turn, soft smile, 4:5 framing.
Influencer baseline — week-one portrait for the recurring weekly drop pipeline.
Podcast host portrait, early 30s, headphones around the neck, dark studio backdrop with rim light, half-smile, 1:1 cover framing.
Podcast cover host; same anchor reuses across episode art and talking-head cuts.
Executive comms portrait, late 40s, charcoal blazer, neutral office backdrop with shallow depth of field, calm direct gaze, 16:9 framing.
Internal-comms avatar; ready for a weekly leadership message lip-synced to a voice clone.
Turn this output into a workflow
Generation is the first node — here's where to take it next.
Open /features/ai-avatar-video-generator for the deep-dive feature explainer — the workflow companion to this portrait page.
Wire the locked portrait into /features/ai-lip-sync to drive the talking head off the script and voice clone.
Pair with /features/ai-voiceover-generator and /prompts/audio/voiceover-script-prompts so the script delivery matches the portrait persona.
Run the recurring identity through /workflows/character-consistency to keep the face on-model across weekly drops.
Chain into /features/ai-talking-head-video for the full spokesperson cut, then ship to /features/ai-video-nle-export for finishing.
Related features
Related how-to guides
Related prompts
Related workflows
Related reading
Frequently asked questions
How is this different from /features/ai-avatar-video-generator?
This generator page is the image-stage portrait — the avatar still you generate before lip-sync. The feature page is the talking-head video pipeline. They are upstream-downstream companions: generate the portrait here, turn it into a talking head there.
Can I generate a real persons avatar?
Real-person likeness without consent is a no-go. Generate original avatars or clone with explicit written permission. Disclose AI when client policy requires it (for example, LinkedIn corporate communications).
Why does my lip-sync look smudged?
Low-resolution or soft portraits produce smudged mouth shapes. Generate the portrait at high fidelity, run a cleanup pass on gpt-image-2 or Flux Kontext, then feed the clean still into Kling Avatar or OmniHuman.
How do I keep the face on-model across weekly drops?
Anchor a single canonical portrait, never re-prompt the face per drop. Use Flux Kontext for outfit and expression edits from the same anchor — the face survives every variation.
Can I run a multilingual talking head off the same portrait?
Yes — the portrait is locale-independent. Generate a separate ElevenLabs voice per locale, feed both into Kling Avatar with the same portrait anchor, and the face stays consistent across language drops.
How does voice cloning fit in?
Voice cloning needs explicit consent. Use ElevenLabs to clone the brand voice (or your own) and pair it with the portrait through the lip-sync node. See /prompts/audio/voiceover-script-prompts for delivery patterns.
Generate it on the canvas
Open Martini, drop this generator on the canvas, and wire it into the workflow you actually need. Free to start — no card required.