I’m building an end-to-end solution that lets campaign teams instantly greet every voter with a short, personalized video featuring a realistic-looking avatar who calls the voter by name, references local issues, and directs them to take action. The core requirements are: • One pipeline that first creates a high-fidelity, realistic avatar from a supplied reference photo, then automatically generates a short script and renders a video unique to each voter. • Output must be web-ready (MP4 or WebM) with thumbnails and embed code so my team can drop the clips straight into our existing websites. • Simple API or webhook so we can trigger creation from our CRM and retrieve the finished asset in real time. I’m already collecting voter data; what’s missing is the AI glue that turns those fields into engaging media. If you’ve worked with tools such as Stable Diffusion, D-ID, Synthesia, ElevenLabs, or similar—and can stitch them together with some custom code for dynamic text-to-speech, lip-sync, and video compositing—this project should feel familiar. Please outline: 1. The tech stack you’d use for avatar generation, TTS, and video assembly. 2. How long a 15-second render will take when running on GPU hardware you recommend. 3. A brief plan for QA so I can be sure each clip shows the correct name and issue reference before it goes live. Successful delivery is a working prototype I can deploy on my server (Docker container or Node/Python repo), plus clear documentation so my devs can maintain it. After that, we can expand into social or email distribution, but initial focus is seamless website integration.