Offline AI Voice Assistant Build

Customer: AI | Published: 08.01.2026

I’m putting together an entirely offline voice assistant named “Aizen.” Everything must run locally on Windows and ship as a single-file .exe produced with PyInstaller, so no cloud calls are allowed at any point. Core stack • Vosk handles real-time speech-to-text. • LLaMA 3 (quantized) interprets intent and decides what to do next. • pyttsx3 provides the reply voice, and I’ll want room to fine-tune rate, pitch, and voice persona. What it has to do on day one — Wake, listen continuously, and transcribe in near real time. — Recognise intents such as “open browser,” “search for…,” and simple typing instructions. — Open Google Chrome automatically when requested, then pass the search or typed text straight into the active field so a user sees immediate results. Architecture expectations I need clear, well-commented modules for STT, NLU, TTS, and action execution so I can bolt on new skills later without touching core logic. Dependency management, model downloads, and configuration should all be scripted to make first-time setup painless. Deliverables 1. Source code with README that explains installation, model paths, and how to add a new intent. 2. PyInstaller spec and resulting Windows .exe that runs fully offline. 3. Basic test script or instructions that prove the assistant opens Chrome and performs a spoken search query end-to-end. If everything works smoothly I’ll roll the same framework out to additional apps and browsers, so writing clean, extendable code will be valued just as highly as the immediate features above.