Aillex is a fully-local talking AI assistant — her voice, brain, memory and animated 3D avatar all run on a single consumer GPU. No cloud, no subscription, no data leaving your machine. We build her in the open and publish every step so you can build your own. Start with the architecture overview, or watch her in action on YouTube.
Build a Local Talking AI Avatar: The Complete Architecture
The full blueprint for a fully-local AI companion — speech in, animated talking character out — running on one consumer GPU. This is the map; every stage links to a hands-on guide.
Give Your AI a Custom Voice — Cloned Locally, Running on CPU
Design a voice once, clone it forever: how we gave Aillex a warm, consistent voice with NeuTTS Air — zero GPU cost, zero per-word fees, fully offline.
Real-Time Lip-Sync with MuseTalk on an RTX 5090 (Blackwell Survival Guide)
MuseTalk generates lip-synced video faster than real time on a 5090 — but getting it to BUILD on Blackwell is dependency hell. Here’s the exact recipe that works.
Run a 26B AI Brain Locally — Warm, Multimodal, and With Memory
The brain is the biggest VRAM line-item and the biggest latency trap. How we run a 26B multimodal LLM via Ollama with sub-second warm responses, persistent memory — and free screen vision.
Turn One AI Image into a Rigged, Animated 3D Character
From a single character image to a fully rigged, animated 3D model you can pose, dress and drive in the browser — an autonomous cloud pipeline with zero local GPU time.