Guides

Build a Local Talking AI Avatar: The Complete Architecture

The full blueprint for a fully-local AI companion — speech in, animated talking character out — running on one consumer GPU. This is the map; every stage links to a hands-on guide.

Give Your AI a Custom Voice — Cloned Locally, Running on CPU

Design a voice once, clone it forever: how we gave Aillex a warm, consistent voice with NeuTTS Air — zero GPU cost, zero per-word fees, fully offline.

Real-Time Lip-Sync with MuseTalk on an RTX 5090 (Blackwell Survival Guide)

MuseTalk generates lip-synced video faster than real time on a 5090 — but getting it to BUILD on Blackwell is dependency hell. Here’s the exact recipe that works.

Run a 26B AI Brain Locally — Warm, Multimodal, and With Memory

The brain is the biggest VRAM line-item and the biggest latency trap. How we run a 26B multimodal LLM via Ollama with sub-second warm responses, persistent memory — and free screen vision.

Turn One AI Image into a Rigged, Animated 3D Character

From a single character image to a fully rigged, animated 3D model you can pose, dress and drive in the browser — an autonomous cloud pipeline with zero local GPU time.