local-llm-butler

Run the right open model
on the hardware you own.

Local LLM Butler scans your machine, picks the exact model and quant that fits your GPU, and hands you a working Ollama or llama.cpp setup. The tribal knowledge of r/LocalLLaMA, as a wizard.

Scan my hardware — free See pricing

curl -fsSL https://butler.aiskillhub.info/scan.sh | sh

Hardware scan, zero guesswork

One command reads your GPU, VRAM, RAM and CPU. No more reddit threads about whether Q4_K_M fits in 12 GB.

Exact model + quant match

We map your hardware against a curated database of open models and GGUF quant levels, and rank what actually runs well — not just what loads.

Ready-to-run config

Get a copy-pasteable Ollama or llama.cpp setup — context size, offload layers, flags — plus a clean local chat and OpenAI-compatible API endpoint.

Pricing

Free

₹0$0

· Full hardware scan
· One recommended model + quant
· Ollama / llama.cpp config generator

Scan my hardware

Pro

₹1,499/mo$19/mo

· Auto-updates when better models drop
· Multi-model routing (fast + smart)
· Private RAG over your documents
· Priority support

Join the waitlist