Run the right open model
on the hardware you own.
Local LLM Butler scans your machine, picks the exact model and quant that fits your GPU, and hands you a working Ollama or llama.cpp setup. The tribal knowledge of r/LocalLLaMA, as a wizard.
curl -fsSL https://butler.aiskillhub.info/scan.sh | sh
Hardware scan, zero guesswork
One command reads your GPU, VRAM, RAM and CPU. No more reddit threads about whether Q4_K_M fits in 12 GB.
Exact model + quant match
We map your hardware against a curated database of open models and GGUF quant levels, and rank what actually runs well — not just what loads.
Ready-to-run config
Get a copy-pasteable Ollama or llama.cpp setup — context size, offload layers, flags — plus a clean local chat and OpenAI-compatible API endpoint.
Pricing
Free
₹0$0
- · Full hardware scan
- · One recommended model + quant
- · Ollama / llama.cpp config generator
Pro
₹1,499/mo$19/mo
- · Auto-updates when better models drop
- · Multi-model routing (fast + smart)
- · Private RAG over your documents
- · Priority support