local-llm-butler

Run the right open model
on the hardware you own.

Local LLM Butler scans your machine, picks the exact model and quant that fits your GPU, and hands you a working Ollama or llama.cpp setup. The tribal knowledge of r/LocalLLaMA, as a wizard.

curl -fsSL https://butler.aiskillhub.info/scan.sh | sh

Hardware scan, zero guesswork

One command reads your GPU, VRAM, RAM and CPU. No more reddit threads about whether Q4_K_M fits in 12 GB.

Exact model + quant match

We map your hardware against a curated database of open models and GGUF quant levels, and rank what actually runs well — not just what loads.

Ready-to-run config

Get a copy-pasteable Ollama or llama.cpp setup — context size, offload layers, flags — plus a clean local chat and OpenAI-compatible API endpoint.

Pricing

Free

₹0$0

  • · Full hardware scan
  • · One recommended model + quant
  • · Ollama / llama.cpp config generator
Scan my hardware

Pro

₹1,499/mo$19/mo

  • · Auto-updates when better models drop
  • · Multi-model routing (fast + smart)
  • · Private RAG over your documents
  • · Priority support
Join the waitlist