{ "title": "Local models", "content": "Local is doable, but OpenClaw expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (\\~\\$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)).\n\n## Recommended: LM Studio + MiniMax M2.1 (Responses API, full-size)\n\nBest current local stack. Load MiniMax M2.1 in LM Studio, enable the local server (default `http://127.0.0.1:1234`), and use Responses API to keep reasoning separate from final text.\n\n* Install LM Studio: [https://lmstudio.ai](https://lmstudio.ai)\n* In LM Studio, download the **largest MiniMax M2.1 build available** (avoid “small”/heavily quantized variants), start the server, confirm `http://127.0.0.1:1234/v1/models` lists it.\n* Keep the model loaded; cold-load adds startup latency.\n* Adjust `contextWindow`/`maxTokens` if your LM Studio build differs.\n* For WhatsApp, stick to Responses API so only final text is sent.\n\nKeep hosted models configured even when running local; use `models.mode: \"merge\"` so fallbacks stay available.\n\n### Hybrid config: hosted primary, local fallback\n\n### Local-first with hosted safety net\n\nSwap the primary and fallback order; keep the same providers block and `models.mode: \"merge\"` so you can fall back to Sonnet or Opus when the local box is down.\n\n### Regional hosting / data routing\n\n* Hosted MiniMax/Kimi/GLM variants also exist on OpenRouter with region-pinned endpoints (e.g., US-hosted). Pick the regional variant there to keep traffic in your chosen jurisdiction while still using `models.mode: \"merge\"` for Anthropic/OpenAI fallbacks.\n* Local-only remains the strongest privacy path; hosted regional routing is the middle ground when you need provider features but want control over data flow.\n\n## Other OpenAI-compatible local proxies\n\nvLLM, LiteLLM, OAI-proxy, or custom gateways work if they expose an OpenAI-style `/v1` endpoint. Replace the provider block above with your endpoint and model ID:\n\nKeep `models.mode: \"merge\"` so hosted models stay available as fallbacks.\n\n* Gateway can reach the proxy? `curl http://127.0.0.1:1234/v1/models`.\n* LM Studio model unloaded? Reload; cold start is a common “hanging” cause.\n* Context errors? Lower `contextWindow` or raise your server limit.\n* Safety: local models skip provider-side filters; keep agents narrow and compaction on to limit prompt injection blast radius.", "code_samples": [ { "code": "**Setup checklist**\n\n* Install LM Studio: [https://lmstudio.ai](https://lmstudio.ai)\n* In LM Studio, download the **largest MiniMax M2.1 build available** (avoid “small”/heavily quantized variants), start the server, confirm `http://127.0.0.1:1234/v1/models` lists it.\n* Keep the model loaded; cold-load adds startup latency.\n* Adjust `contextWindow`/`maxTokens` if your LM Studio build differs.\n* For WhatsApp, stick to Responses API so only final text is sent.\n\nKeep hosted models configured even when running local; use `models.mode: \"merge\"` so fallbacks stay available.\n\n### Hybrid config: hosted primary, local fallback", "language": "unknown" }, { "code": "### Local-first with hosted safety net\n\nSwap the primary and fallback order; keep the same providers block and `models.mode: \"merge\"` so you can fall back to Sonnet or Opus when the local box is down.\n\n### Regional hosting / data routing\n\n* Hosted MiniMax/Kimi/GLM variants also exist on OpenRouter with region-pinned endpoints (e.g., US-hosted). Pick the regional variant there to keep traffic in your chosen jurisdiction while still using `models.mode: \"merge\"` for Anthropic/OpenAI fallbacks.\n* Local-only remains the strongest privacy path; hosted regional routing is the middle ground when you need provider features but want control over data flow.\n\n## Other OpenAI-compatible local proxies\n\nvLLM, LiteLLM, OAI-proxy, or custom gateways work if they expose an OpenAI-style `/v1` endpoint. Replace the provider block above with your endpoint and model ID:", "language": "unknown" } ], "headings": [ { "level": "h2", "text": "Recommended: LM Studio + MiniMax M2.1 (Responses API, full-size)", "id": "recommended:-lm-studio-+-minimax-m2.1-(responses-api,-full-size)" }, { "level": "h3", "text": "Hybrid config: hosted primary, local fallback", "id": "hybrid-config:-hosted-primary,-local-fallback" }, { "level": "h3", "text": "Local-first with hosted safety net", "id": "local-first-with-hosted-safety-net" }, { "level": "h3", "text": "Regional hosting / data routing", "id": "regional-hosting-/-data-routing" }, { "level": "h2", "text": "Other OpenAI-compatible local proxies", "id": "other-openai-compatible-local-proxies" }, { "level": "h2", "text": "Troubleshooting", "id": "troubleshooting" } ], "url": "llms-txt#local-models", "links": [] }