forked from Selig/openclaw-skill
6 custom skills (assign-task, dispatch-webhook, daily-briefing, task-capture, qmd-brain, tts-voice) with technical documentation. Compatible with Claude Code, OpenClaw, Codex CLI, and OpenCode.
48 lines
5.2 KiB
JSON
Executable File
48 lines
5.2 KiB
JSON
Executable File
{
|
|
"title": "Local models",
|
|
"content": "Local is doable, but OpenClaw expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (\\~\\$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)).\n\n## Recommended: LM Studio + MiniMax M2.1 (Responses API, full-size)\n\nBest current local stack. Load MiniMax M2.1 in LM Studio, enable the local server (default `http://127.0.0.1:1234`), and use Responses API to keep reasoning separate from final text.\n\n* Install LM Studio: [https://lmstudio.ai](https://lmstudio.ai)\n* In LM Studio, download the **largest MiniMax M2.1 build available** (avoid “small”/heavily quantized variants), start the server, confirm `http://127.0.0.1:1234/v1/models` lists it.\n* Keep the model loaded; cold-load adds startup latency.\n* Adjust `contextWindow`/`maxTokens` if your LM Studio build differs.\n* For WhatsApp, stick to Responses API so only final text is sent.\n\nKeep hosted models configured even when running local; use `models.mode: \"merge\"` so fallbacks stay available.\n\n### Hybrid config: hosted primary, local fallback\n\n### Local-first with hosted safety net\n\nSwap the primary and fallback order; keep the same providers block and `models.mode: \"merge\"` so you can fall back to Sonnet or Opus when the local box is down.\n\n### Regional hosting / data routing\n\n* Hosted MiniMax/Kimi/GLM variants also exist on OpenRouter with region-pinned endpoints (e.g., US-hosted). Pick the regional variant there to keep traffic in your chosen jurisdiction while still using `models.mode: \"merge\"` for Anthropic/OpenAI fallbacks.\n* Local-only remains the strongest privacy path; hosted regional routing is the middle ground when you need provider features but want control over data flow.\n\n## Other OpenAI-compatible local proxies\n\nvLLM, LiteLLM, OAI-proxy, or custom gateways work if they expose an OpenAI-style `/v1` endpoint. Replace the provider block above with your endpoint and model ID:\n\nKeep `models.mode: \"merge\"` so hosted models stay available as fallbacks.\n\n* Gateway can reach the proxy? `curl http://127.0.0.1:1234/v1/models`.\n* LM Studio model unloaded? Reload; cold start is a common “hanging” cause.\n* Context errors? Lower `contextWindow` or raise your server limit.\n* Safety: local models skip provider-side filters; keep agents narrow and compaction on to limit prompt injection blast radius.",
|
|
"code_samples": [
|
|
{
|
|
"code": "**Setup checklist**\n\n* Install LM Studio: [https://lmstudio.ai](https://lmstudio.ai)\n* In LM Studio, download the **largest MiniMax M2.1 build available** (avoid “small”/heavily quantized variants), start the server, confirm `http://127.0.0.1:1234/v1/models` lists it.\n* Keep the model loaded; cold-load adds startup latency.\n* Adjust `contextWindow`/`maxTokens` if your LM Studio build differs.\n* For WhatsApp, stick to Responses API so only final text is sent.\n\nKeep hosted models configured even when running local; use `models.mode: \"merge\"` so fallbacks stay available.\n\n### Hybrid config: hosted primary, local fallback",
|
|
"language": "unknown"
|
|
},
|
|
{
|
|
"code": "### Local-first with hosted safety net\n\nSwap the primary and fallback order; keep the same providers block and `models.mode: \"merge\"` so you can fall back to Sonnet or Opus when the local box is down.\n\n### Regional hosting / data routing\n\n* Hosted MiniMax/Kimi/GLM variants also exist on OpenRouter with region-pinned endpoints (e.g., US-hosted). Pick the regional variant there to keep traffic in your chosen jurisdiction while still using `models.mode: \"merge\"` for Anthropic/OpenAI fallbacks.\n* Local-only remains the strongest privacy path; hosted regional routing is the middle ground when you need provider features but want control over data flow.\n\n## Other OpenAI-compatible local proxies\n\nvLLM, LiteLLM, OAI-proxy, or custom gateways work if they expose an OpenAI-style `/v1` endpoint. Replace the provider block above with your endpoint and model ID:",
|
|
"language": "unknown"
|
|
}
|
|
],
|
|
"headings": [
|
|
{
|
|
"level": "h2",
|
|
"text": "Recommended: LM Studio + MiniMax M2.1 (Responses API, full-size)",
|
|
"id": "recommended:-lm-studio-+-minimax-m2.1-(responses-api,-full-size)"
|
|
},
|
|
{
|
|
"level": "h3",
|
|
"text": "Hybrid config: hosted primary, local fallback",
|
|
"id": "hybrid-config:-hosted-primary,-local-fallback"
|
|
},
|
|
{
|
|
"level": "h3",
|
|
"text": "Local-first with hosted safety net",
|
|
"id": "local-first-with-hosted-safety-net"
|
|
},
|
|
{
|
|
"level": "h3",
|
|
"text": "Regional hosting / data routing",
|
|
"id": "regional-hosting-/-data-routing"
|
|
},
|
|
{
|
|
"level": "h2",
|
|
"text": "Other OpenAI-compatible local proxies",
|
|
"id": "other-openai-compatible-local-proxies"
|
|
},
|
|
{
|
|
"level": "h2",
|
|
"text": "Troubleshooting",
|
|
"id": "troubleshooting"
|
|
}
|
|
],
|
|
"url": "llms-txt#local-models",
|
|
"links": []
|
|
} |