Initial commit: OpenClaw Skill Collection

6 custom skills (assign-task, dispatch-webhook, daily-briefing, task-capture, qmd-brain, tts-voice) with technical documentation. Compatible with Claude Code, OpenClaw, Codex CLI, and OpenCode.
2026-03-13 10:58:30 +08:00
commit 4c966a3ad2
884 changed files with 140761 additions and 0 deletions
--- a/openclaw-knowhow-skill/docs/infrastructure/nodes/audio.md
+++ b/openclaw-knowhow-skill/docs/infrastructure/nodes/audio.md
@@ -0,0 +1,34 @@
+# Audio and Voice Notes Documentation
+
+## Overview
+
+OpenClaw supports audio transcription with flexible configuration options. The system automatically detects available transcription tools or allows explicit provider/CLI setup.
+
+## Key Capabilities
+
+When audio understanding is enabled, OpenClaw locates the first audio attachment (local path or URL) and downloads it if needed before processing through configured models in sequence until one succeeds.
+
+## Auto-Detection Hierarchy
+
+Without custom configuration, the system attempts transcription in this order:
+- Local CLI tools (sherpa-onnx-offline, whisper-cli, whisper Python CLI)
+- Gemini CLI
+- Provider APIs (OpenAI, Groq, Deepgram, Google)
+
+## Configuration Options
+
+Three configuration patterns are provided:
+
+1. **Provider with CLI fallback** – Uses OpenAI with Whisper CLI as backup
+2. **Provider-only with scope gating** – Restricts to specific chat contexts (e.g., denying group chats)
+3. **Single provider** – Deepgram example for dedicated service use
+
+## Important Constraints
+
+Default size cap is 20MB (`tools.media.audio.maxBytes`). Oversize audio is skipped for that model and the next entry is tried.
+
+Authentication follows standard model auth patterns. The transcript output is available as `{{Transcript}}` for downstream processing, with optional character trimming via `maxChars`.
+
+## Notable Gotchas
+
+Scope rules use first-match evaluation, CLI commands must exit cleanly with plain text output, and timeouts should be reasonable to prevent blocking the reply queue.
--- a/openclaw-knowhow-skill/docs/infrastructure/nodes/camera.md
+++ b/openclaw-knowhow-skill/docs/infrastructure/nodes/camera.md
@@ -0,0 +1,29 @@
+# Camera Capture Documentation
+
+## Overview
+
+OpenClaw enables camera functionality across multiple platforms through agent workflows. The feature supports photo capture (JPG) and video clips (MP4 with optional audio) on iOS, Android, and macOS devices.
+
+## Key Features by Platform
+
+**iOS & Android nodes** offer identical capabilities:
+- Photo capture via `camera.snap` command
+- Video recording via `camera.clip` command
+- User-controlled settings (default enabled)
+- Foreground-only operation
+- Payload protection (base64 under 5 MB)
+
+**macOS app** includes:
+- Same camera commands as mobile platforms
+- Camera disabled by default in settings
+- Additional screen recording capability (separate from camera)
+
+## Important Constraints
+
+Video clips are capped (currently `<= 60s`) to avoid oversized node payloads. Photos are automatically recompressed to maintain payload limits.
+
+Camera and microphone access require standard OS permission prompts. Android requires explicit runtime permissions for `CAMERA` and `RECORD_AUDIO` (when applicable).
+
+## Usage
+
+CLI helpers simplify media capture, automatically writing decoded files to temporary locations and printing `MEDIA:<path>` for agent integration.
--- a/openclaw-knowhow-skill/docs/infrastructure/nodes/images.md
+++ b/openclaw-knowhow-skill/docs/infrastructure/nodes/images.md
@@ -0,0 +1,29 @@
+# Image and Media Support
+
+## Overview
+
+The WhatsApp channel via Baileys Web supports media handling with specific rules for sending, gateway processing, and agent replies.
+
+## Key Features
+
+**CLI Command Structure**
+The documentation specifies: `openclaw message send --media <path-or-url> [--message <caption>]` for transmitting media with optional accompanying text.
+
+**Media Processing Pipeline**
+The system handles various file types differently:
+- Images undergo resizing and recompression to JPEG format with a maximum dimension of 2048 pixels
+- Audio files are converted to voice notes with the `ptt` flag enabled
+- Documents preserve filenames and support larger file sizes
+- MP4 files can enable looped playback on mobile clients using the `gifPlayback` parameter
+
+## Size Constraints
+
+Outbound limits vary by media category:
+- Images are capped at approximately 6 MB following recompression
+- Audio and video files max out at 16 MB
+- Documents can reach up to 100 MB
+- Media understanding operations have separate thresholds (10 MB for images, 20 MB for audio, 50 MB for video)
+
+## Inbound Processing
+
+When messages arrive with attachments, the system downloads media to temporary storage and exposes templating variables for command processing. Audio transcription enables slash command functionality, while image and video descriptions preserve caption text for parsing.
--- a/openclaw-knowhow-skill/docs/infrastructure/nodes/index.md
+++ b/openclaw-knowhow-skill/docs/infrastructure/nodes/index.md
@@ -0,0 +1,332 @@
+# Nodes
+
+A **node** is a companion device (macOS/iOS/Android/headless) that connects to the Gateway **WebSocket** (same port as operators) with `role: "node"` and exposes a command surface (e.g. `canvas.*`, `camera.*`, `system.*`) via `node.invoke`. Protocol details: [Gateway protocol](/gateway/protocol).
+
+Legacy transport: [Bridge protocol](/gateway/bridge-protocol) (TCP JSONL; deprecated/removed for current nodes).
+
+macOS can also run in **node mode**: the menubar app connects to the Gateway's WS server and exposes its local canvas/camera commands as a node (so `openclaw nodes …` works against this Mac).
+
+Notes:
+
+* Nodes are **peripherals**, not gateways. They don't run the gateway service.
+* Telegram/WhatsApp/etc. messages land on the **gateway**, not on nodes.
+
+## Pairing + status
+
+**WS nodes use device pairing.** Nodes present a device identity during `connect`; the Gateway
+creates a device pairing request for `role: node`. Approve via the devices CLI (or UI).
+
+Quick CLI:
+
+```bash
+openclaw devices list
+openclaw devices approve <requestId>
+openclaw devices reject <requestId>
+openclaw nodes status
+openclaw nodes describe --node <idOrNameOrIp>
+```
+
+Notes:
+
+* `nodes status` marks a node as **paired** when its device pairing role includes `node`.
+* `node.pair.*` (CLI: `openclaw nodes pending/approve/reject`) is a separate gateway-owned
+  node pairing store; it does **not** gate the WS `connect` handshake.
+
+## Remote node host (system.run)
+
+Use a **node host** when your Gateway runs on one machine and you want commands
+to execute on another. The model still talks to the **gateway**; the gateway
+forwards `exec` calls to the **node host** when `host=node` is selected.
+
+### What runs where
+
+* **Gateway host**: receives messages, runs the model, routes tool calls.
+* **Node host**: executes `system.run`/`system.which` on the node machine.
+* **Approvals**: enforced on the node host via `~/.openclaw/exec-approvals.json`.
+
+### Start a node host (foreground)
+
+On the node machine:
+
+```bash
+openclaw node run --host <gateway-host> --port 18789 --display-name "Build Node"
+```
+
+### Remote gateway via SSH tunnel (loopback bind)
+
+If the Gateway binds to loopback (`gateway.bind=loopback`, default in local mode),
+remote node hosts cannot connect directly. Create an SSH tunnel and point the
+node host at the local end of the tunnel.
+
+Example (node host -> gateway host):
+
+```bash
+# Terminal A (keep running): forward local 18790 -> gateway 127.0.0.1:18789
+ssh -N -L 18790:127.0.0.1:18789 user@gateway-host
+
+# Terminal B: export the gateway token and connect through the tunnel
+export OPENCLAW_GATEWAY_TOKEN="<gateway-token>"
+openclaw node run --host 127.0.0.1 --port 18790 --display-name "Build Node"
+```
+
+Notes:
+
+* The token is `gateway.auth.token` from the gateway config (`~/.openclaw/openclaw.json` on the gateway host).
+* `openclaw node run` reads `OPENCLAW_GATEWAY_TOKEN` for auth.
+
+### Start a node host (service)
+
+```bash
+openclaw node install --host <gateway-host> --port 18789 --display-name "Build Node"
+openclaw node restart
+```
+
+### Pair + name
+
+On the gateway host:
+
+```bash
+openclaw nodes pending
+openclaw nodes approve <requestId>
+openclaw nodes list
+```
+
+Naming options:
+
+* `--display-name` on `openclaw node run` / `openclaw node install` (persists in `~/.openclaw/node.json` on the node).
+* `openclaw nodes rename --node <id|name|ip> --name "Build Node"` (gateway override).
+
+### Allowlist the commands
+
+Exec approvals are **per node host**. Add allowlist entries from the gateway:
+
+```bash
+openclaw approvals allowlist add --node <id|name|ip> "/usr/bin/uname"
+openclaw approvals allowlist add --node <id|name|ip> "/usr/bin/sw_vers"
+```
+
+Approvals live on the node host at `~/.openclaw/exec-approvals.json`.
+
+### Point exec at the node
+
+Configure defaults (gateway config):
+
+```bash
+openclaw config set tools.exec.host node
+openclaw config set tools.exec.security allowlist
+openclaw config set tools.exec.node "<id-or-name>"
+```
+
+Or per session:
+
+```
+/exec host=node security=allowlist node=<id-or-name>
+```
+
+Once set, any `exec` call with `host=node` runs on the node host (subject to the
+node allowlist/approvals).
+
+Related:
+
+* [Node host CLI](/cli/node)
+* [Exec tool](/tools/exec)
+* [Exec approvals](/tools/exec-approvals)
+
+## Invoking commands
+
+Low-level (raw RPC):
+
+```bash
+openclaw nodes invoke --node <idOrNameOrIp> --command canvas.eval --params '{"javaScript":"location.href"}'
+```
+
+Higher-level helpers exist for the common "give the agent a MEDIA attachment" workflows.
+
+## Screenshots (canvas snapshots)
+
+If the node is showing the Canvas (WebView), `canvas.snapshot` returns `{ format, base64 }`.
+
+CLI helper (writes to a temp file and prints `MEDIA:<path>`):
+
+```bash
+openclaw nodes canvas snapshot --node <idOrNameOrIp> --format png
+openclaw nodes canvas snapshot --node <idOrNameOrIp> --format jpg --max-width 1200 --quality 0.9
+```
+
+### Canvas controls
+
+```bash
+openclaw nodes canvas present --node <idOrNameOrIp> --target https://example.com
+openclaw nodes canvas hide --node <idOrNameOrIp>
+openclaw nodes canvas navigate https://example.com --node <idOrNameOrIp>
+openclaw nodes canvas eval --node <idOrNameOrIp> --js "document.title"
+```
+
+Notes:
+
+* `canvas present` accepts URLs or local file paths (`--target`), plus optional `--x/--y/--width/--height` for positioning.
+* `canvas eval` accepts inline JS (`--js`) or a positional arg.
+
+### A2UI (Canvas)
+
+```bash
+openclaw nodes canvas a2ui push --node <idOrNameOrIp> --text "Hello"
+openclaw nodes canvas a2ui push --node <idOrNameOrIp> --jsonl ./payload.jsonl
+openclaw nodes canvas a2ui reset --node <idOrNameOrIp>
+```
+
+Notes:
+
+* Only A2UI v0.8 JSONL is supported (v0.9/createSurface is rejected).
+
+## Photos + videos (node camera)
+
+Photos (`jpg`):
+
+```bash
+openclaw nodes camera list --node <idOrNameOrIp>
+openclaw nodes camera snap --node <idOrNameOrIp>            # default: both facings (2 MEDIA lines)
+openclaw nodes camera snap --node <idOrNameOrIp> --facing front
+```
+
+Video clips (`mp4`):
+
+```bash
+openclaw nodes camera clip --node <idOrNameOrIp> --duration 10s
+openclaw nodes camera clip --node <idOrNameOrIp> --duration 3000 --no-audio
+```
+
+Notes:
+
+* The node must be **foregrounded** for `canvas.*` and `camera.*` (background calls return `NODE_BACKGROUND_UNAVAILABLE`).
+* Clip duration is clamped (currently `<= 60s`) to avoid oversized base64 payloads.
+* Android will prompt for `CAMERA`/`RECORD_AUDIO` permissions when possible; denied permissions fail with `*_PERMISSION_REQUIRED`.
+
+## Screen recordings (nodes)
+
+Nodes expose `screen.record` (mp4). Example:
+
+```bash
+openclaw nodes screen record --node <idOrNameOrIp> --duration 10s --fps 10
+openclaw nodes screen record --node <idOrNameOrIp> --duration 10s --fps 10 --no-audio
+```
+
+Notes:
+
+* `screen.record` requires the node app to be foregrounded.
+* Android will show the system screen-capture prompt before recording.
+* Screen recordings are clamped to `<= 60s`.
+* `--no-audio` disables microphone capture (supported on iOS/Android; macOS uses system capture audio).
+* Use `--screen <index>` to select a display when multiple screens are available.
+
+## Location (nodes)
+
+Nodes expose `location.get` when Location is enabled in settings.
+
+CLI helper:
+
+```bash
+openclaw nodes location get --node <idOrNameOrIp>
+openclaw nodes location get --node <idOrNameOrIp> --accuracy precise --max-age 15000 --location-timeout 10000
+```
+
+Notes:
+
+* Location is **off by default**.
+* "Always" requires system permission; background fetch is best-effort.
+* The response includes lat/lon, accuracy (meters), and timestamp.
+
+## SMS (Android nodes)
+
+Android nodes can expose `sms.send` when the user grants **SMS** permission and the device supports telephony.
+
+Low-level invoke:
+
+```bash
+openclaw nodes invoke --node <idOrNameOrIp> --command sms.send --params '{"to":"+15555550123","message":"Hello from OpenClaw"}'
+```
+
+Notes:
+
+* The permission prompt must be accepted on the Android device before the capability is advertised.
+* Wi-Fi-only devices without telephony will not advertise `sms.send`.
+
+## System commands (node host / mac node)
+
+The macOS node exposes `system.run`, `system.notify`, and `system.execApprovals.get/set`.
+The headless node host exposes `system.run`, `system.which`, and `system.execApprovals.get/set`.
+
+Examples:
+
+```bash
+openclaw nodes run --node <idOrNameOrIp> -- echo "Hello from mac node"
+openclaw nodes notify --node <idOrNameOrIp> --title "Ping" --body "Gateway ready"
+```
+
+Notes:
+
+* `system.run` returns stdout/stderr/exit code in the payload.
+* `system.notify` respects notification permission state on the macOS app.
+* `system.run` supports `--cwd`, `--env KEY=VAL`, `--command-timeout`, and `--needs-screen-recording`.
+* `system.notify` supports `--priority <passive|active|timeSensitive>` and `--delivery <system|overlay|auto>`.
+* macOS nodes drop `PATH` overrides; headless node hosts only accept `PATH` when it prepends the node host PATH.
+* On macOS node mode, `system.run` is gated by exec approvals in the macOS app (Settings → Exec approvals).
+  Ask/allowlist/full behave the same as the headless node host; denied prompts return `SYSTEM_RUN_DENIED`.
+* On headless node host, `system.run` is gated by exec approvals (`~/.openclaw/exec-approvals.json`).
+
+## Exec node binding
+
+When multiple nodes are available, you can bind exec to a specific node.
+This sets the default node for `exec host=node` (and can be overridden per agent).
+
+Global default:
+
+```bash
+openclaw config set tools.exec.node "node-id-or-name"
+```
+
+Per-agent override:
+
+```bash
+openclaw config get agents.list
+openclaw config set agents.list[0].tools.exec.node "node-id-or-name"
+```
+
+Unset to allow any node:
+
+```bash
+openclaw config unset tools.exec.node
+openclaw config unset agents.list[0].tools.exec.node
+```
+
+## Permissions map
+
+Nodes may include a `permissions` map in `node.list` / `node.describe`, keyed by permission name (e.g. `screenRecording`, `accessibility`) with boolean values (`true` = granted).
+
+## Headless node host (cross-platform)
+
+OpenClaw can run a **headless node host** (no UI) that connects to the Gateway
+WebSocket and exposes `system.run` / `system.which`. This is useful on Linux/Windows
+or for running a minimal node alongside a server.
+
+Start it:
+
+```bash
+openclaw node run --host <gateway-host> --port 18789
+```
+
+Notes:
+
+* Pairing is still required (the Gateway will show a node approval prompt).
+* The node host stores its node id, token, display name, and gateway connection info in `~/.openclaw/node.json`.
+* Exec approvals are enforced locally via `~/.openclaw/exec-approvals.json`
+  (see [Exec approvals](/tools/exec-approvals)).
+* On macOS, the headless node host prefers the companion app exec host when reachable and falls
+  back to local execution if the app is unavailable. Set `OPENCLAW_NODE_EXEC_HOST=app` to require
+  the app, or `OPENCLAW_NODE_EXEC_FALLBACK=0` to disable fallback.
+* Add `--tls` / `--tls-fingerprint` when the Gateway WS uses TLS.
+
+## Mac node mode
+
+* The macOS menubar app connects to the Gateway WS server as a node (so `openclaw nodes …` works against this Mac).
+* In remote mode, the app opens an SSH tunnel for the Gateway port and connects to `localhost`.
--- a/openclaw-knowhow-skill/docs/infrastructure/nodes/location-command.md
+++ b/openclaw-knowhow-skill/docs/infrastructure/nodes/location-command.md
@@ -0,0 +1,28 @@
+# Location Command Documentation
+
+## Core Functionality
+
+The `location.get` node command retrieves device location data. It operates through a three-tier permission model rather than a simple on/off switch, reflecting how modern operating systems handle location access.
+
+## Permission Levels
+
+The system uses three modes: disabled, foreground-only ("While Using"), and background-enabled ("Always"). OS permissions are multi-level. We can expose a selector in-app, but the OS still decides the actual grant.
+
+## Command Parameters & Response
+
+When invoked, the command accepts timeout, cache age, and accuracy preferences. The response includes latitude, longitude, accuracy in meters, altitude, speed, heading, timestamp, precision status, and location source (GPS, WiFi, cellular, or unknown).
+
+## Error Handling
+
+The implementation provides five stable error codes: `LOCATION_DISABLED`, `LOCATION_PERMISSION_REQUIRED`, `LOCATION_BACKGROUND_UNAVAILABLE`, `LOCATION_TIMEOUT`, and `LOCATION_UNAVAILABLE`.
+
+## Implementation Details
+
+- Precise location is a separate toggle from the enablement mode
+- iOS/macOS users configure through system settings
+- Android distinguishes between standard and background location permissions
+- Future background support requires push-triggered workflows
+
+## Integration
+
+The feature integrates via the `nodes` tool (`location_get` action) and CLI command (`openclaw nodes location get`), with recommended UX copy provided for each permission level.
--- a/openclaw-knowhow-skill/docs/infrastructure/nodes/talk.md
+++ b/openclaw-knowhow-skill/docs/infrastructure/nodes/talk.md
@@ -0,0 +1,31 @@
+# Talk Mode Documentation
+
+## Overview
+
+Talk Mode enables continuous voice conversations through a cycle of listening, transcription, model processing, and text-to-speech playback.
+
+## Core Functionality
+
+The system operates in three phases: Listening, Thinking, Speaking. Upon detecting a brief silence, the transcript is sent to the model via the main session, and responses are both displayed in WebChat and spoken aloud using ElevenLabs.
+
+## Voice Control
+
+Responses can include a JSON directive as the first line to customize voice settings:
+
+```json
+{ "voice": "<voice-id>", "once": true }
+```
+
+Supported parameters include voice selection, model specification, speed, stability, and various ElevenLabs-specific options. The `once` flag limits changes to the current reply only.
+
+## Configuration
+
+Settings are managed in `~/.openclaw/openclaw.json` with options for voice ID, model selection, output format, and API credentials. The system defaults to `eleven_v3` model with `interruptOnSpeech` enabled.
+
+## Platform-Specific Behavior
+
+**macOS** displays an always-on overlay with visual indicators for each phase and allows interruption when users speak during assistant responses. The UI includes menu bar toggle, configuration tab, and cloud icon controls.
+
+## Technical Requirements
+
+The feature requires Speech and Microphone permissions and supports various PCM and MP3 output formats across macOS, iOS, and Android platforms for optimized latency.
--- a/openclaw-knowhow-skill/docs/infrastructure/nodes/voicewake.md
+++ b/openclaw-knowhow-skill/docs/infrastructure/nodes/voicewake.md
@@ -0,0 +1,28 @@
+# Voice Wake Documentation
+
+## Overview
+OpenClaw implements a centralized approach to voice wake words. The system treats wake words as a single global list managed by the Gateway rather than allowing per-node customization.
+
+## Key Architecture Details
+
+**Storage Location:** Wake word configurations are maintained on the gateway host at `~/.openclaw/settings/voicewake.json`
+
+**Data Structure:** The system stores trigger words alongside a timestamp in JSON format, containing the active triggers and when they were last modified.
+
+## API Protocol
+
+The implementation provides two primary methods:
+- Retrieval: `voicewake.get` returns the current trigger list
+- Updates: `voicewake.set` modifies triggers with validation and broadcasts changes
+
+A `voicewake.changed` event notifies all connected clients (WebSocket connections, iOS/Android nodes) whenever modifications occur.
+
+## Client Implementations
+
+**macOS:** The native app integrates the global trigger list with voice recognition and allows settings-based editing.
+
+**iOS/Android:** Both mobile platforms expose wake word editors in their settings interfaces. Changes propagate through the Gateway's WebSocket connection to maintain consistency across all devices.
+
+## Operational Constraints
+
+The system normalizes input by trimming whitespace and removing empty values. Empty lists default to system presets, and safety limits enforce caps on trigger count and length.