Initial commit: OpenClaw Skill Collection

6 custom skills (assign-task, dispatch-webhook, daily-briefing, task-capture, qmd-brain, tts-voice) with technical documentation. Compatible with Claude Code, OpenClaw, Codex CLI, and OpenCode.
2026-03-13 10:58:30 +08:00
commit 4c966a3ad2
884 changed files with 140761 additions and 0 deletions
--- a/openclaw-knowhow-skill/docs/infrastructure/nodes/audio.md
+++ b/openclaw-knowhow-skill/docs/infrastructure/nodes/audio.md
@@ -0,0 +1,34 @@
+# Audio and Voice Notes Documentation
+
+## Overview
+
+OpenClaw supports audio transcription with flexible configuration options. The system automatically detects available transcription tools or allows explicit provider/CLI setup.
+
+## Key Capabilities
+
+When audio understanding is enabled, OpenClaw locates the first audio attachment (local path or URL) and downloads it if needed before processing through configured models in sequence until one succeeds.
+
+## Auto-Detection Hierarchy
+
+Without custom configuration, the system attempts transcription in this order:
+- Local CLI tools (sherpa-onnx-offline, whisper-cli, whisper Python CLI)
+- Gemini CLI
+- Provider APIs (OpenAI, Groq, Deepgram, Google)
+
+## Configuration Options
+
+Three configuration patterns are provided:
+
+1. **Provider with CLI fallback** – Uses OpenAI with Whisper CLI as backup
+2. **Provider-only with scope gating** – Restricts to specific chat contexts (e.g., denying group chats)
+3. **Single provider** – Deepgram example for dedicated service use
+
+## Important Constraints
+
+Default size cap is 20MB (`tools.media.audio.maxBytes`). Oversize audio is skipped for that model and the next entry is tried.
+
+Authentication follows standard model auth patterns. The transcript output is available as `{{Transcript}}` for downstream processing, with optional character trimming via `maxChars`.
+
+## Notable Gotchas
+
+Scope rules use first-match evaluation, CLI commands must exit cleanly with plain text output, and timeouts should be reasonable to prevent blocking the reply queue.