forked from Selig/openclaw-skill
6 custom skills (assign-task, dispatch-webhook, daily-briefing, task-capture, qmd-brain, tts-voice) with technical documentation. Compatible with Claude Code, OpenClaw, Codex CLI, and OpenCode.
39 lines
3.6 KiB
JSON
Executable File
39 lines
3.6 KiB
JSON
Executable File
{
|
|
"title": "Talk Mode",
|
|
"content": "Talk mode is a continuous voice conversation loop:\n\n1. Listen for speech\n2. Send transcript to the model (main session, chat.send)\n3. Wait for the response\n4. Speak it via ElevenLabs (streaming playback)\n\n* **Always-on overlay** while Talk mode is enabled.\n* **Listening → Thinking → Speaking** phase transitions.\n* On a **short pause** (silence window), the current transcript is sent.\n* Replies are **written to WebChat** (same as typing).\n* **Interrupt on speech** (default on): if the user starts talking while the assistant is speaking, we stop playback and note the interruption timestamp for the next prompt.\n\n## Voice directives in replies\n\nThe assistant may prefix its reply with a **single JSON line** to control voice:\n\n* First non-empty line only.\n* Unknown keys are ignored.\n* `once: true` applies to the current reply only.\n* Without `once`, the voice becomes the new default for Talk mode.\n* The JSON line is stripped before TTS playback.\n\n* `voice` / `voice_id` / `voiceId`\n* `model` / `model_id` / `modelId`\n* `speed`, `rate` (WPM), `stability`, `similarity`, `style`, `speakerBoost`\n* `seed`, `normalize`, `lang`, `output_format`, `latency_tier`\n* `once`\n\n## Config (`~/.openclaw/openclaw.json`)\n\n* `interruptOnSpeech`: true\n* `voiceId`: falls back to `ELEVENLABS_VOICE_ID` / `SAG_VOICE_ID` (or first ElevenLabs voice when API key is available)\n* `modelId`: defaults to `eleven_v3` when unset\n* `apiKey`: falls back to `ELEVENLABS_API_KEY` (or gateway shell profile if available)\n* `outputFormat`: defaults to `pcm_44100` on macOS/iOS and `pcm_24000` on Android (set `mp3_*` to force MP3 streaming)\n\n* Menu bar toggle: **Talk**\n* Config tab: **Talk Mode** group (voice id + interrupt toggle)\n* Overlay:\n * **Listening**: cloud pulses with mic level\n * **Thinking**: sinking animation\n * **Speaking**: radiating rings\n * Click cloud: stop speaking\n * Click X: exit Talk mode\n\n* Requires Speech + Microphone permissions.\n* Uses `chat.send` against session key `main`.\n* TTS uses ElevenLabs streaming API with `ELEVENLABS_API_KEY` and incremental playback on macOS/iOS/Android for lower latency.\n* `stability` for `eleven_v3` is validated to `0.0`, `0.5`, or `1.0`; other models accept `0..1`.\n* `latency_tier` is validated to `0..4` when set.\n* Android supports `pcm_16000`, `pcm_22050`, `pcm_24000`, and `pcm_44100` output formats for low-latency AudioTrack streaming.",
|
|
"code_samples": [
|
|
{
|
|
"code": "Rules:\n\n* First non-empty line only.\n* Unknown keys are ignored.\n* `once: true` applies to the current reply only.\n* Without `once`, the voice becomes the new default for Talk mode.\n* The JSON line is stripped before TTS playback.\n\nSupported keys:\n\n* `voice` / `voice_id` / `voiceId`\n* `model` / `model_id` / `modelId`\n* `speed`, `rate` (WPM), `stability`, `similarity`, `style`, `speakerBoost`\n* `seed`, `normalize`, `lang`, `output_format`, `latency_tier`\n* `once`\n\n## Config (`~/.openclaw/openclaw.json`)",
|
|
"language": "unknown"
|
|
}
|
|
],
|
|
"headings": [
|
|
{
|
|
"level": "h2",
|
|
"text": "Behavior (macOS)",
|
|
"id": "behavior-(macos)"
|
|
},
|
|
{
|
|
"level": "h2",
|
|
"text": "Voice directives in replies",
|
|
"id": "voice-directives-in-replies"
|
|
},
|
|
{
|
|
"level": "h2",
|
|
"text": "Config (`~/.openclaw/openclaw.json`)",
|
|
"id": "config-(`~/.openclaw/openclaw.json`)"
|
|
},
|
|
{
|
|
"level": "h2",
|
|
"text": "macOS UI",
|
|
"id": "macos-ui"
|
|
},
|
|
{
|
|
"level": "h2",
|
|
"text": "Notes",
|
|
"id": "notes"
|
|
}
|
|
],
|
|
"url": "llms-txt#talk-mode",
|
|
"links": []
|
|
} |