{ "title": "Talk Mode", "content": "Talk mode is a continuous voice conversation loop:\n\n1. Listen for speech\n2. Send transcript to the model (main session, chat.send)\n3. Wait for the response\n4. Speak it via ElevenLabs (streaming playback)\n\n* **Always-on overlay** while Talk mode is enabled.\n* **Listening → Thinking → Speaking** phase transitions.\n* On a **short pause** (silence window), the current transcript is sent.\n* Replies are **written to WebChat** (same as typing).\n* **Interrupt on speech** (default on): if the user starts talking while the assistant is speaking, we stop playback and note the interruption timestamp for the next prompt.\n\n## Voice directives in replies\n\nThe assistant may prefix its reply with a **single JSON line** to control voice:\n\n* First non-empty line only.\n* Unknown keys are ignored.\n* `once: true` applies to the current reply only.\n* Without `once`, the voice becomes the new default for Talk mode.\n* The JSON line is stripped before TTS playback.\n\n* `voice` / `voice_id` / `voiceId`\n* `model` / `model_id` / `modelId`\n* `speed`, `rate` (WPM), `stability`, `similarity`, `style`, `speakerBoost`\n* `seed`, `normalize`, `lang`, `output_format`, `latency_tier`\n* `once`\n\n## Config (`~/.openclaw/openclaw.json`)\n\n* `interruptOnSpeech`: true\n* `voiceId`: falls back to `ELEVENLABS_VOICE_ID` / `SAG_VOICE_ID` (or first ElevenLabs voice when API key is available)\n* `modelId`: defaults to `eleven_v3` when unset\n* `apiKey`: falls back to `ELEVENLABS_API_KEY` (or gateway shell profile if available)\n* `outputFormat`: defaults to `pcm_44100` on macOS/iOS and `pcm_24000` on Android (set `mp3_*` to force MP3 streaming)\n\n* Menu bar toggle: **Talk**\n* Config tab: **Talk Mode** group (voice id + interrupt toggle)\n* Overlay:\n * **Listening**: cloud pulses with mic level\n * **Thinking**: sinking animation\n * **Speaking**: radiating rings\n * Click cloud: stop speaking\n * Click X: exit Talk mode\n\n* Requires Speech + Microphone permissions.\n* Uses `chat.send` against session key `main`.\n* TTS uses ElevenLabs streaming API with `ELEVENLABS_API_KEY` and incremental playback on macOS/iOS/Android for lower latency.\n* `stability` for `eleven_v3` is validated to `0.0`, `0.5`, or `1.0`; other models accept `0..1`.\n* `latency_tier` is validated to `0..4` when set.\n* Android supports `pcm_16000`, `pcm_22050`, `pcm_24000`, and `pcm_44100` output formats for low-latency AudioTrack streaming.", "code_samples": [ { "code": "Rules:\n\n* First non-empty line only.\n* Unknown keys are ignored.\n* `once: true` applies to the current reply only.\n* Without `once`, the voice becomes the new default for Talk mode.\n* The JSON line is stripped before TTS playback.\n\nSupported keys:\n\n* `voice` / `voice_id` / `voiceId`\n* `model` / `model_id` / `modelId`\n* `speed`, `rate` (WPM), `stability`, `similarity`, `style`, `speakerBoost`\n* `seed`, `normalize`, `lang`, `output_format`, `latency_tier`\n* `once`\n\n## Config (`~/.openclaw/openclaw.json`)", "language": "unknown" } ], "headings": [ { "level": "h2", "text": "Behavior (macOS)", "id": "behavior-(macos)" }, { "level": "h2", "text": "Voice directives in replies", "id": "voice-directives-in-replies" }, { "level": "h2", "text": "Config (`~/.openclaw/openclaw.json`)", "id": "config-(`~/.openclaw/openclaw.json`)" }, { "level": "h2", "text": "macOS UI", "id": "macos-ui" }, { "level": "h2", "text": "Notes", "id": "notes" } ], "url": "llms-txt#talk-mode", "links": [] }