Files
Selig 4c966a3ad2 Initial commit: OpenClaw Skill Collection
6 custom skills (assign-task, dispatch-webhook, daily-briefing,
task-capture, qmd-brain, tts-voice) with technical documentation.
Compatible with Claude Code, OpenClaw, Codex CLI, and OpenCode.
2026-03-13 10:58:30 +08:00

32 lines
1.4 KiB
Markdown

# Talk Mode Documentation
## Overview
Talk Mode enables continuous voice conversations through a cycle of listening, transcription, model processing, and text-to-speech playback.
## Core Functionality
The system operates in three phases: Listening, Thinking, Speaking. Upon detecting a brief silence, the transcript is sent to the model via the main session, and responses are both displayed in WebChat and spoken aloud using ElevenLabs.
## Voice Control
Responses can include a JSON directive as the first line to customize voice settings:
```json
{ "voice": "<voice-id>", "once": true }
```
Supported parameters include voice selection, model specification, speed, stability, and various ElevenLabs-specific options. The `once` flag limits changes to the current reply only.
## Configuration
Settings are managed in `~/.openclaw/openclaw.json` with options for voice ID, model selection, output format, and API credentials. The system defaults to `eleven_v3` model with `interruptOnSpeech` enabled.
## Platform-Specific Behavior
**macOS** displays an always-on overlay with visual indicators for each phase and allows interruption when users speak during assistant responses. The UI includes menu bar toggle, configuration tab, and cloud icon controls.
## Technical Requirements
The feature requires Speech and Microphone permissions and supports various PCM and MP3 output formats across macOS, iOS, and Android platforms for optimized latency.