6 custom skills (assign-task, dispatch-webhook, daily-briefing, task-capture, qmd-brain, tts-voice) with technical documentation. Compatible with Claude Code, OpenClaw, Codex CLI, and OpenCode.
59 lines
1.6 KiB
Markdown
59 lines
1.6 KiB
Markdown
# Voice Overlay
|
|
|
|
## Overview
|
|
|
|
This documentation describes the macOS voice overlay lifecycle, designed to manage interactions between wake-word detection and push-to-talk functionality.
|
|
|
|
## Key Design Principle
|
|
|
|
The system ensures predictable behavior when wake-word and push-to-talk overlap. If the overlay is already visible from wake-word and the user presses the hotkey, the hotkey session *adopts* the existing text instead of resetting it.
|
|
|
|
## Core Implementation (as of Dec 9, 2025)
|
|
|
|
The architecture uses three main components:
|
|
|
|
### 1. VoiceSessionCoordinator
|
|
|
|
Acts as a single-session owner managing token-based API calls:
|
|
|
|
- `beginWakeCapture`
|
|
- `beginPushToTalk`
|
|
- `endCapture`
|
|
|
|
### 2. VoiceSession
|
|
|
|
Model carrying session metadata including:
|
|
|
|
- Token
|
|
- Source (wakeWord | pushToTalk)
|
|
- Committed/volatile text
|
|
- Chime flags
|
|
- Timers (auto-send, idle)
|
|
|
|
### 3. VoiceSessionPublisher
|
|
|
|
SwiftUI integration that mirrors the active session into SwiftUI without direct singleton mutations.
|
|
|
|
## Behavior Details
|
|
|
|
- **Wake-word alone**: Auto-sends on silence
|
|
- **Push-to-talk**: Sends immediately upon release, can wait up to 1.5s for a final transcript before falling back to the current text
|
|
|
|
## Debugging Support
|
|
|
|
Stream logs using:
|
|
|
|
```bash
|
|
sudo log stream --predicate 'subsystem == "bot.molt" AND category CONTAINS "voicewake"'
|
|
```
|
|
|
|
## Migration Path
|
|
|
|
Implementation follows five sequential steps:
|
|
|
|
1. Add core components
|
|
2. Wire VoiceSessionCoordinator
|
|
3. Integrate VoiceSession model
|
|
4. Connect VoiceSessionPublisher to SwiftUI
|
|
5. Integration testing for session adoption and cooldown behavior
|