# Voice Wake & Push-to-Talk ## Overview The Voice Wake feature operates in two modes: wake-word (always-on with trigger detection) and push-to-talk (immediate capture via right Option key hold). ## Key Operating Modes ### Wake-word Mode Functions as the default, with the speech recognizer continuously listening for specified trigger tokens. Upon detection, it: 1. Initiates capture 2. Displays an overlay with partial transcription 3. Automatically sends after detecting silence ### Push-to-talk Mode Activates immediately when users hold the right Option key—no trigger word necessary. The overlay remains visible during the hold and processes the audio after release. ## Technical Architecture ### VoiceWakeRuntime Manages the speech recognizer, requiring approximately 0.55 seconds of meaningful pause between trigger word and command. ### Silence Detection - 2.0-second windows during active speech - 5.0-second windows if only the trigger was detected - Hard 120-second limit per session ### Overlay Implementation Uses `VoiceWakeOverlayController` with committed and volatile text states. A critical improvement prevents the "sticky overlay" failure mode where manual dismissal could halt listening—the runtime no longer blocks on overlay visibility, and closing the overlay triggers automatic restart. ## User Configuration Available settings include: - Toggle Voice Wake on/off - Enable push-to-talk (Cmd+Fn hold, macOS 26+) - Language selection - Microphone selection with persistent preferences - Customizable audio cues (Glass sound by default, or any NSSound-compatible file) ## Message Routing Transcripts forward to the active gateway using the app's configured local or remote mode, with replies delivered to the previously-used primary provider: - WhatsApp - Telegram - Discord - WebChat