What is VAD?
Voice Activity Detection (VAD) analyzes audio to determine:- START_OF_SPEECH - User has started speaking
- END_OF_SPEECH - User has stopped speaking
- CONTINUING - Speech is ongoing
How VAD Works
VAD processes audio frames:Configuration
Basic Configuration
Sample Rates
VAD requires specific sample rates:- 8000 Hz - Telephone quality
- 16000 Hz - Standard quality (recommended)
Activation Threshold
Adjust sensitivity:- Lower (0.3-0.4) - More sensitive, detects quieter speech
- Default (0.5) - Balanced
- Higher (0.6-0.7) - Less sensitive, reduces false positives
Available Providers
- Silero - On-device, no API keys needed, works offline
Next Steps
- VAD Providers → - Choose and configure VAD
- STT → - Speech-to-Text
- Turn Detection → - Turn detection

