What is Voice Streaming?
Voice streaming enables your AI Voice Agent to:- Receive voice input - Process audio in real-time
- Convert speech to text - Use STT to transcribe audio
- Detect speech activity - Use VAD to know when users are speaking
- Detect conversation turns - Know when users have finished speaking
- Respond naturally - Enable natural turn-taking in conversations
Core Components
Speech-to-Text (STT)
Convert speech to text in real-time
Voice Activity Detection (VAD)
Detect when users are speaking
Turn Detection
Determine when users finish speaking
Audio Streaming
Stream audio in real-time
Audio Pipeline
The complete audio processing flow:Quick Example
Next Steps
- STT → - Speech-to-Text
- VAD → - Voice Activity Detection
- Turn Detection → - Turn detection
- Audio Streaming → - Real-time streaming
- Integrations → - Choose providers

