Build an IVR System with Speeko API
Interactive Voice Response used to sound robotic. In 2026, there's no excuse.
The Modern IVR Stack
- Telephony — Twilio, Vonage, or Plivo for call routing
- Voice generation — Speeko for natural greetings and prompts
- Speech recognition — For caller input interpretation
- Backend logic — Route calls based on caller responses
Sample Twilio Integration
const twilio = require('twilio');
const axios = require('axios');
app.post('/voice', async (req, res) => {
const twiml = new twilio.twiml.VoiceResponse();
const audio = await axios.post(
'https://api.speekoapp.com/v1/tts',
{
text: 'Welcome to Acme Corp. Press 1 for sales, 2 for support.',
voice: 'af_heart',
format: 'mp3'
},
{ headers: { Authorization: 'Bearer YOUR_KEY' }, responseType: 'arraybuffer' }
);
const audioUrl = await uploadToCDN(audio.data);
twiml.play(audioUrl);
twiml.gather({ numDigits: 1, action: '/handle-menu' });
res.type('text/xml').send(twiml.toString());
});Caching Strategy
IVR prompts repeat thousands of times. Pre-generate and cache them:
const PROMPTS = {
welcome: 'ivr/welcome.mp3',
menu: 'ivr/menu.mp3',
invalid: 'ivr/invalid.mp3'
};This drops your TTS costs to near zero for static prompts.
Dynamic Personalization
For personalized messages, generate on-demand:
const greeting = `Hello ${customer.firstName}, your order will arrive tomorrow.`;
const audio = await generateTTS(greeting);Call Flow Design
- Keep menus short — 4 options max
- Offer "speak to a human" always
- Confirm caller input before acting
- Use natural phrasing, not "press 1 for the first option"
- Respect regional norms (formality, pacing)
Monitoring
Track drop-off at each menu level. If 40% of callers abandon the first menu, redesign it.