AI Audiobook Creation: From Manuscript to Audible
A human-narrated audiobook costs $2,000-$5,000 and takes weeks. AI narration costs under $100 and takes hours. Here's the production pipeline.
The Reality of AI Audiobooks
In 2026, Audible officially supports AI-narrated audiobooks with disclosure. Spotify, Google Play Books, and Kobo all accept them. The stigma is fading, but quality requirements are rising.
Pipeline Overview
- Prepare manuscript — Clean EPUB or formatted text
- Segment by chapter — One audio file per chapter
- Choose voice — Narrator voice matching genre
- Handle dialogue — Different voices for characters (optional)
- Generate audio — Batch processing through TTS API
- Post-production — Normalize loudness to ACX standard
- Master — Export as MP3 192kbps or higher
- Distribute — ACX, Findaway Voices, Spotify for Authors
Voice Selection Guide
- Literary fiction — Mature, measured delivery
- Thrillers — Tense, lower-register male voices work well
- Romance — Warm female voices, genre-expected
- Non-fiction/business — Clear, authoritative
- Children's books — Animated, varied pacing
ACX Audio Standards
If publishing on Audible via ACX, audio must meet:
- RMS between -23dB and -18dB
- Peak values below -3dB
- Noise floor below -60dB
- MP3 192kbps constant bitrate, 44.1kHz
Use ffmpeg with loudness normalization filters to automate this.
Chapter Handling
chapters = parse_epub('book.epub')
for i, chapter in enumerate(chapters):
audio = generate_tts(chapter.text, voice='narrator_01')
normalize_acx(audio, f'chapter_{i:02d}.mp3')Multi-Voice Dialogue
Parse dialogue and switch voices per character:
for segment in parse_dialogue(chapter):
voice = character_voices[segment.speaker]
audio = generate_tts(segment.text, voice=voice)Legal Note
Disclose AI narration in metadata. Some platforms require it; all recommend it.