AI Audiobook Creation: From Manuscript to Audible

A human-narrated audiobook costs $2,000-$5,000 and takes weeks. AI narration costs under $100 and takes hours. Here's the production pipeline.

The Reality of AI Audiobooks

In 2026, Audible officially supports AI-narrated audiobooks with disclosure. Spotify, Google Play Books, and Kobo all accept them. The stigma is fading, but quality requirements are rising.

Pipeline Overview

Prepare manuscript — Clean EPUB or formatted text
Segment by chapter — One audio file per chapter
Choose voice — Narrator voice matching genre
Handle dialogue — Different voices for characters (optional)
Generate audio — Batch processing through TTS API
Post-production — Normalize loudness to ACX standard
Master — Export as MP3 192kbps or higher
Distribute — ACX, Findaway Voices, Spotify for Authors

Voice Selection Guide

Literary fiction — Mature, measured delivery
Thrillers — Tense, lower-register male voices work well
Romance — Warm female voices, genre-expected
Non-fiction/business — Clear, authoritative
Children's books — Animated, varied pacing

ACX Audio Standards

If publishing on Audible via ACX, audio must meet:

RMS between -23dB and -18dB
Peak values below -3dB
Noise floor below -60dB
MP3 192kbps constant bitrate, 44.1kHz

Use ffmpeg with loudness normalization filters to automate this.

Chapter Handling

chapters = parse_epub('book.epub')

for i, chapter in enumerate(chapters):
    audio = generate_tts(chapter.text, voice='narrator_01')
    normalize_acx(audio, f'chapter_{i:02d}.mp3')

Multi-Voice Dialogue

Parse dialogue and switch voices per character:

for segment in parse_dialogue(chapter):
    voice = character_voices[segment.speaker]
    audio = generate_tts(segment.text, voice=voice)

Legal Note

Disclose AI narration in metadata. Some platforms require it; all recommend it.

Start producing your audiobook.