Scripts to Audiobook

Scripts to Audiobook: An AI Coding Journey 🎧

I’m a movie buff and an avid audiobook consumer. For a while, I’d been thinking: what if I could automatically generate personalized audiobooks from AI-generated scripts?

That personal itch led to my latest project: Scripts to Audiobook — a tool to convert scripts into multi-voice audiobooks with configurable TTS providers.

I started building on March 15th and shipped by the early hours of March 20th. In those ~5 days (including a weekend), I cycled through three AI coding assistants. Here’s what happened.

The Starting Point: Perplexity Computer Use

I started with Perplexity’s Computer Use feature, drawn by its promise of “high completion rates” and impressive real-time preview capabilities. As a Pro subscriber (4,500 credits), I gave it a serious shot — running from 3,400+ credits down to 2,200+ before pausing.

My experience was mixed:

What worked:

Real-time preview — Seeing changes instantly was incredibly valuable
Interactive development — The conversational flow felt natural and efficient
Quick prototyping — Great for testing ideas and rapid iteration

What didn’t:

Credits burn fast — 1,200+ credits for partial completion adds up
Completion gap — The AI tended to “brand” things aggressively (naming after itself), which required manual cleanup
Production readiness — While impressive for demos, the output needed significant polish for real use

I stopped not because it couldn’t do the job, but because the credits cost for full completion wasn’t worth it given the current state. It’s a fantastic tool for exploration and demos, but I’d save it for smaller projects or validation phases.

Phase Two: Kimi Code & Token Burn

I switched to Kimi Code, hoping its focused approach would be more effective. It worked — and I burned through two weekly quotas in rapid succession.

Here’s what happened: I’d exhaust my quota, it would reset, and I’d burn through it again. Partly because I was working intensely over a weekend, but mostly because:

Scope creep — Requirements evolved mid-build (new TTS providers, audio format options, chapter support)
Refactor cycles — Fixing bugs led to more bugs, leading to repeated rewrites
Context loss — Starting fresh sessions meant re‑explaining the project each time

The tool itself was capable, but the workflow was inefficient. I was trading time for tokens.

Phase Three: Claude Code & The Finish Line

Finally, I switched to Claude Code (Anthropic’s terminal-based coding agent). Three things made it click:

Rich project context — It could read the entire codebase and understand the architecture
Iterative refinement — Small, targeted changes instead of wholesale rewrites
Stability — What worked yesterday kept working today

In one focused session, I:

Implemented the remaining features (chapter support, TTS switching)
Fixed all outstanding bugs
Polished the UI and documentation
Wrote comprehensive tests

The difference? It felt like working with a senior engineer who understood the big picture.

The Output: What I Built

After all that, here’s what I built:

Scripts to Audiobook converts structured scripts into multi-voice audiobooks using your choice of TTS provider.

Key Features

🎭 Multiple voices — Assign different voices to each character
📖 Chapter support — Organize long content into sections
🔧 TTS flexibility — Switch between providers (Edge TTS, OpenAI, ElevenLabs, etc.)
🎛️ Configurable output — Control audio format, bitrate, and speed

What Makes It Special

Free to use — Edge TTS requires no API key; other providers offer free tiers
AI-agnostic — Works with scripts from any AI (Claude, GPT, Gemini, etc.)
Beginner-friendly UI — Simple configuration interface, no coding required

Tech Stack

TypeScript — Type safety and better DX
Node.js — Cross‑platform runtime
Edge TTS / OpenAI / ElevenLabs — TTS providers
FFmpeg — Audio processing and format conversion

Key Learnings

Tool choice matters — Different AI coding assistants have different strengths. Match the tool to the phase.
Token efficiency — Rich context and iterative refinement beat brute‑force rewriting.
Ship fast, iterate — The first version doesn’t need to be perfect. It just needs to work.

The Beauty of AI-Augmented Coding

This entire project — from idea to shipped product — took less than 5 days. And I’m not a professional developer.

The beauty isn’t in getting it perfect the first time — it’s in the ability to iterate rapidly, learn from mistakes, and keep moving forward.

AI coding assistants aren’t replacements for human judgment. They’re force multipliers. They help you move faster, think bigger, and ship smarter.

What’s Next?

I’m already thinking about v2:

Web UI for easier configuration
More TTS providers (Azure, Google, Amazon)
Voice cloning support
Batch processing for multiple scripts

But first, I’m taking a break. Shipping feels good. 😊

Check it out on GitHub: scripts-to-audiobook

Made with 💙 by Nicky & AI