Show HN: I generated 70k audiobooks with OpenAI Text-to-Speech
Implementation & Open Sourcing
- Code is currently closed source; some ask for open-sourcing to self-host or contribute.
- Author describes it as a straightforward wrapper around OpenAI TTS + Google OAuth + a payment provider.
- Others note there’s limited upside to open-sourcing unless for portfolio/visibility.
User Experience & Feature Requests
- Requests: search across the large catalog, more login options beyond Google, light theme, and mobile apps.
- Multiple voices and voice selection (by author or protagonist gender, or per-character) are highly requested but deprioritized so far.
- Other ideas: multi-narrator “audio play” style, 1.5x generation speed (not just playback), Apple Pay support.
Audio Quality, Languages & Use Cases
- Several listeners praise the natural cadence; some say it’s the best TTS they’ve heard, especially for essays and non-fiction.
- Others find it still slightly unnatural, with odd pauses/emphasis, and say it fails badly on poetry/dramatic works (e.g., Shakespeare’s meter).
- Consensus in the thread: TTS is currently much better for history/philosophy/science/non-fiction than for fiction and dialogue-heavy texts.
- OpenAI TTS is reported as weak for non‑English; some note other models do better at emotion but worse in voice quality or hallucinations.
Generation Strategy & Scalability
- System splits books into ~4k-character chunks due to API limits, generating audio on-demand.
- It pre-generates the next chunk near the end of the current one to keep playback seamless.
- Full-book pre-generation and chapter MP3s are planned but not finished.
Monetization, Pricing & Caching
- Current model: one-time purchase of listening “hours,” with pricing set around 50% of raw API costs; profit appears only after multiple purchases of the same book.
- Revenue so far is very low; author hopes to reach modest MRR.
- Ideas from commenters:
- Monthly subscription and mobile app for recurring revenue.
- Crowd-funding per book (many small contributors unlock a free public audio).
- First buyer funds generation; others pay less, or listen free.
- Using the project as a free/donation-based “lead magnet” for other products.
Ethics & Value of Charging for Public Domain
- Some see charging for public-domain audiobooks as unethical or “gross”; others reply that API/storage costs must be covered and point out that many businesses charge for public-domain content.
- A compromise suggested: charge only at cost, or let users “donate” generated audio to the public.
Comparisons to Existing Projects & Tools
- Mention of Microsoft’s prior Gutenberg TTS effort; some say its voices are worse than OpenAI’s.
- Librivox is cited as a human-read alternative; some prefer human narration, others find many Librivox readings lower quality than the AI.
- Various TTS engines are discussed (ElevenLabs, Piper, Bark, xTTS, Voicebox); consensus is that OpenAI TTS is currently among the most pleasant but not perfect.
Marketing & Title Controversy
- A subthread argues over the HN post title claiming “generated 70k audiobooks” since books are generated on demand, not precomputed.
- Critics call this misleading or a “lie”; supporters say it’s a reasonable shorthand since all 70k are playable via the system and the on-demand detail is disclosed in the post.