Pushing the frontiers of audio generation
Overall impression of the tech
- Many find the audio technically impressive and plausibly human, especially to non-native speakers.
- Several commenters describe a “holy shit” moment where their brain briefly accepted it as real conversation.
- Others emphasize it’s good but “not yet great,” especially around disfluencies (“um,” “uh”) and pacing.
Uncanny valley & “fake personality”
- A dominant reaction is discomfort: voices feel like over-enthusiastic podcasters, ad reads, or awkward people reading a script.
- Listeners dislike the exaggerated friendliness, faux excitement, and constant back-channeling (“oh yeah,” etc.), calling it grating and shallow.
- People say they’d find this style annoying even from humans; the issue is tone and persona, not just artificiality.
- Some report no uncanny valley, especially non-native speakers, but still don’t like the “talking over each other” podcast format.
Identity, style, and training data
- Commenters note the voices lack a coherent “person” behind them: mannerisms and vocabulary feel averaged from training data, not tied to a distinct identity.
- Accents are discussed (e.g., “British accent”), with recognition that lumping many regional accents together is imprecise.
- Several suspect training was skewed toward “professional audio” (ads, podcasts, audiobooks), leading to overfitted “podcaster banter.”
- The fake disfluencies feel mistimed and mechanical, which enhances the uncanny effect.
Use cases, tools, and adoption
- Proposed uses: low-budget voice acting, YouTube narration, “reaction-style” commentary, reading articles or documents.
- Some already use similar TTS tools (browser/OS features, commercial apps, cloud TTS APIs) to listen to blogs and papers.
- NotebookLM’s podcast-style summaries are reported as both engaging and, for others, depressing—seen as replacing careful reading with chatty overviews.
Societal and creative impact
- Concerns that AI-generated audio/music will flood platforms with low-effort content and “AI elevator music.”
- Worries that automating commercial creative work “eats the seed corn,” undermining the ecosystem that trains future human creatives.
- Others argue creative fields may eventually regrow as human-made work becomes a premium differentiator.