GenAI, the snake eating its own tail

Training data, sustainability, and synthetic content

  • One camp worries LLMs are killing “organic” venues (e.g., Q&A sites), reducing future training data and leading to eventual stagnation or “model collapse.”
  • Others argue we already have more than enough high-quality data; big gains now come from better architectures, curation, and synthetic data, which is proving effective in later training phases.
  • Several comments note that prompts, chats, and user uploads themselves form a huge new dataset, though critics say this lacks peer review, consensus signals, and explicit truth checks that sites like Stack Overflow provided.
  • There’s some optimism that reinforcement mechanisms (users re-asking when answers fail, deleting when satisfied, grounded facts, tests/compilers) can approximate truth signals.

Paying creators, incentives, and attribution

  • Many are skeptical of pay-per-crawl/revenue-share schemes: economically circular, administratively messy, and likely to yield trivial payouts.
  • Some think AI firms “should” pay, but building a fair global payment system and avoiding spam/gaming would be a mess, and could worsen incentives.
  • A recurring view is that the real currency will remain attention, not micro-licensing; creators will differentiate by providing value beyond what an LLM can give.
  • The article’s proposed “list the sources used for each answer” is widely judged technically impossible for current transformer training: weights don’t retain per-token provenance, and any claimed citation from the model alone would be fabricated. External search/RAG can only approximate this.

Stack Overflow, Q&A, and UX

  • Several argue Stack Overflow’s decline started long before LLMs: core questions got answered; moderation culture and corporate metrics alienated both askers and expert answerers.
  • Others recall it as transformational compared to earlier tech-help sites, and see LLMs as parasitic on that corpus.
  • Many users prefer LLMs because they are non-judgmental and conversational, unlike SO’s reputation for hostility toward beginners.

Societal impacts, regulation, and power

  • Pessimistic commenters foresee AI deepening inequality, with elites using it to entrench power; some call for more socialist policies, UBI, or “automation taxes.”
  • Others dismiss doom narratives, comparing AI to prior disruptive technologies that were net positive but had serious externalities; they favor regulation that balances risk and benefit rather than bans.
  • There’s anticipation that GenAI will move toward ad-supported models, raising worries that AI “slop” plus monetization will further degrade information quality and incentives for authentic human work.