Show HN: Chat with 19 years of HN

Overall Reaction to the Tool

  • Many commenters find the HN chat interface technically impressive, fun to play with, and surprisingly insightful about users, topics, and trends.
  • People enjoyed examples like: “best database according to HN”, “best time to post Show HN”, language popularity stats, retirement-number analysis, and user-behavior summaries.
  • Some hit the free usage limit quickly and wanted more to explore, such as pre-generated/browsable analyses or blog-style writeups of interesting queries.

UX, Access, and Pricing

  • Several complaints about friction: mandatory email/login, confusing redirects, wrong URL in the submission, and difficulty seeing the HN dataset at first.
  • Multiple users say they won’t give their email “just to try it”; suggestions include captchas instead of logins and showing value before signup.
  • Common request: let users plug in their own OpenAI/Claude keys or some “Login with ChatGPT”–style billing to avoid another subscription.
  • The creator notes that LLM costs and the need to avoid abuse drive the login wall and pricing, and that the app is roughly break-even.

Use of HN Data, Copyright, and Rights

  • Ongoing debate about whether it’s appropriate to monetize analyses of HN comments:
    • One side: HN is public; the data is in a public BigQuery dataset and via API; anything public can be analyzed.
    • Other side: commenters retain copyright; HN only has a license; third parties don’t automatically gain commercial rights just because there’s an API or dataset.
  • The BigQuery listing that appears “official” is clarified (via linked prior discussion) as a third-party project, not something HN/Y Combinator publishes directly.
  • Some find it especially distasteful that their own contributions are turned into a paid product “sold back” to them.

Privacy, Anonymity, and Doxing Concerns

  • Strong unease about prompts like “What do you think about user X?” and how easily the tool (or other LLMs) can:
    • Aggregate a user’s entire history,
    • Infer real-world identity or other accounts,
    • “Dox” people or link throwaways via writing style.
  • Several say this makes them reconsider posting at all; others argue that the damage is already done because of past scraping and datasets.
  • Distinction is made between public records and impersonation: commenters broadly see AI (or humans) role-playing as real individuals as ethically unacceptable.
  • Some propose a convention like “NoAI/NoIndex” in profiles as a soft opt-out signal, while acknowledging it wouldn’t be enforceable.

Technical Aspects and Safety

  • People praise the multi-tool setup (SQL runner, Python transform, charting, search) and how well it orchestrates queries and visualizations.
  • There’s curiosity about safeguards that prevent destructive SQL (e.g., DELETE), with speculation that the database is read-only plus prompt- or tool-level restrictions.

Language Popularity & HN Bias

  • The tool’s outputs suggest Rust and Go dominate by story count and karma, while Lua/Erlang have high per-story scores.
  • Follow-up queries on Show HN titles show Python, JavaScript, Go, and Rust leading by project count.
  • Commenters note:
    • Title-bias (Rust/Go often mentioned in titles),
    • Possible undercounting (e.g., TypeScript, Lisp) due to regex-based detection,
    • That HN “attention” doesn’t necessarily reflect real-world usage.
  • Some perceive a systemic Rust bias on HN and speculate that YC/startup culture amplifies it.

Ethical Discomfort with AI Over Social Data

  • Multiple users express a general “gross” or “icky” feeling about:
    • AI systems mining social conversations for fine-grained judgment of individuals,
    • Normalizing surveillance-like analysis of casual, in-the-moment discussion.
  • Others counter that public forums are inherently public, but even they acknowledge the emotional shock of seeing an LLM instantly surface and summarize one’s entire online persona.