New accounts on HN more likely to use em-dashes

Evidence about em-dashes and “green” accounts

  • OP’s data show new/“green” accounts use em-dashes ~10× more often than older accounts; initial eyeballing of /newcomments vs /noobcomments gave ~32:1 for em-dash presence.
  • A shared SQLite dataset lets others confirm that top em-dash users look legitimate, but nearly all extreme outliers are green accounts.
  • Additional word-frequency analysis: new accounts disproportionately use terms like “ai”, “actually”, “real”, “built”, “tools”, “agents”, etc. with very low p-values; some commenters note this is suggestive but warn about p‑hacking and correlation vs causation.
  • Removing em-dashes, new accounts are still ~6× more likely to use other formatting tells (lists, arrows).

Alternative explanations and skepticism

  • Several point out em-dashes are auto-inserted by iOS/macOS and some non-English keyboard tools; typography fans have long used them, so false positives are inevitable.
  • Others argue that if that were the main cause, it wouldn’t explain such a large differential specifically in new accounts.
  • Some stress that focusing on a single stylometric signal is fragile; bots can trivially avoid em-dashes, or post-process text to strip “tells”.

Perceived bot presence and behavior on HN

  • Many report a strong subjective sense that HN and /noobcomments are recently flooded with AI-written posts: bland, formulaic, slightly pro‑AI, often summarizing the article without adding new insight.
  • Common patterns cited: “this is X, not just Y” structures, sanitized PR tone, over-explained lists, conclusion paragraphs to short comments, and phrases like “is real”.
  • Users link specific accounts that posted long, similar comments seconds apart across threads, or amassed high karma from “paragraphs that say nothing”.
  • Others emphasize the difficulty of distinguishing: humans using AI to “polish” writing vs full automation are effectively indistinguishable.

Reactions to AI slop and impact on writing norms

  • Strong dislike of AI “slop”: verbose, uninteresting, agenda‑pushing, and contributing to a perceived drop in HN comment quality.
  • Several now avoid em-dashes, bullets, or “too clean” grammar to not be accused of using AI; others deliberately keep typos as a “human signal”.
  • Typography and language nerds resent having to self-censor good punctuation; some vow to “reclaim” the em-dash and ignore accusations.

Motives and risks

  • Proposed motives for LLM bots:
    – Build aged, high‑karma accounts for later shilling or coordinated voting.
    – Product marketing and growth-hacking.
    – Political/ideological astroturfing and narrative control.
    – Simple experimentation or desire for engagement.
  • Some see this as an existential threat to anonymous forums: manufactured consensus becomes cheap, while trust and authenticity erode.

Proposed defenses

  • Suggestions include: invite‑tree systems, stronger rate limits or proof‑of‑work to comment, better automated bot detection (e.g., posting speed, history consistency), clique detection, or views that hide young accounts (/classic, account-age filters).
  • Identity verification is floated but widely criticized as harmful to anonymity, vulnerable to black‑market IDs, and socially undesirable.