New accounts on HN more likely to use em-dashes
Evidence about em-dashes and “green” accounts
- OP’s data show new/“green” accounts use em-dashes ~10× more often than older accounts; initial eyeballing of
/newcommentsvs/noobcommentsgave ~32:1 for em-dash presence. - A shared SQLite dataset lets others confirm that top em-dash users look legitimate, but nearly all extreme outliers are green accounts.
- Additional word-frequency analysis: new accounts disproportionately use terms like “ai”, “actually”, “real”, “built”, “tools”, “agents”, etc. with very low p-values; some commenters note this is suggestive but warn about p‑hacking and correlation vs causation.
- Removing em-dashes, new accounts are still ~6× more likely to use other formatting tells (lists, arrows).
Alternative explanations and skepticism
- Several point out em-dashes are auto-inserted by iOS/macOS and some non-English keyboard tools; typography fans have long used them, so false positives are inevitable.
- Others argue that if that were the main cause, it wouldn’t explain such a large differential specifically in new accounts.
- Some stress that focusing on a single stylometric signal is fragile; bots can trivially avoid em-dashes, or post-process text to strip “tells”.
Perceived bot presence and behavior on HN
- Many report a strong subjective sense that HN and
/noobcommentsare recently flooded with AI-written posts: bland, formulaic, slightly pro‑AI, often summarizing the article without adding new insight. - Common patterns cited: “this is X, not just Y” structures, sanitized PR tone, over-explained lists, conclusion paragraphs to short comments, and phrases like “is real”.
- Users link specific accounts that posted long, similar comments seconds apart across threads, or amassed high karma from “paragraphs that say nothing”.
- Others emphasize the difficulty of distinguishing: humans using AI to “polish” writing vs full automation are effectively indistinguishable.
Reactions to AI slop and impact on writing norms
- Strong dislike of AI “slop”: verbose, uninteresting, agenda‑pushing, and contributing to a perceived drop in HN comment quality.
- Several now avoid em-dashes, bullets, or “too clean” grammar to not be accused of using AI; others deliberately keep typos as a “human signal”.
- Typography and language nerds resent having to self-censor good punctuation; some vow to “reclaim” the em-dash and ignore accusations.
Motives and risks
- Proposed motives for LLM bots:
– Build aged, high‑karma accounts for later shilling or coordinated voting.
– Product marketing and growth-hacking.
– Political/ideological astroturfing and narrative control.
– Simple experimentation or desire for engagement. - Some see this as an existential threat to anonymous forums: manufactured consensus becomes cheap, while trust and authenticity erode.
Proposed defenses
- Suggestions include: invite‑tree systems, stronger rate limits or proof‑of‑work to comment, better automated bot detection (e.g., posting speed, history consistency), clique detection, or views that hide young accounts (
/classic, account-age filters). - Identity verification is floated but widely criticized as harmful to anonymity, vulnerable to black‑market IDs, and socially undesirable.