Show HN: Are You in the Weights?

Overall reception

  • Many find the site fun, ego-boosting, and visually appealing (retro/8‑bit aesthetic, clever concept).
  • Others view it as a privacy trap or an instructive but worrying demo of LLM behavior.

Accuracy, hallucinations, and identity collisions

  • Experiences range from eerily accurate summaries (especially for unique surnames, open‑source contributors, academics, or long‑used handles) to complete fabrications.
  • Even people with globally unique names often get described as politicians, security researchers, or especially professional athletes and entertainers.
  • Pseudonyms and long-lived online handles are sometimes recognized more accurately than legal names.
  • The “hallucinations” section is imperfect: some spot‑on descriptions are labeled hallucinations, while many wrong ones appear in the “main” section.

Dangerous and defamatory misidentifications

  • Several users are incorrectly labeled as terrorists, murderers, extremists, or crime victims.
  • Some note that for Arabic or uncommon names, the tool often confuses them with sanctioned individuals or bombers.
  • Commenters find these false positives “scary,” especially given reports that LLMs may be used in military or security decision-making.

Scoring, models, and clustering

  • “Strength” is explained as a linear combination of model self‑reported confidence plus bonuses for cross‑model agreement; commenters note LLM confidence is poorly calibrated.
  • The percentile (“Top N%”) is relative to all queries so far, not the broader population.
  • A clusterer merges model outputs into entities and decides what is a hallucination, optimized for recall over precision, leading to many misclassifications.
  • Prompting details are shared; all models use the same JSON‑only “Who is ?” prompt. Clustering runs on a cheaper model.

Privacy and data handling

  • Strong criticism that all queries (including real names) were publicly listed via a “latest” leaderboard and accessible via API; later partially mitigated but data remain broadly accessible.
  • Lack of a clear privacy policy and presence of tracking/Cloudflare checks raise suspicion about IP/name harvesting.
  • Several argue one should assume any text submitted to random sites will be stored and reused, possibly in future training sets.

UX, cost, and design

  • Praised for design and portraits (generated via an image model), but some report input bugs, intrusive sounds, and rate‑limit errors under load.
  • Running many frontier and smaller models per query is acknowledged as costly; described by the author as a non‑commercial “fun hack and science experiment.”

Broader reflections

  • Thread touches on “being in the weights” as a form of weird digital immortality and raises questions about the right to be forgotten.
  • Some are relieved not to appear at all; others note how hard it is to keep real‑life identity separate from online traces once models ingest public data.