4.5M Suspected Fake Stars in GitHub

Role and Meaning of GitHub Stars

  • Many commenters say stars are essentially bookmarks: “I might want to look at this later,” not an endorsement of quality or even usage.
  • Others treat star count as a rough popularity signal, e.g., when choosing between libraries (“10 stars vs 30k stars”).
  • Several note that GitHub’s own docs present stars as a way to save repos, but an ecosystem has grown that treats them as clout, traction, or credibility.

Incentives and Star-Gaming

  • Stars matter for CVs, perceived legitimacy, VC pitches, open-core traction, and funding; that creates strong incentives to buy or otherwise game them.
  • Some see this as a classic prisoner’s dilemma: if gaming is allowed, not gaming becomes a disadvantage.
  • Hackathon sponsorships that demand stars from participants and ads pushing repos are cited as manipulative, if not strictly fraudulent.
  • One commenter notes the paper’s numbers: millions of suspected fake stars but far fewer unique accounts after de-duplication.

Usefulness of Stars as a Metric

  • Many argue stars are a poor quality metric: trivially easy to click, heavily influenced by age, hype, and personality/brand.
  • Others still find them useful as a first-pass filter or for sorting search results, especially when entering a new ecosystem.
  • There’s skepticism that “N strangers clicked an icon” should ever be treated as a safety or security signal.

Alternative and Composite Signals

  • Commonly suggested better indicators:
    • Recent commit activity and total commits.
    • Open vs. closed issues and PRs; issue resolution patterns.
    • Number of contributors and dependency usage / reverse dependencies.
    • Clone/download counts or imports (with caveats that these can also be gamed).
  • Several propose multi-factor or third‑party “repo quality scores,” but doubt anyone would pay for such a service.

Detection, Defense, and Social/Trust Models

  • Some think GitHub should detect and discount fake stars (e.g., only count “active developer” accounts), others argue every rule set is easily automated around.
  • Web‑of‑trust ideas (prioritizing stars from people you follow or friends‑of‑friends) are discussed but criticized as gameable, low-signal, or misaligned with how developers actually use GitHub.
  • Broader point: all metrics become targets (Goodhart’s law), so users must treat any single number with caution.