2024-12-29

4.5M Suspected Fake Stars in GitHub

Role and Meaning of GitHub Stars

Many commenters say stars are essentially bookmarks: “I might want to look at this later,” not an endorsement of quality or even usage.
Others treat star count as a rough popularity signal, e.g., when choosing between libraries (“10 stars vs 30k stars”).
Several note that GitHub’s own docs present stars as a way to save repos, but an ecosystem has grown that treats them as clout, traction, or credibility.

Incentives and Star-Gaming

Stars matter for CVs, perceived legitimacy, VC pitches, open-core traction, and funding; that creates strong incentives to buy or otherwise game them.
Some see this as a classic prisoner’s dilemma: if gaming is allowed, not gaming becomes a disadvantage.
Hackathon sponsorships that demand stars from participants and ads pushing repos are cited as manipulative, if not strictly fraudulent.
One commenter notes the paper’s numbers: millions of suspected fake stars but far fewer unique accounts after de-duplication.

Usefulness of Stars as a Metric

Many argue stars are a poor quality metric: trivially easy to click, heavily influenced by age, hype, and personality/brand.
Others still find them useful as a first-pass filter or for sorting search results, especially when entering a new ecosystem.
There’s skepticism that “N strangers clicked an icon” should ever be treated as a safety or security signal.

Alternative and Composite Signals

Commonly suggested better indicators:
- Recent commit activity and total commits.
- Open vs. closed issues and PRs; issue resolution patterns.
- Number of contributors and dependency usage / reverse dependencies.
- Clone/download counts or imports (with caveats that these can also be gamed).
Several propose multi-factor or third‑party “repo quality scores,” but doubt anyone would pay for such a service.

Detection, Defense, and Social/Trust Models

Some think GitHub should detect and discount fake stars (e.g., only count “active developer” accounts), others argue every rule set is easily automated around.
Web‑of‑trust ideas (prioritizing stars from people you follow or friends‑of‑friends) are discussed but criticized as gameable, low-signal, or misaligned with how developers actually use GitHub.
Broader point: all metrics become targets (Goodhart’s law), so users must treat any single number with caution.

Related topics