Opus 4.6 uncovers 500 zero-day flaws in open-source code

Verification and evidence

  • Many commenters ask whether the “500 zero-day” figure has been independently verified or published as CVEs with CVSS scores.
  • Anthropic’s public writeup is seen as marketing-heavy and example-light; people want a detailed paper with methodology, false-positive rates, and responses from affected projects.
  • Prior Anthropic claims (e.g., “Chinese APT” use of Claude) are cited as reasons to be extra skeptical of their security PR.

Value and limits of LLM-based vulnerability discovery

  • Some security practitioners in the thread say they’re not surprised; LLM-assisted vuln research has been progressing for years and fits the pattern-heavy, closed-loop nature of the work.
  • Others argue it’s unclear if these are genuinely “hard to find” bugs versus low-hanging fruit in very old, complex codebases.
  • A technical example from Ghostscript is noted: the model missed issues via general analysis but found one by reasoning over commit history and partial fixes.

OpenClaw tangent

  • There’s disagreement over whether finding ~100 bugs in OpenClaw is “enormous economic value.”
  • Several people hadn’t heard of OpenClaw and doubt its “massive adoption”; others argue that even unpopular or frivolous software is worth hardening if widely installed.

Terminology and bug-bounty “slop”

  • Multiple commenters complain that “zero-day” is used loosely; here it really just means “previously unknown vulnerability in shipping software.”
  • Maintainers’ experience with AI-generated bug reports (e.g., in curl) is raised: lots of low-quality “slop” from automated tools and LLMs swamping real findings.
  • Others counter that serious teams using AI analyzers have already submitted valid, high-impact bugs; the problem is amateurs, not the technique.

Trust, authority, and conflicts of interest

  • A long subthread debates whether to trust claims from well-known security experts who say these results are plausible.
  • Some see this as an argument from authority or note potential conflicts of interest (researchers employed by an LLM vendor); others argue expertise and track record are relevant evidence.

Broader implications and criticisms

  • Concerns include: LLMs introducing new vulnerabilities into code, uptime/availability if used in continuous security workflows, and the risk that attackers will use the same tools.
  • Some lament that Anthropic is touting defensive uses while restricting access for defenders, effectively keeping the offensive advantage in-house.
  • A few note this may slow “rewrite in safer language” efforts if LLMs can keep legacy C codebases limping along more safely.