Opus 4.6 uncovers 500 zero-day flaws in open-source code
Verification and evidence
- Many commenters ask whether the “500 zero-day” figure has been independently verified or published as CVEs with CVSS scores.
- Anthropic’s public writeup is seen as marketing-heavy and example-light; people want a detailed paper with methodology, false-positive rates, and responses from affected projects.
- Prior Anthropic claims (e.g., “Chinese APT” use of Claude) are cited as reasons to be extra skeptical of their security PR.
Value and limits of LLM-based vulnerability discovery
- Some security practitioners in the thread say they’re not surprised; LLM-assisted vuln research has been progressing for years and fits the pattern-heavy, closed-loop nature of the work.
- Others argue it’s unclear if these are genuinely “hard to find” bugs versus low-hanging fruit in very old, complex codebases.
- A technical example from Ghostscript is noted: the model missed issues via general analysis but found one by reasoning over commit history and partial fixes.
OpenClaw tangent
- There’s disagreement over whether finding ~100 bugs in OpenClaw is “enormous economic value.”
- Several people hadn’t heard of OpenClaw and doubt its “massive adoption”; others argue that even unpopular or frivolous software is worth hardening if widely installed.
Terminology and bug-bounty “slop”
- Multiple commenters complain that “zero-day” is used loosely; here it really just means “previously unknown vulnerability in shipping software.”
- Maintainers’ experience with AI-generated bug reports (e.g., in curl) is raised: lots of low-quality “slop” from automated tools and LLMs swamping real findings.
- Others counter that serious teams using AI analyzers have already submitted valid, high-impact bugs; the problem is amateurs, not the technique.
Trust, authority, and conflicts of interest
- A long subthread debates whether to trust claims from well-known security experts who say these results are plausible.
- Some see this as an argument from authority or note potential conflicts of interest (researchers employed by an LLM vendor); others argue expertise and track record are relevant evidence.
Broader implications and criticisms
- Concerns include: LLMs introducing new vulnerabilities into code, uptime/availability if used in continuous security workflows, and the risk that attackers will use the same tools.
- Some lament that Anthropic is touting defensive uses while restricting access for defenders, effectively keeping the offensive advantage in-house.
- A few note this may slow “rewrite in safer language” efforts if LLMs can keep legacy C codebases limping along more safely.