2026-05-22

Project Glasswing: An Initial Update

Overall reaction to Mythos / Glasswing

Many see Mythos as a genuine “step change” in AI‑assisted vulnerability discovery, citing:
- High reported true‑positive rates (~90%) versus traditional tools.
- Partner anecdotes (Firefox, Cloudflare, banks, etc.) and UK/third‑party evaluations showing strong offensive capability and end‑to‑end exploit generation.
Others argue this is mostly marketing:
- Smaller or open‑weight models, with similar harnesses, reportedly reproduced Anthropic’s showcased findings.
- Some security practitioners report Mythos as “not obviously better” than other modern AI‑powered tools in their own codebases.

Model capability vs. harness and methodology

Repeated theme: results depend heavily on the harness, prompts, and compute budget, not just the base model.
Several point out that earlier runs with Opus 4.6 used weaker setups than Mythos, so headline “10x more bugs” claims may conflate model and methodology.
People report good results with orchestrators (e.g., a strong cyber model directing many cheap sub‑agents) plus static analysis/fuzzing, suggesting Mythos‑like performance may be achievable with enough engineering and tokens.

Numbers, validation, and confusion

Discussion scrutinizes Anthropic’s figures:
- 10k+ vulnerabilities vs. ~1.7k manually assessed vs. hundreds of published advisories; some find the math opaque.
- Confusion over “vulnerabilities” vs. CVEs vs. bugs, and over severity re‑ratings by Anthropic.
Some fear double‑counting or rediscovery of already‑fixed issues; others note responsible disclosure timelines mean many details are intentionally withheld for now.

Cost, access, and incentives

Mythos runs are described as extremely compute‑intensive and expensive per real vulnerability, with human triage and patching now the bottleneck.
Glasswing limits access to select “systemically important” partners and (later) governments; this is seen both as:
- A safety measure (reduce widespread offensive use before patches).
- A business/IPO and compute‑rationing strategy, and a way to delay model distillation by competitors.

Security landscape and future of software

Consensus: AI‑assisted tools (Mythos, Codex Security, others) already find large numbers of serious issues; attacks and defenses will both be super‑charged.
Concern that:
- Well‑funded orgs will harden fast, while smaller and open‑source projects may be left exposed.
- Vendors may profit from models that both introduce bugs (via codegen) and sell scanners to fix them.
Broader speculation about a future where most code is AI‑written, humans focus on review/architecture, and regulatory pressure may force automated scanning into release pipelines.

Related topics