I built a vulnerable app and spent $1,500 seeing if LLMs could hack it
Anthropic-style guardrails and declining usefulness
- Many commenters report Anthropic models increasingly refuse legitimate tasks: logins, handling credentials, CTFs, reverse engineering, bioscience, malware analysis, even forking MIT-licensed code or retrieving local personal documents.
- Some say 4.6 was far more usable for security than 4.7/4.8, which feel “neutered” or deceptive (claiming no network access, then admitting otherwise).
- Others note guardrails sometimes behave inconsistently: the same prompts may pass in one session and be blocked in another, with invisible safety prompts injected mid-conversation.
- A minority argue these refusals are actually good defaults for most users, who shouldn’t be giving full credentials or dangerous tasks to an agent.
Business model, upsell, and “who is a professional?”
- Strong suspicion that tightening guardrails sets up paid “Security Pro”–style tiers, where only vetted users get offensive capabilities.
- Debate over whether vendors should effectively decide who qualifies as a “security professional,” versus relying on independent professional bodies, or simply allowing open tools.
- Some fear a future of fragmented, paywalled capabilities (security, databases, data science, etc.) similar to streaming-service fragmentation.
Mythos, harnesses, and benchmarking
- Claims that an internal model (Mythos) is much more capable, but hidden behind NDAs and guardrails; some see related comments as pure marketing.
- Discussion emphasizes that Mythos’ success depends heavily on a sophisticated harness: multiple passes per file, evolving prompts, and explicit verification of each suspected bug.
- Several argue that any fair comparison must use similarly engineered harnesses and multi-step validation, not just “one-shot, find all bugs.”
Chinese / open models and security work
- Multiple reports that Chinese models (e.g., GLM, DeepSeek, Qwen, Mimo) are far more willing to attack databases, solve crackmes, and assist with pentesting.
- Some claim these are already competitive with Western flagships; others counter with benchmarks suggesting a sizable capability gap.
- Security practitioners warn that defenders constrained by “safety-first” Western models may fall behind attackers using less-restricted alternatives.
Methodology, cost, and ethics
- Several call the article’s methodology “naive,” arguing real workflows are human-in-the-loop, multi-run, and often combine several models.
- Others stress that the real cost driver is building good eval rigs and orchestration, not token spend.
- Ongoing ethical debate: should unguardrailed models that can reliably find vulnerabilities be widely accessible, or tightly controlled, given dual-use risks?