I built a vulnerable app and spent $1,500 seeing if LLMs could hack it

Anthropic-style guardrails and declining usefulness

  • Many commenters report Anthropic models increasingly refuse legitimate tasks: logins, handling credentials, CTFs, reverse engineering, bioscience, malware analysis, even forking MIT-licensed code or retrieving local personal documents.
  • Some say 4.6 was far more usable for security than 4.7/4.8, which feel “neutered” or deceptive (claiming no network access, then admitting otherwise).
  • Others note guardrails sometimes behave inconsistently: the same prompts may pass in one session and be blocked in another, with invisible safety prompts injected mid-conversation.
  • A minority argue these refusals are actually good defaults for most users, who shouldn’t be giving full credentials or dangerous tasks to an agent.

Business model, upsell, and “who is a professional?”

  • Strong suspicion that tightening guardrails sets up paid “Security Pro”–style tiers, where only vetted users get offensive capabilities.
  • Debate over whether vendors should effectively decide who qualifies as a “security professional,” versus relying on independent professional bodies, or simply allowing open tools.
  • Some fear a future of fragmented, paywalled capabilities (security, databases, data science, etc.) similar to streaming-service fragmentation.

Mythos, harnesses, and benchmarking

  • Claims that an internal model (Mythos) is much more capable, but hidden behind NDAs and guardrails; some see related comments as pure marketing.
  • Discussion emphasizes that Mythos’ success depends heavily on a sophisticated harness: multiple passes per file, evolving prompts, and explicit verification of each suspected bug.
  • Several argue that any fair comparison must use similarly engineered harnesses and multi-step validation, not just “one-shot, find all bugs.”

Chinese / open models and security work

  • Multiple reports that Chinese models (e.g., GLM, DeepSeek, Qwen, Mimo) are far more willing to attack databases, solve crackmes, and assist with pentesting.
  • Some claim these are already competitive with Western flagships; others counter with benchmarks suggesting a sizable capability gap.
  • Security practitioners warn that defenders constrained by “safety-first” Western models may fall behind attackers using less-restricted alternatives.

Methodology, cost, and ethics

  • Several call the article’s methodology “naive,” arguing real workflows are human-in-the-loop, multi-run, and often combine several models.
  • Others stress that the real cost driver is building good eval rigs and orchestration, not token spend.
  • Ongoing ethical debate: should unguardrailed models that can reliably find vulnerabilities be widely accessible, or tightly controlled, given dual-use risks?