The hacker sent by Anthropic to calm the government's nerves about AI safety
Perceived Backfire of Anthropic’s AI-Safety Messaging
- Many argue Anthropic and other labs hyped AI as unprecedentedly powerful and dangerous, then acted surprised when strong regulation and export controls appeared.
- Several see this as “fearmongering” and PR that backfired: talking up existential risk, calling for restrictions on others, then getting hit first when government needed a high-profile example.
- Others respond that distinguishing between useful vs. harmful regulation is legitimate; criticizing overreaction doesn’t contradict earlier safety concerns.
Government Motives and Process
- One camp sees the action as 99% political drama: a corrupt, authoritarian-leaning administration using a rare, extreme national-security mechanism to punish a non-cooperative firm.
- Another camp says Anthropic’s own claims about Mythos/Fable being uniquely dangerous gave regulators all the justification they needed.
- Some connect this to earlier conflict over Anthropic’s terms banning lethal combat planning and mass surveillance, claiming the DoD was angered and the White House coordinated a punitive response.
- Complaints include lack of due process, unprecedented “no non-U.S. citizens” restrictions, and a 90-minute compliance window.
Mythos/Fable Capabilities and Jailbreaks
- Posters note that Fable (Mythos with a strong safety classifier) was marketed as exceptionally dangerous yet well-guarded.
- Critics highlight that an early “jailbreak” appeared quickly, undermining claims of safety testing and external bounties.
- Others counter that the cited “jailbreak” was just normal bug-fixing plus test generation, not a serious exploit-creation event.
Regulation, Open-Weights, and Liability
- Debate around claims that Anthropic pushed for bans on open-weight models:
- One side says the combination of “full shutdown” obligations and liability for downstream misuse in laws like California’s SB 1047 effectively bans open-source.
- Another points to later amendments clarifying shutdown can live in hardware and excludes derivatives beyond the developer’s control, arguing outright bans are an overreading.
- Broader worry: making developers liable for all downstream harms could chill hosting and investment.
Lobbying, Power, and Industry Naivety
- Multiple commenters say Anthropic was naïve: in the current environment, you need lobbyists and “administration insiders,” not technical experts, to manage risk.
- Some refuse to accept this as normal, calling it extortion and market manipulation; others frame it as “Business 101” when facing a state with ultimate power.
Broader AI Impact and Hype
- Views on AI risk are split:
- Some emphasize real harms already visible (CAPTCHAs broken, astroturfing, degraded trust in essays and open source).
- Others say early models like GPT‑2 were overhyped “gibberish generators,” and current frontier risk narratives may still be exaggerated.
- Several expect the current AI bubble to resemble the dot-com era: lots of hype, real but limited long-term utility once the bubble pops.
Meta and Community Dynamics
- There is brief discussion of moderation quality, flagged comments, and the difficulty of fairly enforcing guidelines at HN’s scale.