It seems like the AI crawlers learned how to solve the Anubis challenges

Role and Limits of Anubis / PoW

  • Commenters stress Anubis was never a “bot detector” so much as a rate/cost limiter for abusive traffic, especially from rotating residential IPs that defeat IP-based throttling.
  • It works by requiring a SHA-256 proof-of-work once per client/session and issuing a JWT; scrapers can then amortize the cost over many requests, so large crawlers are only mildly inconvenienced.
  • Several note that if a normal browser can run the JS, a headless browser can too. The move from curl/Go clients to full Chromium was seen as inevitable.
  • Some argue PoW is “security theater”: the cost per page is orders of magnitude too low relative to AI companies’ compute, especially given optimization and batching.

Economics and Alternatives (402, Micropayments, “Useful Work”)

  • Many propose “402 Payment Required”–style schemes or Cloudflare-like pay-per-crawl/x402, to directly charge AI crawlers and shift costs back onto them; concerns include fees, taxes, exclusion of low-income users, and stronger DRM/copyright incentives.
  • Ideas include memory-hard PoW (Argon2, scrypt), per-resource hashes, and tying challenges to limited request quotas, but there’s skepticism that any tuning can meaningfully burden data centers without punishing users.
  • Some suggest embedding “useful work” (cryptomining, protein folding) in PoW; others strongly oppose normalizing web cryptominers and note that making work simultaneously useful, verifiable, and low-latency is unsolved.

Impact of AI Crawlers on the Open Web

  • Several operators of forges and personal sites report massive, robots.txt-ignoring scraping that hammers expensive endpoints (e.g., git blame, logs) and drives up bandwidth/CDN bills or causes slowdowns/DoS.
  • Others say they see little such traffic and suspect this is mainly a problem for highly visible or code-heavy sites.
  • There is worry about non-commercial sites disappearing or retreating into private/overlay networks, geoblocking, or paywalls, contributing to web “balkanization.”

Legal, Ethical, and Normative Arguments

  • One camp: public web content is fair game for crawling unless it causes clear harm (e.g., takes sites down); mandatory robots.txt compliance or anti-crawling laws risk DRM-like regimes.
  • The other camp: ignoring robots.txt and overwhelming small hosts is abusive, and there should be legal penalties (e.g., treating circumvention of systems like Anubis as bypassing “digital locks” under DMCA-style statutes).
  • Debate hinges on whether publishing for humans implies consent to large-scale machine reuse and on the difficulty of cross-border enforcement.

Critiques of Anubis and Broader Arms Race

  • Criticisms: Anubis harms UX (JS dependence, delays), breaks archiving and indexing unless carefully configured, and doesn’t truly stop determined AI crawlers—only the “dumbest” bots.
  • Supporters counter that even partial filtering and raising marginal costs is valuable for donation-funded services that just want to avoid being overrun.
  • Some prefer alternative tactics: serving LLM-generated junk or honeypot link mazes to waste crawler resources or poison training data; others experiment with IPv6-only sites, with mixed reports on effectiveness.