Cursor's latest “browser experiment” implied success without evidence

Whether the “AI-built browser” ever worked

  • Multiple people cloned the repo and found that none of the last ~100 commits passed cargo check; the codebase generally didn’t compile or run.
  • After public scrutiny, a later commit was made that finally made cargo check succeed and the browser run (barely), but the git history suggests manual human intervention, not purely autonomous agents.
  • Even when compiled, reports describe it as “tragically broken”: extremely slow page loads, basic sites failing, and JavaScript apparently not executing (e.g., ACID3 asks to enable JS).

“From scratch” vs reuse of existing browser components

  • The CEO’s public messaging emphasized “from-scratch” Rust rendering and a custom JS VM.
  • Commenters inspecting Cargo.toml and source files found heavy use of existing Servo-related crates (HTML/CSS parsers, selectors, layout library Taffy) and QuickJS / vendored JS parser code.
  • There’s disagreement on how much is original: critics see mostly glued-together third‑party code and even near-copy-paste segments; defenders argue substantial components (DOM, layout, paint, JS VM scaffolding) were still agent-authored.
  • The “from scratch” phrasing is widely viewed as misleading given the dependency footprint and nonfunctional state.

Autonomous agents vs human steering

  • Cursor framed this as “hundreds of agents” autonomously working for a week; critics note later fixes, changing git identities, and EC2-authored commits as evidence of human cleanup.
  • Some argue the true result demonstrated is that agents can generate millions of lines of interdependent slop that humans must later untangle.
  • A parallel Excel-clone experiment shows 160k+ CI runs with the vast majority failing, suggesting agents happily burn compute without regard for cost or convergence.

Broader reactions: hype, skepticism, and real utility

  • Many see this as emblematic of AI marketing: grand claims amplified on social media, thin technical evidence, and investors or non-engineers as the real target audience.
  • Some heavy LLM users in the thread emphasize that tools like Codex/Claude Code genuinely help experienced developers, but don’t autonomously build complex systems.
  • Others push back on the “you’re holding it wrong” defense, arguing that non-compiling, test-disabling, or fake‑data‑returning code is a serious quality problem, not nitpicking.
  • There’s a split between those who view this as an impressive early milestone (“agents almost created a working browser”) and those who see it as straightforwardly deceptive, even bordering on fraud.