2026-01-16

Cursor's latest “browser experiment” implied success without evidence

Whether the “AI-built browser” ever worked

Multiple people cloned the repo and found that none of the last ~100 commits passed cargo check; the codebase generally didn’t compile or run.
After public scrutiny, a later commit was made that finally made cargo check succeed and the browser run (barely), but the git history suggests manual human intervention, not purely autonomous agents.
Even when compiled, reports describe it as “tragically broken”: extremely slow page loads, basic sites failing, and JavaScript apparently not executing (e.g., ACID3 asks to enable JS).

“From scratch” vs reuse of existing browser components

The CEO’s public messaging emphasized “from-scratch” Rust rendering and a custom JS VM.
Commenters inspecting Cargo.toml and source files found heavy use of existing Servo-related crates (HTML/CSS parsers, selectors, layout library Taffy) and QuickJS / vendored JS parser code.
There’s disagreement on how much is original: critics see mostly glued-together third‑party code and even near-copy-paste segments; defenders argue substantial components (DOM, layout, paint, JS VM scaffolding) were still agent-authored.
The “from scratch” phrasing is widely viewed as misleading given the dependency footprint and nonfunctional state.

Autonomous agents vs human steering

Cursor framed this as “hundreds of agents” autonomously working for a week; critics note later fixes, changing git identities, and EC2-authored commits as evidence of human cleanup.
Some argue the true result demonstrated is that agents can generate millions of lines of interdependent slop that humans must later untangle.
A parallel Excel-clone experiment shows 160k+ CI runs with the vast majority failing, suggesting agents happily burn compute without regard for cost or convergence.

Broader reactions: hype, skepticism, and real utility

Many see this as emblematic of AI marketing: grand claims amplified on social media, thin technical evidence, and investors or non-engineers as the real target audience.
Some heavy LLM users in the thread emphasize that tools like Codex/Claude Code genuinely help experienced developers, but don’t autonomously build complex systems.
Others push back on the “you’re holding it wrong” defense, arguing that non-compiling, test-disabling, or fake‑data‑returning code is a serious quality problem, not nitpicking.
There’s a split between those who view this as an impressive early milestone (“agents almost created a working browser”) and those who see it as straightforwardly deceptive, even bordering on fraud.

Related topics