2026-01-26

When AI 'builds a browser,' check the repo before believing the hype

What the demo actually was

Many readers initially assumed “AI built a browser” meant an original, production‑grade engine; cloning the repo showed a brittle, partially working experiment.
The codebase is messy, slow, glitchy, and far from real‑world browser parity; some called it “app‑shaped” or “engine‑shaped” rather than a usable browser.
An engineer involved said the goal was to stress‑test agents on a large, open‑ended task, not to ship a product.

Compilation, dependencies, and “from scratch”

Dispute over whether the project even compiled: some noted broken builds and CI, others clarified it compiled intermittently but not reliably or in GitHub Actions.
The engine uses Servo components (cssparser, html5ever) and Taffy, plus typical libraries like HarfBuzz.
Critics argue this contradicts “from scratch”; defenders say using standard libraries is normal and it is not a mere “Servo wrapper.”

Marketing, hype, and ethics

Strong disagreement over whether the company’s claims were mild startup puffery or actively misleading “fraudulent misrepresentation.”
Concern that management and investors only see the headline “AI built a browser,” not the caveats or the repo, yet will form expectations and make staffing decisions on that basis.
Some see the entire exercise as hype for subscriptions and funding; others say it’s a standard tech hype cycle, not a unique scandal.

Lines of code and bogus productivity metrics

Heavy criticism of touting “3M+ LOC” as an achievement; many emphasize code is a liability, not an asset.
Historical arguments against LOC as a productivity metric are repeated; yet people note KPIs and “% of code written by AI” are resurging as management metrics.
One engineer reports a similar browser‑level result in ~20k LOC, underscoring that sheer volume mostly reflects bloat and “slop.”

What this says about current LLM capabilities

Broad agreement: LLMs are genuinely useful for small, well‑scoped coding tasks, autocomplete, and refactoring.
Many say they still cannot autonomously deliver large, coherent systems without heavy human steering; agents tend to increase “entropy” and tech debt.
Optimists see the week‑long autonomous run as a real milestone in handling longer tasks and expect rapid improvement; skeptics say every high‑profile “AI built X” demo collapses on inspection.

Costs, scale, and token usage

Reported “trillions of tokens” and multi‑million‑dollar cost are questioned as numerically implausible given latency and 2,000‑agent concurrency.
Commenters criticize secondary sources that estimate costs via another chatbot without transparent methodology.

Related topics