When AI 'builds a browser,' check the repo before believing the hype

What the demo actually was

  • Many readers initially assumed “AI built a browser” meant an original, production‑grade engine; cloning the repo showed a brittle, partially working experiment.
  • The codebase is messy, slow, glitchy, and far from real‑world browser parity; some called it “app‑shaped” or “engine‑shaped” rather than a usable browser.
  • An engineer involved said the goal was to stress‑test agents on a large, open‑ended task, not to ship a product.

Compilation, dependencies, and “from scratch”

  • Dispute over whether the project even compiled: some noted broken builds and CI, others clarified it compiled intermittently but not reliably or in GitHub Actions.
  • The engine uses Servo components (cssparser, html5ever) and Taffy, plus typical libraries like HarfBuzz.
  • Critics argue this contradicts “from scratch”; defenders say using standard libraries is normal and it is not a mere “Servo wrapper.”

Marketing, hype, and ethics

  • Strong disagreement over whether the company’s claims were mild startup puffery or actively misleading “fraudulent misrepresentation.”
  • Concern that management and investors only see the headline “AI built a browser,” not the caveats or the repo, yet will form expectations and make staffing decisions on that basis.
  • Some see the entire exercise as hype for subscriptions and funding; others say it’s a standard tech hype cycle, not a unique scandal.

Lines of code and bogus productivity metrics

  • Heavy criticism of touting “3M+ LOC” as an achievement; many emphasize code is a liability, not an asset.
  • Historical arguments against LOC as a productivity metric are repeated; yet people note KPIs and “% of code written by AI” are resurging as management metrics.
  • One engineer reports a similar browser‑level result in ~20k LOC, underscoring that sheer volume mostly reflects bloat and “slop.”

What this says about current LLM capabilities

  • Broad agreement: LLMs are genuinely useful for small, well‑scoped coding tasks, autocomplete, and refactoring.
  • Many say they still cannot autonomously deliver large, coherent systems without heavy human steering; agents tend to increase “entropy” and tech debt.
  • Optimists see the week‑long autonomous run as a real milestone in handling longer tasks and expect rapid improvement; skeptics say every high‑profile “AI built X” demo collapses on inspection.

Costs, scale, and token usage

  • Reported “trillions of tokens” and multi‑million‑dollar cost are questioned as numerically implausible given latency and 2,000‑agent concurrency.
  • Commenters criticize secondary sources that estimate costs via another chatbot without transparent methodology.