When AI 'builds a browser,' check the repo before believing the hype
What the demo actually was
- Many readers initially assumed “AI built a browser” meant an original, production‑grade engine; cloning the repo showed a brittle, partially working experiment.
- The codebase is messy, slow, glitchy, and far from real‑world browser parity; some called it “app‑shaped” or “engine‑shaped” rather than a usable browser.
- An engineer involved said the goal was to stress‑test agents on a large, open‑ended task, not to ship a product.
Compilation, dependencies, and “from scratch”
- Dispute over whether the project even compiled: some noted broken builds and CI, others clarified it compiled intermittently but not reliably or in GitHub Actions.
- The engine uses Servo components (cssparser, html5ever) and Taffy, plus typical libraries like HarfBuzz.
- Critics argue this contradicts “from scratch”; defenders say using standard libraries is normal and it is not a mere “Servo wrapper.”
Marketing, hype, and ethics
- Strong disagreement over whether the company’s claims were mild startup puffery or actively misleading “fraudulent misrepresentation.”
- Concern that management and investors only see the headline “AI built a browser,” not the caveats or the repo, yet will form expectations and make staffing decisions on that basis.
- Some see the entire exercise as hype for subscriptions and funding; others say it’s a standard tech hype cycle, not a unique scandal.
Lines of code and bogus productivity metrics
- Heavy criticism of touting “3M+ LOC” as an achievement; many emphasize code is a liability, not an asset.
- Historical arguments against LOC as a productivity metric are repeated; yet people note KPIs and “% of code written by AI” are resurging as management metrics.
- One engineer reports a similar browser‑level result in ~20k LOC, underscoring that sheer volume mostly reflects bloat and “slop.”
What this says about current LLM capabilities
- Broad agreement: LLMs are genuinely useful for small, well‑scoped coding tasks, autocomplete, and refactoring.
- Many say they still cannot autonomously deliver large, coherent systems without heavy human steering; agents tend to increase “entropy” and tech debt.
- Optimists see the week‑long autonomous run as a real milestone in handling longer tasks and expect rapid improvement; skeptics say every high‑profile “AI built X” demo collapses on inspection.
Costs, scale, and token usage
- Reported “trillions of tokens” and multi‑million‑dollar cost are questioned as numerically implausible given latency and 2,000‑agent concurrency.
- Commenters criticize secondary sources that estimate costs via another chatbot without transparent methodology.