Google CEO says more than a quarter of the company's new code is created by AI
What “25% of code from AI” likely means
- Many argue this mostly reflects advanced autocomplete and boilerplate generation, not autonomous feature work.
- Google’s monorepo and heavy boilerplate (protos, configs, API surface changes, tests) are seen as ideal for AI-assisted large-scale refactors and rote edits.
- Some suspect the metric may also include long‑existing automated refactoring tools and codegen now rebranded as “AI.”
- Several question how the 25% was measured (keystrokes, characters, PRs, lines?) and view it as investor‑oriented marketing.
Productivity and workflow effects
- Enthusiastic users say LLMs are a major help for:
- Boilerplate, glue code, simple scripts, SQL, Terraform, config files.
- Unit test scaffolding and repetitive test variants.
- Quickly recalling APIs or patterns in unfamiliar stacks.
- Others report marginal or negative net gains: time saved typing is lost debugging subtle errors or hallucinated APIs.
- AI is often compared to “supercharged snippets” or “tab completion on steroids,” most useful when the human already understands the solution.
Code quality, complexity, and technical debt
- Strong concern that fast generation of “leaf” or trivial code will worsen bloat and tech debt, especially when repeated logic should be abstracted instead.
- Critics note LLMs confidently produce subtly wrong code; without strong tests and review this can accumulate hidden bugs.
- Some counter that humans already write lots of bad code; if AI output is always supervised and tested, it can still be a net win.
Measurement and metrics debates
- Thread questions whether Google can “accurately and meaningfully” measure software productivity, citing Goodhart’s law and metric gaming.
- Suggested aggregate metrics: time from log inspection to first change, DORA metrics, business outcomes (revenue, reliability) rather than lines of code.
Inside Google’s tooling (per commenters claiming to work there)
- Internal AI is integrated into IDEs, using Gemini adapted to the monorepo; described as high‑quality autocomplete more than autonomous coding.
- Claims of safety processes: monitoring, provenance tracking, adversarial testing, A/B experiments showing productivity gains across languages and seniority levels.
Broader concerns and sentiment
- Worries about: erosion of junior roles and career ladders, long‑term training data pollution, slowing innovation in languages/frameworks, and Google product “enshittification.”
- Others see LLMs as a real but bounded step change, similar in impact to past tools (compilers, refactoring IDEs), not imminent AGI replacing senior engineers.