2024-10-30

Google CEO says more than a quarter of the company's new code is created by AI

What “25% of code from AI” likely means

Many argue this mostly reflects advanced autocomplete and boilerplate generation, not autonomous feature work.
Google’s monorepo and heavy boilerplate (protos, configs, API surface changes, tests) are seen as ideal for AI-assisted large-scale refactors and rote edits.
Some suspect the metric may also include long‑existing automated refactoring tools and codegen now rebranded as “AI.”
Several question how the 25% was measured (keystrokes, characters, PRs, lines?) and view it as investor‑oriented marketing.

Productivity and workflow effects

Enthusiastic users say LLMs are a major help for:
- Boilerplate, glue code, simple scripts, SQL, Terraform, config files.
- Unit test scaffolding and repetitive test variants.
- Quickly recalling APIs or patterns in unfamiliar stacks.
Others report marginal or negative net gains: time saved typing is lost debugging subtle errors or hallucinated APIs.
AI is often compared to “supercharged snippets” or “tab completion on steroids,” most useful when the human already understands the solution.

Code quality, complexity, and technical debt

Strong concern that fast generation of “leaf” or trivial code will worsen bloat and tech debt, especially when repeated logic should be abstracted instead.
Critics note LLMs confidently produce subtly wrong code; without strong tests and review this can accumulate hidden bugs.
Some counter that humans already write lots of bad code; if AI output is always supervised and tested, it can still be a net win.

Measurement and metrics debates

Thread questions whether Google can “accurately and meaningfully” measure software productivity, citing Goodhart’s law and metric gaming.
Suggested aggregate metrics: time from log inspection to first change, DORA metrics, business outcomes (revenue, reliability) rather than lines of code.

Inside Google’s tooling (per commenters claiming to work there)

Internal AI is integrated into IDEs, using Gemini adapted to the monorepo; described as high‑quality autocomplete more than autonomous coding.
Claims of safety processes: monitoring, provenance tracking, adversarial testing, A/B experiments showing productivity gains across languages and seniority levels.

Broader concerns and sentiment

Worries about: erosion of junior roles and career ladders, long‑term training data pollution, slowing innovation in languages/frameworks, and Google product “enshittification.”
Others see LLMs as a real but bounded step change, similar in impact to past tools (compilers, refactoring IDEs), not imminent AGI replacing senior engineers.

Related topics