AI in software engineering at Google: Progress and the path ahead

Shift from Authoring to Reviewing

  • Several commenters echo Google’s observation: with AI suggestions, developers increasingly review and edit rather than write from scratch.
  • Some find this empowering, especially when working outside their specialty (e.g., backend devs producing React UIs).
  • Others argue reviewers rarely achieve the same depth of understanding as original authors, risking shallow comprehension of complex systems.

Learning, Expertise, and Gatekeeping

  • Strong debate on whether AI-assisted coding harms or helps learning.
  • One side: deep understanding comes from struggling through solutions; AI short‑circuits this and can feed Dunning–Kruger dynamics.
  • Other side: copying from LLMs is analogous to learning from Stack Overflow or tutorials; over time people rely less on it as they gain skill.
  • Some push back on “gatekeeping” attitudes that demand low‑level knowledge (e.g., transistors, CPU internals) for everyday coding.

Code Quality, Correctness, and Maintainability

  • Concern that syntactically correct but logically wrong or edge‑case‑fragile code will proliferate.
  • Review fatigue and “looks fine” acceptance are seen as risks, especially late in the day or among inexperienced reviewers.
  • Boilerplate generation is widely seen as a good fit, but there’s worry it may encourage bloated, repetitive code and weaker abstractions.

Metrics and Productivity Claims

  • Google’s “fraction of characters written by AI” (~50% of new code) and similar Copilot stats draw skepticism.
  • Critics say character share is a poor proxy for productivity or quality and fails to distinguish trivial boilerplate from hard logic.
  • Some note that even “accepted” suggestions may require heavy modification.

Google Internal Tools and Culture

  • Multiple Googlers/ex‑Googlers describe internal AI tools as powerful but uneven (good autocomplete, weak review suggestions).
  • Disagreement over whether AI usage is “force‑fed” or optional; some complain certain AI affordances can’t be fully disabled.
  • There is internal concern about overemphasizing AI metrics, but also acknowledgement that pre‑LLM ML autocomplete already existed.

Use Cases, UX, and Limits

  • Most positive experiences are: code completion, boilerplate, schema/unit‑test generation, refactors, and “design sounding board” chats.
  • Poor experiences include constant low‑quality suggestions in IDEs, hallucinated patterns, and lack of domain‑specific preferences.
  • Many see future gains coming more from better IDE integration, context awareness, and workflow design than from raw model gains.

Broader Concerns

  • Fears about IP contamination (e.g., AGPL snippets), privacy leaks, and over‑reliance on non‑deterministic tools.
  • Long‑term speculation ranges from “bulldozer‑style productivity boost” to potential job displacement and even autonomous corporations.