AI in software engineering at Google: Progress and the path ahead
Shift from Authoring to Reviewing
- Several commenters echo Google’s observation: with AI suggestions, developers increasingly review and edit rather than write from scratch.
- Some find this empowering, especially when working outside their specialty (e.g., backend devs producing React UIs).
- Others argue reviewers rarely achieve the same depth of understanding as original authors, risking shallow comprehension of complex systems.
Learning, Expertise, and Gatekeeping
- Strong debate on whether AI-assisted coding harms or helps learning.
- One side: deep understanding comes from struggling through solutions; AI short‑circuits this and can feed Dunning–Kruger dynamics.
- Other side: copying from LLMs is analogous to learning from Stack Overflow or tutorials; over time people rely less on it as they gain skill.
- Some push back on “gatekeeping” attitudes that demand low‑level knowledge (e.g., transistors, CPU internals) for everyday coding.
Code Quality, Correctness, and Maintainability
- Concern that syntactically correct but logically wrong or edge‑case‑fragile code will proliferate.
- Review fatigue and “looks fine” acceptance are seen as risks, especially late in the day or among inexperienced reviewers.
- Boilerplate generation is widely seen as a good fit, but there’s worry it may encourage bloated, repetitive code and weaker abstractions.
Metrics and Productivity Claims
- Google’s “fraction of characters written by AI” (~50% of new code) and similar Copilot stats draw skepticism.
- Critics say character share is a poor proxy for productivity or quality and fails to distinguish trivial boilerplate from hard logic.
- Some note that even “accepted” suggestions may require heavy modification.
Google Internal Tools and Culture
- Multiple Googlers/ex‑Googlers describe internal AI tools as powerful but uneven (good autocomplete, weak review suggestions).
- Disagreement over whether AI usage is “force‑fed” or optional; some complain certain AI affordances can’t be fully disabled.
- There is internal concern about overemphasizing AI metrics, but also acknowledgement that pre‑LLM ML autocomplete already existed.
Use Cases, UX, and Limits
- Most positive experiences are: code completion, boilerplate, schema/unit‑test generation, refactors, and “design sounding board” chats.
- Poor experiences include constant low‑quality suggestions in IDEs, hallucinated patterns, and lack of domain‑specific preferences.
- Many see future gains coming more from better IDE integration, context awareness, and workflow design than from raw model gains.
Broader Concerns
- Fears about IP contamination (e.g., AGPL snippets), privacy leaks, and over‑reliance on non‑deterministic tools.
- Long‑term speculation ranges from “bulldozer‑style productivity boost” to potential job displacement and even autonomous corporations.