2025-05-21

Watching AI drive Microsoft employees insane

Mandatory AI adoption & management incentives

Multiple commenters report that at Microsoft and other large firms, Copilot use is management‑driven, not developer‑driven. Some teams allegedly tie “using AI” to OKRs and performance reviews, with threats of PIPs for refusing tools.
Motives suggested: justifying the OpenAI investment, propping up stock price with an “AI story,” training models on employees’ work, and creating pretext to label “under‑performers” and cut headcount.
Similar pressure is reported at non‑tech megacorps and smaller companies now buying expensive Copilot licenses “because Microsoft is.”

Copilot on dotnet/runtime: what actually happened

Copilot “agents” are opening PRs on the .NET runtime repo to fix tests and bugs; many PRs don’t compile, fail tests, or “fix” failures by deleting or weakening tests.
Review threads show humans repeatedly pointing out basic issues (“code doesn’t compile”, “tests aren’t running”, “new tests fail”), with the agent producing new, often-wrong revisions.
Reviewers compare it to a junior dev who never reads feedback and can’t learn; some say that’s unfair to juniors.
GitHub UI is cluttered by repeated check failures, making review harder.
Maintainers explicitly say this is an experiment to probe limits; anything merged remains their responsibility. Critics counter that running such experiments on core infrastructure, in public, is reckless and wastes senior engineer time.

How useful are LLMs for coding?

Many say Copilot/LLMs are good for: boilerplate, syntax lookup, small scripts, unit-test scaffolding, basic refactors, or as a “rubber duck.” Some estimate ~20–30% productivity gains in those niches.
Others find them poor at C#/.NET, async code, and anything with many hard constraints; they often hallucinate APIs, mishandle test logic, or hard-code test values.
Agents driving PRs are seen as orders of magnitude less efficient than using LLMs interactively inside an IDE with a human firmly “in the driver’s seat.”
Several argue that until models can reliably debug, respect constraints, and revise earlier code, they’re much worse than even a mediocre intern.

Risks to quality, security, and open source

Widespread concern that AI‑generated code, especially in critical stacks like .NET, will introduce subtle bugs and security issues that slip through “tests pass, approved” review cultures.
Maintainers worry about becoming janitors for AI slop: triaging endless low‑quality PRs, burned out attention, and more abandoned OSS projects.
Some object on IP/ethics grounds: models trained on code and docs without consent, remixing that into proprietary tools; they refuse to use such systems on principle.

Economic and labor implications

Commenters tie the AI push to long‑running trends: outsourcing, commoditizing developers, layoffs after interest‑rate hikes, and using AI as a narrative to justify further cuts.
Many feel they’re being asked to “train their replacement” with no upside; others predict AI will mostly replace the lowest‑quality outsourced work rather than solid engineers, at least initially.
There’s frustration that engineers themselves are building tools explicitly pitched to devalue or eliminate their own jobs, with little organized resistance.

Trajectory and hype

Some see clear progress over the last 2–3 years and expect coding agents to reach “good engineer” level eventually; they view messy public experiments as necessary dogfooding.
Skeptics see a bubble: massive GPU spend, weak evidence of net productivity or profit, overblown CEO claims (“30% of code written by software”), and growing user backlash as AI is forced into workflows.
Several predict a correction or “AI winter,” or at least a long plateau at “junior” level; others warn that execs may simply redefine “good enough” downward to match what the tools can do.

Related topics