2026-06-21

When I reject AI code even if it works

When “working” code is still rejected

Many argue that “it runs and tests are green” is only a minimum bar.
Common rejection reasons: can’t explain the approach, diff is larger than the problem, premature abstractions, harder reasoning/maintainability, or trusting AI output more than one’s own understanding.
Several note this is identical to how good reviewers should treat human-written code (e.g., from juniors or contractors).

Risk, accountability, and domains

Higher scrutiny is demanded for safety‑critical or high‑value systems (payments, infrastructure, core backends) than for internal tools, hobby projects, or simple static sites.
On‑call and support expectations: if you can’t debug or explain the code at 3am, you shouldn’t ship it.
Some find it unethical to merge AI code they haven’t truly reviewed for critical systems.

Quality of AI-generated code

Frequent complaints: excessive abstractions, over‑engineering, duplicated utilities, unnecessary columns/fields, front‑end data munging instead of proper backend/DB use, and “code salad” in complex domains.
LLMs are described as increasing entropy, diverging rather than converging without strong constraints.
Subtle bugs (e.g., ML data leakage, accounting corner cases) are easy for AI to miss and hard for novices to detect.
Others counter that average agents already outperform many “enterprise” programmers on basic structure and cleanliness.

Workflows and guardrails

Some build elaborate harnesses: custom linters for “AI-isms,” strict pre‑commit checks, TDD enforced by scripts, multi‑model cross‑review, and periodic AI‑assisted audits.
Others use AI mainly as a pair‑programmer: plans, small snippets, syntax help, porting between languages/frameworks, or expanding comment-level pseudocode.
A few openly merge AI code they don’t fully understand, relying on tests and rapid reaction to failures; others label this as risky or irresponsible.

Organizational dynamics and incentives

Existing bad incentives—rewarding big, fast commits and cowboy behavior—are seen as amplified by AI, leading to faster tech debt accumulation.
Some foresee “software bankruptcy”–style failures, or at least slower delivery, high senior churn, and rewrite attempts that may also fail.

Middle ground and long-term concerns

Debate over whether a stable “middle ground” exists between no AI and full agentic development.
Worries about deskilling, comprehension debt, and future systems no one understands.
Others see AI as a “faster keyboard” and argue the real constraint remains human understanding and architecture, which must not be outsourced.

Related topics