A ChatGPT mistake cost us $10k

Role of ChatGPT vs. Engineering Process

  • Many argue the real problem was not ChatGPT but poor engineering practice: no proper review of generated code, weak testing, and missing monitoring/alerts.
  • Others see ChatGPT as a “single point of failure”: it produced ORM code no one on the team really understood, creating a codebase the team didn’t “own” mentally.
  • Some object to the title as misleading or clickbait; they reframe it as “we blindly trusted ChatGPT and lacked safeguards.”

The Actual Bug and Python/SQLAlchemy Footgun

  • Core issue: default=str(uuid.uuid4()) in a SQLAlchemy Column evaluates once at class definition, so each process reused the same UUID, triggering duplicate key violations.
  • Several note this is analogous to Python’s “mutable default argument” trap and is a common mistake even among humans.
  • Others point out SQLAlchemy’s API (same parameter for static value or callable) makes this error easy; suggestions include separate default vs default_factory or lints that reject static defaults on unique/PK columns.

Testing, Logging, and Monitoring Failures

  • Repeated criticism that:
    • No tests created multiple rows in the same table in one run.
    • Logs and alerts for DB constraint errors were absent or unused; this should have been a 5‑minute diagnosis from “duplicate key” errors.
    • Deploying directly to production, at night, with 10–20 commits/day and no observability is called reckless.

Architecture and Stack Choices

  • Many question rewriting a working NextJS/TS backend to Python/FastAPI before having real traction, especially in a stack the team was weak in.
  • Overprovisioning (“8 ECS tasks × 5 instances” for tiny traffic) is seen as symptomatic of credit-fueled, wasteful startup culture.

LLMs in Production Code

  • Some treat LLMs as useful but only if their output is treated like code from a junior/intern and thoroughly understood.
  • Others are more pessimistic: LLMs are inherently non-deterministic “word generators,” so relying on them for business logic is seen as irresponsible.
  • A minority notes that the same bug could have been written by humans; the key is process (tests, review, observability), not the tool.

Meta: Postmortem and Reputation

  • Mixed reactions to publishing the story: some praise the honesty and see it as a useful cautionary tale; others say it harms the company’s credibility without offering deep or actionable takeaways.