A ChatGPT mistake cost us $10k
Role of ChatGPT vs. Engineering Process
- Many argue the real problem was not ChatGPT but poor engineering practice: no proper review of generated code, weak testing, and missing monitoring/alerts.
- Others see ChatGPT as a “single point of failure”: it produced ORM code no one on the team really understood, creating a codebase the team didn’t “own” mentally.
- Some object to the title as misleading or clickbait; they reframe it as “we blindly trusted ChatGPT and lacked safeguards.”
The Actual Bug and Python/SQLAlchemy Footgun
- Core issue:
default=str(uuid.uuid4())in a SQLAlchemyColumnevaluates once at class definition, so each process reused the same UUID, triggering duplicate key violations. - Several note this is analogous to Python’s “mutable default argument” trap and is a common mistake even among humans.
- Others point out SQLAlchemy’s API (same parameter for static value or callable) makes this error easy; suggestions include separate
defaultvsdefault_factoryor lints that reject static defaults on unique/PK columns.
Testing, Logging, and Monitoring Failures
- Repeated criticism that:
- No tests created multiple rows in the same table in one run.
- Logs and alerts for DB constraint errors were absent or unused; this should have been a 5‑minute diagnosis from “duplicate key” errors.
- Deploying directly to production, at night, with 10–20 commits/day and no observability is called reckless.
Architecture and Stack Choices
- Many question rewriting a working NextJS/TS backend to Python/FastAPI before having real traction, especially in a stack the team was weak in.
- Overprovisioning (“8 ECS tasks × 5 instances” for tiny traffic) is seen as symptomatic of credit-fueled, wasteful startup culture.
LLMs in Production Code
- Some treat LLMs as useful but only if their output is treated like code from a junior/intern and thoroughly understood.
- Others are more pessimistic: LLMs are inherently non-deterministic “word generators,” so relying on them for business logic is seen as irresponsible.
- A minority notes that the same bug could have been written by humans; the key is process (tests, review, observability), not the tool.
Meta: Postmortem and Reputation
- Mixed reactions to publishing the story: some praise the honesty and see it as a useful cautionary tale; others say it harms the company’s credibility without offering deep or actionable takeaways.