2025-04-23

YAGRI: You are gonna read it

YAGRI vs YAGNI and scope of the advice

Many agree that “you are gonna read it” fits operational metadata (timestamps, user IDs) because these are repeatedly needed for debugging and support.
Others warn this can be misread as “index/store everything just in case”, leading to over-collection of user data and liability.
Some see the principle as too generic: useful in many systems, overkill or misapplied in tiny B2B apps.
Observers note the irony that the same field can be argued as YAGRI by one person and YAGNI by another.

Timestamps, booleans, and schema design habits

Widespread support for always having created_at / updated_at, sometimes updated_by, and standard interfaces or mixins to maintain them.
Counterpoint: at billions of rows, multiple 8‑byte timestamps per row materially affect disk and RAM.
Several recommend using nullable timestamps instead of booleans (deleted_at instead of is_deleted), and in general treat booleans as a “code smell” that often wants to become enums, timestamps, or separate tables.
Some highlight subtle pitfalls (e.g., zero timestamps mapping to falsey values) but see them as implementation details, not reasons to avoid the pattern.

Soft deletes, archival, and legal concerns

Opinions strongly diverge:
- Pro-soft-delete: avoids breaking references, supports history and undo, often combined with later archival.
- Anti-soft-delete: adds query complexity, performance overhead, and risks forgetting to filter deleted rows; archive tables or separate “deleted” tables are preferred by some.
Others advocate hard deletes plus a robust audit log, or temporal tables/event-sourced models instead of deleted_at.
Legal and compliance constraints (e.g., GDPR “right to be forgotten” vs retention requirements) make soft delete a product/legal decision, not just a technical one.

Migrations, performance, and reliability

One camp views schema migrations (especially adding columns) as a solved, routine problem in mature frameworks.
Another recounts “war stories”: migrations failing only in prod, noisy long‑running DDL impacting other workloads, and complex down/rollback logic corrupting data.
This fuels the argument that getting core metadata fields right up front reduces risky schema churn later.

Audit logging, event sourcing, and alternatives

Many argue a well‑designed audit log (who/what/when/optionally why, and the ability to undo) is more powerful than sprinkling metadata on every table.
Event sourcing is presented as a stronger, but cognitively expensive, version of this: great in finance or highly audited domains, overkill and operationally tricky elsewhere (slow projections, schema evolution pain).
Some favor hybrid approaches: main tables as current state; logs, CDC streams, or temporal tables for history.

Ownership and process

Disagreement over who decides:
- Some say engineers should “own their craft” and always add basic timestamps without asking.
- Others insist storing extra data, soft deletions, and retention behavior must be explicitly product/legally specified, since they carry complexity and regulatory implications.

Related topics