YAGRI: You are gonna read it

YAGRI vs YAGNI and scope of the advice

  • Many agree that “you are gonna read it” fits operational metadata (timestamps, user IDs) because these are repeatedly needed for debugging and support.
  • Others warn this can be misread as “index/store everything just in case”, leading to over-collection of user data and liability.
  • Some see the principle as too generic: useful in many systems, overkill or misapplied in tiny B2B apps.
  • Observers note the irony that the same field can be argued as YAGRI by one person and YAGNI by another.

Timestamps, booleans, and schema design habits

  • Widespread support for always having created_at / updated_at, sometimes updated_by, and standard interfaces or mixins to maintain them.
  • Counterpoint: at billions of rows, multiple 8‑byte timestamps per row materially affect disk and RAM.
  • Several recommend using nullable timestamps instead of booleans (deleted_at instead of is_deleted), and in general treat booleans as a “code smell” that often wants to become enums, timestamps, or separate tables.
  • Some highlight subtle pitfalls (e.g., zero timestamps mapping to falsey values) but see them as implementation details, not reasons to avoid the pattern.

Soft deletes, archival, and legal concerns

  • Opinions strongly diverge:
    • Pro-soft-delete: avoids breaking references, supports history and undo, often combined with later archival.
    • Anti-soft-delete: adds query complexity, performance overhead, and risks forgetting to filter deleted rows; archive tables or separate “deleted” tables are preferred by some.
  • Others advocate hard deletes plus a robust audit log, or temporal tables/event-sourced models instead of deleted_at.
  • Legal and compliance constraints (e.g., GDPR “right to be forgotten” vs retention requirements) make soft delete a product/legal decision, not just a technical one.

Migrations, performance, and reliability

  • One camp views schema migrations (especially adding columns) as a solved, routine problem in mature frameworks.
  • Another recounts “war stories”: migrations failing only in prod, noisy long‑running DDL impacting other workloads, and complex down/rollback logic corrupting data.
  • This fuels the argument that getting core metadata fields right up front reduces risky schema churn later.

Audit logging, event sourcing, and alternatives

  • Many argue a well‑designed audit log (who/what/when/optionally why, and the ability to undo) is more powerful than sprinkling metadata on every table.
  • Event sourcing is presented as a stronger, but cognitively expensive, version of this: great in finance or highly audited domains, overkill and operationally tricky elsewhere (slow projections, schema evolution pain).
  • Some favor hybrid approaches: main tables as current state; logs, CDC streams, or temporal tables for history.

Ownership and process

  • Disagreement over who decides:
    • Some say engineers should “own their craft” and always add basic timestamps without asking.
    • Others insist storing extra data, soft deletions, and retention behavior must be explicitly product/legally specified, since they carry complexity and regulatory implications.