YAGRI: You are gonna read it
YAGRI vs YAGNI and scope of the advice
- Many agree that “you are gonna read it” fits operational metadata (timestamps, user IDs) because these are repeatedly needed for debugging and support.
- Others warn this can be misread as “index/store everything just in case”, leading to over-collection of user data and liability.
- Some see the principle as too generic: useful in many systems, overkill or misapplied in tiny B2B apps.
- Observers note the irony that the same field can be argued as YAGRI by one person and YAGNI by another.
Timestamps, booleans, and schema design habits
- Widespread support for always having
created_at/updated_at, sometimesupdated_by, and standard interfaces or mixins to maintain them. - Counterpoint: at billions of rows, multiple 8‑byte timestamps per row materially affect disk and RAM.
- Several recommend using nullable timestamps instead of booleans (
deleted_atinstead ofis_deleted), and in general treat booleans as a “code smell” that often wants to become enums, timestamps, or separate tables. - Some highlight subtle pitfalls (e.g., zero timestamps mapping to falsey values) but see them as implementation details, not reasons to avoid the pattern.
Soft deletes, archival, and legal concerns
- Opinions strongly diverge:
- Pro-soft-delete: avoids breaking references, supports history and undo, often combined with later archival.
- Anti-soft-delete: adds query complexity, performance overhead, and risks forgetting to filter deleted rows; archive tables or separate “deleted” tables are preferred by some.
- Others advocate hard deletes plus a robust audit log, or temporal tables/event-sourced models instead of
deleted_at. - Legal and compliance constraints (e.g., GDPR “right to be forgotten” vs retention requirements) make soft delete a product/legal decision, not just a technical one.
Migrations, performance, and reliability
- One camp views schema migrations (especially adding columns) as a solved, routine problem in mature frameworks.
- Another recounts “war stories”: migrations failing only in prod, noisy long‑running DDL impacting other workloads, and complex down/rollback logic corrupting data.
- This fuels the argument that getting core metadata fields right up front reduces risky schema churn later.
Audit logging, event sourcing, and alternatives
- Many argue a well‑designed audit log (who/what/when/optionally why, and the ability to undo) is more powerful than sprinkling metadata on every table.
- Event sourcing is presented as a stronger, but cognitively expensive, version of this: great in finance or highly audited domains, overkill and operationally tricky elsewhere (slow projections, schema evolution pain).
- Some favor hybrid approaches: main tables as current state; logs, CDC streams, or temporal tables for history.
Ownership and process
- Disagreement over who decides:
- Some say engineers should “own their craft” and always add basic timestamps without asking.
- Others insist storing extra data, soft deletions, and retention behavior must be explicitly product/legally specified, since they carry complexity and regulatory implications.