2025-12-21

Logging sucks

Single-purpose site and marketing angle

Some readers like the interactive, “modern” presentation; others dislike the standalone domain and worry it will disappear unlike a stable personal blog.
Several conclude it functions as content marketing / lead generation: the “personalized report” form and tie-in to an observability SaaS (and indirectly Cloudflare) are noted.
A few argue this is fine as long as the content is genuinely useful.

Wide events, structured logging, and observability

Many see the core idea as “structured logs with rich context per request,” often already practiced with correlation/request IDs and JSON.
Supporters say wide events make it easy to answer product and incident questions: who is impacted, which flows, which customers, which experiments, etc.
Others argue this is not new; similar ideas exist in observability tools, structured logging libraries, tracing systems, and event sourcing.
Some emphasize schema discipline (e.g., standard field names, “related.*” fields) to avoid chaos in wide logs.

System architecture and “15 services per request”

The claim that a single request may touch many services sparks a microservices vs monolith debate.
One camp calls such architectures “deeply broken” and driven by fashion or incentives rather than need; monoliths are said to be sufficient for most apps.
Another camp provides detailed real-world examples (e‑g., mobility/bike‑sharing) where many distinct concerns and teams justify multiple services, or at least clear internal boundaries.
Discussion highlights that service boundaries are a trade-off: flexibility, independent deployment and scaling vs added latency, complexity, and logging/tracing difficulty.

Logs vs metrics, traces, and audits

Several stress that logs, metrics, traces, and audits serve different purposes, mainly differentiated by how you handle loss of fidelity.
Others propose a “grand theory of observability”: treat all signals as events that can be transformed into each other, with different storage and durability policies.
Large subthread debates audit logging: some insist audit streams must be transactionally durable across fault domains; others say regulators only require “reasonable” durability and availability.

Performance, sampling, and logging infrastructure

The recommendation to log every error/slow request while sampling healthy ones is criticized: in degraded states, log volume spikes can overload systems.
Counter-argument: a production service should be designed so log ingestion scales, with buffering, backpressure, and best-effort dropping where acceptable.
There’s debate on realistic throughput: some claim properly designed systems can handle enormous event rates; others note typical end‑to‑end stacks fall over at much lower volumes.
Adaptive and bucketed sampling strategies, or sampling primarily at the log backend, are mentioned as practical mitigations.

Tooling and implementation practices

Many say correlation/request IDs plus structured logs (often JSON) and tools like Splunk, Kibana, Loki, Tempo, or ClickHouse already solve most issues.
Several argue OpenTelemetry + tracing plus “wide spans” effectively implement what the article describes, but contest the article’s criticism of OTLP.
Suggestions include: consistent schemas, user IDs on every log, separate “audit” message types, log-based events feeding metrics systems, and using logs as an explicit product for internal consumers.

Critiques of the article (tone, examples, AI)

Some praise the substance and interactivity but find the “logging sucks” framing overdramatic, strawman-y, or condescending.
Multiple commenters feel the prose is verbose and “AI‑ish,” though there’s disagreement on this; others think calling out AI assistance is becoming unhelpful.
The initial bad‑log example is seen as unrealistic by some (because many already use correlation IDs), while others say they still see logs that bad in practice.

Miscellaneous insights

Concerns raised about logging sensitive or business-critical fields (like lifetime value) vs their usefulness for prioritization.
Several emphasize that logging, tracing, and metrics must be intentionally designed; tools alone don’t fix poor conventions.
A recurring theme: monolith vs microservices and “wide events vs traditional logs” are both trade‑offs; success depends more on discipline, consistency, and clear goals than on any single pattern.

Related topics