Log by time, not by count

Logs vs Metrics: Definitions and Roles

  • Many commenters say the post is really about metrics, not logs: “logging by time” is essentially emitting metrics at a fixed interval.
  • Common framing:
    • Logs = discrete, human-readable events for diagnostics and postmortem analysis.
    • Metrics = quantitative measurements over time, usually aggregated, used for dashboards, alerting, capacity planning.
  • Several note that at scale logs should be structured (JSON/logfmt) so they can be filtered and partially treated like metrics, but the conceptual goals differ.

Time-Based vs Count-Based Logging

  • Support for the post’s intuition: count-based “every N items” logs can overwhelm readers and backends; time-based summaries are often what humans actually want.
  • Critiques: if you want periodic summaries, that’s a metric; use a metrics system instead of repurposing logs.
  • Some point out a subtle bug: if your processing loop blocks when there’s no work, “log every T seconds” may not actually give a consistent log rate.
  • Others argue time-based throttling is useful in multithreaded code because it avoids global contended counters.

Observability Practices and Tooling

  • Strong SRE/ops sentiment:
    • Logs are for “why,” metrics are for “is it healthy,” and tracing is for following a request across services.
    • Do not rely on logs for health checks or alerting; use dedicated metrics (Prometheus, Datadog, etc.) and health endpoints.
  • Modern observability stacks ingest structured events, then derive metrics and traces later (OpenTelemetry, columnar backends, “wide events”).

Volume, Sampling, and Aggregation

  • At high volume you cannot log everything:
    • Metrics aggregate (counts, sums, max, etc.).
    • Logs are sampled or throttled (by time or probability).
    • Traces are sampled at the “request/span” level.
  • Several emphasize “filter and aggregate after ingestion, not in application code,” if storage allows.

Logging Best Practices and Pitfalls

  • Recommended patterns: per-request IDs, log important branches and errors, dynamically adjustable log levels (even per-user), structured logs.
  • Warnings against: using log search as a metrics system, unbounded verbose logging, and treating log stream behavior as a production interface that’s hard to change later.
  • Distinction between program logs, audit logs (e.g. flight data recorders), write-ahead logs, and event-sourcing streams is highlighted as often overlooked.