2024-06-14

The problem with OpenTelemetry

Complexity and Developer Experience

Many commenters describe OpenTelemetry (OTel) as conceptually heavy and hard to get started with, especially from the docs.
Python and JavaScript SDKs are called out for confusing global state, “god objects,” surprising behavior (e.g., header encoding), and silent failures.
Collector configuration is seen as powerful but hard to understand; multiple transport options (HTTP, protobuf-over-HTTP, gRPC) add to confusion.
Others report smooth experiences, particularly in Go, .NET, and JVM, saying basic tracing took under a few hours and that DX is improving over time.

Traces vs Metrics vs Logs

Strong disagreement on whether logs and metrics should be first-class alongside traces.
One camp argues logs are just events and metrics are aggregations over span data; rich tracing plus aggregation should be enough for most debugging and performance analysis.
Another camp insists logs, metrics, and traces are fundamentally different primitives with distinct semantics, performance characteristics, and regulatory constraints; collapsing them into a single “event” abstraction is seen as naive.
Metrics are praised for cheap, continuous visibility and for surfacing “missing” behavior (e.g., requests that never happen), while traces are praised for detailed root-cause analysis but criticized for cost and sampling issues.

Scope, API/SDK Design, and “Bundling”

Several people think OTel tries to solve too many problems (all signals, all languages, transport, collectors, semantics), leading to bloated SDKs and a steep learning curve.
Others counter that the API/SDK split already exists, language SDKs are allowed to differ, and a unified project improves interoperability and cross-signal correlation.
There is debate over whether logs/metrics should live in the same project as tracing or as separate but related efforts.

Vendor Neutrality and Lock‑in

Many see OTel’s main value as breaking vendor lock‑in: standard APIs, one agent/collector per host, and the ability to route data to different commercial or open-source backends.
Some observers suspect commercial vendors whose products overlap tracing may be biased against OTel; others argue vendors should contribute more to the standard they benefit from.

Adoption Patterns and Gaps

Success stories include small projects using single-binary backends and large orgs standardizing on OTel to escape opaque pricing.
Pain points include short‑lived processes, span size limits, unclear semantics (e.g., trace events vs log records), and the feeling that tracing-specific goals are slowed by the broader “everything telemetry” ambition.

Related topics