Back to basics: Why we chose long-polling over websockets

SSE vs Long‑Polling

  • Several commenters ask why not use Server-Sent Events (SSE) instead of long‑polling.
  • SSE advantages cited: simple one-way streaming, works over plain HTTP, good fit for streaming updates (e.g., job monitoring).
  • Drawbacks mentioned: need separate text/event-stream handling vs application/json, some proxies/load balancers buffer or break SSE, connection limits in HTTP/1.1 (per-domain) and quirks on mobile Safari.
  • Workarounds: BroadcastChannel or SharedWorker to share a single SSE connection across tabs; using visibility APIs to close/reopen; pings and reconnection logic; domain sharding.
  • Some report SSE being flaky on mobile, leading them back to long‑polling.

WebSockets: Pros, Cons, and Complexity

  • One side argues WebSockets are conceptually simple, efficient (no per-message HTTP headers), support binary data, guarantee in-order delivery, and work well with HTTP/2 (and RFC 8441).
  • Others stress real-world complexity: reconnections, missed events, load balancing, DDoS concerns, corporate firewalls blocking the Upgrade handshake, and substantially different observability and auth patterns.
  • Some say these problems are overstated or solved by mature libraries, GraphQL subscriptions, or frameworks; others insist the operational and monitoring shift at scale is non-trivial.
  • There is mention of a patent troll targeting WebSocket use.

Long‑Polling Pros, Cons, and Design Gotchas

  • Many like long‑polling for fitting existing request/response auth, logging, and infra; easier to reason about with standard HTTP tooling; robust fallback when WS fails.
  • Critics highlight message ordering races, reconnection and timeout handling, need for sequence IDs and ACKs, and complexity that can start to resemble reimplementing TCP.
  • Timeouts across clients, proxies, web servers (e.g., aggressive keepalive limits) and CDNs must be tuned; otherwise, connections drop and messages can be lost without careful queueing and resync logic.
  • Some argue these issues exist for all streaming mechanisms (WS, SSE, long‑poll), so you need logs, retransmit queues, and resync strategies regardless.

HTTP/2, gRPC, and Alternatives

  • HTTP/2 multiplexing helps with connection limits but does not by itself provide unsolicited server‑push to pages; browser support for HTTP/2 server push is minimal.
  • gRPC over HTTP/2 enables streaming in controlled environments but is not directly usable from browsers.
  • Some note that, in practice, resource usage mostly scales with “one connection per client” regardless of transport.

Backend & DB Considerations

  • Several suggest centralizing change detection and fan-out (e.g., message brokers, Redis, Postgres LISTEN/NOTIFY) instead of having many workers/job pollers hammer the database.
  • Others caution about resource costs and queue limits for DB-based pub/sub, and about the remaining need to map backend events to the correct client connection.