Feed readers which don't take "no" for an answer
HTTP status codes and API semantics
- Debate over whether HTTP status codes are good design for app-level errors.
- Some argue app-specific error payloads should dominate, with HTTP codes only indicating transport-level success/failure.
- Others insist layered design makes sense: HTTP handles resource/transport status (e.g., 404, 429), app errors go in the body.
- Disagreement over using 404 for “resource not found in DB” vs “endpoint doesn’t exist”; some see both as 404, others prefer 200 with an empty/“no results” payload.
Feed reader behavior & conditional requests
- Central complaint: many RSS/Atom readers poll too frequently with unconditional GETs of large feeds.
- Proper behavior cited: send
If-Modified-Since/If-None-Matchand respect304 Not Modified. - Some readers do this correctly; others hammer feeds every few minutes and ignore caching semantics, effectively wasting bandwidth.
Aggressive rate limiting and 429 responses
- The blog in question returns 429 and advises a 24‑hour retry for clients that repeatedly fetch unconditionally.
- Supporters: servers owe clients neither unlimited requests nor special treatment; 429 + Retry-After is a clear signal, and misbehaving clients should fix caching.
- Critics: blocking after 2 hits in 20 minutes for a 500KB RSS feed is “hostile” and punishes end users, especially behind shared IPs or when testing new readers.
- Semantic dispute over whether 429 is “rate limiting” vs “blocking,” but practical effect is the same: no content during the window.
Bandwidth, feed design, and caching
- The feed contains
100 full posts (500KB). Some say that’s excessive and should be trimmed (e.g., fewer items, summaries only). - Others defend full-content, long-history feeds; the real waste is clients re-downloading unchanged content instead of using conditional requests.
- Examples given where individual readers account for noticeable percentages of a site’s yearly egress.
Bots, LLM scrapers, and infrastructure
- Several report big increases in bot and LLM-related traffic, often ignoring robots.txt and faking user agents.
- Approaches mentioned: blocking datacenter IPs, “bot motels” (trapping crawlers in junk content), poisoning indexes.
- Some suggest CDNs, WebSub/pubsubhubbub, or third-party hubs to offload polling; others resist CDNs as corrosive to an open, independently hosted web.
Miscellaneous tangents
- Grammar digression on “which” vs “that.”
- Reflections on falling traffic for small sites, search downranking, paywalls, and monopoly/antitrust politics.