IO Devices and Latency

Interactive Visuals and Accessibility

  • Commenters widely praise the animations as some of the best latency explanations they’ve seen; many say they forgot it was effectively an ad.
  • Visuals are implemented with heavy use of d3.js; other libraries like GSAP and SVG.js are mentioned as alternatives.
  • Some users browse with JavaScript disabled and see no visuals, requesting static images as a fallback.
  • Others report breakage from browser extensions (dark mode, ad blockers, user styles) and some browser-specific issues (Safari, Chrome/Firefox mismatches).

Durability, Replication, and Probability

  • The article’s “1 in a million” durability remark is viewed as too pessimistic: commenters note that failures are only dangerous during the short window before a replica is replaced.
  • One commenter provides a back-of-the-envelope recalculation showing far lower failure probability if failures are independent and replacement happens in ~30 minutes, but another cautions that failures are often correlated.
  • The product uses semi-synchronous replication: the primary waits for at least one replica ACK before commit, introducing a network hop on writes but favoring read-heavy workloads.

Local NVMe vs Networked Storage and “Unlimited IOPS”

  • Strong support for using local NVMe instead of cloud network volumes (EBS/Volumes) due to latency, IOPS limits, and cloud storage being “unusually slow.”
  • Some nuance: network-attached storage makes maintenance/drains and durability easier, especially for systems that don’t implement replication themselves.
  • “Unlimited IOPS” is defended as “practically unlimited” for MySQL: CPU becomes the bottleneck long before the physical NVMe IOPS limit is hit.

IOPS Limits, SSD Latency, and Hardware Differences

  • Several fio benchmarks are shared comparing random writes vs fsync, O_DIRECT vs buffered IO, consumer vs enterprise NVMe.
  • Key observations:
    • Raw random writes can be tens of microseconds; durable sync writes are often ~250–300µs on consumer drives and much faster on enterprise drives with power-loss protection.
    • Enterprise SSDs may acknowledge fsync before flushing to flash, relying on capacitors to guarantee durability on power loss.
    • NVMe performance varies widely by device class and power-saving state; numbers in the article are broadly plausible but depend heavily on hardware and configuration.

SQLite + NVMe vs Client-Server Databases

  • One subthread promotes SQLite-on-NVMe as a pattern: avoid the network hop, get microsecond-scale operations, and rely on a single writer.
  • Counterarguments:
    • Multi-writer scenarios and multiple webservers rapidly complicate SQLite usage; Postgres/MySQL are easier once you need a shared database.
    • Local Postgres on the same host, using Unix sockets, is common and often “fast enough” while preserving scaling options.
    • Some argue SQLite’s single-writer constraint is manageable for mostly-read workloads; others say you’ll hit that limit earlier than you think.
  • There is back-and-forth on whether IPC/network overhead is negligible compared to query execution; opinions differ on how much optimization this really buys in web apps.

Cloud Operations, Local SSD Reliability, and Drains

  • Prior bad experiences with GCP Local SSD (bad blocks) are contrasted with more recent reports of no such issues in testing.
  • Local SSD setups rely on higher-level replication (e.g., MySQL semi-sync across AZs) plus orchestration to rapidly detect and replace failing nodes.
  • Commenters highlight cloud “events”/drains (e.g., EC2 termination for maintenance) as a major operational risk for local-only storage: miss the event and local data disappears.
  • Some note that for many orgs, the complexity of scripting automatic rebuilds on wiped local disks makes network-attached storage (EBS, etc.) more attractive.

Cloud IOPS Throttling and Economics

  • IOPS limits on EBS-type volumes are explained as packet/operation rate limits, distinct from raw bandwidth, with both volume-level and instance-level caps.
  • Moving to local NVMe removes artificial IOPS caps but trades off the elasticity of EBS and its ability to survive instance resizes or failures transparently.
  • There’s curiosity about whether local NVMe is not only a latency win but also a throughput-per-dollar win; consensus is that it depends on workload and scaling patterns.

Educational, Historical, and Corrective Notes

  • Many see the article as ideal teaching material for high school/university courses on storage and latency; several plan to link it in classes or to family.
  • Old mainframe/tape and COBOL anecdotes underline how physical device behavior (e.g., tape overshoot, drum memories) shaped algorithms and access patterns.
  • One commenter challenges specific HDD numbers (e.g., average rotational latency) and offers more detailed track-count estimates, pointing to an in-depth HDD performance paper.
  • Some minor nitpicks appear (e.g., missing intermediate technologies between tape and HDD), but they don’t detract from broad praise for clarity and visuals.