IO Devices and Latency
Interactive Visuals and Accessibility
- Commenters widely praise the animations as some of the best latency explanations they’ve seen; many say they forgot it was effectively an ad.
- Visuals are implemented with heavy use of d3.js; other libraries like GSAP and SVG.js are mentioned as alternatives.
- Some users browse with JavaScript disabled and see no visuals, requesting static images as a fallback.
- Others report breakage from browser extensions (dark mode, ad blockers, user styles) and some browser-specific issues (Safari, Chrome/Firefox mismatches).
Durability, Replication, and Probability
- The article’s “1 in a million” durability remark is viewed as too pessimistic: commenters note that failures are only dangerous during the short window before a replica is replaced.
- One commenter provides a back-of-the-envelope recalculation showing far lower failure probability if failures are independent and replacement happens in ~30 minutes, but another cautions that failures are often correlated.
- The product uses semi-synchronous replication: the primary waits for at least one replica ACK before commit, introducing a network hop on writes but favoring read-heavy workloads.
Local NVMe vs Networked Storage and “Unlimited IOPS”
- Strong support for using local NVMe instead of cloud network volumes (EBS/Volumes) due to latency, IOPS limits, and cloud storage being “unusually slow.”
- Some nuance: network-attached storage makes maintenance/drains and durability easier, especially for systems that don’t implement replication themselves.
- “Unlimited IOPS” is defended as “practically unlimited” for MySQL: CPU becomes the bottleneck long before the physical NVMe IOPS limit is hit.
IOPS Limits, SSD Latency, and Hardware Differences
- Several fio benchmarks are shared comparing random writes vs fsync, O_DIRECT vs buffered IO, consumer vs enterprise NVMe.
- Key observations:
- Raw random writes can be tens of microseconds; durable sync writes are often ~250–300µs on consumer drives and much faster on enterprise drives with power-loss protection.
- Enterprise SSDs may acknowledge fsync before flushing to flash, relying on capacitors to guarantee durability on power loss.
- NVMe performance varies widely by device class and power-saving state; numbers in the article are broadly plausible but depend heavily on hardware and configuration.
SQLite + NVMe vs Client-Server Databases
- One subthread promotes SQLite-on-NVMe as a pattern: avoid the network hop, get microsecond-scale operations, and rely on a single writer.
- Counterarguments:
- Multi-writer scenarios and multiple webservers rapidly complicate SQLite usage; Postgres/MySQL are easier once you need a shared database.
- Local Postgres on the same host, using Unix sockets, is common and often “fast enough” while preserving scaling options.
- Some argue SQLite’s single-writer constraint is manageable for mostly-read workloads; others say you’ll hit that limit earlier than you think.
- There is back-and-forth on whether IPC/network overhead is negligible compared to query execution; opinions differ on how much optimization this really buys in web apps.
Cloud Operations, Local SSD Reliability, and Drains
- Prior bad experiences with GCP Local SSD (bad blocks) are contrasted with more recent reports of no such issues in testing.
- Local SSD setups rely on higher-level replication (e.g., MySQL semi-sync across AZs) plus orchestration to rapidly detect and replace failing nodes.
- Commenters highlight cloud “events”/drains (e.g., EC2 termination for maintenance) as a major operational risk for local-only storage: miss the event and local data disappears.
- Some note that for many orgs, the complexity of scripting automatic rebuilds on wiped local disks makes network-attached storage (EBS, etc.) more attractive.
Cloud IOPS Throttling and Economics
- IOPS limits on EBS-type volumes are explained as packet/operation rate limits, distinct from raw bandwidth, with both volume-level and instance-level caps.
- Moving to local NVMe removes artificial IOPS caps but trades off the elasticity of EBS and its ability to survive instance resizes or failures transparently.
- There’s curiosity about whether local NVMe is not only a latency win but also a throughput-per-dollar win; consensus is that it depends on workload and scaling patterns.
Educational, Historical, and Corrective Notes
- Many see the article as ideal teaching material for high school/university courses on storage and latency; several plan to link it in classes or to family.
- Old mainframe/tape and COBOL anecdotes underline how physical device behavior (e.g., tape overshoot, drum memories) shaped algorithms and access patterns.
- One commenter challenges specific HDD numbers (e.g., average rotational latency) and offers more detailed track-count estimates, pointing to an in-depth HDD performance paper.
- Some minor nitpicks appear (e.g., missing intermediate technologies between tape and HDD), but they don’t detract from broad praise for clarity and visuals.