Memory access is O(N^[1/3])
Physical and Geometric Limits
- Several comments tie memory latency to physics: finite speed of light and bounds like the Bekenstein/holographic limits, implying at best surface-area scaling rather than volume (Ω(N^1/2) or Ω(N^1/3) depending on dimensional assumptions).
- Others argue this is too speculative for “near-black-hole” computers: going from entropy bounds to concrete latency involves multiple abstractions (classical vs quantum, where data must arrive, support mass, time dilation, heat removal).
- There’s debate whether modern hardware is effectively 2D (PCB/IC layouts → ~√N scaling) versus partially 3D-stacked memory, and whether heat dissipation (area-limited) dominates.
What Big‑O and “N” Actually Mean
- Long subthread clarifies that Big‑O is an upper bound on a function, not inherently “worst case” and not itself a runtime; you must specify which function (best/average/worst case).
- Disagreement over whether a single random memory access should be modeled as O(1) or O(M^{1/3}), where M is addressable memory.
- Some say algorithmic complexity should include this factor (e.g., linear scan becomes O(N·M^{1/3})); others argue M^{1/3} is just another hardware constant, like CPU frequency, for most algorithm analysis.
- Multiple people stress that “N = data size” and “M = memory pool size” are distinct; adding 1/3 to every exponent is usually mixing them up.
Empirical Evidence and the Article’s Argument
- Several note prior work (“Myth of RAM”) where measured latencies over many systems fit ≈√N better than N^{1/3}.
- Many object that the article’s “empirical argument” is a ChatGPT-generated table, not real measurements; trust in such data is heavily criticized.
- Some interpret the article as about latency of a single random access vs total memory size, not time to scan all memory; confusion arises from ambiguous wording like “access 8× as much memory.”
- One commenter points out that full scans can remain O(N) if bandwidth scales and latency is overlapped.
Architecture, Caches, and NUMA
- Several argue the result is largely an artifact of cache hierarchy design rather than a pure geometric law; practical behavior is more step-function-like (L1/L2/L3/RAM/disk).
- NUMA, multi-socket systems, and distributed/cloud setups are discussed as real manifestations of non‑uniform memory access; others note NUMA cost/latency overheads and why many large services still prefer single-socket nodes.
- Examples include precomputation tables that become slower when they no longer fit in cache, and processing-in-memory / GPU-style local computation that sidesteps global O(M^{1/3}) costs.
Broader Theoretical and Practical Takeaways
- Multiple commenters emphasize that assuming O(1) RAM access is the key abstraction gap between classic models (RAM, Turing machines) and physical machines, especially at data-center scale.
- Cache-aware and cache-oblivious models already treat memory hierarchy explicitly; some see O(M^{1/3}) as a helpful intuition about why locality matters, not something that fundamentally changes algorithm design.
- Others think this is overfitted, mathematically sloppy (Big‑O vs Big‑Ω, wrong N), or trivial once restated precisely, though the core message—large, far memory is inherently slower—is widely accepted.