2025-10-05

Memory access is O(N^[1/3])

Physical and Geometric Limits

Several comments tie memory latency to physics: finite speed of light and bounds like the Bekenstein/holographic limits, implying at best surface-area scaling rather than volume (Ω(N^1/2) or Ω(N^1/3) depending on dimensional assumptions).
Others argue this is too speculative for “near-black-hole” computers: going from entropy bounds to concrete latency involves multiple abstractions (classical vs quantum, where data must arrive, support mass, time dilation, heat removal).
There’s debate whether modern hardware is effectively 2D (PCB/IC layouts → ~√N scaling) versus partially 3D-stacked memory, and whether heat dissipation (area-limited) dominates.

What Big‑O and “N” Actually Mean

Long subthread clarifies that Big‑O is an upper bound on a function, not inherently “worst case” and not itself a runtime; you must specify which function (best/average/worst case).
Disagreement over whether a single random memory access should be modeled as O(1) or O(M^{1/3}), where M is addressable memory.
Some say algorithmic complexity should include this factor (e.g., linear scan becomes O(N·M^{1/3})); others argue M^{1/3} is just another hardware constant, like CPU frequency, for most algorithm analysis.
Multiple people stress that “N = data size” and “M = memory pool size” are distinct; adding 1/3 to every exponent is usually mixing them up.

Empirical Evidence and the Article’s Argument

Several note prior work (“Myth of RAM”) where measured latencies over many systems fit ≈√N better than N^{1/3}.
Many object that the article’s “empirical argument” is a ChatGPT-generated table, not real measurements; trust in such data is heavily criticized.
Some interpret the article as about latency of a single random access vs total memory size, not time to scan all memory; confusion arises from ambiguous wording like “access 8× as much memory.”
One commenter points out that full scans can remain O(N) if bandwidth scales and latency is overlapped.

Architecture, Caches, and NUMA

Several argue the result is largely an artifact of cache hierarchy design rather than a pure geometric law; practical behavior is more step-function-like (L1/L2/L3/RAM/disk).
NUMA, multi-socket systems, and distributed/cloud setups are discussed as real manifestations of non‑uniform memory access; others note NUMA cost/latency overheads and why many large services still prefer single-socket nodes.
Examples include precomputation tables that become slower when they no longer fit in cache, and processing-in-memory / GPU-style local computation that sidesteps global O(M^{1/3}) costs.

Broader Theoretical and Practical Takeaways

Multiple commenters emphasize that assuming O(1) RAM access is the key abstraction gap between classic models (RAM, Turing machines) and physical machines, especially at data-center scale.
Cache-aware and cache-oblivious models already treat memory hierarchy explicitly; some see O(M^{1/3}) as a helpful intuition about why locality matters, not something that fundamentally changes algorithm design.
Others think this is overfitted, mathematically sloppy (Big‑O vs Big‑Ω, wrong N), or trivial once restated precisely, though the core message—large, far memory is inherently slower—is widely accepted.

Related topics