Python numbers every programmer should know

Scope, Title, and Intent

  • Many readers interpret the title literally and push back: “every programmer should know” is seen as overstated; a handful of relative costs is enough.
  • Others note it’s clearly modeled on the classic “Latency Numbers Every Programmer Should Know”; some think the homage works, others say the list is too long and specific to be memorable.
  • The author clarifies in-thread: the goal is a mental model and to show when micro-optimizations don’t matter, not to encourage shaving nanoseconds.

Usefulness vs. Overkill

  • One camp: if you’re in a domain where these per-op nanoseconds and bytes matter, Python is probably the wrong tool; focus on algorithmic complexity, IO, and profiling instead.
  • Counterpoint: performance is a leaky abstraction; rough constants help you sanity-check expectations (“should a million adds take ~tens of ms or seconds?”) and choose data structures wisely.
  • Several experienced Python users say they’ve never needed such numbers in 10–20 years of work; they rely on profiling and higher-performance libraries (NumPy, DuckDB, Cython, etc.).

Python Performance Strategy

  • Recurrent advice:
    • Use Python where performance is “good enough”; push hot paths into C/Rust/NumPy/Numba/JAX/etc. when needed.
    • Prefer algorithmic/data-structure fixes (e.g., set vs list membership, bulk IO vs tiny writes) over micro-tuning.
    • Profile real workloads; don’t pre-optimize based on tables.

Benchmark Quality and Variability

  • Multiple commenters stress the numbers are highly hardware-, OS-, build-, and version-dependent; Mac M4 Pro isn’t representative of typical servers.
  • Critiques of missing/weak stats: lack of standard deviations or confidence intervals; only medians are shown.
  • Some measurements and explanations are called out as misleading or incomplete:
    • String memory example ignores Unicode representations.
    • Constant-time claims for concatenation hinge on “small” sizes.
    • Object sizes (ints, lists of ints/floats, empty set/dict) initially misinterpreted; container vs element sizes matter.
    • Async benchmarks (e.g., asyncio.sleep(0), gather) conflate event-loop spin cost with task/future construction overhead.

Broader Reflections

  • Several see the page as a fun, educational reference and a way to update intuition that “Python is always slow” (many basic ops are tens of ns).
  • Others label it “AI slop” or “premature optimization bait,” arguing that without solid methodology and context (e.g., C baselines, stdlib vs third-party libs), such tables can mislead more than they help.