Recent Performance Improvements in Function Calls in CPython

Function Calls, Locals, and Micro‑Optimizations

  • Several comments benchmark global vs local references to min.
    • Binding local_min = min inside a function/loop often yields ~5–10% speedup because local lookups use a compact array instead of global dict lookups.
    • However, one counter‑benchmark showed a local alias being slightly slower, illustrating that results depend on exact code and version.
  • Putting hot code in a function can roughly halve runtime vs running at module scope, again due to locals being faster than globals.
  • Dynamic features (min being overridable, locals()/exec) constrain optimizations and can cause surprising behavior around local variable visibility.

Builtins, Loops, and Algorithm Choices

  • Using min(heights) on a list is ~5x faster than manual while‑loops scanning for a minimum; for loops are faster than while, but still much slower than the builtin.
  • Using min(a, b) inside a loop was ~3x slower than a simple if comparison.
  • General advice: let optimized builtins and C‑backed libraries do the looping; avoid tight Python loops when possible.

CPython vs Other Runtimes and Languages

  • CPython remains much slower than PHP, Go, and Java in simple loop benchmarks (often multiple‑x slower), while PyPy can be significantly faster than both CPython and PHP on the same code.
  • Some argue this confirms Python should mainly be “glue” around fast C/Fortran/Rust code; others point out that for many apps, developer time and ecosystem outweigh raw speed.
  • Rewriting Python prototypes in Rust/Go/Java often yields 10–20x speedups, but at higher implementation cost.

Interpreter Internals and Optimization Strategy

  • Python 3.11+ executes many Python‑level calls entirely inside the bytecode interpreter loop (no C call), improving call performance.
  • Discussion compares this to Lua and LuaJIT, which experiment with opposite strategies (using or avoiding C‑level calls) for performance.
  • Superinstructions were mostly removed because they inhibited newer optimizations.

Ecosystem, Use Cases, and Tooling

  • Python is defended as a high‑level DSL over fast native libraries, especially in ML/AI and scientific computing where SciPy, NumPy, and friends dominate.
  • Alternatives like Go, Rust, Nim, Julia, Common Lisp, Fortran, and others are discussed; ecosystem maturity and hiring pool heavily favor Python.
  • Heavy imports (NumPy, pandas, xgboost, etc.) can take seconds, which matters for many short‑lived processes but is a one‑time cost for long‑running workloads.

Meta‑Takeaways

  • Participants stress benchmarking actual code rather than relying on old rules of thumb.
  • Consensus: CPython deliberately trades speed for simplicity, flexibility, and C‑extension friendliness; performance work is welcome but won’t change that core trade‑off.