Python performance myths and fairy tales
Python’s Slowness and What the Talk Actually Says
- Many readers note the talk confirms that Python is slow, not the opposite.
- The article’s “myths” are clarified as:
- “Python is not slow” → debunked: it is slow.
- “It’s just glue, rewrite hot parts in C/Rust” → debunked as non-trivial because of boxing/unboxing and FFI overhead.
- “It’s slow because it’s interpreted” → debunked: interpretation is a minor part; dynamic semantics dominate.
Ergonomics vs Performance Trade‑off
- One side argues Python’s “99% performance cost for 1% niceness” is a bad deal and reflects poor language design (too dynamic, too many ways to do things).
- The other side says productivity and correctness matter more:
- Code is faster to write, easier to read, and often more correct.
- Many workloads are I/O or network bound; CPU time is not the main bottleneck.
- Popular libraries (pytest, Pydantic, Pandas, Matplotlib) and ecosystem make Python “second best at everything.”
JITs, Dynamic Semantics, and CPUs
- Discussion of how JITs use speculation and guards; CPUs hide some overhead via branch prediction and out-of-order execution.
- Critics stress that Python’s semantics (dynamic attribute lookup, descriptors, big-int arithmetic, monkey patching) impose unavoidable runtime checks; JITs can reduce but not eliminate them.
- Comparisons drawn to JavaScript (v8), Smalltalk, Common Lisp, and PyPy; some argue those show dynamic languages can be fast, others say Python’s specific semantics and C API make it harder.
C Extensions, FFI, and “Glue Language” Reality
- Crossing Python–C boundaries is costly; best practice is to move entire hot loops into compiled code, not call C per element.
- FFI adds complexity in build, debugging, and distribution; some argue it’s often simpler to just write everything in Rust/C++ if performance is central.
- Counterpoint: in many domains (NumPy, PyTorch, Arrow, GNU Radio) Python serves effectively as a DSL/config layer over optimized native cores.
Concurrency and the GIL
- Questions about multi-core use: multiprocessing has been around for years; recent versions add “full-threaded” mode and ongoing GIL work.
- Some doubt multithreading will close the gap with compiled languages due to memory-bandwidth waste and object-heavy layouts.
Alternatives, Subsets, and “Python 4”
- Ideas raised: static or “final” qualifiers, restricted subsets (SPy, Numba-style), or a new Python-like but stricter language (Mojo, Nim, Julia, Go, Rust, Scala).
- PyPy praised for speed but criticized for compatibility and behavioral differences (GC, C-extensions).
- Several think a truly fast “Python 4” would effectively be a new, incompatible language; others suggest opt‑in restrictions or DSL compilers as a more practical path.