I don't like NumPy
Array semantics, broadcasting, and >2D pain
- Many agree the real difficulty starts with 3D+ arrays: slicing, reshaping, and broadcasting become hard to reason about.
- Advanced indexing is seen as especially opaque: shapes change in non‑intuitive ways, and scalar vs array indices interact with broadcasting in confusing, poorly documented ways.
- Some argue this is partly that humans are bad at higher dimensions; others think better array languages (APL/J/BQN, Julia) show the problem is NumPy’s design, not the domain.
Loops, vectorization, and performance hierarchy
- Debate over “you can’t use loops”: some say NumPy’s point is performance and falling back to Python loops defeats the purpose, especially at pixel‑ or element‑level.
- Others use NumPy “like MATLAB” where developer time matters more than runtime, and occasional loops are fine.
- Several posts outline a performance ladder (GPU > vectorized CPU > static scalar > dynamic Python), emphasizing how easy it is to accidentally fall to the bottom by writing innocent‑looking loops.
- Concrete examples (e.g., sieve of Eratosthenes) show that many algorithms cannot be cleanly vectorized; in those cases NumPy doesn’t solve Python’s slowness.
Comparisons: MATLAB, Julia, R, array languages
- MATLAB and Julia are praised for more consistent, math‑like array syntax; vectorized code often “just works” with minor tweaks.
- R/tidyverse is liked for data manipulation but criticized as a DSL with painful general‑purpose programming and deployment.
- Several see NumPy as “not a true array language” but a vectorization library bolted onto Python. Others prefer its broadcasting over MATLAB’s memory‑heavy style.
Workarounds and alternative tools
- For multidimensional work, xarray (named dimensions) is heavily recommended and reportedly eliminates many of the author’s complaints.
- Other suggestions: JAX (especially
vmapandjit), Numba, CuPy, Torch, einops, named tensors, array-api-compat, and niche projects that turn NumPy into a more complete array language.
API inconsistencies, gotchas, and ecosystem issues
- Complaints about inconsistent axis arguments, surprising return types, verbose syntax, implicit broadcasting bugs, and legacy warts (
poly1d, indexing rules). - Some argue this reflects broader Python problems: dynamic, underspecified APIs; difficulty standardizing across libraries; heavy dependency/import overhead.
- Others defend NumPy as a crucial lingua franca and reference implementation that enabled most of the scientific Python stack despite its rough edges.