2025-01-23

C is not suited to SIMD (2019)

Scope of the Argument

Discussion centers on auto-vectorization and high-level SIMD, not on whether C can use intrinsics or inline assembly.
Several commenters say the article’s title is misleading: C is fine for manual SIMD; the hard part is getting compilers to turn generic scalar C into good SIMD automatically.

Auto‑Vectorization Limits in C

Key issue: C’s pointer and function model obscure aliasing and higher-level structure, making automatic SIMD harder.
Aliasing: compilers often can’t prove that pointers don’t overlap, especially in libraries. restrict can help but is unsafe if misused and cannot always be known at library compile time.
Some compilers can vectorize things like exp/sigmoid loops using vector math libraries, but typically need non‑default flags (fast-math or similar), which many consider unacceptable for serious code.

Math Functions and Modularity

One line of argument: math functions like exp are library calls (or intrinsics) with scalar signatures (e.g., double exp(double)), which blocks fusion with surrounding loops and thus SIMD opportunities.
Others counter that modern compilers can treat standard math functions specially and that exp is almost always software anyway.
General theme: modularity and separate compilation of functions obstruct global optimization and fusion needed for aggressive SIMD.

Manual SIMD and Libraries

Many point out that C/C++ with intrinsics or inline assembly are widely and successfully used for sophisticated SIMD (UTF‑8 decoding, compression, sorting, exponents).
There is debate over “portable intrinsics” across SSE/NEON/AVX/AVX‑512; some claim decent portability, others say ISA differences (e.g., missing mask/bit-extract instructions) force ISA‑specific code.

Type Systems, Arrays, and Other Languages

Fortran is cited as easier to auto‑vectorize due to non‑aliasing array semantics and array‑centric design.
C’s array/pointer model and relatively weak expressive power for shapes/dimensions are criticized for scientific computing; others argue it’s still effective with discipline.
Array languages and modern array‑centric compilers are cited as better matches for pervasive SIMD, with discussion of “fusion vs fission” in array pipelines.
Python is described as “good with vectorization” via C/Fortran-backed libraries; Rust, Zig, C#, CUDA, and C++ SIMD libraries are mentioned as alternative ecosystems with varying SIMD ergonomics.

Related topics