Claude’s C Compiler vs. GCC
Compiler design and C’s parsing quirks
- Several comments note that CCC’s main missing piece is not parsing but optimization: modern compilers spend most complexity in IR design, analyses, and register allocation, not frontends.
- Discussion dives into the “typedef problem” and why C isn’t context-free: typedef names and identifiers share syntax, forcing context-sensitive parsing or lexer hacks. Various academic and practical solutions (lexer hacks, PEG + match-time captures, GLR/GLL with graph-structured stacks) are mentioned.
- GCC’s multi-IR pipeline (GIMPLE, RTL) is contrasted with LLVM’s more unified IR as a saner design.
CCC’s performance and correctness issues
- The SQLite benchmark shows CCC builds are ~12–20x slower in “normal” runs, with one nested-query case up to 158,000x slower; commenters doubt the explanation given (simple per-iteration slowdown) and suspect miscompilation or pathological spilling/cache behavior.
- CCC is described as worse than GCC -O0 and slower than fast non-optimizing compilers like TCC, which surprises some who see -O0 as an easy baseline.
- Multiple reports say CCC happily compiles blatantly invalid C (wrong argument counts, dereferencing non-pointers, ignoring
const, type redefinitions), suggesting it optimizes for “no errors + passes some tests” rather than semantic correctness. - Assembly output is likened to an undergraduate compiler: heavy register spilling, likely dead code, ineffective or non-working SSA optimization passes.
How “real” is Anthropic’s Linux-boot claim?
- Anthropic’s blog said CCC could build a bootable Linux 6.9 for x86, ARM, and RISC-V; this article only verifies RISC-V, and x86 fails at link time.
- Commenters question whether the kernel really booted on all three architectures, and note the repo only documents RISC‑V boot tests.
- Others stress that “0 compiler errors on all kernel C files” doesn’t imply correctness: CCC may just be silently accepting bad code.
What CCC actually demonstrates about LLMs
- Many see CCC as a research demo of agentic LLMs plus a strong harness (GCC-as-oracle, tests), not a serious GCC competitor.
- Key takeaway for supporters: an autonomous (but heavily orchestrated) system can produce a 100k+ LOC, multi-arch C compiler that compiles the kernel and SQLite at all, which would have been implausible a few years ago.
- Critics counter that:
- Compilers and their documentation are heavily present in training data, so this is recombination, not novel design.
- The result is huge, fragile, under-optimized, and hard to evolve—exactly the “second 90% / third 90%” of software work that LLMs struggle with.
- Without robust specs and test oracles, the same techniques tend to produce slop that only “looks correct.”
Pro vs. anti LLM coding agents
- Pro side themes:
- CCC proves agents can handle very complex, highly verifiable tasks; next iterations could close performance gaps dramatically.
- Even a flawed compiler at this scale shows how much routine engineering can be automated; used with human oversight, this augments productivity.
- It’s unfair to compare a few weeks and $20k of tokens to decades of GCC; the right comparison is against what a small human team could do in similar time.
- Anti/skeptical side themes:
- Anthropic’s marketing overstated reality (“bootable Linux on 3 archs”; “working compiler”), breeding distrust and comparisons to vaporware hype.
- Agents still fail badly on smaller, real-world tasks (e.g., nontrivial refactors) and generate unmaintainable, license-risky code; humans remain on the hook for understanding and maintenance.
- Claims that “the next generation will fix it” resemble autonomous-vehicle timelines: last few percent of reliability may be extremely hard.
Economic, ethical, and societal concerns
- Several comments focus less on CCC itself and more on:
- Concentration of power: whoever controls the top models controls effective “means of software production”; users lose deep understanding and agency.
- Employment and inequality: AI boosters simultaneously ask for massive capital and forecast wide programmer unemployment, unsurprisingly provoking backlash.
- Data pollution: models trained increasingly on AI-generated code may degrade over time; “AI feeding on its own slop” is a recurring worry.
- Licensing: strong suspicion that training on GPL’d compilers and then emitting proprietary-ish code skirts both the spirit and perhaps letter of open-source licenses.
Methodology, orchestration, and alternatives
- Many view the most interesting part as the harness/orchestration design: iterative agents with GCC as oracle, profilers, and tests driving code evolution.
- Several argue human-in-the-loop use (small, reviewed contributions guided by experts) is more practical and cheaper than fully autonomous multi-agent “vibe coding.”
- Some suggest more telling benchmarks would be:
- A minimal C compiler that can compile SQLite with good performance and a small, clear codebase.
- LLM-built compilers for entirely new ISAs or languages, where memorization is impossible and design choices must be made from specs alone.