2025-07-29

My 2.5 year old laptop can write Space Invaders in JavaScript now (GLM-4.5 Air)

Training data, cloning, and originality

Many argue the model likely saw numerous Space Invaders clones in training, so the result may be sophisticated “copy–paste with extra steps” rather than invention.
Others counter that humans also recombine prior knowledge, and that models demonstrably handle entirely new requirements when given detailed specs.
Debate centers on whether LLMs are “just recall”:
- Critics say output is mostly lossy compression of training data with limited true reasoning.
- Supporters point to compression itself as a powerful form of understanding, plus hallucinations as evidence it’s not literal memorization.
Some small-scale code comparisons show similarity in structure and idioms but not verbatim copying, suggesting reuse of patterns rather than wholesale plagiarism.

Benchmarks, pelicans, and artist concerns

The long‑running “SVG pelican on a bicycle” prompt is discussed as a benchmark that models may now be overfitting on, especially as it went viral.
This leads to a broader point: public benchmarks get “burned” as soon as labs can train/cheat on them, motivating people to keep private test sets.
Artists worry that anything put online becomes training data and is commoditized; suggestions include physical exhibitions or DRM’d portfolios, but consensus is that DRM would be brittle and easily bypassed.

Local models and hardware (Apple vs others)

A big theme is how impressive it is that an M2/M4 Mac with 64–128GB unified memory can run ~200B‑parameter MoE models locally and generate full games.
Disagreement over how “exceptional” that hardware is: common for high‑end Macs, but far above typical consumer laptops.
On PCs, running comparable models usually requires 24–48GB+ of GPU VRAM or slow CPU inference; unified memory gives Macs an advantage for large models.
Alternatives include multi‑GPU rigs, high‑RAM EPYC servers, new AMD Strix Halo / Framework Desktop, or simply renting GPUs from cloud providers.

Capabilities and limits of LLM coding

Commenters note that LLMs excel at well‑trodden tasks (classic tutorials, boilerplate, UI patterns) but often struggle with novel, idiosyncratic problems and unfamiliar platforms.
Some find “agentic coding” magical yet fragile: great for simple greenfield projects, frustrating for evolving real codebases without tests.
Others describe large productivity gains for glue code, obscure tools (e.g., ffmpeg, jq, AppleScript), quick throwaway utilities, and educational explanations.
Several emphasize disciplined workflows: small iterative prompts, unit tests, and line‑by‑line review; otherwise quality, performance, and security can suffer.

Open vs closed models, fine‑tuning, and economics

Open models are seen as astonishingly strong and only ~6 months behind top proprietary labs, with rapid progress (LLaMA leak onward).
Some speculate this erodes moats of providers like Anthropic/OpenAI, but others note:
- High‑end cloud models still outperform local ones and are cheaper than buying/operating powerful hardware for most users.
- Many expect a database‑like landscape: a mix of strong open models and premium proprietary ones.
Fine‑tuning/LoRA: tools like peft, Unsloth, Axolotl, MLX are recommended; but multiple comments warn that naïve finetuning can degrade general capabilities, and is best for narrow tasks or downsizing to small specialized models.

Use cases, local adoption, and “real engineering”

Some argue a Space Invaders clone isn’t representative of “real engineering” because requirements are fully known and heavily represented in training data. Others respond that implementing it still involves genuine engineering patterns.
Local LLMs are compared to Linux: valuable to enthusiasts, students, and developers who want privacy, low latency, or offline use, while most people will likely stay on SaaS.
There is ongoing concern about overhyping capabilities, but also recognition that even “merely remixing” models are already changing workflows and expanding what individuals can build.

Related topics