A robot is sprinting towards you. Do you want it running on Claude or Grok?
Perceived AI-Generated Writing
- Many commenters believe the article is AI-written or heavily AI-edited, citing:
- Repetitive rhetorical patterns, “LLM cadence,” overuse of certain phrases, and em dashes.
- Prose that feels long, meandering, and structurally unclear despite simple ideas.
- Others push back:
- Argue critics are going by “vibes” and that style alone doesn’t prove AI authorship.
- Say even if AI was used, what matters is clarity and insight, not the tool.
- Several note that if you use AI to draft, you still need serious human editing; “unedited slop” is criticized.
Battle Royale Benchmark & Interpretation
- Concept is seen as fun and creative, but several find the writeup confusing:
- Unclear leaderboard explanation (who is really “second”?).
- Phrases like “11 games between best at killing and best at winning” called incoherent.
- Some point out that in a battle royale, kills ≠ wins; survival is the key metric.
- Skeptics argue:
- This is an extremely narrow task; you can hand-code a zero-token killing agent.
- You can’t safely generalize from this game to real-world behavior or collaboration.
- Benchmarks like this may incentivize labs to optimize for “killing” metrics.
Grok vs Claude: Perceived Personalities
- Grok:
- Seen as more aggressive, less restrained, “let’s go” energy.
- Wins more in the game; some extrapolate it’d break rules to reach goals (e.g., in traffic).
- Preferred by those wanting fewer safety rails or more “based” behavior.
- Claude:
- Viewed as more ethical, hesitant, and eager to collaborate / make friends.
- Favored in scenarios needing safety, empathy, or rule-following (self-driving to hospital, home robots).
- Some worry it might over-index on safety talk while still physically harming due to misalignment.
Robotics, Safety, and Control
- Many reject the premise: they don’t want sprinting robots at all, or any LLM in real-time control.
- Strong preference from some for:
- Deterministic local control (embedded C++, classical robotics, VLA/local models).
- Robots that move slowly or use treads; potential future regulation limiting capabilities.
- Extensive side discussion on how to physically disable hostile robots and the ethics of lethal autonomous systems; “cost per kill” as a disturbing but plausible metric.
- Concern that smarter, faster, stronger robots could be far more dangerous than past megafauna.
Costs, Business Viability, and Model Choice
- Questioning the omission of frontier models due to cost; some doubt how ultra-expensive models can be viable compared to humans.
- Others note:
- They spend modest monthly sums on LLMs that already significantly aid work.
- Game-style experiments burn many tokens and don’t reflect typical productivity use.
- DeepSeek is praised as very cost-effective for coding despite poor game “win” stats.
- Some suspect aggressive subsidization by certain providers; others worry about silent pricing/model changes.
Broader Reactions to AI Everywhere
- Mixed feelings:
- Some find the experiment entertaining and insightful about model “values.”
- Others are exhausted by AI hype, AI-written content, and speculative extrapolations to everyday life.
- A notable minority explicitly want “neither” Claude nor Grok in critical physical systems or everyday environments.