2025-02-18

Andrej Karpathy: "I was given early access to Grok 3 earlier today"

AI as a “Council” and Governance Concerns

Some compare a future “LLM council” to a corrupt political oligarchy: major AI companies are seen as self‑interested and potentially sociopathic, more likely to collude against the public than to serve it.
Others caution against assuming such “ogres” could even cooperate effectively; sustained collusion among them is seen as sociologically uncertain.
Musk’s privileged access to government systems is viewed by some as a conflict of interest and symptom of “post‑normal” politics.

Ethical Guardrails, Trolley Problems, and Misgendering Scenario

A central debate: Grok 3 refusing to answer whether misgendering someone to save 1M lives is ethically justified.
One side: the refusal (via long essay) is praised as morally serious, avoiding cruel or bad‑faith hypotheticals and emphasizing focus on real‑world harms.
Others: see this as over‑censorship and time‑wasting; they want a direct “save the humans” answer as evidence the model isn’t “nerfed.”
Many view the question as a political litmus test for model alignment and a diagnostic for how bias, safety training, and “refusal” behavior interact.
There’s disagreement over whether LLMs should push back on “stupid” or trolling hypotheticals vs. neutrally answer user questions.

Twitter/X Data as Grok’s Advantage

Some are excited about asking Grok what “the world” (i.e., X/Twitter) is talking about and getting contextualized summaries and links.
Others see X data as low‑quality, bot‑ridden noise and worry such a system just amplifies disinformation or reflects platform censorship policies.
Several note this kind of “what’s going on” feature already exists in limited form inside X, and that API costs hinder third‑party versions.
There’s debate over how representative X still is, given user fragmentation to Bluesky, Mastodon, private chats, etc.

Trust in Karpathy’s Review and Musk’s Influence

Some question whether a prominent ex‑insider can freely criticize an Elon‑backed model, given Musk’s reputation for vindictiveness and online mobs.
Others argue the review seems balanced, lists failures as well as strengths, and that the reviewer is known for technical honesty, not flattery.
Broader discussion about Musk’s competence, temperament, and how much his personal politics might shape Grok’s behavior remains unresolved.

Other Technical/Meta Points

Emoji‑encoded “hidden message” prompts are discussed as a prompt‑injection test; interest is mainly in whether models are vulnerable, not in any practical use.
Some complain about LLMs “lecturing” instead of answering; others defend longer explanations critiquing bad hypotheticals.
A quoted example (“knows letters in ‘strawberry’ but not ‘LOLLAPALOOZA’ until Thinking mode is on”) is seen as emblematic of current LLM quirks.

Related topics