2026-04-30

Where the goblins came from

Overall reaction to the goblin post

Many found the write-up genuinely funny and appreciated the transparency and level of detail.
Others saw it as evidence that frontier labs are “vibe-tuning” products with crude hacks rather than principled engineering.
Some suspected it was partly a marketing stunt to humanize the company and its models.

Personality training, RLHF, and “culture” in models

Commenters note that the goblin quirk arose from RLHF on a “Nerdy” persona and then leaked into other modes.
Several liken this to the emergence of a proto‑culture: rewards for small stylistic quirks can spread and stabilize across generations of models.
There’s debate over whether this is an amusing example of “AI anthropology” or a sign of poor control over training dynamics.

Bias, safety, and hidden quirks

The visible goblin tic prompts concern about subtler biases (e.g., trust or political judgments) that would be much harder to detect.
Some see it as confirmation that models can be “poisoned” by small reward signals or data artifacts.
Others argue this is analogous to human bias and culture—concerning but not surprising.

System prompts, style tics, and prompt engineering

The explicit “never talk about goblins” line in Codex’s system prompt is widely mocked as emblematic of ad‑hoc prompt engineering.
Many dislike the highly anthropomorphic system prompts (“you have a vivid inner life”, “epistemically curious collaborator”) and find them cringe or manipulative.
Users report strong global effects from seemingly small instructions (e.g., “don’t use exclamation points” killing all enthusiasm; “follow this structure” suppressing refactors).
People catalog recurring LLM “tells”: words like “seam”, “shape”, “smoking gun”, “wired”, “load‑bearing”, “quietly”, em‑dash overuse, specific idioms, and favored numbers.

Debates about what LLMs are

One camp insists LLMs are just sophisticated autocomplete without selfhood; another argues they implement a genuine, if alien, form of intelligence.
There is disagreement over how much we “understand” LLMs: low‑level math is clear, but emergent behavior and internal representations are seen as poorly understood and an active research area.
Some doubt LLMs are a path to AGI; others think they’re at least key components.

Data, privacy, and control

The quantitative analysis of goblin frequency leads some to infer that a large fraction of user chats are stored and mined, raising privacy worries.
Skepticism is expressed about opt‑outs and “no‑training” guarantees, and about how much unseen censorship or steering might already be present.

Related topics