2024-12-03

Certain names make ChatGPT grind to a halt, and we know why

Hardcoded Name Filters and Censorship

Many see the name-based filter as a crude patch: effectively an if statement that aborts on certain strings.
This is criticized as turning “AI development” into endless exception-writing rather than fixing root causes.
The filter only applies on the public ChatGPT site; API / Azure access apparently bypasses it via a thinner control layer.

Hallucinations, Defamation, and Legal Pressure

Core problem: the model fabricates detailed, often defamatory claims about individuals when uncertain.
Some argue the “solution” is to make the system unusable for certain queries rather than improving truthfulness.
Others note this creates a two-tier world: a handful of protected names vs billions who can still be casually defamed.
Discussion links the filter to legal threats and defamation cases; there’s debate over whether that’s conclusively known or just strongly inferred.

Capabilities, Limitations, and Everyday Use

Several comments stress LLMs are unreliable for factual tasks like listing methods or sorting by code metrics.
Nonetheless, people defend LLMs as a universal interface for messy, one-off tasks (parsing ugly tables, renaming files), especially for non-programmers.
Others insist simpler tools (spreadsheets, command-line sort, Excel) are usually more appropriate and predictable.

Technical and Safety Architecture

Comparisons are drawn to exception-heavy traditional software: lots of work is about handling invalid input and bug-for-bug compatibility.
OpenAI already uses moderation models; the name-filter is seen as an extra, narrowly targeted layer.
A proposal for a dedicated “legal advisor” model is criticized as likely unworkable: it can’t tell true accusations from hallucinated ones.

Speculation About Specific Blocked Names

One thread links a blocked name to multiple people: a public figure, another person on a terror watchlist, and general confusion in training data.
Another suggests some families may be aggressively filtered to avoid amplifying conspiracy theories.
Others note some of these blocks have already been relaxed or “fixed,” adding to the sense of ad hoc behavior.

Local vs Hosted Models and Data Removal

Some argue this shows why local models are attractive: no external filters or legal takedown constraints.
Counterpoint: neither local nor remote deployments solve the core issue of being unable to truly “untrain” personal data once ingested.

Adversarial Uses and Prompt Injection

People immediately test jailbreaks: referring indirectly to blocked individuals, spelling tricks, or using descriptors (“B. H., mayor in Australia”).
A visual prompt injection example shows that lightly embedded banned text in images can crash or halt sessions.
There’s joking about watermarking content with blocked names to stop scraping or break AI processing.

Critique of Article and Meta-HN Topics

Some call the article clickbait for claiming “we know why” while mostly speculating.
There’s mixed opinion on the outlet’s general quality.
A separate sub-thread explains HN’s “second chance” / pool mechanism, which can resurface older stories and confuse timestamps.

Related topics