I extracted the safety filters from Apple Intelligence models

What these filters are and where they sit

  • Extracted configs are regex‑style block/replace lists used around Apple’s “Apple Intelligence” models.
  • Commenters say they’re an extra, cheap first layer before a heavier “safety model”/classifier runs, on both input and output.
  • Different files map to specific features: proactive notification summaries, Writing Tools, camera “visual intelligence,” messages/mail replies, code intelligence, etc.
  • Some lists are “retain”/substitution lists (replacing a term with “test complete”), others are hard denies that disable the feature (“Writing tools unavailable”).

Test phrases and QA scaffolding

  • Odd phrases like “granular mango serpent” and “xylophone copious opportunity defined elephant” (XCODE acronym) appear widely.
  • Consensus: these are artificial, low‑collision QA tokens used to test that filters are loaded and working, analogous to antivirus test strings.
  • Confirmed behavior: using the phrase in Apple Intelligence triggers blocked‑content errors, supporting the “QA hook” theory.

Regex safety: utility and limitations

  • Some see regex filters as “silly” and trivially bypassed (e.g., leetspeak, euphemisms), others defend them as fast, effective for 99% of ordinary users, and good CYA.
  • Classic problems identified: false positives like Scunthorpe‑style matches, blocking benign phrases (“pass on,” “take it off me”), and missing coded language (“unalive”).
  • Several argue that LLMs easily normalize typos and substitutions, so naive regexes neither robustly block nor meaningfully degrade harmful use.

Politics, brands, and topic avoidance

  • Lists explicitly block many current politicians’ names, some political topics (e.g., Palestine in certain contexts), competitor AI brand names (ChatGPT, Gemini, others), and some welfare/poverty terms in French.
  • Interpretations range from neutral “avoid generating abusive or defamatory replies about named individuals” to concern about opaque political and socioeconomic framing.
  • Apple product names and capitalization are enforced (iPhone, etc.), seen by some as trivial trademark defense, by others as branding overreach into user expression.

Regional and “safety vs censorship” debate

  • CN‑specific configs emphasize sexual deviance, religion, and some political/religious terms; other locales vary by language and local politics.
  • Large subthread debates whether this is ordinary corporate risk management and legal compliance, or a step toward corporate/state speech control akin to national firewalls.
  • Some point to open‑weights/offline models as an escape valve; others note most users will be stuck with whatever guardrails platform vendors impose.