I extracted the safety filters from Apple Intelligence models
What these filters are and where they sit
- Extracted configs are regex‑style block/replace lists used around Apple’s “Apple Intelligence” models.
- Commenters say they’re an extra, cheap first layer before a heavier “safety model”/classifier runs, on both input and output.
- Different files map to specific features: proactive notification summaries, Writing Tools, camera “visual intelligence,” messages/mail replies, code intelligence, etc.
- Some lists are “retain”/substitution lists (replacing a term with “test complete”), others are hard denies that disable the feature (“Writing tools unavailable”).
Test phrases and QA scaffolding
- Odd phrases like “granular mango serpent” and “xylophone copious opportunity defined elephant” (XCODE acronym) appear widely.
- Consensus: these are artificial, low‑collision QA tokens used to test that filters are loaded and working, analogous to antivirus test strings.
- Confirmed behavior: using the phrase in Apple Intelligence triggers blocked‑content errors, supporting the “QA hook” theory.
Regex safety: utility and limitations
- Some see regex filters as “silly” and trivially bypassed (e.g., leetspeak, euphemisms), others defend them as fast, effective for 99% of ordinary users, and good CYA.
- Classic problems identified: false positives like Scunthorpe‑style matches, blocking benign phrases (“pass on,” “take it off me”), and missing coded language (“unalive”).
- Several argue that LLMs easily normalize typos and substitutions, so naive regexes neither robustly block nor meaningfully degrade harmful use.
Politics, brands, and topic avoidance
- Lists explicitly block many current politicians’ names, some political topics (e.g., Palestine in certain contexts), competitor AI brand names (ChatGPT, Gemini, others), and some welfare/poverty terms in French.
- Interpretations range from neutral “avoid generating abusive or defamatory replies about named individuals” to concern about opaque political and socioeconomic framing.
- Apple product names and capitalization are enforced (iPhone, etc.), seen by some as trivial trademark defense, by others as branding overreach into user expression.
Regional and “safety vs censorship” debate
- CN‑specific configs emphasize sexual deviance, religion, and some political/religious terms; other locales vary by language and local politics.
- Large subthread debates whether this is ordinary corporate risk management and legal compliance, or a step toward corporate/state speech control akin to national firewalls.
- Some point to open‑weights/offline models as an escape valve; others note most users will be stuck with whatever guardrails platform vendors impose.