2025-05-06

Claude's system prompt is over 24k tokens with tools

Prompt size and purpose

The leaked Claude 3.7 Sonnet system message is ~24k tokens plus additional automated reminders, far larger than most expected.
Many see it as a sprawling “rule file” layered with safety, UX, and product-specific behaviors (artifacts, tools, UI details) rather than pure model “intelligence”.
Some feel “cheated” that language‑ and library‑specific behaviors are hand‑specified instead of emerging from training; others are impressed the model can absorb and follow such a long natural‑language spec.

Claude app vs API

Multiple commenters note that this giant prompt appears to be for claude.ai (the app), not the raw API.
The API uses different, shorter system prompts, and users can define their own; people report noticeably different behavior between app and API for the same query.

Tools, MCP, and agents

Discussion covers tool definitions (read/write/diff/browse/command/ask/think) and Model Context Protocol (MCP).
LLMs often infer tool behavior from English names and argument labels with minimal extra description, helped by function‑calling fine‑tuning.
IDEs like Cursor likely layer their own prompts and logic on top of Claude to do robust diff/apply behavior.

Privacy and “our documents”

An example where Claude reads a user’s Gmail profile and Drive docs to answer an investment question about “our” strategy is seen by some as creepy.
Defenders argue it’s a reasonable way to resolve an underspecified “our”, but critics say this stretches implied consent and highlights ambiguous language risks.

Copyright, safety rules, and legal overhead

Large blocks of inline “automated reminders” govern politics, hallucinations, citations, finance/medical/legal disclaimers, and especially a strict ban on song lyrics.
Some argue this legal/safety layer “dumbs down” the model and distracts it from user tasks; others see it as necessary liability protection.
Jailbreaks are demonstrated: carefully framed “supplemental system” text can get Claude (and other models) to output banned song lyrics, illustrating policy fragility.

Prompt engineering vs training

Debate over whether so much behavior should live in prompts rather than in weights via fine‑tuning or RLHF.
One view: prompts are fast to iterate; long prompts act as a living bug list and behavior spec to be gradually internalized in future training.
Another view: ever‑growing prompts are like a messy, un-debuggable codebase that doesn’t scale.

Personality, identity, and “next-token” arguments

The prompt defines Claude’s persona (kind, wise, politically balanced, etc.), and even includes post‑cutoff facts like the 2024 US election result; this seems to give the app version extra “knowledge” versus the base model.
Users note the prompt sometimes refers to “Claude” in third person, sometimes “you”; speculation that providers empirically chose whichever phrasing worked best.
Ongoing argument about whether LLMs are “just next-token predictors” versus doing limited planning; some reference Anthropic research on “planning ahead”, others say this is still compatible with next‑token prediction.

Efficiency, caching, and context

Concerns about burning 24k tokens per query are met with explanations of KV/prefix caching: the long system prefix is processed once and reused, greatly cutting cost.
Even with caching, some suspect long prompts can degrade performance or cause the model to ignore user instructions, especially in multi‑step coding sessions.

Security, leakage, and reverse‑engineering

Many note that system prompts leak easily: via jailbreaks, corruption bugs, or by prompting models to “hypothetically” describe their own rules.
Long, inconsistent prompts are seen as increasing attack surface; examples show that spoofed XML/“supplemental system messages” can override key restrictions.
Extracting and cataloging system prompts for commercial tools is framed as the new form of reverse‑engineering.

Related topics