Skills Officially Comes to Codex

What Skills Are and Why People Like Them

  • Seen as bundled “workflow recipes”: instructions + resources + optional scripts, usable on demand by agents.
  • Key value: reusable, sharable, and context-efficient—only short YAML front-matter is always loaded; full markdown/scripts are lazily pulled when needed.
  • Several commenters argue skills are more important long term than MCPs: simpler, easier to author (just .md), and compose across tasks without purpose-specific agents.
  • Others emphasize that the real innovation is bundling prompt + code with an assumed sandboxed execution environment.

Practical Use Cases and Workflows

  • Many examples: project-specific frameworks, niche tools (e.g. marimo, Metabase, Sentry, Playwright), database querying, Django templates, auth setups, testing workflows, PR conventions, analytics queries, and periodic data updates.
  • Common pattern: “instructions I might need but don’t want in AGENTS.md” – skills keep rare or complex workflows out of the main context.
  • Skills are seen as especially useful for cross‑team standards and for codifying the outcome of long debugging/build sessions; some use “meta skills” to let the agent update its own skills.

Comparison to MCPs, Tools, and Plain Prompts

  • One camp: skills are “just prompts/tools in folders,” not fundamentally new; you could already script “read front-matters and decide what to load.”
  • Others: skills effectively replace many MCPs, avoid constant MCP instructions in context, and can even describe how to use MCP servers themselves.
  • Some frameworks expose skills via MCP anyway; skills can be thought of as a catalog over tools/functions.

Verification, Evaluation, and Context Concerns

  • Debate over free-form markdown vs structured formats (YAML/JSON): structure would help external evaluation and iteration, but LLM behavior remains non‑verifiable.
  • Suggestions: traditional “evals”, unit‑testing MCP tools, multiple independent agents for consensus, DSPy+GEPA, and RL to learn which skills are actually useful.
  • Important implementation detail: only skill names/descriptions in front-matter go into the prompt index, so undisclosed logic in the body may never be used.
  • That index is both a discovery aid and a liability: every skill description is effectively a prompt injection and eats tokens each turn.

Sharing, Standards, Secrets, and Monetization

  • Teams share skills via Git repos; Codex can discover Claude skills automatically in some setups.
  • Official public skill repos exist; some wish for ranked marketplaces, but others expect spam, security headaches, and little revenue.
  • Handling secrets is unresolved: people hack around with .env files or local storage; a first‑class, user‑prompted secret store is desired.
  • Monetizing skills is questioned; DRM on markdown is seen as unrealistic.