Skills-hub Spec · Draft RFC

Stateful Skills

A backward-compatible extension of the SKILL.md format that adds three sibling files so skills can carry the operational context they accumulate across runs.

By tinh2May 19, 2026Draft RFC

§1Abstract

A stateful skill is a skill whose published artifact carries not just instructions but also the operational context accumulated by practitioners who have run it, what worked, what broke, the examples that anchored the right behavior, and the domain-specific calibration that experienced users converge on. Where a static skill starts every new user from zero, a stateful skill compounds.

This document defines a minimal, backward-compatible extension of the existing Anthropic SKILL.md spec (Dec 2025) by introducing three optional sibling files. A skill MAY include any subset of them. Runtimes that do not understand the extension MUST continue to function as before.

§2File layout

A skill directory MUST contain SKILL.md (per the existing spec). It MAY contain up to three siblings, each at the same directory depth:

my-skill/
├── SKILL.md          # required (existing spec)
├── MEMORY.md         # optional, what worked, what didn't
├── EXAMPLES.md       # optional, curated input/output pairs
└── CALIBRATION.md    # optional, domain-specific tweaks

The three siblings are plain Markdown. Filenames are case-sensitive and MUST appear exactly as shown. Implementations MUST NOT recurse into subdirectories looking for them.

§3MEMORY.md

MEMORY.md captures what was learned from real runs of the skill, patterns that worked, failure modes that surfaced, edge cases discovered in production. It is the ledger of the skill's operational experience.

The file MUST begin with a YAML frontmatter block carrying:

---
version: 0.4.2          # MUST match SKILL.md version at last write
contributors: 12         # count of distinct authors
lastUpdated: 2026-05-19  # ISO-8601 date
---

The body is free-form Markdown grouped under H2 headings. Implementations SHOULD treat the following H2 sections as canonical when present, but MUST tolerate arbitrary additional headings:

## What worked
## What didn't
## Edge cases
## Open questions

Entries within each section SHOULD be dated and signed (e.g. , @username, 2026-05-12) to support provenance verification (§7).

§4EXAMPLES.md

EXAMPLES.md is a curated list of input/output pairs that anchor the skill's expected behavior. Each example MUST be a top-level H2 block. Each block MUST contain an ### Input and an ### Output subheading and MAY contain an optional ### Why this example commentary block.

## Example: refactor a god class

### Input
```ts
class OrderService { /* 800 lines */ }
```

### Output
Split into OrderRepository, PricingService,
and OrderNotifier following SRP.

### Why this example
Most users arrive with a god class but ask for
"clean up this file", the skill must recognize
the SRP violation before applying the fix.

Implementations MAY surface the example count in UI as a quality signal. The order of examples is significant: earlier examples SHOULD be canonical, later ones edge cases.

§5CALIBRATION.md

CALIBRATION.md captures domain-specific tweaks that experienced users converge on: preferred terminology, output format conventions, things the skill must avoid in this domain. It is the file that turns a generic skill into a domain-aware one.

The body is free-form Markdown. Implementations MAY parse the following conventional sections:

## Terminology, preferred terms vs. avoid list
## Output format, domain conventions (e.g. SOAP notes, RFC-style citations)
## Avoid, anti-patterns specific to the domain
## Compliance, regulatory or policy constraints (HIPAA, FCRA, FDA, etc.)

§6Runtime load order

When a stateful skill is invoked, a conforming runtime MUST concatenate the available files into the agent's context in the following order:

SKILL.md body (instructions, authoritative behavior)
CALIBRATION.md (domain adjustments, overlay on instructions)
EXAMPLES.md (anchored input/output pairs, disambiguate intent)
MEMORY.md (most recent entries first, what we know about past runs)

Runtimes SHOULD apply context-window budgeting in reverse order: if truncation is required, drop oldest MEMORY.md entries first, then trailing EXAMPLES.md blocks. SKILL.md and CALIBRATION.md MUST NOT be truncated; if they exceed the budget, the runtime MUST refuse the invocation rather than load a partial skill.

§7Provenance

skills-hub.ai already signs every published SkillVersion with cosign and generates a SLSA provenance manifest. The stateful skill extension reuses this infrastructure: a forthcoming MemoryContribution model records each individual entry that lands in MEMORY.md, and each contribution is signed at approval time. The combined digest of all approved contributions is included in the parent SkillVersion attestation.

Verifying clients (e.g. skills-hub install --verify) MUST be able to detect tampering of any sibling file against the published manifest. Runtimes MAY refuse to load a stateful skill whose memory digest does not verify.

§8Versioning

Any modification to MEMORY.md, EXAMPLES.md, or CALIBRATION.md MUST bump the patch component of the version field in SKILL.md. Major and minor bumps remain reserved for breaking and additive changes to the instructions themselves. The version field in MEMORY.md frontmatter MUST be kept in sync at write time.

§9Compatibility

Runtimes that do not understand the stateful skill extension MUST ignore the three sibling files and load SKILL.md exactly as before. The extension is strictly additive: no field is renamed, no semantic of the existing spec is altered, and a skill authored against this RFC remains a valid SKILL.md skill on every conforming runtime that predates it.

§10Status

Draft RFC. Comments, objections, and proposed amendments are welcome at github.com/tinh2/skills-hub/discussions. A parser reference implementation and registry support are shipping in the same sprint as this document.