Question 1

What is Prompt Cache?

Accepted Answer

A provider-side cache that reuses the model's processing of a repeated prompt prefix, cutting cost ~90% and first-token latency ~80% when the prefix is identical between calls.

Question 2

What does Prompt Cache mean in AI coding?

Accepted Answer

Prompt caching is the provider-side optimization that reuses the model's processing of a repeated prompt prefix. When you make multiple calls with the same long system prompt + same long context (e.g., the same codebase + the same skill instructions across an agent's turn loop), the provider caches the prefix's intermediate state. Repeat calls hit the cache and pay roughly 10% of the prefix cost and start streaming roughly 80% faster. Anthropic launched prompt caching in 2024; OpenAI and Google added it in 2025. In 2026 it's the largest single cost optimization for production AI agents, Claude Code's sub-agents leverage it heavily, as does Cursor's Composer. Effective with consistent prefix structure (system prompt + skill body + workspace context); breaks down when prompts vary.

Prompt Cache

In one sentence

Full definition

On skills-hub.ai

Related terms