Skip to main content

AI coding glossary

Context Window

Also known as: context length, token window, input window

In one sentence

The maximum number of tokens an LLM can read at once, both the prompt and the model's response combined. Larger windows allow longer documents, more conversation history, and bigger codebases.

Full definition

An LLM's context window is the maximum number of tokens it can process in a single inference, covering both input (prompt + system + history + retrieved context) and output. In 2026 the flagship coding models all sit at 1M tokens: Claude Opus 4.7 1M, GPT-5.5 1M, Gemini 3.1 Pro (2M+ in some configs). A 1M context fits roughly an 80,000-line codebase, an hour of meeting transcripts, or 50,000 lines of test results. The practical gotcha: cost scales linearly with input tokens, and quality degrades past the model's effective context (often 50-70% of the advertised max). Real-world heuristic: stay under 200K tokens for cheap + sharp; expand only when the task genuinely needs it.

On skills-hub.ai

Related terms