AI coding glossary
Context Window
Also known as: context length, token window, input window
In one sentence
The maximum number of tokens an LLM can read at once, both the prompt and the model's response combined. Larger windows allow longer documents, more conversation history, and bigger codebases.
Full definition
An LLM's context window is the maximum number of tokens it can process in a single inference, covering both input (prompt + system + history + retrieved context) and output. In 2026 the flagship coding models all sit at 1M tokens: Claude Opus 4.7 1M, GPT-5.5 1M, Gemini 3.1 Pro (2M+ in some configs). A 1M context fits roughly an 80,000-line codebase, an hour of meeting transcripts, or 50,000 lines of test results. The practical gotcha: cost scales linearly with input tokens, and quality degrades past the model's effective context (often 50-70% of the advertised max). Real-world heuristic: stay under 200K tokens for cheap + sharp; expand only when the task genuinely needs it.