Skip to main content

AI coding glossary

Guardrails

Also known as: llm guardrails, agent guardrails, safety filters

In one sentence

Rules and validation layers that constrain what an AI agent or LLM can do or say, input validation, output filtering, per-tool allowlists, approval gates, and budget limits.

Full definition

Guardrails are the safety + correctness layer wrapped around an LLM or agent. In agentic coding tools they take several forms: per-tool allowlists (this subagent can only Read + Edit, not Bash), approval gates (require human OK before destructive operations), input validation (reject obviously prompt-injection-shaped inputs), output filters (block secrets being printed), budget limits (cancel if cost exceeds X), and scope constraints (this agent only operates in this directory). In 2026 the canonical patterns: Cline's per-step approval UI, Claude Code's hooks (PreToolUse / PostToolUse), MCP's per-tool permissions, and skill-level tool declarations in SKILL.md frontmatter.

On skills-hub.ai

Related terms