Cursor · Release deep-dive

Cursor 3.6 & 3.7: Auto-Review Mode, Canvas Design, and Smarter Agent Control

Cursor 3.6 ships Auto-review Mode — a three-stage classifier (allowlist → sandbox → LLM judge) that decides which tool calls run, run sandboxed, or need approval. Cursor 3.7 adds Design Mode for visual canvas iteration and interactive context usage reports. Here's how both reshape your daily workflow.

3-stageclassifier pipeline in every new Cursor project

By Skills-Hub Team · Cursor ecosystem coverageJune 5, 20267 min read

CursorAgent SafetyCanvas

Every developer who has left a Cursor agent running overnight has returned to the same scene: a terminal full of y presses, shell commands that ran without review, and a diff that touches files you never meant to touch. Cursor 3.6's Auto-review Mode is the systematic answer to that problem. It replaces the binary "ask every time" vs. "run everything" choice with a three-stage classifier pipeline that handles tool calls the same way a good senior engineer would: approve the obvious ones, isolate the risky ones, and only escalate what genuinely needs a human decision.

One week later, Cursor 3.7 landed with Canvas Design Mode and interactive context usage reports. Together these two releases mark the biggest quality-of-life jump for heavy Cursor users since Composer 2.5.

3.6

Auto-review Mode

Released May 29, 2026

3.7

Canvas Design Mode

Released June 4, 2026

stages in the classifier

allowlist → sandbox → LLM judge

The y-spamming problem

Before Auto-review, Cursor's agent run modes were effectively binary. "Ask" mode stops the agent before every tool call — safe, but so interruptive that you spend more time approving than coding. "Auto" mode (sometimes called YOLO) runs everything without asking — fast, but every shell command lands without review.

In practice most developers ended up in a third emergent mode: "y- spamming" — accepting every prompt as fast as possible while half- watching the terminal. That's the worst outcome: the cognitive load of Ask mode with none of the safety.

Auto-review is the structured middle ground. It applies specifically to Shell, MCP, and Fetch tool calls — the surface area where agents actually interact with your file system, network, and external services. Regular file reads and edits are not classified; only calls that could cause side effects outside the current repo are routed through the pipeline.

How Auto-review works

Every Shell, MCP, or Fetch call the agent wants to make passes through three stages in order. The first stage that produces a verdict wins; calls never travel further down the pipeline than they have to.

Stage 1: Allowlist

If the call matches a pattern in your terminal or MCP allowlist, it executes immediately. No delay, no LLM call. This is where you put the commands that are unambiguously safe in your project: git status, pnpm test, cat on specific paths.

Settings (JSON)

{
  "cursor.agent.terminalAllowlist": [
    "git *",
    "pnpm test",
    "pnpm build",
    "ls *",
    "cat src/**"
  ]
}

Allowlist matching uses glob patterns. Anything that matches runs without any classifier overhead. Build your allowlist first — it's the biggest lever for reducing unnecessary LLM calls.

Stage 2: Sandbox

If a call doesn't match the allowlist but can be sandboxed, Cursor runs it in an isolated environment with network and file system restrictions. The agent can still execute the command and see real output, but side effects that would touch files outside the project or make external network calls are blocked at the OS level.

Sandbox execution is transparent to the agent — it sees real output and behaves normally. You get the speed of Auto mode with the safety net of isolation for calls that sandbox cleanly.

Stage 3: Classifier

Everything that doesn't match the allowlist and can't be sandboxed goes to the classifier: a subagent that evaluates the call against three criteria.

Safety: Does this call have an obvious risk profile? (Deleting files, exfiltrating data, installing global packages without prior instruction)
Intent alignment: Does the call match what you actually asked the agent to do in this session?
Scope: Is the blast radius of this call proportionate to the task?

The classifier returns one of three verdicts: allow (execute), replan (try a different approach without this tool call), or ask (surface to the user). Replan is the interesting one — the agent doesn't just stop, it reroutes. If the classifier blocks a curl call, the agent might switch to a native HTTP library instead.

↓ 60%

fewer approval prompts in typical sessions

Cursor's internal measurement on projects with a tuned allowlist and Auto-review enabled

Configuring the classifier

Auto-review is off by default in existing projects and on by default in new ones created after 3.6. To switch an existing project:

Terminal

# Open settings
cmd+, (Mac) / ctrl+, (Windows/Linux)

# Navigate to:
# Cursor Settings > Agents > Run Mode
# Select: Auto-review

The classifier is steerable with custom instructions. You write plain English rules that tell it how to weigh specific patterns for your project. This is where Auto-review gets genuinely powerful.

.cursor/agent-policy.md

# Classifier Instructions

## Always allow
- Any git command that does not push to remote
- Any pnpm/npm command that only installs devDependencies
- Read operations on any file in this repo

## Always block
- Any rm -rf or equivalent recursive delete
- Any command that writes to files outside this repo root
- Network calls to domains not in our allowlist

## Ask me first
- git push to any remote (including origin)
- npm publish or equivalent
- Any Docker command that modifies running containers
- Secrets or credentials in environment variables

Canvas Design Mode (3.7)

Canvas Design Mode, released June 4, 2026, solves a different problem: the gap between what you see in a UI and what you can accurately describe in text. Before Design Mode, if you wanted Cursor to adjust a component's alignment or color, you wrote a text description of the visual change and hoped the agent interpreted it correctly.

Design Mode makes UI elements in a canvas directly selectable and annotatable. Click a button, drag a margin, annotate a layout section — the annotation becomes the agent's instruction. You're pointing at what to change instead of describing it.

How annotation works

When you open a canvas in Design Mode, every rendered element becomes selectable. Click to select, then annotate with a label ("too small", "wrong color — should match header", "needs 8px gap above"). The agent reads the annotations as structured visual feedback and generates the code change.

The workflow is closest to leaving design comments in Figma — except the comments directly drive code output. For teams where designers review canvases, this means design feedback can feed directly into the agent's task queue without a developer translating comments into prompts.

Context usage reports

Context window management is the hidden tax on long agentic sessions. By the time an agent starts producing degraded output, you're already most of the way through a 200K token window and have no visibility into why.

Cursor 3.7 addresses this with a context usage report that renders as a canvas. Ask Cursor show me a context usage report and it produces an interactive breakdown of where tokens are going.

Example context breakdown

Context usage: 142,400 / 200,000 tokens (71%)

System prompt          12,400    8.7%
Tool definitions        8,200    5.8%
Rules (.cursorrules)   14,300   10.1%
Skills loaded          22,100   15.5%
Conversation history   81,400   57.2%
File context            4,000    2.8%

Because the report is a canvas, you can follow up: "Which skills are consuming the most tokens?" or "How do I reduce the rules footprint?" The agent responds inline and can show a revised breakdown after you accept its suggestions.

The embedded "Debug with Agent" button in the report is the practical lever: one click starts a conversation about context optimization without losing the current task thread.

Run modes compared

Cursor now ships four distinct run modes. Here's the practical guide to when each one belongs:

Ask

Approve every tool call

Use for untrusted repos, onboarding

Auto-review

Classify, sandbox, or ask

Default for active development

Auto

Run everything

Trusted solo projects, CI contexts

A fourth mode — "Yolo" in community parlance — is Auto with all safety checks disabled. It exists for benchmarking and CI scenarios where the agent runs in a throwaway VM. Don't use it on your main checkout.

The right default for most developers is Auto-review with a tuned allowlist covering your project's common commands. The first week will surface occasional classifier escalations — use those to refine your allowlist and custom instructions until the interrupt rate drops to near zero.

Composing with skills-hub.ai

Auto-review's classifier instructions are a form of policy — and policy belongs in version control, not just in Cursor's settings UI. The cursor-agent-safety skill on skills-hub.ai gives you a starting template that audits your current run mode configuration, writes a project-appropriate classifier instruction file, and validates your allowlist against common patterns.

Terminal

# Install the cursor-agent-safety skill
npx @skills-hub-ai/cli install cursor-agent-safety

# The skill creates:
# .cursor/agent-policy.md  — classifier instructions
# .cursor/terminal-allowlist.json  — shell command policy

Once installed, run the skill whenever you add a new tool or integration to your project. It audits whether the new surface area needs allowlist entries or classifier instructions to stay within your project's risk tolerance.

Browse related skills at /browse?category=security or the Kiro skills page if you're running both Cursor and Kiro in parallel.

Written by

Skills-Hub Team

Cursor ecosystem coverage

Skills-Hub is the open registry for AI coding skills, with SKILL.md files synced daily from Anthropic, Google, Microsoft, and 90+ official sources. Free + MIT.

Browse skills →More posts

Continue reading

Cursor Composer 2.5: Build in Parallel, 10x Cheaper

8 min read →

Claude Code Subagents: The Complete 2026 Guide

9 min read →

Windsurf Is Now Devin Desktop: ACP and What Changes

6 min read →