Skip to main content

debug-production-incident

Structured triage → mitigation → root-cause flow for live production incidents. Use when a running system is degraded or down (errors, latency, outages) and you need to restore service safely and then find the cause.

v1.0.0New
0

Signing

SignedSLSA L2
Signed by
skills-hub.ai distributor
Method
Distributor-signed by skills-hub.aiCryptographically signed by the skills-hub.ai distributor key at publish time.
Signed

Install this skill

Run this command in your terminal. No account required — it auto-detects your AI tool and installs the skill file.

npx @skills-hub-ai/cli install debug-production-incident
Or download directly:
Browse all CLI commands →

Setup by platform

Claude Code

~/.claude/skills/<skill>/SKILL.md

Setup guide →

Cursor

~/.cursor/skills/<skill>/SKILL.md

Setup guide →

Windsurf

~/.windsurf/skills/<skill>/SKILL.md, or add to .windsurfrules in your project

Setup guide →

GitHub Copilot

.github/copilot-instructions.md (project-level), paste skill content into instructions

Setup guide →

Cline

~/.cline/skills/<skill>/SKILL.md, or add to .clinerules in your project

Setup guide →

Codex CLI

~/.codex/skills/<skill>/SKILL.md, or reference via codex.md instructions file

Setup guide →

Gemini CLI

GEMINI.md (project-level) or ~/.gemini/skills/<skill>/SKILL.md

Setup guide →

Install

One-click setup for your editor

Run in your project root

npx @skills-hub-ai/cli install debug-production-incident --target claude-code

Examples

Prompt

Our user dashboard is showing 500 errors for all users in the US East region since 10:15 AM. The error rate spiked suddenly after a recent deployment. Can you help?

Output

SITUATION: US East users experiencing 500 errors since 10:15 AM, coinciding with recent deployment, indicating a likely deploy-related incident.
TIMELINE: Deployment completed at 10:10 AM; errors began at 10:15 AM; alert triggered at 10:16 AM.
HYPOTHESES: Deploy introduced a breaking change; logs show new error signatures post-deploy; metrics indicate error spike immediately after release.
ACTION NOW: Roll back the recent deployment to the previous stable version. Expect error rate to decrease and errors to subside within 2-3 minutes.
ROLLBACK: Revert the deployment via the deployment system to the last known-good build. Monitor metrics to confirm recovery.

Prompt

Our API service is experiencing high latency and increased error rates in the Europe region, affecting 30% of users. No recent deploys. What should I check next?

Output

SITUATION: Europe region API experiencing high latency and errors affecting 30% of users, without recent deployments, indicating a possible dependency or resource issue.
TIMELINE: Latency and errors increased sharply at 11:45 AM; no recent code changes; traffic volume stable.
HYPOTHESES: Dependency slowdown (e.g., downstream API or database); resource saturation (CPU/memory); network issues.
ACTION NOW: Check dependency health and latency metrics, focusing on downstream services and database performance. If dependency latency is high, enable circuit breakers or serve cached responses.
ROLLBACK: Not applicable unless a dependency fix is identified; focus on mitigating by serving degraded responses or rate limiting until root cause is confirmed.

Instructions

This skill doesn’t include stateful context yet, instructions only. Learn about stateful skills.

Security

Loading security scan...

Reviews (0)

Frequently asked questions about debug-production-incident

What does the debug-production-incident skill do?

Structured triage → mitigation → root-cause flow for live production incidents. Use when a running system is degraded or down (errors, latency, outages) and you need to restore service safely and then find the cause. It's a reusable SKILL.md instruction set that loads into your AI coding assistant on demand, no prompt engineering, no copy-pasting every session.

How do I install the debug-production-incident skill?

Run `npx @skills-hub-ai/cli install debug-production-incident` from your terminal. The CLI writes the SKILL.md to the correct location for your AI tool (e.g. ~/.claude/skills/debug-production-incident/ for Claude Code or ~/.cursor/skills/ for Cursor with --target cursor) and adds it to your project's .skills.json lockfile.

Which AI tools does debug-production-incident work with?

debug-production-incident runs in Claude Code, Cursor, Windsurf, GitHub Copilot, Cline, Codex CLI, Gemini CLI. It follows the open Agent Skills standard (SKILL.md), so the same skill works in every supported tool without modification.

Is the debug-production-incident skill free?

Yes. Every skill on skills-hub.ai is free and open-source. There are no premium tiers, paywalls, or usage limits. You only pay for whatever AI assistant you're already using.

How do I use debug-production-incident after installing it?

In Claude Code, type `/debug-production-incident` (or whatever slash command the skill registers) and the AI follows the skill's instructions immediately. You can also reference it by name in natural language, your AI loads the skill into context when relevant.

Can I share the debug-production-incident skill with my team?

Yes. Commit your project's .skills.json lockfile and teammates run `npx @skills-hub-ai/cli install` (no args) to install every skill at the exact version you pinned. Organization-scoped installs work via skills-hub.ai organizations.