Databricks skills
Official Databricks skills for AI coding assistants working with the Databricks platform skills-hub.ai mirrors 29 skills from Databricks daily, every skill links back to its upstream GitHub source. Install with one command across Claude Code, Cursor, Codex, Windsurf, and any MCP-compatible tool.
Installing a Databricks skill
Pick a skill below, then run the install command for your AI coding tool. The skills-hub CLI writes the SKILL.md to the right directory and tracks the install in .skills.json so your team gets reproducible installs.
# Install a Databricks skill
npx @skills-hub-ai/cli install <skill-slug>
# Browse all Databricks skills via API
curl https://skills-hub.ai/api/v1/skills?source=databricks
# Browse all sources
open https://skills-hub.ai/sourcesTop Databricks skills
See all →The most-installed skills from Databricks, ranked by adoption.
01databricks-app-design
Design the UX of Databricks data apps — dashboards, KPI pages, reports, charts, tables, and Genie/chat data assistants — mapped to concrete AppKit components. Use when BUILDING or reviewing any UI that displays data or answers data questions: choosing genre, layout, charts, KPIs, semantic color, required states (loading/empty/error), IBCS notation, and AI-result trust (showing generated SQL/sources for Genie/chat). NOT for authoring managed AI/BI (Lakeview) dashboards (→ databricks-aibi-dashboards), non-data frontend (forms, settings, auth, marketing), or scaffolding/build/deploy (→ databricks-apps). Complements databricks-apps; use it alongside whenever the app has a dashboard, chart, table, KPI, report, or Genie/chat/AI surface.
Buildfrom Databricks02databricks-unstructured-pdf-generation
Build RAG / unstructured-document evaluation datasets and demo documents (e.g. for Knowledge Assistant) on Databricks: generate synthetic PDFs locally, upload to Unity Catalog volumes, and pair each document with test questions for retrieval evaluation.
Buildfrom Databricks03databricks-pipelines
Develop Lakeflow Spark Declarative Pipelines (formerly Delta Live Tables) on Databricks. Use when building batch or streaming data pipelines with Python or SQL. Invoke BEFORE starting implementation.
Buildfrom Databricks04databricks-apps
Build apps on Databricks Apps platform. Use when asked to create dashboards, data apps, analytics tools, or visualizations. Evaluates data access patterns (analytics vs Lakebase synced tables) before scaffolding. Invoke BEFORE starting implementation.
Buildfrom Databricks05databricks-core
Databricks CLI operations: auth, profiles, data exploration, and bundles. Contains up-to-date guidelines for Databricks-related CLI tasks.
Buildfrom Databricks06databricks-serverless-migration
Migrate Databricks workloads from classic compute to serverless compute. Use when migrating notebooks, jobs, pipelines, or Scala JARs (`spark_jar_task`) from classic clusters to serverless, checking if existing code is serverless-compatible, or writing new serverless-compatible code. Provides concrete fixes for the serverless Spark Connect architecture and guides the full migration. Not for classic DBR version upgrades or cluster configuration changes within classic compute.
Buildfrom Databricks07databricks-docs
Databricks documentation reference via llms.txt index. Use when other skills do not cover a topic, looking up unfamiliar Databricks features, or needing authoritative docs on APIs, configurations, or platform capabilities.
Buildfrom Databricks08databricks-lakebase
Databricks Lakebase Postgres: projects, scaling, connectivity, Lakebase synced tables, and Data API. Use when asked about Lakebase databases, OLTP storage, or connecting apps to Postgres on Databricks.
Buildfrom Databricks09databricks-model-serving
Manage Databricks Model Serving endpoints via CLI. Use when asked to create, configure, query, or manage model serving endpoints for LLM inference, custom models, or external models.
Buildfrom Databricks10databricks-python-sdk
Databricks development guidance including Python SDK, Databricks Connect, CLI, and REST API. Use when working with databricks-sdk, databricks-connect, or Databricks APIs.
Buildfrom Databricks11databricks-dabs
Create, configure, validate, deploy, run, and manage Declarative Automation Bundles (DABs, formerly Databricks Asset Bundles). Use when working with Databricks resources via DABs including dashboards, jobs, pipelines, alerts, volumes, and apps.
Buildfrom Databricks12databricks-vector-search
Databricks Vector Search endpoints and indexes for RAG and semantic search; covers index types, search modes, end-to-end RAG patterns
Buildfrom Databricks13databricks-spark-structured-streaming
Comprehensive guide to Spark Structured Streaming for production workloads. Use when building streaming pipelines, working with Kafka ingestion, implementing Real-Time Mode (RTM), configuring triggers (processingTime, availableNow), handling stateful operations with watermarks, optimizing checkpoints, performing stream-stream or stream-static joins, writing to multiple sinks, or tuning streaming cost and performance.
Buildfrom Databricks14spark-python-data-source
Build custom Python data sources for Apache Spark using the PySpark DataSource API — batch and streaming readers/writers for external systems. Use this skill whenever someone wants to connect Spark to an external system (database, API, message queue, custom protocol), build a Spark connector or plugin in Python, implement a DataSourceReader or DataSourceWriter, pull data from or push data to a system via Spark, or work with the PySpark DataSource API in any way. Even if they just say "read from X in Spark" or "write DataFrame to Y" and there's no native connector, this skill applies.
Buildfrom Databricks15databricks-synthetic-data-gen
Generate realistic synthetic data using Spark + Faker (strongly recommended). Supports serverless execution, multiple output formats (Parquet/JSON/CSV/Delta), and scales from thousands to millions of rows. For small datasets (<10K rows), can optionally generate locally and upload to volumes. Use when user mentions 'synthetic data', 'test data', 'generate data', 'demo dataset', 'Faker', or 'sample data'.
Buildfrom Databricks16databricks-lakeflow-connect
Build managed ingestion pipelines into Databricks using Lakeflow Connect. Use when ingesting from SaaS apps (Salesforce, Workday Reports, ServiceNow, Google Analytics 4, HubSpot, Confluence) or databases (SQL Server cloud and on-prem; PostgreSQL/MySQL CDC in PuPr) into Unity Catalog with serverless pipelines.
Buildfrom Databricks17databricks-dbsql
Databricks SQL (DBSQL) advanced features and SQL warehouse capabilities. This skill MUST be invoked when the user mentions: "DBSQL", "Databricks SQL", "SQL warehouse", "SQL scripting", "stored procedure", "CALL procedure", "materialized view", "CREATE MATERIALIZED VIEW", "pipe syntax", "|>", "geospatial", "H3", "ST_", "spatial SQL", "collation", "COLLATE", "ai_query", "ai_classify", "ai_extract", "ai_gen", "AI function", "http_request", "remote_query", "read_files", "Lakehouse Federation", "recursive CTE", "WITH RECURSIVE", "multi-statement transaction", "temp table", "temporary view", "pipe operator". SHOULD also invoke when the user asks about SQL best practices, data modeling patterns, or advanced SQL features on Databricks.
Buildfrom Databricks18databricks
Databricks CLI operations: auth, profiles, data exploration, and bundles. Contains up-to-date guidelines for Databricks-related CLI tasks.
Buildfrom Databricks19databricks-agent-bricks
Create Agent Bricks: Knowledge Assistants (KA) for document Q&A and Supervisor Agents for multi-agent orchestration (MAS).
Buildfrom Databricks20databricks-jobs
Develop and deploy Lakeflow Jobs on Databricks via DABs, Python SDK, or the CLI. Use when creating data engineering jobs with notebooks, Python wheels, SQL, dbt, or pipelines. Invoke BEFORE starting implementation.
Buildfrom Databricks21databricks-ai-functions
Use Databricks built-in AI Functions (ai_classify, ai_extract, ai_summarize, ai_mask, ai_translate, ai_fix_grammar, ai_gen, ai_analyze_sentiment, ai_similarity, ai_parse_document, ai_query, ai_forecast) to add AI capabilities directly to SQL and PySpark pipelines without managing model endpoints. Also covers document parsing and building custom RAG pipelines (parse → chunk → index → query).
Buildfrom Databricks22databricks-apps-python
Builds Databricks applications. Prefers AppKit (TypeScript + React SDK) for new apps; falls back to Python frameworks (Dash, Streamlit, Gradio, Flask, FastAPI, Reflex) when Python is required. Handles OAuth authorization, app resources, SQL warehouse and Lakebase connectivity, model serving, foundation model APIs, and deployment. Use when building web apps, dashboards, ML demos, or REST APIs for Databricks, or when the user mentions AppKit, Streamlit, Dash, Gradio, Flask, FastAPI, Reflex, or Databricks app.
Buildfrom Databricks23databricks-metric-views
Unity Catalog metric views: define, create, query, and manage governed business metrics in YAML. Use when building standardized KPIs, revenue metrics, order analytics, or any reusable business metrics that need consistent definitions across teams and tools.
Buildfrom Databricks24databricks-mlflow-evaluation
MLflow 3 GenAI agent evaluation. Use when writing mlflow.genai.evaluate() code, creating @scorer functions, using built-in scorers (Guidelines, Correctness, Safety, RetrievalGroundedness), building eval datasets from traces, setting up trace ingestion and production monitoring, aligning judges with MemAlign from domain expert feedback, or running optimize_prompts() with GEPA for automated prompt improvement.
Buildfrom Databricks
About this source
skills-hub.ai mirrors skills from 90+ official GitHub repositories every day. Each imported skill is parsed from a SKILL.md file in the source repo, gets a security scan and quality score on import, and links back to its upstream source of truth.
Last sync: Jun 14, 2026, 4:11 PM (success).
Databricks skills, frequently asked
What are Databricks skills?
Databricks skills are AI coding skills published by Databricks (Official Databricks skills for AI coding assistants working with the Databricks platform) and mirrored daily on skills-hub.ai. They are SKILL.md files that follow the open Agent Skills standard, so they work in Claude Code, Cursor, Codex CLI, Windsurf, Copilot, and any MCP-compatible tool.
How many Databricks skills are available?
skills-hub.ai indexes 29 skills from Databricks, synced daily from the upstream GitHub repository (https://github.com/databricks/databricks-agent-skills).
How do I install a Databricks skill?
Run `npx @skills-hub-ai/cli install <skill-slug>` in your project. The CLI writes the SKILL.md to the right directory for your AI tool and adds it to your `.skills.json` lockfile so your team gets the same skills at the same versions.
Are these official Databricks skills?
Yes. Every skill from this source is mirrored from Databricks's own GitHub repository (https://github.com/databricks/databricks-agent-skills). Each skill page links back to the upstream source of truth, so you can verify the original.