On this page
The AI Builder's Guide to Building Skills for Claude Code
This guide covers building custom SKILL.md files from first principles to multi-skill architecture, with essential security patterns.

Claude Code didn't just enter the AI coding market it took it over. $2.5 billion ARR in under a year. Eighteen percent of developers use it at work, up 6x from early 2025 (JetBrains, 2026). Four percent of all public GitHub commits now flow through it. And the thing that makes Claude Code more than a smart autocomplete? Skills.
Skills are how you teach Claude your workflow, your conventions, your judgment. They're markdown files that turn a general-purpose AI into a specialized teammate. And because SKILL.md is now an open standard adopted by 27+ tools — from GitHub Copilot to Gemini CLI to Cursor a skill you write today runs across the entire AI tooling ecosystem.
Most guides show you the syntax and stop there. This one covers what actually matters: how Claude reads your instructions, where skills break, how to secure them, and how to compose them into systems.
Key Takeaways
Claude Code commands the #1 spot by revenue in a $7B+ AI coding market, with 91% user satisfaction (JetBrains, 2026)
SKILL.md is an open standard adopted by 27+ AI tools write once, your skill runs on Copilot, Gemini CLI, Cursor, and others
36.8% of published skills contain at least one security vulnerability (Snyk ToxicSkills, 2026) allowed-tools is your first line of defense
Good skills use progressive disclosure: metadata at startup, full content on activation, references and scripts only when needed
Why Build Skills for Claude Code?
The AI coding market hit an estimated $5.9 billion to $8.1 billion in 2025 and is projected to more than double by 2030 (MarketsandMarkets, 2025). Three tools — Claude Code, Cursor, and GitHub Copilot command roughly 70-80% of that revenue, pulling in a combined $7-8.5 billion ARR as of Q1 2026 (AgentMarketCap, April 2026).
Claude Code isn't just participating. It's the largest AI coding agent by revenue and the fastest-growing standalone coding product ever shipped. And it didn't win on features Copilot shipped years earlier, Cursor built a whole IDE. Claude Code won on customizability. Skills are the mechanism.
Here's the part most tutorials miss: SKILL.md isn't a Claude Code proprietary format. It's an open standard. Anthropic open-sourced the specification, and 27+ AI tools now support it GitHub Copilot, OpenAI Codex, Google Gemini CLI, Cursor, Windsurf, and more. The reference implementation on GitHub has 75,000+ stars (AgentPatterns.ai, 2026). You're not writing a Claude Code plugin. You're writing a portable AI capability.
So why does this matter for you as a builder? Because the skills ecosystem is exploding 283,000+ skills indexed by SkillsMP as of February 2026, over 400,000 across all sources (denser.ai, April 2026) and most of them are mediocre. The gap between a prompt someone tossed into a file and a well-architected skill is the gap between a tool that frustrates and a tool that feels like it reads your mind. This guide is about closing that gap.
According to a 2026 JetBrains survey of 10,000+ developers, Claude Code reached 18% workplace adoption with the highest satisfaction scores in the industry a 91% customer satisfaction rate and an NPS of 54, which the survey classifies as "excellent" (JetBrains Research, April 2026). Skills are the customization layer that turns high satisfaction into deep integration with how your team actually works.
The Anatomy of a SKILL.md File
Every skill starts as a single markdown file with YAML frontmatter. Claude Code reads the frontmatter at session startup the name and description determine when your skill activates and loads the full body and any referenced files only when the skill matches the current task.
Here's the minimal skeleton:
The frontmatter fields break down like this: |
Our finding: Skills with descriptions under 30 words that name specific triggers ("when reviewing PRs", "before committing") activate far more reliably than those with generic descriptions. Claude's skill-matching is keyword-and-context based treat your description like search engine optimization for an audience of one very literal reader.
Progressive Disclosure How Claude Reads Skills
Claude Code doesn't load your entire skill folder into memory at startup. It uses a three-tier progressive disclosure system, and understanding these tiers is the difference between a skill that gets used and one that gets ignored.
Tier 1: Metadata at session start. Claude loads the name and description from every skill in .claude/skills/. That's it. Twenty to forty words across all your skills combined. If your description doesn't match the task, Claude never reads another word of your skill file.
Tier 2: Full markdown on activation. When Claude determines a skill matches the current task, it loads the entire SKILL.md body instructions, examples, rules, decision frameworks. This is your main canvas. Every word here costs context budget, and Claude's context window is finite. A skill body over 500 words starts competing with the user's actual code for attention.
Tier 3: References and scripts on demand. Files in a references/ or scripts/ subdirectory within your skill folder only load when Claude explicitly reads them via a tool call. Put reference material, long examples, and executable scripts here not in the main SKILL.md body.
This structure exists because context is the scarcest resource in AI-assisted development. Claude Code juggles your codebase, conversation history, system instructions, and skill content in a single context window. A bloated skill body doesn't just waste tokens it pushes out the code Claude is supposed to be working on.
The rule of thumb: everything in Tier 2 should be instruction. Everything in Tier 3 should be reference. If you find yourself pasting a 200-line code example into your SKILL.md body, move it to references/examples.md and add a note: "See references/examples.md for the full implementation."
The SKILL.md standard's three-tier loading model mirrors how the best agent platforms handle instruction density compact at invocation, detailed on activation, expansive on demand. When you structure a skill this way, it activates faster, uses less context, and leaves more room for the work that actually matters.
Where Skills Live The Directory Structure
Skills live in the .claude/skills/ directory, and where that directory sits determines who can use the skill:
.claude/
└── skills/
├── code-reviewer/
│ └── SKILL.md
└── commit-writer/
├── references/
│ └── examples.md
├── scripts/
│ └── format-commit.sh
└── SKILL.md
Project-level skills (.claude/skills/ in your repo root) are shared with everyone who clones the project. Use these for team conventions, project-specific workflows, and domain rules. When a new developer clones the repo, they get your skills automatically.
User-level skills (~/.claude/skills/) are personal. They follow you across projects. Use them for your individual preferences how you like commit messages formatted, your preferred testing workflow, your personal code style rules.
Plugin skills come from third-party packages and live in the plugin cache. They're read-only from your perspective, but understanding their structure helps you debug conflicts.
The decision framework is simple: if the rule should apply to everyone working on this codebase, put it in project-level. If it's your personal taste, keep it in user-level. Skills in both locations activate simultaneously, so don't create two skills with the same trigger condition unless you enjoy debugging phantom behavior.
Writing Your First Skill A Step-by-Step Walkthrough
Let's build something real: a commit message writer that follows the conventional commits spec. Not because the world needs another commit writer skill, but because it's the right size to demonstrate every concept that matters.
I've built about a dozen skills for my own workflow. The pattern I keep re-learning: start with the trigger, then write the behavior, then add the guardrails. Every time I started with the guardrails first, I over-engineered a skill that never activated reliably.
Step 1: Name the skill and write the description
|
Notice the description names specific trigger phrases: "write a commit message", "generate a commit", "format staged changes." This is intentional. Claude matches skills against what the user actually types.
Step 2: Write the instruction body
Keep it under 300 words. State the goal, the steps, and the rules:
|
Step 3: Test and iterate
Stage a small change, then ask Claude to write a commit message. Watch what happens:
Does Claude activate the skill? (It should say "Using skill: commit-writer")
Does it follow the format? (Check the output against conventional commits)
Does it stop at the right boundary? (It should NOT auto-commit)
If activation fails, your description is wrong add more trigger phrases. If the output is wrong, your instructions are ambiguous add specifics. Debug the activation problem first, then the quality problem. A perfectly written skill that never triggers is worse than an imperfect one that runs.
Step 4: Extract to references
Once the instructions feel too long, move examples and edge cases to references/. Your SKILL.md body should be the shortest version that reliably produces correct results. Everything else is reference material.
The iteration loop for every skill I maintain: activate, observe, tighten, repeat. Each pass removes words, not adds them.
Security and Guardrails
Here's the stat that should keep you up at night: a Snyk audit of the public skills ecosystem found that 36.8% of published skills contain at least one security vulnerability, and 341 malicious skills were identified in the wild as of February 2026 (Snyk ToxicSkills, 2026).
Skills run with access to your filesystem, your shell, and your network. A malicious or poorly-written skill can read secrets, modify code, exfiltrate data, or run arbitrary commands. The allowed-tools field isn't documentation it's your primary security control.
The allowed-tools whitelist
|
Start with the narrowest whitelist and expand only when something breaks. A commit-writer skill needs Bash(git diff:*) and Read — not Bash(*). A code reviewer needs Read, Grep, and Glob it shouldn't need shell access at all.
Source review checklist for third-party skills
Before installing a skill someone else wrote, answer three questions:
Does the allowed-tools list match what the skill claims to do? If a "syntax highlighter" skill requests Bash, something is wrong.
Are there hardcoded URLs, IP addresses, or data exfiltration paths in the body or scripts? Grep for curl, wget, http, and anything that looks like a webhook.
Does the skill reference external scripts or dependencies? Follow every source, ., or import skills can execute arbitrary code through script references.
Model overrides deserve a mention too. A skill that pins model: opus for a task Sonnet handles fine is burning money, not adding value. Skills that run frequently commit hooks, lint checks, quick reviews — should target the fastest model that does the job. Reserve Opus for architecture decisions and complex refactors.
The 36.8% vulnerability rate isn't a reason to avoid skills. It's a reason to be deliberate about them. A skill you write yourself, with a tight allowed-tools whitelist and instructions you've tested, is safer than copying a Stack Overflow answer into your terminal. And it's definitely safer than asking a general-purpose AI to do the same task with unrestricted tool access every single time.
Multi-Skill Architecture Patterns
Once you've written three or four skills, a new problem emerges: they start stepping on each other. Two skills claim to handle code review. A third triggers on every PR-related task. Your commit-writer activates when someone asks about commit history documentation.
Composition is harder than creation. Here are the patterns that work.
One skill, one trigger condition
The single biggest mistake in skill architecture is the "Swiss Army knife" skill one SKILL.md that handles code review, commit writing, PR description generation, and release note drafting. It activates constantly, consumes enormous context, and does everything mediocrely.
Instead: one skill per distinct trigger condition. code-reviewer activates on "review this." commit-writer activates on "write a commit." pr-description creates PR descriptions. Each is small, focused, and cheap to load. Claude activates the right one based on the task.
Skill-to-skill handoff
When skills do need to chain, use explicit handoff language:
|
Don't try to make skills call each other automatically. That path leads to infinite loops and context explosions. A skill should do one thing and suggest the next step let the user drive the transitions.
Subagent skills
Sometimes a task is complex enough that it needs its own agent. Claude Code supports this through agent-type skills:
|
An agent-type skill spawns as a subagent with its own context window, which means it can handle large reviews without competing for space in the main conversation. Use this pattern when a skill needs to process a lot of information independently code reviews, test generation, migration analysis. Don't use it for quick checks or simple transformations; the subagent spawn overhead isn't worth it.
Architecture rule of thumb
If you can describe a skill's job in one sentence and that sentence doesn't include the word "and," the scope is right. "Writes conventional commit messages from staged diffs" is correct. "Writes commit messages and reviews code and generates PR descriptions and updates changelogs" means you need four skills.
When NOT to Build a Skill
Skills aren't free. Every skill you add increases startup latency, consumes context budget, and creates another thing to maintain when your workflow changes. A skill is worth building when the task is repeated, structured, and benefits from consistency.
Here's the decision framework I use.
Build a skill when:
Signal | Example |
|---|---|
You've done the same task 3+ times and each time you had to re-explain the rules | "No, use conventional commits with the ticket number" |
The task has a clear input → output shape that instructions can capture | Diff in, formatted commit message out |
Getting it wrong has real consequences | Security reviews, database migrations, deployment commands |
You want the same behavior across multiple projects | User-level skill |
Skip the skill when:
Signal | Better tool |
|---|---|
It's a one-time task or happens once a month | Just tell Claude what you want in the moment |
It's a standing rule, not a triggered workflow | Put it in CLAUDE.md |
It's an automated trigger (on save, on push, etc.) | Use hooks (settings.json) |
You're not sure what good output looks like yet | Do it manually a few times first, then codify |
The 3x rule: don't build a skill for something you've done fewer than three times. You don't know the edge cases yet. You'll build the wrong abstraction, and wrong abstractions are worse than no abstractions because they're actively misleading.
Skills are the middle layer in Claude Code's customization stack. CLAUDE.md handles ambient context ("we use TypeScript strict mode, we prefer functions over classes"). Hooks handle event-driven automation ("run the linter before every commit"). Skills handle triggered expertise ("when I ask for a code review, here's exactly how to do it"). Pick the right layer and you get reliable behavior. Pick the wrong one and you get a skill that activates when it shouldn't, or a CLAUDE.md that's too long to be useful.
Frequently Asked Questions
Can I use skills written for other AI tools in Claude Code?
Yes. Since SKILL.md is an open standard, most skills written for GitHub Copilot, Cursor, Gemini CLI, or Windsurf work in Claude Code with minimal changes. The primary difference is the allowed-tools field Claude Code uses tool-specific names like Bash, Read, and Grep, while other platforms may name them differently. Check the tool names and adjust.
How many skills should I have active at once?
There's no hard limit, but context is finite. Each skill's metadata (~30-50 words) loads at startup. With 10-15 well-scoped skills, the overhead is negligible. Beyond 30, activation accuracy starts degrading because too many descriptions compete for the same trigger conditions. If you're collecting skills like trading cards, you're doing it wrong.
Do skills work with Claude's API, or only in the CLI?
Skills are primarily a Claude Code CLI feature. The Claude API uses system prompts and tool definitions instead. However, the SKILL.md open standard means the same instructional content works across platforms if you move from CLI to API, your skill's instructions translate directly into system prompt content.
What's the difference between a skill and a custom slash command?
A slash command (/review, /deploy) is a user-triggered shortcut. A skill activates automatically based on task context. Slash commands are great for explicit workflows you run deliberately. Skills are better for behaviors you want Claude to adopt without being asked. The two can coexist a skill handles automatic activation, and a slash command provides an explicit trigger for edge cases.
Do I need to know how to code to build a skill?
No. SKILL.md files are plain markdown with YAML frontmatter. If you can write clear instructions in English, you can build a skill. The advanced patterns subagent configuration, reference scripts, multi-skill architecture benefit from engineering experience, but a basic skill that captures your preferences for a repeated task requires nothing more than knowing what you want Claude to do and communicating it clearly.
Conclusion
Skills are the difference between using Claude Code and building on it. A well-crafted skill captures your expertise in a form Claude can apply consistently, across projects, and across the 27+ tools that now speak SKILL.md.
The patterns that separate good skills from the 400,000+ mediocre ones are straightforward: specific triggers, tight scopes, progressive disclosure, allowed-tools whitelists, and ruthless editing. A 200-word skill that activates reliably and produces correct output is better than a 2,000-word skill that never loads.
Start with the commit-writer pattern from this guide. Observe how it activates. Tighten the description. Move examples to references. Then build your second skill something specific to your stack, your conventions, your pain points. By the third one, the patterns will feel obvious.
The AI coding market is consolidating around platforms that support customization. Skills are the customization primitive. The builders who invest in them now are the ones whose tools feel like extensions of their brain six months from now.


