Name: Sensei
Author: Microsoft
Install
Terminal · npx
$npx skills add https://github.com/microsoft/github-copilot-for-azure --skill azure-ai
Works with Paperclip
How Sensei fits into a Paperclip company.

Sensei drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.
SaaS FactoryPaired
Pre-configured AI company — 18 agents, 18 skills, one-time purchase.
$27$59
Explore pack
Source file
SKILL.md230 linesmarkdown
Expand
1---2name: sensei3description: "**WORKFLOW SKILL** — Iteratively improve skill frontmatter compliance using the Ralph loop pattern. WHEN: \"run sensei\", \"sensei help\", \"improve skill\", \"fix frontmatter\", \"skill compliance\", \"frontmatter audit\", \"score skill\", \"check skill tokens\". INVOKES: token counting tools, test runners, git commands. FOR SINGLE OPERATIONS: use token CLI directly for counts/checks."4license: MIT5metadata:6  author: Microsoft7  version: "1.0.5"8---9 10# Sensei11 12> "A true master teaches not by telling, but by refining." - The Skill Sensei13 14Automates skill frontmatter improvement using the [Ralph loop pattern](https://github.com/soderlind/ralph) - iteratively improving skills until they reach Medium-High compliance with passing tests, then checking token usage and prompting for action.15 16## Help17 18When user says "sensei help" or asks how to use sensei, show this:19 20```21╔══════════════════════════════════════════════════════════════════╗22║  SENSEI - Skill Frontmatter Compliance Improver                  ║23╠══════════════════════════════════════════════════════════════════╣24║                                                                  ║25║  USAGE:                                                          ║26║    Run sensei on <skill-name>              # Single skill        ║27║    Run sensei on <skill-name> --skip-integration  # Fast mode    ║28║    Run sensei on <skill1>, <skill2>, ...   # Multiple skills     ║29║    Run sensei on all Low-adherence skills  # Batch by score      ║30║    Run sensei on all skills                # All skills       ║31║                                                                  ║32║  EXAMPLES:                                                       ║33║    Run sensei on appinsights-instrumentation                     ║34║    Run sensei on azure-security --skip-integration               ║35║    Run sensei on azure-security, azure-observability             ║36║    Run sensei on all Low-adherence skills                        ║37║                                                                  ║38║  WHAT IT DOES:                                                   ║39║    1. READ      - Load skill's SKILL.md, tests, and token count  ║40║    2. SCORE     - Check compliance (Low/Medium/Medium-High/High) ║41║    3. SCAFFOLD  - Create tests from template if missing          ║42║    4. IMPROVE   - Add WHEN: triggers (cross-model optimized)     ║43║    5. TEST      - Run tests, fix if needed                       ║44║    6. REFERENCES- Validate markdown links                        ║45║    7. TOKENS    - Check token budget, gather suggestions         ║46║    8. SUMMARY   - Show before/after with suggestions             ║47║    9. PROMPT    - Ask: Commit, Create Issue, or Skip?            ║48║   10. REPEAT    - Until Medium-High score + tests pass           ║49║                                                                  ║50║  TARGET SCORE: Medium-High                                       ║51║    ✓ Description > 150 chars, ≤ 60 words                         ║52║    ✓ Has "WHEN:" trigger phrases (preferred)                     ║53║    ✓ No "DO NOT USE FOR:" (unless disambiguation-critical)         ║54║    ✓ SKILL.md < 500 tokens (soft limit)                          ║55║                                                                  ║56║  MORE INFO:                                                      ║57║    See .github/skills/sensei/README.md for full documentation    ║58║                                                                  ║59╚══════════════════════════════════════════════════════════════════╝60```61 62## When to Use63 64- Improving a skill's frontmatter compliance score65- Adding trigger phrases and anti-triggers to skill descriptions66- Batch-improving multiple skills at once67- Auditing and fixing Low-adherence skills68 69## Invocation Modes70 71### Single Skill72```73Run sensei on azure-deploy74```75 76### Multiple Skills77```78Run sensei on azure-security, azure-observability79```80 81### By Adherence Level82```83Run sensei on all Low-adherence skills84```85 86### All Skills87```88Run sensei on all skills89```90 91### GEPA Mode (Deep Optimization)92```93Run sensei on my-skill --gepa94Run sensei on my-skill --gepa --skip-integration95Run sensei on all skills --gepa96```97 98When `--gepa` is used, Step 5 (IMPROVE) is replaced with GEPA evolutionary optimization.99Instead of template-based improvements, GEPA parses trigger prompt arrays from the existing100test harness and combines them with content quality heuristics to build a fitness function.101An LLM proposes and evaluates many candidate improvements automatically. Note: GEPA does not102execute Jest tests directly — it uses the test data (prompts) as evaluation inputs.103 104**GEPA score-only mode** (no LLM calls, just evaluate current quality):105```106Run sensei score my-skill107Run sensei score all skills108```109 110## The Ralph Loop111 112For each skill, execute this loop until score >= Medium-High AND tests pass:113 1141. **READ** - Load `plugin/skills/{skill-name}/SKILL.md`, tests, and token count1152. **SCORE** - Run spec-based compliance check (see [SCORING.md](references/SCORING.md)):116   - Validate `name` per [agentskills.io spec](https://agentskills.io/specification) (no `--`, no start/end `-`, lowercase alphanumeric)117   - Check description length and word count (≤60 words)118   - Check triggers (WHEN: preferred, USE FOR: accepted)119   - Warn on "DO NOT USE FOR:" (risky in multi-skill environments — **exception**: REQUIRED for skills that share trigger overlap with broader skills like `azure-prepare`)120   - Preserve optional spec fields (`license`, `metadata`, `allowed-tools`) if present1213. **CHECK** - If score >= Medium-High AND tests pass → go to TOKENS step1224. **SCAFFOLD** - If `tests/{skill-name}/` doesn't exist, create from `tests/_template/`1235. **IMPROVE FRONTMATTER** - Add WHEN: triggers (stay under 60 words and 1024 chars)1245b. **IMPROVE WITH GEPA** (when `--gepa` flag is set) — Replaces step 5 (IMPROVE FRONTMATTER) with automated optimization; step 6 (IMPROVE TESTS) still runs normally:125   - Auto-discovers `tests/{skill-name}/triggers.test.ts` and extracts prompt arrays126   - Builds a GEPA evaluator scoring content quality + trigger accuracy based on those trigger prompt arrays (not Jest test pass/fail results)127   - Runs `python .github/skills/sensei/scripts/gepa/auto_evaluator.py optimize --skill {skill-name} --skills-dir plugin/skills --tests-dir tests`128   - Shows diff of optimized SKILL.md for user approval129   - GEPA uses existing test trigger definitions as configuration — it does not execute, replace, or modify Jest tests1306. **IMPROVE TESTS** - Update `shouldTriggerPrompts` and `shouldNotTriggerPrompts` to match the finalized frontmatter (including any GEPA changes)1317. **VERIFY** - Run `cd tests && npm test -- --testPathPatterns={skill-name}`1328. **VALIDATE REFERENCES** - Run `cd scripts && npm run references {skill-name}` to check markdown links1339. **TOKENS** - Check token budget and line count (< 500 lines per spec), gather optimization suggestions13410. **SUMMARY** - Display before/after comparison with unimplemented suggestions13511. **PROMPT** - Ask user: Commit, Create Issue, or Skip?13612. **REPEAT** - Go to step 2 (max 5 iterations per skill)137 138## Scoring Criteria (Quick Reference)139 140Sensei validates skills against the [agentskills.io specification](https://agentskills.io/specification). See [SCORING.md](references/SCORING.md) for full details.141 142| Score | Requirements |143|-------|--------------|144| **Invalid** | Name fails spec validation (consecutive hyphens, start/end hyphen, uppercase, etc.) |145| **Low** | Basic description, no explicit triggers |146| **Medium** | Has trigger keywords/phrases, description > 150 chars, >60 words |147| **Medium-High** | Has "WHEN:" (preferred) or "USE FOR:" triggers, ≤60 words |148| **High** | Medium-High + compatibility field |149 150**Target: Medium-High** (distinctive triggers, concise description)151 152> ⚠️ "DO NOT USE FOR:" is **risky in multi-skill environments** (15+ overlapping skills) — causes keyword contamination on fast-pattern-matching models. Safe for small, isolated skill sets. Use positive routing with `WHEN:` for cross-model safety.153>154> **Exception — disambiguation-critical skills:** When a skill's `USE FOR` triggers directly overlap with a broader skill (e.g., `azure-prepare` owns "deploy to Azure"), `DO NOT USE FOR:` is **REQUIRED** to prevent the broader skill from capturing prompts that belong to the specialized skill. Removing it causes routing regressions. Integration tests validate this routing -- run them before removing any `DO NOT USE FOR:` clause.155 156**Strongly recommended** (reported as suggestions if missing):157- `license` — identifies the license applied to the skill158- `metadata.version` — tracks the skill version for consumers159 160## Frontmatter Template161 162Per the [agentskills.io spec](https://agentskills.io/specification), required and optional fields:163 164```yaml165---166name: skill-name167description: "[ACTION VERB] [UNIQUE_DOMAIN]. [One clarifying sentence]. WHEN: \"trigger 1\", \"trigger 2\", \"trigger 3\"."168license: MIT169metadata:170  version: "1.0"171# Other optional spec fields — preserve if already present:172# metadata.author: example-org173# allowed-tools: Bash(git:*) Read174---175```176 177> **IMPORTANT:** Use inline double-quoted strings for descriptions. Do NOT use `>-` folded scalars (incompatible with skills.sh). Do NOT use `|` literal blocks (preserves newlines). Keep total description under 1024 characters and ≤60 words.178 179> ⚠️ **"DO NOT USE FOR:" carries context-dependent risk.** In multi-skill environments (10+ skills with overlapping domains), anti-trigger clauses introduce the very keywords that cause wrong-skill activation on Claude Sonnet and fast-pattern-matching models ([evidence](https://gist.github.com/kvenkatrajan/52e6e77f5560ca30640490b4cc65d109)). For small, isolated skill sets (1-5 skills), the risk is low. When in doubt, use positive routing with `WHEN:` and distinctive quoted phrases.180>181> **Exception:** `DO NOT USE FOR:` is **REQUIRED** when a specialized skill's triggers overlap with a broader skill (e.g., `azure-hosted-copilot-sdk` vs. `azure-prepare` on "deploy to Azure"). Without the negative discriminator, the broader skill captures prompts that should route to the specialized one. Always run integration tests before removing a `DO NOT USE FOR:` clause.182 183## Test Scaffolding184 185When tests don't exist, scaffold from `tests/_template/`:186 187```bash188cp -r tests/_template tests/{skill-name}189```190 191Then update:1921. `SKILL_NAME` constant in all test files1932. `shouldTriggerPrompts` - 5+ prompts matching new frontmatter triggers1943. `shouldNotTriggerPrompts` - 5+ prompts matching anti-triggers195 196**Commit Messages:**197```198sensei: improve {skill-name} frontmatter199```200 201## Constraints202 203- Only modify `plugin/skills/` - these are the Azure skills used by Copilot204- `.github/skills/` contains meta-skills like sensei for developer tooling205- Max 5 iterations per skill before moving on206- Description must stay under 1024 characters207- SKILL.md should stay under 500 tokens (soft limit)208- Tests must pass before prompting for action209- User chooses: Commit, Create Issue, or Skip after each skill210 211## Flags212 213| Flag | Description |214|------|-------------|215| `--skip-integration` | Skip integration tests for faster iteration. Only runs unit and trigger tests. |216| `--gepa` | Use GEPA evolutionary optimization instead of template-based improvement. Auto-discovers tests and builds evaluator at runtime. |217 218> ⚠️ Skipping integration tests speeds up the loop but may miss runtime issues. Consider running full tests before final commit.219 220## Reference Documentation221 222- [SCORING.md](references/SCORING.md) - Detailed scoring criteria223- [LOOP.md](references/LOOP.md) - Ralph loop workflow details224- [EXAMPLES.md](references/EXAMPLES.md) - Before/after examples225- [TOKEN-INTEGRATION.md](references/TOKEN-INTEGRATION.md) - Token budget integration226 227## Related Skills228 229- [markdown-token-optimizer](/.github/skills/markdown-token-optimizer) - Token analysis and optimization230- [skill-authoring](/.github/skills/skill-authoring) - Skill writing guidelines
Related skills
Analyze Test Run

Install Analyze Test Run skill for Claude Code from microsoft/github-copilot-for-azure.
Appinsights Instrumentation

Microsoft's official guidance for adding Azure Application Insights telemetry to web applications. Covers SDK setup patterns for ASP.NET Core, Node.js, and Pyth
Appinsights Instrumentation

The appinsights-instrumentation skill provides developers with guidance, reference materials, and best practices for instrumenting webapps with Azure Applicatio