Install
Terminal · npx
$npx skills add https://github.com/charon-fan/agent-playbook --skill self-improving-agent
Works with Paperclip
How Self Improving Agent fits into a Paperclip company.

Self Improving Agent drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.
SaaS FactoryPaired
Pre-configured AI company — 18 agents, 18 skills, one-time purchase.
$27$59
Explore pack
Source file
SKILL.md408 linesmarkdown
Expand
1---2name: self-improving-agent3description: A universal self-improving agent that learns from ALL skill experiences. Uses multi-memory architecture (semantic + episodic + working) to continuously evolve the codebase. Auto-triggers on skill completion/error with hooks-based self-correction.4allowed-tools: Read, Write, Edit, Bash, Grep, Glob, WebSearch5metadata:6  hooks:7    before_start:8      - trigger: session-logger9        mode: auto10        context: "Start {skill_name}"11    after_complete:12      - trigger: create-pr13        mode: ask_first14        condition: skills_modified15        reason: "Submit improvements to repository"16      - trigger: session-logger17        mode: auto18        context: "Self-improvement cycle complete"19    # Note: on_error intentionally only logs to session to avoid infinite recursion20    # Self-correction is triggered by other skills (debugger, code-reviewer) completing their work21    on_error:22      - trigger: session-logger23        mode: auto24        context: "Error captured in {skill_name}"25---26 27# Self-Improving Agent28 29> "An AI agent that learns from every interaction, accumulating patterns and insights to continuously improve its own capabilities." — Based on 2025 lifelong learning research30 31## Overview32 33This is a **universal self-improvement system** that learns from ALL skill experiences, not just PRDs. It implements a complete feedback loop with:34 35- **Multi-Memory Architecture**: Semantic + Episodic + Working memory36- **Self-Correction**: Detects and fixes skill guidance errors37- **Self-Validation**: Periodically verifies skill accuracy38- **Hooks Integration**: Auto-triggers on skill events (before_start, after_complete, on_error)39- **Evolution Markers**: Traceable changes with source attribution40 41## Research-Based Design42 43Based on 2025 research:44 45| Research | Key Insight | Application |46|----------|-------------|-------------|47| [SimpleMem](https://arxiv.org/html/2601.02553v1) | Efficient lifelong memory | Pattern accumulation system |48| [Multi-Memory Survey](https://dl.acm.org/doi/10.1145/3748302) | Semantic + Episodic memory | World knowledge + experiences |49| [Lifelong Learning](https://arxiv.org/html/2501.07278v1) | Continuous task stream learning | Learn from every skill use |50| [Evo-Memory](https://shothota.medium.com/evo-memory-deepminds-new-benchmark) | Test-time lifelong learning | Real-time adaptation |51 52## The Self-Improvement Loop53 54```55┌─────────────────────────────────────────────────────────────────┐56│                    UNIVERSAL SELF-IMPROVEMENT                    │57├─────────────────────────────────────────────────────────────────┤58│                                                                 │59│   Skill Event → Extract Experience → Abstract Pattern → Update  │60│        │                  │                │         │          │61│        ▼                  ▼                ▼         ▼          │62│   ┌─────────────────────────────────────────────────────┐       │63│   │              MULTI-MEMORY SYSTEM                      │       │64│   ├─────────────────────────────────────────────────────┤       │65│   │  Semantic Memory   │  Episodic Memory  │ Working Memory │  │66│   │  (Patterns/Rules)  │  (Experiences)    │  (Current)     │  │67│   │  memory/semantic/  │  memory/episodic/ │  memory/working/│  │68│   └─────────────────────────────────────────────────────┘       │69│                                                                 │70│   ┌─────────────────────────────────────────────────────┐       │71│   │              FEEDBACK LOOP                            │       │72│   │  User Feedback → Confidence Update → Pattern Adapt   │       │73│   └─────────────────────────────────────────────────────┘       │74│                                                                 │75└─────────────────────────────────────────────────────────────────┘76```77 78## When This Activates79 80### Automatic Triggers (via hooks)81 82| Event | Trigger | Action |83|-------|---------|--------|84| **before_start** | Any skill starts | Log session start |85| **after_complete** | Any skill completes | Extract patterns, update skills |86| **on_error** | Bash returns non-zero exit | Capture error context, trigger self-correction |87 88### Manual Triggers89 90- User says "自我进化", "self-improve", "从经验中学习"91- User says "分析今天的经验", "总结教训"92- User asks to improve a specific skill93 94## Evolution Priority Matrix95 96Trigger evolution when new reusable knowledge appears:97 98| Trigger | Target Skill | Priority | Action |99|---------|--------------|----------|--------|100| New PRD pattern discovered | prd-planner | High | Add to quality checklist |101| Architecture tradeoff clarified | architecting-solutions | High | Add to decision patterns |102| API design rule learned | api-designer | High | Update template |103| Debugging fix discovered | debugger | High | Add to anti-patterns |104| Review checklist gap | code-reviewer | High | Add checklist item |105| Perf/security insight | performance-engineer, security-auditor | High | Add to patterns |106| UI/UX spec issue | prd-planner, architecting-solutions | High | Add visual spec requirements |107| React/state pattern | debugger, refactoring-specialist | Medium | Add to patterns |108| Test strategy improvement | test-automator, qa-expert | Medium | Update approach |109| CI/deploy fix | deployment-engineer | Medium | Add to troubleshooting |110 111## Multi-Memory Architecture112 113### 1. Semantic Memory (`memory/semantic-patterns.json`)114 115Stores **abstract patterns and rules** reusable across contexts:116 117```json118{119  "patterns": {120    "pattern_id": {121      "id": "pat-2025-01-11-001",122      "name": "Pattern Name",123      "source": "user_feedback|implementation_review|retrospective",124      "confidence": 0.95,125      "applications": 5,126      "created": "2025-01-11",127      "category": "prd_structure|react_patterns|async_patterns|...",128      "pattern": "One-line summary",129      "problem": "What problem does this solve?",130      "solution": { ... },131      "quality_rules": [ ... ],132      "target_skills": [ ... ]133    }134  }135}136```137 138### 2. Episodic Memory (`memory/episodic/`)139 140Stores **specific experiences and what happened**:141 142```143memory/episodic/144├── 2025/145│   ├── 2025-01-11-prd-creation.json146│   ├── 2025-01-11-debug-session.json147│   └── 2025-01-12-refactoring.json148```149 150```json151{152  "id": "ep-2025-01-11-001",153  "timestamp": "2025-01-11T10:30:00Z",154  "skill": "debugger",155  "situation": "User reported data not refreshing after form submission",156  "root_cause": "Empty callback in onRefresh prop",157  "solution": "Implement actual refresh logic in callback",158  "lesson": "Always verify callbacks are not empty functions",159  "related_pattern": "callback_verification",160  "user_feedback": {161    "rating": 8,162    "comments": "This was exactly the issue"163  }164}165```166 167### 3. Working Memory (`memory/working/`)168 169Stores **current session context**:170 171```172memory/working/173├── current_session.json   # Active session data174├── last_error.json        # Error context for self-correction175└── session_end.json       # Session end marker176```177 178## Self-Improvement Process179 180### Phase 1: Experience Extraction181 182After any skill completes, extract:183 184```yaml185What happened:186  skill_used: {which skill}187  task: {what was being done}188  outcome: {success|partial|failure}189 190Key Insights:191  what_went_well: [what worked]192  what_went_wrong: [what didn't work]193  root_cause: {underlying issue if applicable}194 195User Feedback:196  rating: {1-10 if provided}197  comments: {specific feedback}198```199 200### Phase 2: Pattern Abstraction201 202Convert experiences to reusable patterns:203 204| Concrete Experience | Abstract Pattern | Target Skill |205|--------------------|------------------|--------------|206| "User forgot to save PRD notes" | "Always persist thinking to files" | prd-planner |207| "Code review missed SQL injection" | "Add security checklist item" | code-reviewer |208| "Callback was empty, didn't work" | "Verify callback implementations" | debugger |209| "Net APY position ambiguous" | "UI specs need exact relative positions" | prd-planner |210 211**Abstraction Rules:**212 213```yaml214If experience_repeats 3+ times:215  pattern_level: critical216  action: Add to skill's "Critical Mistakes" section217 218If solution_was_effective:219  pattern_level: best_practice220  action: Add to skill's "Best Practices" section221 222If user_rating >= 7:223  pattern_level: strength224  action: Reinforce this approach225 226If user_rating <= 4:227  pattern_level: weakness228  action: Add to "What to Avoid" section229```230 231### Phase 3: Skill Updates232 233Update the appropriate skill files with **evolution markers**:234 235```markdown236<!-- Evolution: 2025-01-12 | source: ep-2025-01-12-001 | skill: debugger -->237 238## Pattern Added (2025-01-12)239 240**Pattern**: Always verify callbacks are not empty functions241 242**Source**: Episode ep-2025-01-12-001243 244**Confidence**: 0.95245 246### Updated Checklist247- [ ] Verify all callbacks have implementations248- [ ] Test callback execution paths249```250 251**Correction Markers** (when fixing wrong guidance):252 253```markdown254<!-- Correction: 2025-01-12 | was: "Use callback chain" | reason: caused stale refresh -->255 256## Corrected Guidance257 258Use direct state monitoring instead of callback chains:259```typescript260// ✅ Do: Direct state monitoring261const prevPendingCount = usePrevious(pendingCount);262```263```264 265### Phase 4: Memory Consolidation266 2671. **Update semantic memory** (`memory/semantic-patterns.json`)2682. **Store episodic memory** (`memory/episodic/YYYY-MM-DD-{skill}.json`)2693. **Update pattern confidence** based on applications/feedback2704. **Prune outdated patterns** (low confidence, no recent applications)271 272## Self-Correction (on_error hook)273 274Triggered when:275- Bash command returns non-zero exit code276- Tests fail after following skill guidance277- User reports the guidance produced incorrect results278 279**Process:**280 281```markdown282## Self-Correction Workflow283 2841. Detect Error285   - Capture error context from working/last_error.json286   - Identify which skill guidance was followed287 2882. Verify Root Cause289   - Was the skill guidance incorrect?290   - Was the guidance misinterpreted?291   - Was the guidance incomplete?292 2933. Apply Correction294   - Update skill file with corrected guidance295   - Add correction marker with reason296   - Update related patterns in semantic memory297 2984. Validate Fix299   - Test the corrected guidance300   - Ask user to verify301```302 303**Example:**304 305```markdown306<!-- Correction: 2025-01-12 | was: "useMemo for claimable ids" | reason: stale data at click time -->307 308## Self-Correction: Click-Time Computation309 310**Issue**: Using useMemo for claimable IDs caused stale data311**Fix**: Compute at click time for always-fresh data312**Pattern**: click_time_vs_open_time_computation313```314 315## Self-Validation316 317Use the validation template in `references/appendix.md` when reviewing updates.318 319## Hooks Integration320 321### Wiring Hooks in Claude Code Settings322 323Add to Claude Code settings (`~/.claude/settings.json`):324 325```json326{327  "hooks": {328    "PreToolUse": [329      {330        "matcher": "Bash|Write|Edit",331        "hooks": [332          {333            "type": "command",334            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\""335          }336        ]337      }338    ],339    "PostToolUse": [340      {341        "matcher": "Bash",342        "hooks": [343          {344            "type": "command",345            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\""346          }347        ]348      }349    ],350    "Stop": [351      {352        "matcher": "",353        "hooks": [354          {355            "type": "command",356            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh"357          }358        ]359      }360    ]361  }362}363```364 365Replace `${SKILLS_DIR}` with your actual skills path.366 367## Additional References368 369See `references/appendix.md` for memory structure, workflow diagrams, metrics, feedback templates, and research links.370 371## Best Practices372 373### DO374 375- ✅ Learn from EVERY skill interaction376- ✅ Extract patterns at the right abstraction level377- ✅ Update multiple related skills378- ✅ Track confidence and apply counts379- ✅ Ask for user feedback on improvements380- ✅ Use evolution/correction markers for traceability381- ✅ Validate guidance before applying broadly382 383### DON'T384 385- ❌ Over-generalize from single experiences386- ❌ Update skills without confidence tracking387- ❌ Ignore negative feedback388- ❌ Make changes that break existing functionality389- ❌ Create contradictory patterns390- ❌ Update skills without understanding context391 392## Quick Start393 394After any skill completes, this agent automatically:395 3961. **Analyzes** what happened3972. **Extracts** patterns and insights3983. **Updates** relevant skill files3994. **Logs** to memory for future reference4005. **Reports** summary to user401 402## References403 404- [SimpleMem: Efficient Lifelong Memory for LLM Agents](https://arxiv.org/html/2601.02553v1)405- [A Survey on the Memory Mechanism of Large Language Model Agents](https://dl.acm.org/doi/10.1145/3748302)406- [Lifelong Learning of LLM based Agents](https://arxiv.org/html/2501.07278v1)407- [Evo-Memory: DeepMind's Benchmark](https://shothota.medium.com/evo-memory-deepminds-new-benchmark)408- [Let's Build a Self-Improving AI Agent](https://medium.com/@nomannayeem/lets-build-a-self-improving-ai-agent-that-learns-from-your-feedback-722d2ce9c2d9)
Related skills
1password

Install 1password skill for Claude Code from steipete/clawdis.
3d Web Experience

Install 3d Web Experience skill for Claude Code from sickn33/antigravity-awesome-skills.
Ab Test Setup

This handles the full A/B testing workflow from hypothesis formation to statistical analysis. It walks you through proper test design, calculates sample sizes,