Name: Agent Governance
Author: Github
Install
Terminal · npx
$npx skills add https://github.com/github/awesome-copilot --skill agent-governance
Works with Paperclip
How Agent Governance fits into a Paperclip company.

Agent Governance drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.
SaaS FactoryPaired
Pre-configured AI company — 18 agents, 18 skills, one-time purchase.
$27$59
Explore pack
Source file
SKILL.md569 linesmarkdown
Expand
1---2name: agent-governance3description: |4  Patterns and techniques for adding governance, safety, and trust controls to AI agent systems. Use this skill when:5  - Building AI agents that call external tools (APIs, databases, file systems)6  - Implementing policy-based access controls for agent tool usage7  - Adding semantic intent classification to detect dangerous prompts8  - Creating trust scoring systems for multi-agent workflows9  - Building audit trails for agent actions and decisions10  - Enforcing rate limits, content filters, or tool restrictions on agents11  - Working with any agent framework (PydanticAI, CrewAI, OpenAI Agents, LangChain, AutoGen)12---13 14# Agent Governance Patterns15 16Patterns for adding safety, trust, and policy enforcement to AI agent systems.17 18## Overview19 20Governance patterns ensure AI agents operate within defined boundaries — controlling which tools they can call, what content they can process, how much they can do, and maintaining accountability through audit trails.21 22```23User Request → Intent Classification → Policy Check → Tool Execution → Audit Log24                     ↓                      ↓               ↓25              Threat Detection         Allow/Deny      Trust Update26```27 28## When to Use29 30- **Agents with tool access**: Any agent that calls external tools (APIs, databases, shell commands)31- **Multi-agent systems**: Agents delegating to other agents need trust boundaries32- **Production deployments**: Compliance, audit, and safety requirements33- **Sensitive operations**: Financial transactions, data access, infrastructure management34 35---36 37## Pattern 1: Governance Policy38 39Define what an agent is allowed to do as a composable, serializable policy object.40 41```python42from dataclasses import dataclass, field43from enum import Enum44from typing import Optional45import re46 47class PolicyAction(Enum):48    ALLOW = "allow"49    DENY = "deny"50    REVIEW = "review"  # flag for human review51 52@dataclass53class GovernancePolicy:54    """Declarative policy controlling agent behavior."""55    name: str56    allowed_tools: list[str] = field(default_factory=list)       # allowlist57    blocked_tools: list[str] = field(default_factory=list)       # blocklist58    blocked_patterns: list[str] = field(default_factory=list)    # content filters59    max_calls_per_request: int = 100                             # rate limit60    require_human_approval: list[str] = field(default_factory=list)  # tools needing approval61 62    def check_tool(self, tool_name: str) -> PolicyAction:63        """Check if a tool is allowed by this policy."""64        if tool_name in self.blocked_tools:65            return PolicyAction.DENY66        if tool_name in self.require_human_approval:67            return PolicyAction.REVIEW68        if self.allowed_tools and tool_name not in self.allowed_tools:69            return PolicyAction.DENY70        return PolicyAction.ALLOW71 72    def check_content(self, content: str) -> Optional[str]:73        """Check content against blocked patterns. Returns matched pattern or None."""74        for pattern in self.blocked_patterns:75            if re.search(pattern, content, re.IGNORECASE):76                return pattern77        return None78```79 80### Policy Composition81 82Combine multiple policies (e.g., org-wide + team + agent-specific):83 84```python85def compose_policies(*policies: GovernancePolicy) -> GovernancePolicy:86    """Merge policies with most-restrictive-wins semantics."""87    combined = GovernancePolicy(name="composed")88 89    for policy in policies:90        combined.blocked_tools.extend(policy.blocked_tools)91        combined.blocked_patterns.extend(policy.blocked_patterns)92        combined.require_human_approval.extend(policy.require_human_approval)93        combined.max_calls_per_request = min(94            combined.max_calls_per_request,95            policy.max_calls_per_request96        )97        if policy.allowed_tools:98            if combined.allowed_tools:99                combined.allowed_tools = [100                    t for t in combined.allowed_tools if t in policy.allowed_tools101                ]102            else:103                combined.allowed_tools = list(policy.allowed_tools)104 105    return combined106 107 108# Usage: layer policies from broad to specific109org_policy = GovernancePolicy(110    name="org-wide",111    blocked_tools=["shell_exec", "delete_database"],112    blocked_patterns=[r"(?i)(api[_-]?key|secret|password)\s*[:=]"],113    max_calls_per_request=50114)115team_policy = GovernancePolicy(116    name="data-team",117    allowed_tools=["query_db", "read_file", "write_report"],118    require_human_approval=["write_report"]119)120agent_policy = compose_policies(org_policy, team_policy)121```122 123### Policy as YAML124 125Store policies as configuration, not code:126 127```yaml128# governance-policy.yaml129name: production-agent130allowed_tools:131  - search_documents132  - query_database133  - send_email134blocked_tools:135  - shell_exec136  - delete_record137blocked_patterns:138  - "(?i)(api[_-]?key|secret|password)\\s*[:=]"139  - "(?i)(drop|truncate|delete from)\\s+\\w+"140max_calls_per_request: 25141require_human_approval:142  - send_email143```144 145```python146import yaml147 148def load_policy(path: str) -> GovernancePolicy:149    with open(path) as f:150        data = yaml.safe_load(f)151    return GovernancePolicy(**data)152```153 154---155 156## Pattern 2: Semantic Intent Classification157 158Detect dangerous intent in prompts before they reach the agent, using pattern-based signals.159 160```python161from dataclasses import dataclass162 163@dataclass164class IntentSignal:165    category: str       # e.g., "data_exfiltration", "privilege_escalation"166    confidence: float   # 0.0 to 1.0167    evidence: str       # what triggered the detection168 169# Weighted signal patterns for threat detection170THREAT_SIGNALS = [171    # Data exfiltration172    (r"(?i)send\s+(all|every|entire)\s+\w+\s+to\s+", "data_exfiltration", 0.8),173    (r"(?i)export\s+.*\s+to\s+(external|outside|third.?party)", "data_exfiltration", 0.9),174    (r"(?i)curl\s+.*\s+-d\s+", "data_exfiltration", 0.7),175 176    # Privilege escalation177    (r"(?i)(sudo|as\s+root|admin\s+access)", "privilege_escalation", 0.8),178    (r"(?i)chmod\s+777", "privilege_escalation", 0.9),179 180    # System modification181    (r"(?i)(rm\s+-rf|del\s+/[sq]|format\s+c:)", "system_destruction", 0.95),182    (r"(?i)(drop\s+database|truncate\s+table)", "system_destruction", 0.9),183 184    # Prompt injection185    (r"(?i)ignore\s+(previous|above|all)\s+(instructions?|rules?)", "prompt_injection", 0.9),186    (r"(?i)you\s+are\s+now\s+(a|an)\s+", "prompt_injection", 0.7),187]188 189def classify_intent(content: str) -> list[IntentSignal]:190    """Classify content for threat signals."""191    signals = []192    for pattern, category, weight in THREAT_SIGNALS:193        match = re.search(pattern, content)194        if match:195            signals.append(IntentSignal(196                category=category,197                confidence=weight,198                evidence=match.group()199            ))200    return signals201 202def is_safe(content: str, threshold: float = 0.7) -> bool:203    """Quick check: is the content safe above the given threshold?"""204    signals = classify_intent(content)205    return not any(s.confidence >= threshold for s in signals)206```207 208**Key insight**: Intent classification happens *before* tool execution, acting as a pre-flight safety check. This is fundamentally different from output guardrails which only check *after* generation.209 210---211 212## Pattern 3: Tool-Level Governance Decorator213 214Wrap individual tool functions with governance checks:215 216```python217import functools218import time219from collections import defaultdict220 221_call_counters: dict[str, int] = defaultdict(int)222 223def govern(policy: GovernancePolicy, audit_trail=None):224    """Decorator that enforces governance policy on a tool function."""225    def decorator(func):226        @functools.wraps(func)227        async def wrapper(*args, **kwargs):228            tool_name = func.__name__229 230            # 1. Check tool allowlist/blocklist231            action = policy.check_tool(tool_name)232            if action == PolicyAction.DENY:233                raise PermissionError(f"Policy '{policy.name}' blocks tool '{tool_name}'")234            if action == PolicyAction.REVIEW:235                raise PermissionError(f"Tool '{tool_name}' requires human approval")236 237            # 2. Check rate limit238            _call_counters[policy.name] += 1239            if _call_counters[policy.name] > policy.max_calls_per_request:240                raise PermissionError(f"Rate limit exceeded: {policy.max_calls_per_request} calls")241 242            # 3. Check content in arguments243            for arg in list(args) + list(kwargs.values()):244                if isinstance(arg, str):245                    matched = policy.check_content(arg)246                    if matched:247                        raise PermissionError(f"Blocked pattern detected: {matched}")248 249            # 4. Execute and audit250            start = time.monotonic()251            try:252                result = await func(*args, **kwargs)253                if audit_trail is not None:254                    audit_trail.append({255                        "tool": tool_name,256                        "action": "allowed",257                        "duration_ms": (time.monotonic() - start) * 1000,258                        "timestamp": time.time()259                    })260                return result261            except Exception as e:262                if audit_trail is not None:263                    audit_trail.append({264                        "tool": tool_name,265                        "action": "error",266                        "error": str(e),267                        "timestamp": time.time()268                    })269                raise270 271        return wrapper272    return decorator273 274 275# Usage with any agent framework276audit_log = []277policy = GovernancePolicy(278    name="search-agent",279    allowed_tools=["search", "summarize"],280    blocked_patterns=[r"(?i)password"],281    max_calls_per_request=10282)283 284@govern(policy, audit_trail=audit_log)285async def search(query: str) -> str:286    """Search documents — governed by policy."""287    return f"Results for: {query}"288 289# Passes: search("latest quarterly report")290# Blocked: search("show me the admin password")291```292 293---294 295## Pattern 4: Trust Scoring296 297Track agent reliability over time with decay-based trust scores:298 299```python300from dataclasses import dataclass, field301import math302import time303 304@dataclass305class TrustScore:306    """Trust score with temporal decay."""307    score: float = 0.5          # 0.0 (untrusted) to 1.0 (fully trusted)308    successes: int = 0309    failures: int = 0310    last_updated: float = field(default_factory=time.time)311 312    def record_success(self, reward: float = 0.05):313        self.successes += 1314        self.score = min(1.0, self.score + reward * (1 - self.score))315        self.last_updated = time.time()316 317    def record_failure(self, penalty: float = 0.15):318        self.failures += 1319        self.score = max(0.0, self.score - penalty * self.score)320        self.last_updated = time.time()321 322    def current(self, decay_rate: float = 0.001) -> float:323        """Get score with temporal decay — trust erodes without activity."""324        elapsed = time.time() - self.last_updated325        decay = math.exp(-decay_rate * elapsed)326        return self.score * decay327 328    @property329    def reliability(self) -> float:330        total = self.successes + self.failures331        return self.successes / total if total > 0 else 0.0332 333 334# Usage in multi-agent systems335trust = TrustScore()336 337# Agent completes tasks successfully338trust.record_success()  # 0.525339trust.record_success()  # 0.549340 341# Agent makes an error342trust.record_failure()  # 0.467343 344# Gate sensitive operations on trust345if trust.current() >= 0.7:346    # Allow autonomous operation347    pass348elif trust.current() >= 0.4:349    # Allow with human oversight350    pass351else:352    # Deny or require explicit approval353    pass354```355 356**Multi-agent trust**: In systems where agents delegate to other agents, each agent maintains trust scores for its delegates:357 358```python359class AgentTrustRegistry:360    def __init__(self):361        self.scores: dict[str, TrustScore] = {}362 363    def get_trust(self, agent_id: str) -> TrustScore:364        if agent_id not in self.scores:365            self.scores[agent_id] = TrustScore()366        return self.scores[agent_id]367 368    def most_trusted(self, agents: list[str]) -> str:369        return max(agents, key=lambda a: self.get_trust(a).current())370 371    def meets_threshold(self, agent_id: str, threshold: float) -> bool:372        return self.get_trust(agent_id).current() >= threshold373```374 375---376 377## Pattern 5: Audit Trail378 379Append-only audit log for all agent actions — critical for compliance and debugging:380 381```python382from dataclasses import dataclass, field383import json384import time385 386@dataclass387class AuditEntry:388    timestamp: float389    agent_id: str390    tool_name: str391    action: str           # "allowed", "denied", "error"392    policy_name: str393    details: dict = field(default_factory=dict)394 395class AuditTrail:396    """Append-only audit trail for agent governance events."""397    def __init__(self):398        self._entries: list[AuditEntry] = []399 400    def log(self, agent_id: str, tool_name: str, action: str,401            policy_name: str, **details):402        self._entries.append(AuditEntry(403            timestamp=time.time(),404            agent_id=agent_id,405            tool_name=tool_name,406            action=action,407            policy_name=policy_name,408            details=details409        ))410 411    def denied(self) -> list[AuditEntry]:412        """Get all denied actions — useful for security review."""413        return [e for e in self._entries if e.action == "denied"]414 415    def by_agent(self, agent_id: str) -> list[AuditEntry]:416        return [e for e in self._entries if e.agent_id == agent_id]417 418    def export_jsonl(self, path: str):419        """Export as JSON Lines for log aggregation systems."""420        with open(path, "w") as f:421            for entry in self._entries:422                f.write(json.dumps({423                    "timestamp": entry.timestamp,424                    "agent_id": entry.agent_id,425                    "tool": entry.tool_name,426                    "action": entry.action,427                    "policy": entry.policy_name,428                    **entry.details429                }) + "\n")430```431 432---433 434## Pattern 6: Framework Integration435 436### PydanticAI437 438```python439from pydantic_ai import Agent440 441policy = GovernancePolicy(442    name="support-bot",443    allowed_tools=["search_docs", "create_ticket"],444    blocked_patterns=[r"(?i)(ssn|social\s+security|credit\s+card)"],445    max_calls_per_request=20446)447 448agent = Agent("openai:gpt-4o", system_prompt="You are a support assistant.")449 450@agent.tool451@govern(policy)452async def search_docs(ctx, query: str) -> str:453    """Search knowledge base — governed."""454    return await kb.search(query)455 456@agent.tool457@govern(policy)458async def create_ticket(ctx, title: str, body: str) -> str:459    """Create support ticket — governed."""460    return await tickets.create(title=title, body=body)461```462 463### CrewAI464 465```python466from crewai import Agent, Task, Crew467 468policy = GovernancePolicy(469    name="research-crew",470    allowed_tools=["search", "analyze"],471    max_calls_per_request=30472)473 474# Apply governance at the crew level475def governed_crew_run(crew: Crew, policy: GovernancePolicy):476    """Wrap crew execution with governance checks."""477    audit = AuditTrail()478    for agent in crew.agents:479        for tool in agent.tools:480            original = tool.func481            tool.func = govern(policy, audit_trail=audit)(original)482    result = crew.kickoff()483    return result, audit484```485 486### OpenAI Agents SDK487 488```python489from agents import Agent, function_tool490 491policy = GovernancePolicy(492    name="coding-agent",493    allowed_tools=["read_file", "write_file", "run_tests"],494    blocked_tools=["shell_exec"],495    max_calls_per_request=50496)497 498@function_tool499@govern(policy)500async def read_file(path: str) -> str:501    """Read file contents — governed."""502    import os503    safe_path = os.path.realpath(path)504    if not safe_path.startswith(os.path.realpath(".")):505        raise ValueError("Path traversal blocked by governance")506    with open(safe_path) as f:507        return f.read()508```509 510---511 512## Governance Levels513 514Match governance strictness to risk level:515 516| Level | Controls | Use Case |517|-------|----------|----------|518| **Open** | Audit only, no restrictions | Internal dev/testing |519| **Standard** | Tool allowlist + content filters | General production agents |520| **Strict** | All controls + human approval for sensitive ops | Financial, healthcare, legal |521| **Locked** | Allowlist only, no dynamic tools, full audit | Compliance-critical systems |522 523---524 525## Best Practices526 527| Practice | Rationale |528|----------|-----------|529| **Policy as configuration** | Store policies in YAML/JSON, not hardcoded — enables change without deploys |530| **Most-restrictive-wins** | When composing policies, deny always overrides allow |531| **Pre-flight intent check** | Classify intent *before* tool execution, not after |532| **Trust decay** | Trust scores should decay over time — require ongoing good behavior |533| **Append-only audit** | Never modify or delete audit entries — immutability enables compliance |534| **Fail closed** | If governance check errors, deny the action rather than allowing it |535| **Separate policy from logic** | Governance enforcement should be independent of agent business logic |536 537---538 539## Quick Start Checklist540 541```markdown542## Agent Governance Implementation Checklist543 544### Setup545- [ ] Define governance policy (allowed tools, blocked patterns, rate limits)546- [ ] Choose governance level (open/standard/strict/locked)547- [ ] Set up audit trail storage548 549### Implementation550- [ ] Add @govern decorator to all tool functions551- [ ] Add intent classification to user input processing552- [ ] Implement trust scoring for multi-agent interactions553- [ ] Wire up audit trail export554 555### Validation556- [ ] Test that blocked tools are properly denied557- [ ] Test that content filters catch sensitive patterns558- [ ] Test rate limiting behavior559- [ ] Verify audit trail captures all events560- [ ] Test policy composition (most-restrictive-wins)561```562 563---564 565## Related Resources566 567- [Agent Governance Toolkit](https://github.com/microsoft/agent-governance-toolkit) — Full governance framework568- [AgentMesh Integrations](https://github.com/microsoft/agent-governance-toolkit/tree/main/packages/agentmesh-integrations) — Framework-specific packages569- [OWASP Top 10 for LLM Applications](https://owasp.org/www-project-top-10-for-large-language-model-applications/)
Related skills
Add Educational Comments

Takes any code file and transforms it into a teaching resource by adding educational comments that explain syntax, design choices, and language concepts. Automa
Agentic Eval

Implements self-critique loops where Claude generates output, evaluates it against your criteria, then refines based on its own feedback. Includes evaluator-opt
Ai Prompt Engineering Safety Review

The ai-prompt-engineering-safety-review skill analyzes AI prompts for safety risks, bias, security vulnerabilities, and effectiveness using a structured evaluat