Works with Paperclip

How Benchmark fits into a Paperclip company.

Benchmark drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59

Explore pack

Source file

SKILL.md93 linesmarkdown

Expand

1---2name: benchmark3description: Use this skill to measure performance baselines, detect regressions before/after PRs, and compare stack alternatives.4origin: ECC5---6 7# Benchmark — Performance Baseline & Regression Detection8 9## When to Use10 11- Before and after a PR to measure performance impact12- Setting up performance baselines for a project13- When users report "it feels slow"14- Before a launch — ensure you meet performance targets15- Comparing your stack against alternatives16 17## How It Works18 19### Mode 1: Page Performance20 21Measures real browser metrics via browser MCP:22 23```241. Navigate to each target URL252. Measure Core Web Vitals:26   - LCP (Largest Contentful Paint) — target < 2.5s27   - CLS (Cumulative Layout Shift) — target < 0.128   - INP (Interaction to Next Paint) — target < 200ms29   - FCP (First Contentful Paint) — target < 1.8s30   - TTFB (Time to First Byte) — target < 800ms313. Measure resource sizes:32   - Total page weight (target < 1MB)33   - JS bundle size (target < 200KB gzipped)34   - CSS size35   - Image weight36   - Third-party script weight374. Count network requests385. Check for render-blocking resources39```40 41### Mode 2: API Performance42 43Benchmarks API endpoints:44 45```461. Hit each endpoint 100 times472. Measure: p50, p95, p99 latency483. Track: response size, status codes494. Test under load: 10 concurrent requests505. Compare against SLA targets51```52 53### Mode 3: Build Performance54 55Measures development feedback loop:56 57```581. Cold build time592. Hot reload time (HMR)603. Test suite duration614. TypeScript check time625. Lint time636. Docker build time64```65 66### Mode 4: Before/After Comparison67 68Run before and after a change to measure impact:69 70```71/benchmark baseline    # saves current metrics72# ... make changes ...73/benchmark compare     # compares against baseline74```75 76Output:77```78| Metric | Before | After | Delta | Verdict |79|--------|--------|-------|-------|---------|80| LCP | 1.2s | 1.4s | +200ms | WARNING: WARN |81| Bundle | 180KB | 175KB | -5KB | ✓ BETTER |82| Build | 12s | 14s | +2s | WARNING: WARN |83```84 85## Output86 87Stores baselines in `.ecc/benchmarks/` as JSON. Git-tracked so the team shares baselines.88 89## Integration90 91- CI: run `/benchmark compare` on every PR92- Pair with `/canary-watch` for post-deploy monitoring93- Pair with `/browser-qa` for full pre-ship checklist

Related skills

Agent Eval

Install Agent Eval skill for Claude Code from affaan-m/everything-claude-code.

Agent Harness Construction

Install Agent Harness Construction skill for Claude Code from affaan-m/everything-claude-code.

Agent Payment X402

Install Agent Payment X402 skill for Claude Code from affaan-m/everything-claude-code.