Install
Terminal · npx
$npx skills add https://github.com/obra/superpowers --skill test-driven-development
Works with Paperclip
How Ai Regression Testing fits into a Paperclip company.

Ai Regression Testing drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.
SaaS FactoryPaired
Pre-configured AI company — 18 agents, 18 skills, one-time purchase.
$27$59
Explore pack
Source file
SKILL.md385 linesmarkdown
Expand
1---2name: ai-regression-testing3description: Regression testing strategies for AI-assisted development. Sandbox-mode API testing without database dependencies, automated bug-check workflows, and patterns to catch AI blind spots where the same model writes and reviews code.4origin: ECC5---6 7# AI Regression Testing8 9Testing patterns specifically designed for AI-assisted development, where the same model writes code and reviews it — creating systematic blind spots that only automated tests can catch.10 11## When to Activate12 13- AI agent (Claude Code, Cursor, Codex) has modified API routes or backend logic14- A bug was found and fixed — need to prevent re-introduction15- Project has a sandbox/mock mode that can be leveraged for DB-free testing16- Running `/bug-check` or similar review commands after code changes17- Multiple code paths exist (sandbox vs production, feature flags, etc.)18 19## The Core Problem20 21When an AI writes code and then reviews its own work, it carries the same assumptions into both steps. This creates a predictable failure pattern:22 23```24AI writes fix → AI reviews fix → AI says "looks correct" → Bug still exists25```26 27**Real-world example** (observed in production):28 29```30Fix 1: Added notification_settings to API response31  → Forgot to add it to the SELECT query32  → AI reviewed and missed it (same blind spot)33 34Fix 2: Added it to SELECT query35  → TypeScript build error (column not in generated types)36  → AI reviewed Fix 1 but didn't catch the SELECT issue37 38Fix 3: Changed to SELECT *39  → Fixed production path, forgot sandbox path40  → AI reviewed and missed it AGAIN (4th occurrence)41 42Fix 4: Test caught it instantly on first run PASS:43```44 45The pattern: **sandbox/production path inconsistency** is the #1 AI-introduced regression.46 47## Sandbox-Mode API Testing48 49Most projects with AI-friendly architecture have a sandbox/mock mode. This is the key to fast, DB-free API testing.50 51### Setup (Vitest + Next.js App Router)52 53```typescript54// vitest.config.ts55import { defineConfig } from "vitest/config";56import path from "path";57 58export default defineConfig({59  test: {60    environment: "node",61    globals: true,62    include: ["__tests__/**/*.test.ts"],63    setupFiles: ["__tests__/setup.ts"],64  },65  resolve: {66    alias: {67      "@": path.resolve(__dirname, "."),68    },69  },70});71```72 73```typescript74// __tests__/setup.ts75// Force sandbox mode — no database needed76process.env.SANDBOX_MODE = "true";77process.env.NEXT_PUBLIC_SUPABASE_URL = "";78process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY = "";79```80 81### Test Helper for Next.js API Routes82 83```typescript84// __tests__/helpers.ts85import { NextRequest } from "next/server";86 87export function createTestRequest(88  url: string,89  options?: {90    method?: string;91    body?: Record<string, unknown>;92    headers?: Record<string, string>;93    sandboxUserId?: string;94  },95): NextRequest {96  const { method = "GET", body, headers = {}, sandboxUserId } = options || {};97  const fullUrl = url.startsWith("http") ? url : `http://localhost:3000${url}`;98  const reqHeaders: Record<string, string> = { ...headers };99 100  if (sandboxUserId) {101    reqHeaders["x-sandbox-user-id"] = sandboxUserId;102  }103 104  const init: { method: string; headers: Record<string, string>; body?: string } = {105    method,106    headers: reqHeaders,107  };108 109  if (body) {110    init.body = JSON.stringify(body);111    reqHeaders["content-type"] = "application/json";112  }113 114  return new NextRequest(fullUrl, init);115}116 117export async function parseResponse(response: Response) {118  const json = await response.json();119  return { status: response.status, json };120}121```122 123### Writing Regression Tests124 125The key principle: **write tests for bugs that were found, not for code that works**.126 127```typescript128// __tests__/api/user/profile.test.ts129import { describe, it, expect } from "vitest";130import { createTestRequest, parseResponse } from "../../helpers";131import { GET, PATCH } from "@/app/api/user/profile/route";132 133// Define the contract — what fields MUST be in the response134const REQUIRED_FIELDS = [135  "id",136  "email",137  "full_name",138  "phone",139  "role",140  "created_at",141  "avatar_url",142  "notification_settings",  // ← Added after bug found it missing143];144 145describe("GET /api/user/profile", () => {146  it("returns all required fields", async () => {147    const req = createTestRequest("/api/user/profile");148    const res = await GET(req);149    const { status, json } = await parseResponse(res);150 151    expect(status).toBe(200);152    for (const field of REQUIRED_FIELDS) {153      expect(json.data).toHaveProperty(field);154    }155  });156 157  // Regression test — this exact bug was introduced by AI 4 times158  it("notification_settings is not undefined (BUG-R1 regression)", async () => {159    const req = createTestRequest("/api/user/profile");160    const res = await GET(req);161    const { json } = await parseResponse(res);162 163    expect("notification_settings" in json.data).toBe(true);164    const ns = json.data.notification_settings;165    expect(ns === null || typeof ns === "object").toBe(true);166  });167});168```169 170### Testing Sandbox/Production Parity171 172The most common AI regression: fixing production path but forgetting sandbox path (or vice versa).173 174```typescript175// Test that sandbox responses match the expected contract176describe("GET /api/user/messages (conversation list)", () => {177  it("includes partner_name in sandbox mode", async () => {178    const req = createTestRequest("/api/user/messages", {179      sandboxUserId: "user-001",180    });181    const res = await GET(req);182    const { json } = await parseResponse(res);183 184    // This caught a bug where partner_name was added185    // to production path but not sandbox path186    if (json.data.length > 0) {187      for (const conv of json.data) {188        expect("partner_name" in conv).toBe(true);189      }190    }191  });192});193```194 195## Integrating Tests into Bug-Check Workflow196 197### Custom Command Definition198 199```markdown200<!-- .claude/commands/bug-check.md -->201# Bug Check202 203## Step 1: Automated Tests (mandatory, cannot skip)204 205Run these commands FIRST before any code review:206 207    npm run test       # Vitest test suite208    npm run build      # TypeScript type check + build209 210- If tests fail → report as highest priority bug211- If build fails → report type errors as highest priority212- Only proceed to Step 2 if both pass213 214## Step 2: Code Review (AI review)215 2161. Sandbox / production path consistency2172. API response shape matches frontend expectations2183. SELECT clause completeness2194. Error handling with rollback2205. Optimistic update race conditions221 222## Step 3: For each bug fixed, propose a regression test223```224 225### The Workflow226 227```228User: "バグチェックして" (or "/bug-check")229  │230  ├─ Step 1: npm run test231  │   ├─ FAIL → Bug found mechanically (no AI judgment needed)232  │   └─ PASS → Continue233  │234  ├─ Step 2: npm run build235  │   ├─ FAIL → Type error found mechanically236  │   └─ PASS → Continue237  │238  ├─ Step 3: AI code review (with known blind spots in mind)239  │   └─ Findings reported240  │241  └─ Step 4: For each fix, write a regression test242      └─ Next bug-check catches if fix breaks243```244 245## Common AI Regression Patterns246 247### Pattern 1: Sandbox/Production Path Mismatch248 249**Frequency**: Most common (observed in 3 out of 4 regressions)250 251```typescript252// FAIL: AI adds field to production path only253if (isSandboxMode()) {254  return { data: { id, email, name } };  // Missing new field255}256// Production path257return { data: { id, email, name, notification_settings } };258 259// PASS: Both paths must return the same shape260if (isSandboxMode()) {261  return { data: { id, email, name, notification_settings: null } };262}263return { data: { id, email, name, notification_settings } };264```265 266**Test to catch it**:267 268```typescript269it("sandbox and production return same fields", async () => {270  // In test env, sandbox mode is forced ON271  const res = await GET(createTestRequest("/api/user/profile"));272  const { json } = await parseResponse(res);273 274  for (const field of REQUIRED_FIELDS) {275    expect(json.data).toHaveProperty(field);276  }277});278```279 280### Pattern 2: SELECT Clause Omission281 282**Frequency**: Common with Supabase/Prisma when adding new columns283 284```typescript285// FAIL: New column added to response but not to SELECT286const { data } = await supabase287  .from("users")288  .select("id, email, name")  // notification_settings not here289  .single();290 291return { data: { ...data, notification_settings: data.notification_settings } };292// → notification_settings is always undefined293 294// PASS: Use SELECT * or explicitly include new columns295const { data } = await supabase296  .from("users")297  .select("*")298  .single();299```300 301### Pattern 3: Error State Leakage302 303**Frequency**: Moderate — when adding error handling to existing components304 305```typescript306// FAIL: Error state set but old data not cleared307catch (err) {308  setError("Failed to load");309  // reservations still shows data from previous tab!310}311 312// PASS: Clear related state on error313catch (err) {314  setReservations([]);  // Clear stale data315  setError("Failed to load");316}317```318 319### Pattern 4: Optimistic Update Without Proper Rollback320 321```typescript322// FAIL: No rollback on failure323const handleRemove = async (id: string) => {324  setItems(prev => prev.filter(i => i.id !== id));325  await fetch(`/api/items/${id}`, { method: "DELETE" });326  // If API fails, item is gone from UI but still in DB327};328 329// PASS: Capture previous state and rollback on failure330const handleRemove = async (id: string) => {331  const prevItems = [...items];332  setItems(prev => prev.filter(i => i.id !== id));333  try {334    const res = await fetch(`/api/items/${id}`, { method: "DELETE" });335    if (!res.ok) throw new Error("API error");336  } catch {337    setItems(prevItems);  // Rollback338    alert("削除に失敗しました");339  }340};341```342 343## Strategy: Test Where Bugs Were Found344 345Don't aim for 100% coverage. Instead:346 347```348Bug found in /api/user/profile     → Write test for profile API349Bug found in /api/user/messages    → Write test for messages API350Bug found in /api/user/favorites   → Write test for favorites API351No bug in /api/user/notifications  → Don't write test (yet)352```353 354**Why this works with AI development:**355 3561. AI tends to make the **same category of mistake** repeatedly3572. Bugs cluster in complex areas (auth, multi-path logic, state management)3583. Once tested, that exact regression **cannot happen again**3594. Test count grows organically with bug fixes — no wasted effort360 361## Quick Reference362 363| AI Regression Pattern | Test Strategy | Priority |364|---|---|---|365| Sandbox/production mismatch | Assert same response shape in sandbox mode |  High |366| SELECT clause omission | Assert all required fields in response |  High |367| Error state leakage | Assert state cleanup on error |  Medium |368| Missing rollback | Assert state restored on API failure |  Medium |369| Type cast masking null | Assert field is not undefined |  Medium |370 371## DO / DON'T372 373**DO:**374- Write tests immediately after finding a bug (before fixing it if possible)375- Test the API response shape, not the implementation376- Run tests as the first step of every bug-check377- Keep tests fast (< 1 second total with sandbox mode)378- Name tests after the bug they prevent (e.g., "BUG-R1 regression")379 380**DON'T:**381- Write tests for code that has never had a bug382- Trust AI self-review as a substitute for automated tests383- Skip sandbox path testing because "it's just mock data"384- Write integration tests when unit tests suffice385- Aim for coverage percentage — aim for regression prevention
Related skills
Agent Eval

Install Agent Eval skill for Claude Code from affaan-m/everything-claude-code.
Agent Harness Construction

Install Agent Harness Construction skill for Claude Code from affaan-m/everything-claude-code.
Agent Payment X402

Install Agent Payment X402 skill for Claude Code from affaan-m/everything-claude-code.