Name: Test Driven Development
Author: Obra

Install

Terminal · npx

$npx skills add https://github.com/obra/superpowers --skill test-driven-development

Works with Paperclip

How Test Driven Development fits into a Paperclip company.

Test Driven Development drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59

Explore pack

Source file

SKILL.md371 linesmarkdown

Expand

1---2name: test-driven-development3description: Use when implementing any feature or bugfix, before writing implementation code4---5 6# Test-Driven Development (TDD)7 8## Overview9 10Write the test first. Watch it fail. Write minimal code to pass.11 12**Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.13 14**Violating the letter of the rules is violating the spirit of the rules.**15 16## When to Use17 18**Always:**19- New features20- Bug fixes21- Refactoring22- Behavior changes23 24**Exceptions (ask your human partner):**25- Throwaway prototypes26- Generated code27- Configuration files28 29Thinking "skip TDD just this once"? Stop. That's rationalization.30 31## The Iron Law32 33```34NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST35```36 37Write code before the test? Delete it. Start over.38 39**No exceptions:**40- Don't keep it as "reference"41- Don't "adapt" it while writing tests42- Don't look at it43- Delete means delete44 45Implement fresh from tests. Period.46 47## Red-Green-Refactor48 49```dot50digraph tdd_cycle {51    rankdir=LR;52    red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];53    verify_red [label="Verify fails\ncorrectly", shape=diamond];54    green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];55    verify_green [label="Verify passes\nAll green", shape=diamond];56    refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];57    next [label="Next", shape=ellipse];58 59    red -> verify_red;60    verify_red -> green [label="yes"];61    verify_red -> red [label="wrong\nfailure"];62    green -> verify_green;63    verify_green -> refactor [label="yes"];64    verify_green -> green [label="no"];65    refactor -> verify_green [label="stay\ngreen"];66    verify_green -> next;67    next -> red;68}69```70 71### RED - Write Failing Test72 73Write one minimal test showing what should happen.74 75<Good>76```typescript77test('retries failed operations 3 times', async () => {78  let attempts = 0;79  const operation = () => {80    attempts++;81    if (attempts < 3) throw new Error('fail');82    return 'success';83  };84 85  const result = await retryOperation(operation);86 87  expect(result).toBe('success');88  expect(attempts).toBe(3);89});90```91Clear name, tests real behavior, one thing92</Good>93 94<Bad>95```typescript96test('retry works', async () => {97  const mock = jest.fn()98    .mockRejectedValueOnce(new Error())99    .mockRejectedValueOnce(new Error())100    .mockResolvedValueOnce('success');101  await retryOperation(mock);102  expect(mock).toHaveBeenCalledTimes(3);103});104```105Vague name, tests mock not code106</Bad>107 108**Requirements:**109- One behavior110- Clear name111- Real code (no mocks unless unavoidable)112 113### Verify RED - Watch It Fail114 115**MANDATORY. Never skip.**116 117```bash118npm test path/to/test.test.ts119```120 121Confirm:122- Test fails (not errors)123- Failure message is expected124- Fails because feature missing (not typos)125 126**Test passes?** You're testing existing behavior. Fix test.127 128**Test errors?** Fix error, re-run until it fails correctly.129 130### GREEN - Minimal Code131 132Write simplest code to pass the test.133 134<Good>135```typescript136async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {137  for (let i = 0; i < 3; i++) {138    try {139      return await fn();140    } catch (e) {141      if (i === 2) throw e;142    }143  }144  throw new Error('unreachable');145}146```147Just enough to pass148</Good>149 150<Bad>151```typescript152async function retryOperation<T>(153  fn: () => Promise<T>,154  options?: {155    maxRetries?: number;156    backoff?: 'linear' | 'exponential';157    onRetry?: (attempt: number) => void;158  }159): Promise<T> {160  // YAGNI161}162```163Over-engineered164</Bad>165 166Don't add features, refactor other code, or "improve" beyond the test.167 168### Verify GREEN - Watch It Pass169 170**MANDATORY.**171 172```bash173npm test path/to/test.test.ts174```175 176Confirm:177- Test passes178- Other tests still pass179- Output pristine (no errors, warnings)180 181**Test fails?** Fix code, not test.182 183**Other tests fail?** Fix now.184 185### REFACTOR - Clean Up186 187After green only:188- Remove duplication189- Improve names190- Extract helpers191 192Keep tests green. Don't add behavior.193 194### Repeat195 196Next failing test for next feature.197 198## Good Tests199 200| Quality | Good | Bad |201|---------|------|-----|202| **Minimal** | One thing. "and" in name? Split it. | `test('validates email and domain and whitespace')` |203| **Clear** | Name describes behavior | `test('test1')` |204| **Shows intent** | Demonstrates desired API | Obscures what code should do |205 206## Why Order Matters207 208**"I'll write tests after to verify it works"**209 210Tests written after code pass immediately. Passing immediately proves nothing:211- Might test wrong thing212- Might test implementation, not behavior213- Might miss edge cases you forgot214- You never saw it catch the bug215 216Test-first forces you to see the test fail, proving it actually tests something.217 218**"I already manually tested all the edge cases"**219 220Manual testing is ad-hoc. You think you tested everything but:221- No record of what you tested222- Can't re-run when code changes223- Easy to forget cases under pressure224- "It worked when I tried it" ≠ comprehensive225 226Automated tests are systematic. They run the same way every time.227 228**"Deleting X hours of work is wasteful"**229 230Sunk cost fallacy. The time is already gone. Your choice now:231- Delete and rewrite with TDD (X more hours, high confidence)232- Keep it and add tests after (30 min, low confidence, likely bugs)233 234The "waste" is keeping code you can't trust. Working code without real tests is technical debt.235 236**"TDD is dogmatic, being pragmatic means adapting"**237 238TDD IS pragmatic:239- Finds bugs before commit (faster than debugging after)240- Prevents regressions (tests catch breaks immediately)241- Documents behavior (tests show how to use code)242- Enables refactoring (change freely, tests catch breaks)243 244"Pragmatic" shortcuts = debugging in production = slower.245 246**"Tests after achieve the same goals - it's spirit not ritual"**247 248No. Tests-after answer "What does this do?" Tests-first answer "What should this do?"249 250Tests-after are biased by your implementation. You test what you built, not what's required. You verify remembered edge cases, not discovered ones.251 252Tests-first force edge case discovery before implementing. Tests-after verify you remembered everything (you didn't).253 25430 minutes of tests after ≠ TDD. You get coverage, lose proof tests work.255 256## Common Rationalizations257 258| Excuse | Reality |259|--------|---------|260| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |261| "I'll test after" | Tests passing immediately prove nothing. |262| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |263| "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. |264| "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. |265| "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. |266| "Need to explore first" | Fine. Throw away exploration, start with TDD. |267| "Test hard = design unclear" | Listen to test. Hard to test = hard to use. |268| "TDD will slow me down" | TDD faster than debugging. Pragmatic = test-first. |269| "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. |270| "Existing code has no tests" | You're improving it. Add tests for existing code. |271 272## Red Flags - STOP and Start Over273 274- Code before test275- Test after implementation276- Test passes immediately277- Can't explain why test failed278- Tests added "later"279- Rationalizing "just this once"280- "I already manually tested it"281- "Tests after achieve the same purpose"282- "It's about spirit not ritual"283- "Keep as reference" or "adapt existing code"284- "Already spent X hours, deleting is wasteful"285- "TDD is dogmatic, I'm being pragmatic"286- "This is different because..."287 288**All of these mean: Delete code. Start over with TDD.**289 290## Example: Bug Fix291 292**Bug:** Empty email accepted293 294**RED**295```typescript296test('rejects empty email', async () => {297  const result = await submitForm({ email: '' });298  expect(result.error).toBe('Email required');299});300```301 302**Verify RED**303```bash304$ npm test305FAIL: expected 'Email required', got undefined306```307 308**GREEN**309```typescript310function submitForm(data: FormData) {311  if (!data.email?.trim()) {312    return { error: 'Email required' };313  }314  // ...315}316```317 318**Verify GREEN**319```bash320$ npm test321PASS322```323 324**REFACTOR**325Extract validation for multiple fields if needed.326 327## Verification Checklist328 329Before marking work complete:330 331- [ ] Every new function/method has a test332- [ ] Watched each test fail before implementing333- [ ] Each test failed for expected reason (feature missing, not typo)334- [ ] Wrote minimal code to pass each test335- [ ] All tests pass336- [ ] Output pristine (no errors, warnings)337- [ ] Tests use real code (mocks only if unavoidable)338- [ ] Edge cases and errors covered339 340Can't check all boxes? You skipped TDD. Start over.341 342## When Stuck343 344| Problem | Solution |345|---------|----------|346| Don't know how to test | Write wished-for API. Write assertion first. Ask your human partner. |347| Test too complicated | Design too complicated. Simplify interface. |348| Must mock everything | Code too coupled. Use dependency injection. |349| Test setup huge | Extract helpers. Still complex? Simplify design. |350 351## Debugging Integration352 353Bug found? Write failing test reproducing it. Follow TDD cycle. Test proves fix and prevents regression.354 355Never fix bugs without a test.356 357## Testing Anti-Patterns358 359When adding mocks or test utilities, read @testing-anti-patterns.md to avoid common pitfalls:360- Testing mock behavior instead of real behavior361- Adding test-only methods to production classes362- Mocking without understanding dependencies363 364## Final Rule365 366```367Production code → test exists and failed first368Otherwise → not TDD369```370 371No exceptions without your human partner's permission.

Related skills

Brainstorming

Forces Claude to actually think before coding, which frankly should be the default but isn't. Blocks all implementation until you've walked through context expl

Dispatching Parallel Agents

When you've got multiple independent failures across different test files or subsystems, this cuts debugging time by running separate agents in parallel instead

Executing Plans

Takes a written implementation plan and executes it step by step with proper checkpoints. Loads the plan file, reviews it critically for gaps or unclear instruc