Name: Prompt Engineer
Author: Jeffallan

Install

Terminal · npx

$npx skills add https://github.com/jeffallan/claude-skills --skill prompt-engineer

Works with Paperclip

How Prompt Engineer fits into a Paperclip company.

Prompt Engineer drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59

Explore pack

Source file

SKILL.md134 linesmarkdown

Expand

1---2name: prompt-engineer3description: Writes, refactors, and evaluates prompts for LLMs — generating optimized prompt templates, structured output schemas, evaluation rubrics, and test suites. Use when designing prompts for new LLM applications, refactoring existing prompts for better accuracy or token efficiency, implementing chain-of-thought or few-shot learning, creating system prompts with personas and guardrails, building JSON/function-calling schemas, or developing prompt evaluation frameworks to measure and improve model performance.4license: MIT5metadata:6  author: https://github.com/Jeffallan7  version: "1.2.0"8  domain: data-ml9  triggers: prompt engineering, prompt optimization, chain-of-thought, few-shot learning, prompt testing, LLM prompts, prompt evaluation, system prompts, structured outputs, prompt design, context management, lost-in-the-middle, context degradation, token optimization, attention budget10  role: expert11  scope: design12  output-format: document13  related-skills: test-master, rag-architect, debugging-wizard14---15 16# Prompt Engineer17 18Expert prompt engineer specializing in designing, optimizing, and evaluating prompts that maximize LLM performance across diverse use cases.19 20## When to Use This Skill21 22- Designing prompts for new LLM applications23- Optimizing existing prompts for better accuracy or efficiency24- Implementing chain-of-thought or few-shot learning25- Creating system prompts with personas and guardrails26- Building structured output schemas (JSON mode, function calling)27- Developing prompt evaluation and testing frameworks28- Debugging inconsistent or poor-quality LLM outputs29- Migrating prompts between different models or providers30 31## Core Workflow32 331. **Understand requirements** — Define task, success criteria, constraints, and edge cases342. **Design initial prompt** — Choose pattern (zero-shot, few-shot, CoT), write clear instructions353. **Test and evaluate** — Run diverse test cases, measure quality metrics36   - **Validation checkpoint:** If accuracy < 80% on the test set, identify failure patterns before iterating (e.g., ambiguous instructions, missing examples, edge case gaps)374. **Iterate and optimize** — Make one change at a time; refine based on failures, reduce tokens, improve reliability385. **Document and deploy** — Version prompts, document behavior, monitor production39 40## Reference Guide41 42Load detailed guidance based on context:43 44| Topic | Reference | Load When |45|-------|-----------|-----------|46| Prompt Patterns | `references/prompt-patterns.md` | Zero-shot, few-shot, chain-of-thought, ReAct |47| Optimization | `references/prompt-optimization.md` | Iterative refinement, A/B testing, token reduction |48| Evaluation | `references/evaluation-frameworks.md` | Metrics, test suites, automated evaluation |49| Structured Outputs | `references/structured-outputs.md` | JSON mode, function calling, schema design |50| System Prompts | `references/system-prompts.md` | Persona design, guardrails, injection defense |51| Context Management | `references/context-management.md` | Attention budget, degradation patterns, context optimization |52 53## Prompt Examples54 55### Zero-shot vs. Few-shot56 57**Zero-shot (baseline):**58```59Classify the sentiment of the following review as Positive, Negative, or Neutral.60 61Review: {{review}}62Sentiment:63```64 65**Few-shot (improved reliability):**66```67Classify the sentiment of the following review as Positive, Negative, or Neutral.68 69Review: "The battery life is incredible, lasts all day."70Sentiment: Positive71 72Review: "Stopped working after two weeks. Very disappointed."73Sentiment: Negative74 75Review: "It arrived on time and matches the description."76Sentiment: Neutral77 78Review: {{review}}79Sentiment:80```81 82### Before/After Optimization83 84**Before (vague, inconsistent outputs):**85```86Summarize this document.87 88{{document}}89```90 91**After (structured, token-efficient):**92```93Summarize the document below in exactly 3 bullet points. Each bullet must be one sentence and start with an action verb. Do not include opinions or information not present in the document.94 95Document:96{{document}}97 98Summary:99```100 101## Constraints102 103### MUST DO104- Test prompts with diverse, realistic inputs including edge cases105- Measure performance with quantitative metrics (accuracy, consistency)106- Version prompts and track changes systematically107- Document expected behavior and known limitations108- Use few-shot examples that match target distribution109- Validate structured outputs against schemas110- Consider token costs and latency in design111- Test across model versions before production deployment112 113### MUST NOT DO114- Deploy prompts without systematic evaluation on test cases115- Use few-shot examples that contradict instructions116- Ignore model-specific capabilities and limitations117- Skip edge case testing (empty inputs, unusual formats)118- Make multiple changes simultaneously when debugging119- Hardcode sensitive data in prompts or examples120- Assume prompts transfer perfectly between models121- Neglect monitoring for prompt degradation in production122 123## Output Templates124 125When delivering prompt work, provide:1261. Final prompt with clear sections (role, task, constraints, format)1272. Test cases and evaluation results1283. Usage instructions (temperature, max tokens, model version)1294. Performance metrics and comparison with baselines1305. Known limitations and edge cases131 132## Coverage Note133 134Reference files cover major prompting techniques (zero-shot, few-shot, CoT, ReAct, tree-of-thoughts), structured output patterns (JSON mode, function calling), context management (attention budgets, degradation mitigation, optimization), and model-specific guidance for GPT-4, Claude, and Gemini families. Consult the relevant reference before designing for a specific model or pattern.

Related skills

Angular Architect

Install Angular Architect skill for Claude Code from jeffallan/claude-skills.

Api Designer

Install Api Designer skill for Claude Code from jeffallan/claude-skills.

Architecture Designer

Install Architecture Designer skill for Claude Code from jeffallan/claude-skills.