Claude Agent Skill · by Deanpeters

Context Engineering Advisor

Install Context Engineering Advisor skill for Claude Code from deanpeters/product-manager-skills.

Install
Terminal · npx
$npx skills add https://github.com/microsoft/github-copilot-for-azure --skill azure-messaging
Works with Paperclip

How Context Engineering Advisor fits into a Paperclip company.

Context Engineering Advisor drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

S
SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59
Explore pack
Source file
SKILL.md763 lines
Expand
---name: context-engineering-advisordescription: Diagnose context stuffing vs. context engineering. Use when an AI workflow feels bloated, brittle, or hard to steer reliably.intent: >-  Guide product managers through diagnosing whether they're doing **context stuffing** (jamming volume without intent) or **context engineering** (shaping structure for attention). Use this to identify context boundaries, fix "Context Hoarding Disorder," and implement tactical practices like bounded domains, episodic retrieval, and the Research→Plan→Reset→Implement cycle.type: interactivetheme: ai-agentsbest_for:  - "Diagnosing context stuffing vs. context engineering in your AI workflows"  - "Building better memory and retrieval architecture for AI agents"  - "Improving AI output quality through structured context design"scenarios:  - "My AI outputs are mediocre even though I'm giving it lots of information — diagnose what's wrong"  - "I want to architect context properly for a multi-step AI workflow in my product team"estimated_time: "15-20 min"--- ## Purpose Guide product managers through diagnosing whether they're doing **context stuffing** (jamming volume without intent) or **context engineering** (shaping structure for attention). Use this to identify context boundaries, fix "Context Hoarding Disorder," and implement tactical practices like bounded domains, episodic retrieval, and the Research→Plan→Reset→Implement cycle. **Key Distinction:** Context stuffing assumes volume = quality ("paste the entire PRD"). Context engineering treats AI attention as a scarce resource and allocates it deliberately. This is not about prompt writing—it's about **designing the information architecture** that grounds AI in reality without overwhelming it with noise. ## Key Concepts ### The Paradigm Shift: Parametric → Contextual Intelligence **The Fundamental Problem:**- LLMs have **parametric knowledge** (encoded during training) = static, outdated, non-attributable- When asked about proprietary data, real-time info, or user preferences → forced to hallucinate or admit ignorance- **Context engineering** bridges the gap between static training and dynamic reality **PM's Role Shift:** From feature builder → **architect of informational ecosystems** that ground AI in reality --- ### Context Stuffing vs. Context Engineering | Dimension | Context Stuffing | Context Engineering ||-----------|------------------|---------------------|| **Mindset** | Volume = quality | Structure = quality || **Approach** | "Add everything just in case" | "What decision am I making?" || **Persistence** | Persist all context | Retrieve with intent || **Agent Chains** | Share everything between agents | Bounded context per agent || **Failure Response** | Retry until it works | Fix the structure || **Economic Model** | Context as storage | Context as attention (scarce resource) | **Critical Metaphor:** Context stuffing is like bringing your entire file cabinet to a meeting. Context engineering is bringing only the 3 documents relevant to today's decision. --- ### The Anti-Pattern: Context Stuffing **Five Markers of Context Stuffing:**1. **Reflexively expanding context windows** — "Just add more tokens!"2. **Persisting everything "just in case"** — No clear retention criteria3. **Chaining agents without boundaries** — Agent A passes everything to Agent B to Agent C4. **Adding evaluations to mask inconsistency** — "We'll just retry until it's right"5. **Normalized retries** — "It works if you run it 3 times" becomes acceptable **Why It Fails:**- **Reasoning Noise:** Thousands of irrelevant files compete for attention, degrading multi-hop logic- **Context Rot:** Dead ends, past errors, irrelevant data accumulate → goal drift- **Lost in the Middle:** Models prioritize beginning (primacy) and end (recency), ignore middle- **Economic Waste:** Every query becomes expensive without accuracy gains- **Quantitative Degradation:** Accuracy drops below 20% when context exceeds ~32k tokens **The Hidden Costs:**- Escalating token consumption- Diluted attention across irrelevant material- Reduced output confidence- Cascading retries that waste time and money --- ### Real Context Engineering: Core Principles **Five Foundational Principles:**1. **Context without shape becomes noise**2. **Structure > Volume**3. **Retrieve with intent, not completeness**4. **Small working contexts** (like short-term memory)5. **Context Compaction:** Maximize density of relevant information per token **Quantitative Framework:**```Efficiency = (Accuracy × Coherence) / (Tokens × Latency)``` **Key Finding:** Using RAG with 25% of available tokens preserves 95% accuracy while significantly reducing latency and cost. --- ### The 5 Diagnostic Questions (Detect Context Hoarding Disorder) Ask these to identify context stuffing: 1. **What specific decision does this support?** — If you can't answer, you don't need it2. **Can retrieval replace persistence?** — Just-in-time beats always-available3. **Who owns the context boundary?** — If no one, it'll grow forever4. **What fails if we exclude this?** — If nothing breaks, delete it5. **Are we fixing structure or avoiding it?** — Stuffing context often masks bad information architecture --- ### Memory Architecture: Two-Layer System **Short-Term (Conversational) Memory:**- Immediate interaction history for follow-up questions- Challenge: Space management → older parts summarized or truncated- Lifespan: Single session **Long-Term (Persistent) Memory:**- User preferences, key facts across sessions → deep personalization- Implemented via vector database (semantic retrieval)- Two types:  - **Declarative Memory:** Facts ("I'm vegan")  - **Procedural Memory:** Behavioral patterns ("I debug by checking logs first")- Lifespan: Persistent across sessions **LLM-Powered ETL:** Models generate their own memories by identifying signals, consolidating with existing data, updating database automatically. --- ### The Research → Plan → Reset → Implement Cycle **The Context Rot Solution:** 1. **Research:** Agent gathers data → large, chaotic context window (noise + dead ends)2. **Plan:** Agent synthesizes into high-density SPEC.md or PLAN.md (Source of Truth)3. **Reset:** **Clear entire context window** (prevents context rot)4. **Implement:** Fresh session using **only** the high-density plan as context **Why This Works:** Context rot is eliminated; agent starts clean with compressed, high-signal context. --- ### Anti-Patterns (What This Is NOT) - **Not about choosing AI tools** — Claude vs. ChatGPT doesn't matter; architecture matters- **Not about writing better prompts** — This is systems design, not copywriting- **Not about adding more tokens** — "Infinite context" narratives are marketing, not engineering reality- **Not about replacing human judgment** — Context engineering amplifies judgment, doesn't eliminate it --- ### When to Use This Skill ✅ **Use this when:**- You're pasting entire PRDs/codebases into AI and getting vague responses- AI outputs are inconsistent ("works sometimes, not others")- You're burning tokens without seeing accuracy improvements- You suspect you're "context stuffing" but don't know how to fix it- You need to design context architecture for an AI product feature ❌ **Don't use this when:**- You're just getting started with AI (start with basic prompts first)- You're looking for tool recommendations (this is about architecture, not tooling)- Your AI usage is working well (if it ain't broke, don't fix it) --- ### Facilitation Source of Truth Use [`workshop-facilitation`](../workshop-facilitation/SKILL.md) as the default interaction protocol for this skill. It defines:- session heads-up + entry mode (Guided, Context dump, Best guess)- one-question turns with plain-language prompts- progress labels (for example, Context Qx/8 and Scoring Qx/5)- interruption handling and pause/resume behavior- numbered recommendations at decision points- quick-select numbered response options for regular questions (include `Other (specify)` when useful) This file defines the domain-specific assessment content. If there is a conflict, follow this file's domain logic. ## Application This interactive skill uses **adaptive questioning** to diagnose context stuffing, identify boundaries, and provide tactical implementation guidance. --- ### Step 0: Gather Context **Agent asks:** Before we diagnose your context practices, let's gather information: **Current AI Usage:**- What AI tools/systems do you use? (ChatGPT, Claude, custom agents, etc.)- What PM tasks do you use AI for? (PRD writing, user research synthesis, discovery, etc.)- How do you provide context? (paste docs, reference files, use projects/memory) **Symptoms:**- Are AI outputs inconsistent? (works sometimes, not others)- Are you retrying prompts multiple times to get good results?- Are responses vague or hedged despite providing "all the context"?- Are token costs escalating without accuracy improvements? **System Architecture (if applicable):**- Do you have custom AI agents or workflows?- How is context shared between agents?- Do you use RAG, vector databases, or memory systems? **You can describe briefly or paste examples.** --- ### Step 1: Diagnose Context Stuffing Symptoms **Agent asks:** Let's assess whether you're experiencing **context stuffing**. Which of these symptoms do you recognize? **Select all that apply:** 1. **"I paste entire documents into AI"** — Full PRDs, complete user interview transcripts, entire codebases2. **"AI gives vague, hedged responses despite having 'all the context'"** — Responses like "it depends," "consider these options," non-committal3. **"I have to retry prompts 3+ times to get usable output"** — Inconsistency is normalized4. **"Token costs are escalating but accuracy isn't improving"** — Spending more, getting same or worse results5. **"I keep adding more context hoping it'll help"** — Reflexive expansion without strategy6. **"My agents pass everything to each other"** — Agent A → Agent B → Agent C with full context chain7. **"I don't have clear criteria for what to include/exclude"** — No context boundary definitions8. **"None of these—my AI usage is working well"** — Skip to advanced optimization **User response:** [Select symptoms] **Agent analyzes:** Based on your selections:- **0-1 symptoms:** Healthy context practices; proceed to optimization- **2-3 symptoms:** Early context stuffing; address before it scales- **4+ symptoms:** Active Context Hoarding Disorder; immediate intervention needed **Agent proceeds to diagnostic questions.** --- ### Step 2: Diagnostic Question 1 — What Specific Decision Does This Support? **Agent asks:** Let's start with the most critical question: **For each piece of context you're including, what specific decision does it support?** **Example Context Analysis:** Imagine you're asking AI to "help with discovery planning." You're providing:- Entire PRD (20 pages)- 50 user interview transcripts (full)- Competitive analysis doc (15 pages)- Team meeting notes from last 3 months **Question:** What decision are you making right now? **Offer 3 scenarios:** 1. **"I'm deciding which user segment to interview first"**   - **Context needed:** User segments from PRD (2 paragraphs), prior interview themes (1 page synthesis), not full transcripts   - **Context NOT needed:** Meeting notes, full competitive analysis, full PRD 2. **"I'm deciding which discovery questions to ask in interviews"**   - **Context needed:** Research objectives (from PRD), past interview insights (synthesis), Jobs-to-be-Done framework   - **Context NOT needed:** Full competitive analysis, full meeting notes 3. **"I'm not sure what decision I'm making—I just want AI to 'understand my product'"**   - **Problem:** No specific decision = context stuffing trap   - **Fix:** Define the decision first, then select context **Agent recommends:** **Best Practice:** Before adding context, complete this sentence:> "I need this context because I'm deciding [specific decision], and without [specific information], I can't make that decision." If you can't complete that sentence, you don't need the context. **User response:** [Describe their decision + context] **Agent validates:** Does the context directly support the stated decision? If not, recommend trimming. --- ### Step 3: Diagnostic Question 2 — Can Retrieval Replace Persistence? **Agent asks:** **Second question: Is this information you always need, or something you can retrieve just-in-time?** **The Distinction:** **Always-Needed (Persist):**- Core product constraints (technical, regulatory, strategic)- User preferences that apply to every interaction- Critical definitions (operational glossary)- Non-negotiable rules **Episodic (Retrieve on-demand):**- Project-specific details (this epic, this sprint)- Historical data (past PRDs, old interview transcripts)- Contextual facts (competitive analysis, market research)- Temporary decisions **Key Insight:** Just-in-time retrieval beats always-available. Don't persist what you can retrieve. **Offer 3 options:** 1. **"Most of my context is always-needed (core constraints, user prefs)"**   - **Assessment:** Good instinct; verify with Question 4 (what fails if excluded?)   - **Recommendation:** Build constraints registry and operational glossary (persist these) 2. **"Most of my context is episodic (project details, historical data)"**   - **Assessment:** Perfect candidate for RAG or retrieval   - **Recommendation:** Implement semantic search; retrieve only relevant chunks for each query 3. **"I'm not sure which is which—I persist everything to be safe"**   - **Assessment:** Classic Context Hoarding Disorder symptom   - **Fix:** Apply Question 4 test to each piece of context **Agent recommends:** **Rule of Thumb:**- **Persist:** Information referenced in 80%+ of interactions- **Retrieve:** Information referenced in <20% of interactions- **Gray zone (20-80%):** Depends on retrieval latency vs. context window cost **User response:** [Categorize their context] **Agent provides:** Specific recommendations on what to persist vs. retrieve. --- ### Step 4: Diagnostic Question 3 — Who Owns the Context Boundary? **Agent asks:** **Third question: Who is responsible for defining what belongs in vs. out of your AI's context?** **The Ownership Problem:** If **no one** owns the context boundary, it will grow indefinitely. Every PM will add "just one more thing," and six months later, you're stuffing 100k tokens per query. **Offer 3 options:** 1. **"I own the boundary (solo PM or small team)"**   - **Assessment:** Good—you can make fast decisions   - **Recommendation:** Document your boundary criteria (use Questions 1-5 as framework) 2. **"My team shares ownership (collaborative boundary definition)"**   - **Assessment:** Can work if formalized   - **Recommendation:** Create a "Context Manifest" doc: what's always included, what's retrieved, what's excluded (and why) 3. **"No one owns it—it's ad-hoc / implicit"**   - **Assessment:** Critical risk; boundary will expand uncontrollably   - **Fix:** Assign explicit ownership; schedule quarterly context audits **Agent recommends:** **Best Practice: Create a Context Manifest** ```markdown# Context Manifest: [Product/Feature Name] ## Always Persisted (Core Context)- Product constraints (technical, regulatory)- User preferences (role, permissions, preferences)- Operational glossary (20 key terms) ## Retrieved On-Demand (Episodic Context)- Historical PRDs (retrieve via semantic search)- User interview transcripts (retrieve relevant quotes)- Competitive analysis (retrieve when explicitly needed) ## Excluded (Out of Scope)- Meeting notes older than 30 days (no longer relevant)- Full codebase (use code search instead)- Marketing materials (not decision-relevant) ## Boundary Owner: [Name]## Last Reviewed: [Date]## Next Review: [Date + 90 days]``` **User response:** [Describe current ownership model] **Agent provides:** Recommendation on formalizing ownership + template for Context Manifest. --- ### Step 5: Diagnostic Question 4 — What Fails if We Exclude This? **Agent asks:** **Fourth question: For each piece of context, what specific failure mode occurs if you exclude it?** This is the **falsification test**. If you can't identify a concrete failure, you don't need the context. **Offer 3 scenarios:** 1. **"If I exclude product constraints, AI will recommend infeasible solutions"**   - **Failure Mode:** Clear and concrete   - **Assessment:** Valid reason to persist constraints 2. **"If I exclude historical PRDs, AI won't understand our product evolution"**   - **Failure Mode:** Vague and hypothetical   - **Assessment:** Historical context rarely needed for current decisions   - **Fix:** Retrieve PRDs only when explicitly referencing past decisions 3. **"If I exclude this, I'm not sure anything would break—I just include it to be thorough"**   - **Failure Mode:** None identified   - **Assessment:** Context stuffing; delete immediately **Agent recommends:** **The Falsification Protocol:** For each context element, complete this statement:> "If I exclude [context element], then [specific failure] will occur in [specific scenario]." **Examples:**- ✅ Good: "If I exclude GDPR constraints, AI will recommend features that violate EU privacy law."- ❌ Bad: "If I exclude this PRD, AI might not fully understand the product." (Vague) **User response:** [Apply falsification test to their context] **Agent provides:** List of context elements to delete (no concrete failure identified). --- ### Step 6: Diagnostic Question 5 — Are We Fixing Structure or Avoiding It? **Agent asks:** **Fifth question: Is adding more context solving a problem, or masking a deeper structural issue?** **The Root Cause Question:** Context stuffing often hides bad information architecture. Instead of fixing messy, ambiguous documents, teams add more documents hoping AI will "figure it out." **Offer 3 options:** 1. **"I'm adding context because our docs are poorly structured/ambiguous"**   - **Assessment:** You're masking a structural problem   - **Fix:** Clean up the docs first (remove ambiguity, add constraints, define terms)   - **Example:** Instead of pasting 5 conflicting PRDs, reconcile them into 1 Source of Truth 2. **"I'm adding context because we don't have a shared operational glossary"**   - **Assessment:** You're compensating for missing foundations   - **Fix:** Build the glossary (20-30 key terms); AI can reference it reliably   - **Example:** Define "active user," "churn," "engagement" unambiguously 3. **"I'm adding context because our constraints aren't documented"**   - **Assessment:** You're avoiding constraint engineering   - **Fix:** Create constraints registry (technical, regulatory, strategic)   - **Example:** Document "We won't build mobile apps" vs. explaining it in every prompt **Agent recommends:** **The Structural Health Test:** If you're adding context to compensate for:- **Ambiguous documentation** → Fix the docs, don't add more- **Undefined terms** → Build operational glossary- **Undocumented constraints** → Create constraints registry- **Conflicting information** → Reconcile into Source of Truth **User response:** [Identify structural issues] **Agent provides:** Prioritized list of structural fixes before adding more context. --- ### Step 7: Define Memory Architecture **Agent asks:** Based on your context needs, let's design a **two-layer memory architecture**. **Your Context Profile (from previous steps):**- Always-needed context: [Summary from Q2]- Episodic context: [Summary from Q2]- Boundary owner: [From Q3]- Validated essentials: [From Q4]- Structural fixes needed: [From Q5] **Recommended Architecture:** **Short-Term (Conversational) Memory:**- **What it stores:** Immediate interaction history for follow-up questions- **Lifespan:** Single session- **Management:** Summarize or truncate older parts to avoid crowding- **Your specific needs:** [Agent customizes based on user's workflow] **Long-Term (Persistent) Memory:**- **What it stores:** User preferences, core constraints, operational glossary- **Lifespan:** Persistent across sessions- **Implementation:** Vector database (semantic retrieval)- **Two types:**  - **Declarative Memory:** Facts (e.g., "We follow HIPAA regulations")  - **Procedural Memory:** Behavioral patterns (e.g., "Always validate feasibility before usability")- **Your specific needs:** [Agent customizes] **Retrieval Strategy (Episodic Context):**- **What it retrieves:** Historical PRDs, user interviews, competitive analysis- **Method:** Semantic search triggered by query intent- **Optimization:** Contextual Retrieval (Anthropic) — prepend explanatory context to each chunk before embedding- **Your specific needs:** [Agent customizes] **Agent offers:** Would you like me to:1. **Generate a Context Architecture Blueprint** for your specific use case?2. **Provide implementation guidance** (tools, techniques, best practices)?3. **Design a retrieval strategy** for your episodic context? **User response:** [Selection] --- ### Step 8: Implement Research → Plan → Reset → Implement Cycle **Agent asks:** Now let's implement the **Research → Plan → Reset → Implement** cycle to prevent context rot. **The Problem:** As agents research, context windows grow chaotic—filled with dead ends, errors, and noise. This dilutes attention and causes goal drift. **The Solution:** Compress research into a high-density plan, then **clear the context window** before implementing. **The Four-Phase Cycle:** **Phase 1: Research (Chaotic Context Allowed)**- Agent gathers data from multiple sources- Context window grows large and messy (this is expected)- Dead ends, failed hypotheses, and noise accumulate- **Goal:** Comprehensive information gathering **Phase 2: Plan (Synthesis)**- Agent synthesizes research into a high-density SPEC.md or PLAN.md- This becomes the **Source of Truth** for implementation- **Key elements:**  - Decision made  - Evidence supporting decision  - Constraints applied  - Next steps (sequenced)- **Format:** Structured, concise, unambiguous **Phase 3: Reset (Clear Context Window)**- **Critical step:** Clear the entire context window- Delete all research artifacts, dead ends, errors- This prevents context rot from poisoning implementation **Phase 4: Implement (Fresh Session with Plan Only)**- Start a new session with **only the high-density plan** as context- Agent has clean, focused attention on execution- No noise from research phase **Agent offers 3 options:** 1. **"I want a template for the PLAN.md format"**   - Agent provides structured template for high-density plans 2. **"I want to see an example of this cycle in action"**   - Agent walks through concrete PM use case (e.g., discovery planning) 3. **"I'm ready to implement this in my workflow"**   - Agent provides step-by-step implementation guide **User response:** [Selection] **Agent provides:** Tailored guidance based on selection. --- ### Step 9: Action Plan & Next Steps **Agent synthesizes:** Based on your context engineering assessment, here's your action plan: **Immediate Fixes (This Week):**1. [Delete context with no falsifiable failure mode from Q4]2. [Apply Research→Plan→Reset→Implement to your next AI task]3. [Document context boundary in Context Manifest] **Foundation Building (Next 2 Weeks):**1. [Build constraints registry with 20+ entries]2. [Create operational glossary with 20-30 key terms]3. [Implement two-layer memory architecture] **Long-Term Optimization (Next Month):**1. [Set up semantic retrieval for episodic context]2. [Assign context boundary owner + quarterly audit schedule]3. [Implement Contextual Retrieval (Anthropic) for RAG] **Success Metrics:**- Token usage down 50%+ (less context stuffing)- Output consistency up (less retry/regeneration)- Response quality up (sharper, less hedged answers)- Context window stable (no unbounded growth) **Agent offers:** Would you like me to:1. **Generate specific implementation docs** (Context Manifest, PLAN.md template, etc.)?2. **Provide advanced techniques** (Contextual Retrieval, LLM-powered ETL)?3. **Review your current context setup** (provide feedback on specific prompts/workflows)? --- ## Examples ### Example 1: Solo PM Context Stuffing → Engineering **Context:**- Solo PM at early-stage startup- Using Claude Projects for PRD writing- Pasting entire PRDs (20 pages) + all user interviews (50 transcripts) every time- Getting vague, inconsistent responses **Assessment:**- Symptoms: Hedged responses, normalized retries (4+ symptoms)- Q1 (Decision): "I just want AI to understand my product" (no specific decision)- Q2 (Persist/Retrieve): Persisting everything (no retrieval strategy)- Q3 (Ownership): No formal owner (solo PM, ad-hoc)- Q4 (Failure): Can't identify concrete failures for most context- Q5 (Structure): Avoiding constraint documentation **Diagnosis:** Active Context Hoarding Disorder **Intervention:**1. **Immediate:** Delete all context that fails Q4 test → keeps 20% of original2. **Week 1:** Build constraints registry (10 technical constraints, 5 strategic)3. **Week 2:** Create operational glossary (25 terms)4. **Week 3:** Implement Research→Plan→Reset→Implement for next PRD **Outcome:** Token usage down 70%, output quality up significantly, responses crisp and actionable. --- ### Example 2: Growth-Stage Team with Agent Chains **Context:**- Product team with 5 PMs- Custom AI agents for discovery synthesis- Agent A (research) → Agent B (synthesis) → Agent C (recommendations)- Each agent passes full context to next → context window explodes to 100k+ tokens **Assessment:**- Symptoms: Escalating token costs, inconsistent outputs (3 symptoms)- Q1 (Decision): Each agent has clear decision, but passes unnecessary context- Q2 (Persist/Retrieve): Mixing persistent and episodic without strategy- Q3 (Ownership): No explicit owner; each PM adds context- Q4 (Failure): Agents pass "just in case" context with no falsifiable failure- Q5 (Structure): Missing Context Manifest **Diagnosis:** Agent orchestration without boundaries **Intervention:**1. **Immediate:** Define bounded context per agent (Agent A outputs only 2-page synthesis to Agent B, not full research)2. **Week 1:** Assign context boundary owner (Lead PM)3. **Week 2:** Create Context Manifest (what persists, what's retrieved, what's excluded)4. **Week 3:** Implement Research→Plan→Reset→Implement between Agent B and Agent C **Outcome:** Token usage down 60%, agent chain reliability up, costs reduced by 50%. --- ### Example 3: Enterprise with RAG but No Context Engineering **Context:**- Large enterprise with vector database RAG system- "Stuff the whole knowledge base" approach (10,000+ documents)- Retrieval returns 50+ chunks per query → floods context window- Accuracy declining as knowledge base grows **Assessment:**- Symptoms: Vague responses despite "complete knowledge," normalized retries (2 symptoms)- Q1 (Decision): Decisions clear, but retrieval has no intent (returns everything)- Q2 (Persist/Retrieve): Good instinct to retrieve, but no filtering- Q3 (Ownership): Engineering owns RAG, Product doesn't own context boundaries- Q4 (Failure): Can't identify why 50 chunks needed vs. 5- Q5 (Structure): Knowledge base has no structure (flat documents, no metadata) **Diagnosis:** Retrieval without intent (RAG as context stuffing) **Intervention:**1. **Immediate:** Limit retrieval to top 5 chunks per query (down from 50)2. **Week 1:** Implement Contextual Retrieval (Anthropic) — prepend explanatory context to each chunk during indexing3. **Week 2:** Add metadata to documents (category, recency, authority)4. **Week 3:** Product team defines retrieval intent per query type (discovery = customer insights, feasibility = technical constraints) **Outcome:** Accuracy up 35% (from Anthropic benchmark), latency down 60%, token usage down 80%. --- ## Common Pitfalls ### 1. **"Infinite Context" Marketing vs. Engineering Reality****Failure Mode:** Believing "1 million token context windows" means you should use all of them. **Consequence:** Reasoning Noise degrades performance; accuracy drops below 20% past ~32k tokens. **Fix:** Context windows are not free. Treat tokens as scarce; optimize for density, not volume. --- ### 2. **Retrying Instead of Restructuring****Failure Mode:** "It works if I run it 3 times" → normalizing retries instead of fixing structure. **Consequence:** Wastes time and money; masks deeper context rot issues. **Fix:** If retries are common, your context structure is broken. Apply Q5 (fix structure, don't add volume). --- ### 3. **No Context Boundary Owner****Failure Mode:** Ad-hoc, implicit context decisions → unbounded growth. **Consequence:** Six months later, every query stuffs 100k tokens per interaction. **Fix:** Assign explicit ownership; create Context Manifest; schedule quarterly audits. --- ### 4. **Mixing Always-Needed with Episodic****Failure Mode:** Persisting historical data that should be retrieved on-demand. **Consequence:** Context window crowded with irrelevant information; attention diluted. **Fix:** Apply Q2 test: persist only what's needed in 80%+ of interactions; retrieve the rest. --- ### 5. **Skipping the Reset Phase****Failure Mode:** Never clearing context window during Research→Plan→Implement cycle. **Consequence:** Context rot accumulates; goal drift; dead ends poison implementation. **Fix:** Mandatory Reset phase after Plan; start implementation with only high-density plan as context. --- ## References ### Related Skills- **[ai-shaped-readiness-advisor](../ai-shaped-readiness-advisor/SKILL.md)** (Interactive) — Context Design is Competency #1 of AI-shaped work- **[problem-statement](../problem-statement/SKILL.md)** (Component) — Evidence-based framing requires context engineering- **[epic-hypothesis](../epic-hypothesis/SKILL.md)** (Component) — Testable hypotheses depend on clear constraints (part of context)- **[pol-probe-advisor](../pol-probe-advisor/SKILL.md)** (Interactive) — Validation experiments benefit from context engineering (define what AI needs to know) ### External Frameworks- **Dean Peters** — [*Context Stuffing Is Not Context Engineering*](https://deanpeters.substack.com/p/context-stuffing-is-not-context-engineering) (Dean Peters' Substack, 2026)- **Teresa Torres** — *Continuous Discovery Habits* (Context Engineering as one of 5 new AI PM disciplines)- **Marty Cagan** — *Empowered* (Feasibility risk in AI era includes understanding "physics of AI")- **Anthropic** — [Contextual Retrieval whitepaper](https://www.anthropic.com/news/contextual-retrieval) (35% failure rate reduction)- **Google** — Context engineering whitepaper on LLM-powered memory systems ### Technical References- **RAG (Retrieval-Augmented Generation)** — Standard technique for episodic context retrieval- **Vector Databases** — Semantic search for long-term memory (Pinecone, Weaviate, Chroma)- **Contextual Retrieval (Anthropic)** — Prepend explanatory context to chunks before embedding- **LLM-as-Judge** — Automated evaluation of context quality