How Context Window Management fits into a Paperclip company.

Context Window Management drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59

Explore pack

Source file

SKILL.md315 linesmarkdown

Expand

1---2name: context-window-management3description: Strategies for managing LLM context windows including4  summarization, trimming, routing, and avoiding context rot5risk: unknown6source: vibeship-spawner-skills (Apache 2.0)7date_added: 2026-02-278---9 10# Context Window Management11 12Strategies for managing LLM context windows including summarization, trimming, routing, and avoiding context rot13 14## Capabilities15 16- context-engineering17- context-summarization18- context-trimming19- context-routing20- token-counting21- context-prioritization22 23## Prerequisites24 25- Knowledge: LLM fundamentals, Tokenization basics, Prompt engineering26- Skills_recommended: prompt-engineering27 28## Scope29 30- Does_not_cover: RAG implementation details, Model fine-tuning, Embedding models31- Boundaries: Focus is context optimization, Covers strategies not specific implementations32 33## Ecosystem34 35### Primary_tools36 37- tiktoken - OpenAI's tokenizer for counting tokens38- LangChain - Framework with context management utilities39- Claude API - 200K+ context with caching support40 41## Patterns42 43### Tiered Context Strategy44 45Different strategies based on context size46 47**When to use**: Building any multi-turn conversation system48 49interface ContextTier {50    maxTokens: number;51    strategy: 'full' | 'summarize' | 'rag';52    model: string;53}54 55const TIERS: ContextTier[] = [56    { maxTokens: 8000, strategy: 'full', model: 'claude-3-haiku' },57    { maxTokens: 32000, strategy: 'full', model: 'claude-3-5-sonnet' },58    { maxTokens: 100000, strategy: 'summarize', model: 'claude-3-5-sonnet' },59    { maxTokens: Infinity, strategy: 'rag', model: 'claude-3-5-sonnet' }60];61 62async function selectStrategy(messages: Message[]): ContextTier {63    const tokens = await countTokens(messages);64 65    for (const tier of TIERS) {66        if (tokens <= tier.maxTokens) {67            return tier;68        }69    }70    return TIERS[TIERS.length - 1];71}72 73async function prepareContext(messages: Message[]): PreparedContext {74    const tier = await selectStrategy(messages);75 76    switch (tier.strategy) {77        case 'full':78            return { messages, model: tier.model };79 80        case 'summarize':81            const summary = await summarizeOldMessages(messages);82            return { messages: [summary, ...recentMessages(messages)], model: tier.model };83 84        case 'rag':85            const relevant = await retrieveRelevant(messages);86            return { messages: [...relevant, ...recentMessages(messages)], model: tier.model };87    }88}89 90### Serial Position Optimization91 92Place important content at start and end93 94**When to use**: Constructing prompts with significant context95 96// LLMs weight beginning and end more heavily97// Structure prompts to leverage this98 99function buildOptimalPrompt(components: {100    systemPrompt: string;101    criticalContext: string;102    conversationHistory: Message[];103    currentQuery: string;104}): string {105    // START: System instructions (always first)106    const parts = [components.systemPrompt];107 108    // CRITICAL CONTEXT: Right after system (high primacy)109    if (components.criticalContext) {110        parts.push(`## Key Context\n${components.criticalContext}`);111    }112 113    // MIDDLE: Conversation history (lower weight)114    // Summarize if long, keep recent messages full115    const history = components.conversationHistory;116    if (history.length > 10) {117        const oldSummary = summarize(history.slice(0, -5));118        const recent = history.slice(-5);119        parts.push(`## Earlier Conversation (Summary)\n${oldSummary}`);120        parts.push(`## Recent Messages\n${formatMessages(recent)}`);121    } else {122        parts.push(`## Conversation\n${formatMessages(history)}`);123    }124 125    // END: Current query (high recency)126    // Restate critical requirements here127    parts.push(`## Current Request\n${components.currentQuery}`);128 129    // FINAL: Reminder of key constraints130    parts.push(`Remember: ${extractKeyConstraints(components.systemPrompt)}`);131 132    return parts.join('\n\n');133}134 135### Intelligent Summarization136 137Summarize by importance, not just recency138 139**When to use**: Context exceeds optimal size140 141interface MessageWithMetadata extends Message {142    importance: number;  // 0-1 score143    hasCriticalInfo: boolean;  // User preferences, decisions144    referenced: boolean;  // Was this referenced later?145}146 147async function smartSummarize(148    messages: MessageWithMetadata[],149    targetTokens: number150): Message[] {151    // Sort by importance, preserve order for tied scores152    const sorted = [...messages].sort((a, b) =>153        (b.importance + (b.hasCriticalInfo ? 0.5 : 0) + (b.referenced ? 0.3 : 0)) -154        (a.importance + (a.hasCriticalInfo ? 0.5 : 0) + (a.referenced ? 0.3 : 0))155    );156 157    const keep: Message[] = [];158    const summarizePool: Message[] = [];159    let currentTokens = 0;160 161    for (const msg of sorted) {162        const msgTokens = await countTokens([msg]);163        if (currentTokens + msgTokens < targetTokens * 0.7) {164            keep.push(msg);165            currentTokens += msgTokens;166        } else {167            summarizePool.push(msg);168        }169    }170 171    // Summarize the low-importance messages172    if (summarizePool.length > 0) {173        const summary = await llm.complete(`174            Summarize these messages, preserving:175            - Any user preferences or decisions176            - Key facts that might be referenced later177            - The overall flow of conversation178 179            Messages:180            ${formatMessages(summarizePool)}181        `);182 183        keep.unshift({ role: 'system', content: `[Earlier context: ${summary}]` });184    }185 186    // Restore original order187    return keep.sort((a, b) => a.timestamp - b.timestamp);188}189 190### Token Budget Allocation191 192Allocate token budget across context components193 194**When to use**: Need predictable context management195 196interface TokenBudget {197    system: number;      // System prompt198    criticalContext: number;  // User prefs, key info199    history: number;     // Conversation history200    query: number;       // Current query201    response: number;    // Reserved for response202}203 204function allocateBudget(totalTokens: number): TokenBudget {205    return {206        system: Math.floor(totalTokens * 0.10),      // 10%207        criticalContext: Math.floor(totalTokens * 0.15),  // 15%208        history: Math.floor(totalTokens * 0.40),     // 40%209        query: Math.floor(totalTokens * 0.10),       // 10%210        response: Math.floor(totalTokens * 0.25),    // 25%211    };212}213 214async function buildWithBudget(215    components: ContextComponents,216    modelMaxTokens: number217): PreparedContext {218    const budget = allocateBudget(modelMaxTokens);219 220    // Truncate/summarize each component to fit budget221    const prepared = {222        system: truncateToTokens(components.system, budget.system),223        criticalContext: truncateToTokens(224            components.criticalContext, budget.criticalContext225        ),226        history: await summarizeToTokens(components.history, budget.history),227        query: truncateToTokens(components.query, budget.query),228    };229 230    // Reallocate unused budget231    const used = await countTokens(Object.values(prepared).join('\n'));232    const remaining = modelMaxTokens - used - budget.response;233 234    if (remaining > 0) {235        // Give extra to history (most valuable for conversation)236        prepared.history = await summarizeToTokens(237            components.history,238            budget.history + remaining239        );240    }241 242    return prepared;243}244 245## Validation Checks246 247### No Token Counting248 249Severity: WARNING250 251Message: Building context without token counting. May exceed model limits.252 253Fix action: Count tokens before sending, implement budget allocation254 255### Naive Message Truncation256 257Severity: WARNING258 259Message: Truncating messages without summarization. Critical context may be lost.260 261Fix action: Summarize old messages instead of simply removing them262 263### Hardcoded Token Limit264 265Severity: INFO266 267Message: Hardcoded token limit. Consider making configurable per model.268 269Fix action: Use model-specific limits from configuration270 271### No Context Management Strategy272 273Severity: WARNING274 275Message: LLM calls without context management strategy.276 277Fix action: Implement context management: budgets, summarization, or RAG278 279## Collaboration280 281### Delegation Triggers282 283- retrieval|rag|search -> rag-implementation (Need retrieval system)284- memory|persistence|remember -> conversation-memory (Need memory storage)285- cache|caching -> prompt-caching (Need caching optimization)286 287### Complete Context System288 289Skills: context-window-management, rag-implementation, conversation-memory, prompt-caching290 291Workflow:292 293```2941. Design context strategy2952. Implement RAG for large corpuses2963. Set up memory persistence2974. Add caching for performance298```299 300## Related Skills301 302Works well with: `rag-implementation`, `conversation-memory`, `prompt-caching`, `llm-npc-dialogue`303 304## When to Use305- User mentions or implies: context window306- User mentions or implies: token limit307- User mentions or implies: context management308- User mentions or implies: context engineering309- User mentions or implies: long context310- User mentions or implies: context overflow311 312## Limitations313- Use this skill only when the task clearly matches the scope described above.314- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.315- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.

Related skills

3d Web Experience

Install 3d Web Experience skill for Claude Code from sickn33/antigravity-awesome-skills.

Agent Memory Mcp

Install Agent Memory Mcp skill for Claude Code from sickn33/antigravity-awesome-skills.

Agent Memory Systems

Install Agent Memory Systems skill for Claude Code from sickn33/antigravity-awesome-skills.