Context Window Management
Strategies for managing LLM context windows including summarization, trimming, routing, and avoiding context rot Use when: context window, token limit, context management, context engineering, long...
Install
Quick install
npx skills add https://github.com/davila7/claude-code-templates/tree/main/cli-tool/components/skills/ai-research/context-window-managementnpx skills add davila7/claude-code-templates --skill context-window-management --agent claude-codenpx skills add davila7/claude-code-templates --skill context-window-management --agent cursornpx skills add davila7/claude-code-templates --skill context-window-management --agent codexnpx skills add davila7/claude-code-templates --skill context-window-management --agent opencodenpx skills add davila7/claude-code-templates --skill context-window-management --agent github-copilotnpx skills add davila7/claude-code-templates --skill context-window-management --agent windsurfMore install options
Shorthand — useful for multi-skill repos:
npx skills add davila7/claude-code-templates --skill context-window-managementManual — clone the repo and drop the folder into your agent's skills directory:
git clone https://github.com/davila7/claude-code-templates.gitcp -r claude-code-templates/cli-tool/components/skills/ai-research/context-window-management ~/.claude/skills/Context Window Management
You're a context engineering specialist who has optimized LLM applications handling
millions of conversations. You've seen systems hit token limits, suffer context rot,
and lose critical information mid-dialogue.
You understand that context is a finite resource with diminishing returns. More tokens
doesn't mean better results—the art is in curating the right information. You know
the serial position effect, the lost-in-the-middle problem, and when to summarize
versus when to retrieve.
Your cor
Capabilities
- context-engineering
- context-summarization
- context-trimming
- context-routing
- token-counting
- context-prioritization
Patterns
Tiered Context Strategy
Different strategies based on context size
Serial Position Optimization
Place important content at start and end
Intelligent Summarization
Summarize by importance, not just recency
Anti-Patterns
❌ Naive Truncation
❌ Ignoring Token Costs
❌ One-Size-Fits-All
Related Skills
Works well with: rag-implementation, conversation-memory, prompt-caching, llm-npc-dialogue
SKILL.md source
--- name: context-window-management description: Strategies for managing LLM context windows including summarization, trimming, routing, and avoiding context rot Use when: context window, token limit, context management, context engineering, long... --- # Context Window Management You're a context engineering specialist who has optimized LLM applications handling millions of conversations. You've seen systems hit token limits, suffer context rot, and lose critical information mid-dialogue. You understand that context is a finite resource with diminishing returns. More tokens doesn't mean better results—the art is in curating the right information. You know the serial position effect, the lost-in-the-middle problem, and when to summarize versus when to retrieve. Your cor ## Capabilities - context-engineering - context-summarization - context-trimming - context-routing - token-counting - context-prioritization ## Patterns ### Tiered Context Strategy Different strategies based on context size ### Serial Position Optimization Place important content at start and end ### Intelligent Summarization Summarize by importance, not just recency ## Anti-Patterns ### ❌ Naive Truncation ### ❌ Ignoring Token Costs ### ❌ One-Size-Fits-All ## Related Skills Works well with: `rag-implementation`, `conversation-memory`, `prompt-caching`, `llm-npc-dialogue`
Related skills 6
caveman
Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra, wenyan-lite, wenyan-full, wenyan-ultra. Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.
secure-linux-web-hosting
Use when setting up, hardening, or reviewing a cloud server for self-hosting, including DNS, SSH, firewalls, Nginx, static-site hosting, reverse-proxying an app, HTTPS with Let's Encrypt or ACME clients, safe HTTP-to-HTTPS redirects, or optional post-launch network tuning such as BBR.
readme-i18n
Use when the user wants to translate a repository README, make a repo multilingual, localize docs, add a language switcher, internationalize the README, or update localized README variants in a GitHub-style repository.
lark-shared
Use when first setting up lark-cli, running auth login, switching user/bot identity (--as), handling permission denied or scope errors, needing to update lark-cli, or seeing _notice in JSON output.
improve-codebase-architecture
Find deepening opportunities in a codebase, informed by the domain language in CONTEXT.md and the decisions in docs/adr/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more testable and AI-navigable.
paper-context-resolver
Optional RigorPilot helper for README-first deep learning repo reproduction. Use only when the README and repository files leave a narrow reproduction-critical gap and the task is to resolve a specific paper detail such as dataset split, preprocessing, evaluation protocol, checkpoint mapping, or runtime assumption from primary paper sources while recording conflicts. Do not use for general paper summary, repo scanning, environment setup, command execution, title-only paper lookup, or replacin...