a typical session

ai can read 18,000 tokens of config on every message — wasting tokens before your prompt even starts

memorytune compresses your Claude Code config — CLAUDE.md, memory files, skill descriptions — so you spend tokens on work, not overhead.

type full workday

duration 4+ hours

model opus 4.6

plan max 5

effort max

tokens consumed

tokens saved

usage limits hit

config reduction

workday — max 5 plan, opus 4.6, max effort, 8am to 6pm

tuned session

standard config

cumulative impact — tokens remaining after each technique

saved

remaining

compression fidelity — 112 question a/b test

pass

degraded

compressed notation

what it actually looks like

ai reads tokens, not grammar. every heading, bullet point, and adjective is overhead — charged on every message. these are real config blocks, compressed with zero accuracy loss.

naked code — the only reader is the machine.

before — 440 tokens

## Code Documentation - Every function needs a docstring with description, args, returns, and examples - Add inline comments above complex blocks explaining the reasoning, not the what - README sections for each module with architecture overview and data flow - Type annotations on all function signatures and class attributes - Changelog entries for every modification

440 tok

after — 42 tokens

docs:none—ai reads source directly types:yes,skip obvious no readme,no changelog,no docstrings code IS the context

42 tok

mem dedup — save what matters. derive the rest.

before — 1,203 tokens

## Memory System - Save important user preferences to memory - Memory files go in the .claude/memory/ dir - Include frontmatter with name, description, and type fields - Update MEMORY.md index when saving memories - Types: user, feedback, project, reference - Don't save things derivable from code - Don't save git history or debugging solutions - Check for existing memory before creating new

1,203 tok

after — 156 tokens

mem:save→.claude/memory/ w/ frontmatter(name,desc,type) types:user|feedback|project|reference update MEMORY.md index on save skip:code-derivable,git-history,debug-fixes dedup:check existing before new

156 tok

code switch — write for the reader. the reader is a tokenizer.

before — 380 tokens

## Response Behavior Please keep your responses concise and focused on the task at hand. Do not include unnecessary preamble, summaries, or pleasantries. When you reference code, always include the file path and line number so the user can navigate directly to the relevant section. If you encounter an error, explain what went wrong and suggest a fix rather than just showing the error message.

380 tok

after — 38 tokens

resp:concise,task-focused,no filler code ref→filepath:line always error→explain+fix,not just dump

38 tok

fork loop — when stuck, fork. don't loop.

without fork loop

ssh connection refused → retry with -v flag → connection refused → try port 22 explicitly → connection refused → try username@ip instead → connection refused → check firewall... same error → retry original command → connection refused → ask user for help

6 attempts, same wall

fork loop

ssh connection refused ×2 → fork: agent A keeps ssh debug agent B checks routing + firewall → B finds: no internet forwarding to host → B fixes route, ssh connects solved in 2 steps, not 6

2 attempts trigger fork

hidden state — transformer runs. SSM thinks.

before — 410 tokens

## Working Memory - Before each response, mentally review all prior context to maintain continuity - Keep a running summary of decisions made, files changed, and approaches tried - When starting a new task, check if similar work was done earlier in the session - Compress old context when approaching limits — preserve decisions, drop details - Carry architectural understanding forward between messages, never start cold

410 tok

after — 39 tokens

mem:compressed state,not full replay decisions+changes→persist,details→drop similar prior work→reuse,don't redo architecture→carry forward always

39 tok

url inject — skip the form. drop the value.

before — 520 tokens

> "go to the project settings" 1. navigate to dashboard.example.com 2. click "Projects" in the sidebar 3. find the project named "api-v2" 4. click the gear icon 5. dialog: "Save changes?" → click OK 6. scroll to "Webhooks" section 7. type the new URL into the field 8. click "Save"

520 tok

after — 18 tokens

navigate→dashboard.example.com/projects/api-v2/settings#webhooks inject URL value→save

18 tok

session log

what happened in 4 hours without hitting the usage limit