The MCP server that turns Claude into the only coding agent hitting 100% on a real benchmark. -77% active tokens, -76% wall time, 0 losses across 96 tasks on Claude Opus 4.7. Structural code navigation + persistent memory. Works with every MCP client.
One MCP server. One profile. 97.9% on tsbench at -80% tokens.
Structural code navigation, persistent memory, and Bash output compaction for AI coding agents.
mibayy.github.io/token-savior -- project site + benchmark landing
github.com/Mibayy/tsbench -- benchmark source + fixtures
| Plain Claude Code | With Token Savior | |
|---|---|---|
| Score | 141 / 180 (78.3%) | 188 / 192 (97.9%) |
| Active tokens / task | 17 221 | 3 395 (-80%) |
| Wall time / task | 110.6 s | 18.9 s (-83%) |
Reproduces with the optimized profile (single env var). See BENCHMARK-SUMMARY.
Real-world bench against 7 days of transcripts (1130 Bash outputs) drove
this release. Cumulative savings now sit at ~20.4 K tokens/week
(19.3% match rate, 68.9% mean compaction) vs ~12 K/week on v4.2.0.
pytest regex: now matches python3 -m pytest, uv run pytest,poetry/hatch/pdm/rye run pytest.fetch, checkout, branch, worktree list,stash list.gh repo view, gh pr view, gh issue view,gh pr diff.grep, find, cat compactors. Group hits by file, strip commoncd /root/foo && git status now compacts&&/; chains. Bails onjest, vitest, eslint, biome, kubectl get/logs, aws sts/ec2/lambda/logs/iam/dynamodb/s3, npm/yarn/pnpm list, pip list/show, curl. Peaks: 91.7% on aws ec2, 95% onjest all-green.capture_getts init --agent {claude,cursor,gemini,codex} CLI. Detects agent(matcher, command),settings.json, idempotent on re-run.ts_discover cross-project + format="adoption" reports TS-vs-nativegit status/diff/log/ push/commit/add, pytest, cargo test/build/clippy, tsc, docker ps/logs, gh run list/view. Median 63%, peak 100% (a green pytest -q collapses to one line).git status -> --porcelain=v2 --branch, tsc ->--pretty false, pytest -> -q --tb=line, etc. 10 safe rules,get_usage_stats v2. ASCII sparkline (30 d), daily breakdown tableformat="json".ts_discover. Scans ~/.claude/projects/*/*.jsonlfind_symbol, edits without get_edit_context,memory_search without prior memory_index, native shell on codepip install "token-savior-recall[mcp]"Add to your MCP config (e.g. Claude Code):
{
"mcpServers": {
"token-savior-recall": {
"command": "/path/to/venv/bin/token-savior",
"env": {
"WORKSPACE_ROOTS": "/path/to/project1,/path/to/project2",
"TOKEN_SAVIOR_CLIENT": "claude-code",
"TOKEN_SAVIOR_PROFILE": "optimized"
}
}
}
}That's it. TOKEN_SAVIOR_PROFILE=optimized ships the Pareto-optimum
config that wins tsbench. It bundles:
tiny_plus (15 hot tools manifest)No other tuning needed.
Bash compaction and the PreToolUse rewriter are opt-in. Two env vars and
one CLI call:
export TS_BASH_COMPACT=1 # PostToolUse output compactors (34 of them)
export TS_BASH_REWRITE=1 # PreToolUse command rewriter (10 rules)
ts init --agent claude --yes # auto-merge hooks into ~/.claude/settings.jsonts init is idempotent. It detects existing hook entries, dedups by(matcher, command), prints a unified diff, and backs up settings.json
to .bak-YYYYMMDD-HHMMSS (UTC) before writing. Supported agents:claude, cursor, gemini, codex. Pass --dry-run to preview, or--global to write the user-level config.
Optional audit log of every rewrite:
export TS_BASH_REWRITE_LOG=$HOME/.local/state/token-savior/rewrites.jsonl| Family | Compactors |
|---|---|
| git | status, diff, log, push/pull, commit, add, fetch, checkout, branch, worktree list, stash list |
| gh | run list, run view, pr diff, pr view, issue view, repo view |
| test/lint | pytest, jest, vitest, eslint, biome, cargo test, cargo build/clippy, tsc |
| cloud | kubectl get, kubectl logs, aws sts, aws ec2, aws lambda, aws logs, aws iam, aws dynamodb, aws s3 |
| docker | docker ps, docker logs |
| packaging | npm/yarn/pnpm list, pip list/show |
| shell catch-alls | grep, find, cat, curl |
Each compactor is a pure function (no I/O, no globals) returning a
token-efficient rendering. The dispatcher returns None when no matcher
fires, leaving the existing sandbox path untouched. Compound commands
(cd ... && cmd) fall through to the last meaningful segment.
ts_discover -- find missed TS opportunitiesNew MCP tool that scans your Claude Code transcripts for patterns where
TS tools would have been cheaper than what the agent actually did.
ts_discover() # active project, last 30 days
ts_discover(project=None) # ALL transcript projects
ts_discover(format="adoption") # TS vs native ratio per session
ts_discover(format="adoption_json") # same, JSONFindings: Read->Grep->Read chains, sequential find_symbol, edits
without get_edit_context, memory_search without memory_index,
native shell on code files. Args are pruned to load-bearing keys
(PII-safe). Streams JSONL with mtime fast-skip.
ts init CLIts init --agent claude [--global] [--dry-run] [--yes]
ts init --agent cursor
ts init --agent gemini
ts init --agent codexDetects the target agent's settings location, deep-merges the Token
Savior hook config (PostToolUse + PreToolUse), preserves existing
hooks, dedups, prints a unified diff. Backs up tosettings.json.bak-YYYYMMDD-HHMMSS (UTC). Re-running is a no-op.
Claude Code reads whole files to answer questions about three lines, and
forgets everything the moment a session ends. Token Savior fixes both,
plus a third axis: it now compacts the noisy Bash output that bloats
turn budgets between code reads.
It indexes your codebase by symbol -- functions, classes, imports, call
graph -- so the model navigates by pointer instead of by cat. Measured
reduction: 97% fewer chars injected across 170+ real sessions.
On top of that sits a persistent memory engine. Every decision, bugfix,
convention, guardrail and session rollup is stored in SQLite WAL + FTS5
And on top of that, since v4.1, sit the Bash compactors and the
PreToolUse rewriter. Bench numbers above.
| Profile | Tools exposed | Manifest tokens | When to use |
|---|---|---|---|
optimized | 15 | ~1.5 KT | Recommended default -- Pareto win on tsbench |
auto | adaptive | ~1-2 KT | Per-client telemetry-based (experimental) |
tiny | 6 | ~0.6 KT | Minimal hot loop |
lean | 51 | ~4 KT | Legacy -- broader surface |
full | 68 | ~6 KT | Everything exposed |
You probably want optimized.
| Operation | Plain Claude | Token Savior | Reduction |
|---|---|---|---|
find_symbol("send_message") | 41M chars (full read) | 67 chars | -99.9% |
get_function_source("compile") | grep + cat chain | 4.5K chars | direct |
get_change_impact("LLMClient") | impossible | 16K chars | new capability |
| 96-task tsbench (Opus, plain vs ts) | 17 221 active/task | 3 395 active/task | -80% |
| 7-day Bash output bench (v4.3) | ~30 K tokens/week | ~9.6 K tokens/week | ~20.4 K/week |
pip install "token-savior-recall[mcp]"
# Optional hybrid vector search:
pip install "token-savior-recall[mcp,memory-vector]"uvx token-savior-recallclaude mcp add token-savior -- /path/to/venv/bin/token-saviorgit clone https://github.com/Mibayy/token-savior
cd token-savior
python3 -m venv .venv
.venv/bin/pip install -e ".[mcp,dev]"
pytest tests/ -qSuite size: 1688 passed, 55 skipped on main. CI green on Python
3.11 / 3.12 / 3.13.
The compactor numbers above come from replaying real Claude Code
transcripts through the dispatcher. Two scripts live under scripts/:
python3 scripts/bench_compactors_real.py # match rate + mean savings
python3 scripts/bench_compactors_unmatched.py # top unmatched commandsThe first walks ~/.claude/projects/*/*.jsonl, replays every Bash
output through the registry, and reports per-family savings + overall
match rate. The second buckets the unmatched commands so the next
compactor target is obvious from the histogram.
To reproduce the tsbench score:
git clone https://github.com/Mibayy/tsbench && cd tsbench
python3 generate.py --seed 42
git tag v1
python3 breaking_changes.py
git tag v2
TS_PROFILE=tiny_plus TS_CAPTURE_DISABLED=1 python3 bench.py --tasks all --run Bts CLI for non-MCP agentsFor agents without MCP (Cursor, Aider, Continue, scripts, CI), the ts
command exposes a subset of the tools via shell:
ts use /path/to/project
ts get my_function # JSON output
ts search 'pattern'
ts daemon start # ~145ms per call vs 1.5s cold fork
ts init --agent cursor # wire up Bash hooks for non-Claude agentsOn Claude Code, prefer the MCP server -- measured cheaper than CLI on
Opus 4.7. The CLI is there for the portability case.
| Var | Purpose |
|---|---|
TS_BASH_COMPACT=1 | Enable PostToolUse Bash output compactors |
TS_BASH_REWRITE=1 | Enable PreToolUse Bash command rewriter |
TS_BASH_REWRITE_LOG | JSONL audit log of every rewrite |
TS_COMPACT_INLINE_THRESHOLD | Hybrid mode threshold (default 4 KB) |
TS_COMPACT_TINY_THRESHOLD | Skip-sandbox threshold (default 256 B) |
TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID | Critical-observation feed |
TS_VIEWER_PORT | Web viewer dashboard |
TS_AUTO_EXTRACT=1 + TS_API_KEY | LLM auto-extraction of memory observations |
TS_CAPTURE_DISABLED=1 | Skip read-side capture sandboxing (default in optimized) |
TS_MEMORY_DISABLE=1 | Silence memory hooks (clean-context workloads) |
MIT
Mibayy/token-savior
March 30, 2026
June 15, 2026
Python