Claude Code Is Burning Your Budget — And You Have No Idea Where
Claude Code reached $2.5B ARR in February 2026 — nine months after launch. Over 500 companies are now spending more than $1M/year on it. And almost none of them can answer this question:
"Which repo is eating $40K a month?"
The Anthropic console gives you one number: total spend. No per-repo breakdown. No per-engineer view. No PR-level attribution. Engineering VPs and FinOps teams are flying blind on their fastest-growing AI cost center.
The Attribution Problem
Claude Code stores session logs locally at ~/.claude/projects/. Each session file records the model, tokens consumed, and project path. The data exists — it just hasn't been aggregated or shared.
In a 10-engineer team, the distribution is rarely even:
| Engineer | Claude Code spend/mo | Repos touched |
|---|---|---|
| Engineer A (platform) | $1,240 | api-core, auth-service |
| Engineer B (frontend) | $380 | web-app |
| Engineer C (data) | $2,100 | pipeline-etl, ml-training |
| Engineers D–J | $90–$340 each | various |
| Total | $6,800/mo |
Without attribution, you see $6,800. With it, you see that one data pipeline is responsible for 30% of your Claude budget — and it's running the same generation loop 200 times a day.
Three Cost Leaks Unique to Claude Code
1. Long context windows on large repos
Claude Code reads your entire codebase context before responding. A 200K-token repo context at $3/M input tokens = $0.60 per session cold start. If your team opens 50 sessions per day against a large monorepo, that's $30/day just in context loading — before any actual output.
import tokenfence as tf
# Track Claude Code session costs by repo
with tf.budget(
workflow="claude-code-session",
max_usd=2.00,
tags={"repo": "api-core", "engineer": "eng-a"}
) as budget:
# Your Claude Code API call here
budget.record(response.usage)
2. Iterative refactor loops
Claude Code is optimized for multi-turn refactoring. A developer spending 2 hours on a complex refactor might run 40–60 turns, each with a large context window. This is legitimate use — but it burns 10–15× the tokens of a simple code generation task.
Identify which repo types drive loop-heavy sessions and you can right-size budgets or switch to cheaper models for simpler tasks.
3. Test generation at scale
Teams that use Claude Code to generate test suites often underestimate the cost. Generating tests for a 5,000-line module can cost $8–12 in tokens — and if you're regenerating tests on every PR review, it adds up fast.
# Set per-task budgets based on task type
TASK_BUDGETS = {
"test-generation": 0.15, # $0.15 per file
"code-review": 0.05, # $0.05 per PR
"refactor": 0.50, # $0.50 per session
"architecture-review": 2.00, # $2.00 per session
}
with tf.budget(
workflow=f"claude-code-{task_type}",
max_usd=TASK_BUDGETS[task_type],
tags={"repo": repo_name, "task": task_type}
) as budget:
result = call_claude_code(task)
budget.record(result.usage)
Building a Team Attribution Layer
The architecture for Claude Code cost attribution is simpler than it looks:
- CLI wrapper — thin wrapper around Claude Code that injects repo and user tags into every session
- Log aggregator — reads
~/.claude/projects/session files and ships them to a central store - Dashboard — groups by repo, engineer, and time period; flags spend anomalies
The data granularity you get from session logs:
- Session start/end timestamps
- Model used (Sonnet vs Opus vs Haiku)
- Input + output token counts
- Project path (maps to repo)
- Session duration
What you need to add yourself:
- Engineer identity (git config or SSH key fingerprint)
- PR/ticket association (inject via branch name convention)
- Cost calculation (token count × current model pricing)
The Budget Allocation Model
Once you have per-repo data, you can run attribution budgets at the team level:
import tokenfence as tf
# Monthly budget by repo (scale with repo size + activity)
REPO_BUDGETS_MONTHLY = {
"api-core": 800, # $800/mo — high-activity critical service
"web-app": 300, # $300/mo — frontend, lighter context
"ml-pipeline": 1500, # $1,500/mo — large files, frequent runs
"auth-service": 200, # $200/mo — smaller, stable codebase
}
# Daily budget = monthly / 22 working days
daily = {repo: budget / 22 for repo, budget in REPO_BUDGETS_MONTHLY.items()}
# Alert when a repo hits 80% of daily budget
tf.set_alert_threshold(0.8, channel="slack:#eng-costs")
Most teams find the 80/20 rule applies strongly: two or three repos account for 70–80% of total Claude Code spend. Identifying those repos is the first step to bringing costs under control.
FinOps Reporting: What Finance Actually Wants
When FinOps or your CFO asks about AI spend, they want four numbers:
- Total spend by team (engineering, data, product)
- Cost per PR merged (normalized unit economics)
- Month-over-month trend (are costs growing faster than headcount?)
- Forecast for next quarter (based on hiring plan + current per-engineer spend)
Without attribution, you can only answer #1. With per-repo, per-engineer data, you can answer all four — and have a conversation about whether the spend is generating proportional output.
# Weekly spend digest (runs every Monday)
report = tf.get_spend_breakdown(
period="last_7_days",
group_by=["tags.repo", "tags.engineer"],
include_trend=True
)
for repo, data in report.by_repo.items():
print(f"{repo}: ${data.total_usd:.2f} ({data.pct_change:+.1f}% vs last week)")
if data.pct_change > 30:
print(f" ⚠️ Anomaly: {repo} spend up {data.pct_change:.0f}%")
Quick Wins Before Full Attribution
If you're not ready to build a full attribution layer, three tactical changes cut Claude Code costs immediately:
- Default to Sonnet, not Opus. For most code tasks, Claude Sonnet 4 at $3/$15 per M tokens is indistinguishable from Opus at $15/$75. That's a 5× cost reduction per session.
- Set session budgets in your IDE plugin config. Claude Code doesn't have native budget limits yet — use a wrapper or a per-developer spend alert to catch runaway sessions.
- Log project paths from day one. Even if you're not analyzing them today, the session logs with project paths give you a retroactive attribution capability when FinOps asks for a breakdown next quarter.
The Bottom Line
Claude Code is the fastest-growing AI development tool on the market. It's generating real productivity gains — but it's also generating real costs that most engineering teams can't explain at the repo or engineer level.
The gap between "total spend" and "attributed spend" is where budget waste hides. Instrument it early, before your Claude Code bill becomes a boardroom question.
pip install tokenfence # Python
npm install tokenfence # Node.js / TypeScript
Read the docs → · See pricing →
The data is already in your session logs. Time to use it.
Ready to protect your AI budget?
Two lines of code. Per-workflow budgets. Automatic model downgrade. Hard kill switch.