Now in public beta

The cost circuit breaker for AI agents

Wrap your OpenAI/Anthropic clients in 2 lines of code. Set budget caps, automate model downgrades, and stop runaway loops instantly.

Get Started View on GitHub

MIT LicenseZero Dependencies

main.py

1from tokenfence import guard

3client = guard(

4 openai.OpenAI(),

5 budget="$0.50",

6 fallback="gpt-4o-mini",

7 on_limit="stop"

Budget:$0.12 / $0.50

Works with your stack

🤖

OpenAI

🧠

Anthropic

👥

CrewAI

⚡

AutoGen

🔗

LangChain

📊

LangGraph

280+ installs

Open Source

Python + TypeScript

AI agents have unpredictable costs

A single agentic loop can burn through your entire monthly budget in minutes. One bad prompt, one recursive call, and you're looking at a four-figure surprise on your next invoice.

67%

of developers cite cost as #1 barrier to deploying AI agents¹

$4.2K

average unexpected bill from a single runaway agentic workflow

Estimated API Cost

$0.00

Looking normal...

¹ McKinsey Global AI Survey 2025

Three layers of protection

TokenFence wraps your AI client locally. Zero latency overhead, full cost control.

Budget Caps

Set per-workflow spending limits. Never exceed your budget. Enforce hard limits at the SDK level before requests even fire.

Auto Downgrade

When 80% of budget is spent, automatically route to cheaper models. GPT-4o falls back to GPT-4o-mini seamlessly.

Kill Switch

Hard stop at budget limit. Graceful error handling, clean shutdown, no surprise bills. Your agents fail safely.

2 lines. That's it.

No config files, no dashboards to set up. Import, wrap, deploy.

Before

1import openai

3client = openai.OpenAI()

5response = client.chat.completions.create(

6 model="gpt-4o",

7 messages=[{"role": "user", "content": prompt}]

9# No budget configured.

10# No fallback model.

11# Risk of cost overruns.

After

1import openai

2from tokenfence import guard

4client = guard(

5 openai.OpenAI(),

6 budget="$0.50",

7 fallback="gpt-4o-mini",

8 on_limit="stop"

11response = client.chat.completions.create(

12 model="gpt-4o",

13 messages=[{"role": "user", "content": prompt}]

14)

15# Protected. Capped. Safe.

How it works

Five steps from unprotected to fully guarded. All local, all fast.

Wrap your AI client

Import TokenFence and wrap your existing OpenAI or Anthropic client. Zero config required.

client = guard(openai.OpenAI())

Set your budget

Define spending limits per workflow, per user, or globally. Budgets reset on your schedule.

budget="$0.50"

Monitor in real-time

TokenFence tracks every token locally. No proxy, no external calls, no added latency.

# 0ms overhead

Auto-downgrade models

At 80% budget, requests automatically route to cheaper models. Quality degrades gracefully.

fallback="gpt-4o-mini"

Hard stop at limit

When budget is exhausted, TokenFence stops requests with a clean, catchable exception.

on_limit="stop"

Simple, transparent pricing

Start free. Upgrade when you need more protection.

Hobby

Free

For side projects and experimentation

50K requests/month
Basic budget caps
Single project
Community support
Core SDK access

Get Started Free

Recommended

Pro

$49/mo

For production AI applications

500K requests/month
Auto model downgrades
Per-workflow budget caps
Unlimited projects
Email support
Usage dashboard

Team

$149/mo

For growing teams and organizations

2M requests/month
Everything in Pro
Team management
Priority support
Advanced analytics
Custom policies

Enterprise

Custom

For teams with advanced needs

Unlimited requests
SLA guarantee
On-premise deployment
Dedicated support
Custom integrations
SSO / SAML
Audit logs

Contact Sales

Frequently asked questions

Everything you need to know about TokenFence.

Stop burning money on runaway agents

Ship AI agents with confidence. Set budgets, enforce policies, stop runaway costs. Free to start, no credit card required.

Get Started Free