Function Call Billing
When a model emits a function call, the JSON tool-invocation counts as output tokens · cheap tools still carry token cost, and chained tool calls compound fast.
When a model emits a function call, the JSON tool-invocation counts as output tokens · cheap tools still carry token cost, and chained tool calls compound fast.
Basic
Providers bill the JSON representation of function calls as output tokens. A simple tool call like `{"name":"get_weather","args":{"city":"Paris"}}` is ~15 tokens. Multi-tool chains (agent loop: plan → tool call → result → plan again) can easily accumulate 5-20 tool calls, each priced as output. Tool definitions themselves count as input tokens · registering 50 tools on every call bloats the input bill.
Deep
Anthropic added optimized tool use in 2025 that strips redundant JSON framing, reducing cost 15-20% vs naive. OpenAI now supports "parallel tool calls" where one assistant turn emits multiple tool invocations in a single response · reduces per-call overhead. MCP servers add tool definitions via the clients' session context · those definitions hit input tokens once per new session. Caching tool definitions (Anthropic cache_control) brings input cost down to near-zero on subsequent calls.
Expert
Agent pricing math: typical loop burns 5-20 tool calls, each roughly 50-200 tokens output. A 10-step agent run at $3/M output = ~$0.003. Looks cheap, but scales with traffic. The real cost is cumulative context: each tool result feeds into the next turn as input, so input tokens grow linearly with steps. Best practice: summarize old tool results, drop unneeded context, and cache long-lived tool definitions.
Depending on why you're here
- ·Tool calls emit JSON as output tokens
- ·Tool definitions count as input tokens (per session)
- ·Anthropic + OpenAI optimize framing for cost
- ·Cache tool definitions with cache_control
- ·Use parallel tool calls where supported
- ·Summarize old tool results · context grows fast in agent loops
- ·Agent adoption drives token consumption 10-50× vs chat
- ·Tool-heavy workloads are provider's most profitable
- ·MCP's success compounds this · more tools = more calls
- ·When AI uses tools, each tool call has a cost
- ·Agents that use many tools cost more to run
- ·Hidden in your app's AI bill
Function call billing is invisible until your agent traffic scales. Cache tools, parallelize calls, summarize context · or pay 10× what you need to.