Beta
PricingReading · ~3 min · 59 words deep

Per-Request Pricing

A flat fee per API request regardless of input or output size · common for image generation, web search, and some premium agent products.

TL;DR

A flat fee per API request regardless of input or output size · common for image generation, web search, and some premium agent products.

Level 1

Per-request pricing is an alternative to per-token pricing. OpenAI DALL-E 3 charges $0.04 per image. Anthropic's web search tool charges $0.01 per query. Perplexity's API charges $0.005 per query. The logic: some features have relatively predictable cost, so providers simplify billing as a flat fee rather than metering tokens.

Level 2

Per-request pricing tradeoff: predictability vs fairness. For fixed-output features (image generation, fixed-size embeddings, tool calls), flat fee is simple. For variable-length outputs, per-token is fairer. Some providers blend · flat fee plus per-token for output length. Use cases best served by per-request: web search, image gen, function call billing, tool invocations. Enterprise customers often prefer per-request for predictable budgeting.

Level 3

Per-request pricing is common in enterprise pricing contracts · a "credit" or "unit" abstraction hides token metering. Salesforce Einstein sells "Generative AI credits" that map to requests; Microsoft Copilot sells "Message Units." This abstraction decouples customer pricing from underlying model costs, giving providers margin flexibility when they swap models behind the scenes. Downside for builders: per-request pricing obscures comparisons across providers.

The takeaway for you
If you are a
Researcher
  • ·Flat fee per API call · decoupled from token count
  • ·Common for image gen, web search, tool invocations
  • ·Enables unit-based enterprise pricing abstractions
If you are a
Builder
  • ·Use when output length varies wildly · caps unexpected bills
  • ·Watch for providers hiding per-token costs behind "credits"
  • ·Compare total call cost vs tokenized alternatives
If you are a
Investor
  • ·Unit pricing is the enterprise abstraction layer · decouples margin from cost
  • ·Allows providers to swap models without changing customer bills
  • ·Standard in Salesforce, Microsoft, SAP AI suites
If you are a
Curious · Normie
  • ·A fixed price per AI query · no surprises from long conversations
  • ·Common for AI image makers and AI search
  • ·Simpler than paying per word
Gecko's take

Per-request pricing is how enterprise AI gets sold. Simpler billing beats fair pricing on procurement committees.

When output is long and unpredictable. For short outputs, per-token is usually cheaper.