How often are Pricing entries updated?

Entries auto-sync with underlying seed data. Definitions stay stable; live data callouts (benchmarks, prices, valuations) update daily. See /methodology for the full refresh schedule.

Where does the data come from?

BenchGecko ingests from OpenRouter (pricing), Epoch AI (benchmarks), public earnings filings (companies), vendor spec sheets (chips + memory + systems), and hand-curated seeds for concepts and compliance.

Category16 terms · BenchGecko glossary

Pricing

Input, output, cache, arbitrage, free tiers.

Learn hub

Most-read in Pricing

Top 12 terms

Pricing

Cache Hit Rate

The fraction of input tokens served from provider-side prompt cache · directly impacts effective pricing.

Read

Pricing

Batched Inference

Submitting many prompts at once in a single batch job · providers discount 50% on delayed batch completion.

Read

Pricing

Reserved Capacity

Pre-purchased dedicated throughput · flat hourly fee for guaranteed tokens-per-second from a specific model.

Read

Pricing

Per-Request Pricing

A flat fee per API call regardless of tokens · used for image generation, search, and some agent products.

Read

Pricing

Tiered Pricing

Volume discounts at defined usage thresholds · $X/M tokens below 1B tokens, $Y/M above.

Read

Pricing

Volume Discounts

Pre-negotiated flat rate applied to all consumption · typical enterprise AI contract shape.

Read

Pricing

Multi-Modal Pricing

Per-token pricing differs by input type · text tokens vs image tokens vs audio seconds vs video frames.

Read

Pricing

Reasoning Token Billing

Reasoning models bill thinking tokens separately from output · can 2-10× effective cost per query.

Read

Pricing

Function Call Billing

Tool-use / function-call responses count as output tokens · structured JSON tool calls are billed normally.

Read

Pricing

Spot Pricing

Preemptible GPU capacity at deep discount · AWS Spot, GCP Spot, Azure Spot · 60-90% off on-demand but can be evicted.

Read

Pricing

BYOK(BYOK)

Bring Your Own Key · you pay the model provider directly, the app just routes your requests.

Read

Pricing

Input Tokens

Tokens you send to the model · priced separately from output tokens, usually cheaper.

Read

Everything in this category

All 16 · A-Z

Arbitrage (AI Pricing)Batched Inference BYOKBYOK Cache Hit Rate Cache Pricing Free Tier Function Call Billing Input Tokens Multi-Modal Pricing Output Tokens Per-Request Pricing Reasoning Token Billing Reserved Capacity Spot Pricing Tiered Pricing Volume Discounts

Explore more

Other categories

Benchmarks

Chips

Memory

Model families

Companies

Foundry

Systems

Concepts

Economy

Agents

Compliance

People

Frequently Asked Questions

The Pricing category covers 16 terms. Input, output, cache, arbitrage, free tiers. Every term has four depth levels (TL;DR, Basic, Deep, Expert), role-based takeaways, FAQs, and live BenchGecko data where available.

Pricing

Top 12 terms

All 16 · A-Z

Other categories

Benchmarks

Chips

Memory

Model families

Companies

Foundry

Systems

Concepts

Economy

Agents

Compliance

People

Frequently Asked Questions

Keep exploring

Related data

Adjacent layers