Research note · Live pricing dataset

API Pricing Compression Monitor

A live BenchGecko note tracking where AI API prices are compressing, which providers anchor the low cost floor, and which premium models still sit far above the market median.

Dataset date
May 6, 2026
BenchGecko generated data
386
Priced models
54 providers with input and output prices
$0.385
Median input
per one million input tokens
$1.2
Median output
per one million output tokens
155
Low cost band
models at or below $0.50 blended
Finding 01

The low cost cluster is already deep.

155 priced models sit at or below $0.50 blended per one million tokens using a 3 input to 1 output workload mix. That makes the bottom of the market wide enough for real substitution pressure.

Finding 02

Premium pricing still has a long tail.

35 priced models sit above $5 blended. The monitor treats these as premium outliers until benchmark strength, latency, context, or specialist capability explains the gap.

Finding 03

Output tokens remain the expensive side.

The median output price is $1.2 versus $0.385 for input. The 90th to 10th percentile spread is 77.8x for output and 46.2x for input.

Premium watchlist

High priced models that need capability context before price alone is interpreted as market power.

Methodology and caveats

Prices are taken from BenchGecko model records and expressed per one million tokens where available.

The blended price uses a 3 input to 1 output workload mix. It is a comparison lens, not a universal workload model.

Compression does not mean quality convergence. Benchmark coverage, latency, context, tools, and reliability still matter.