The low cost cluster is already deep.
155 priced models sit at or below $0.50 blended per one million tokens using a 3 input to 1 output workload mix. That makes the bottom of the market wide enough for real substitution pressure.
A live BenchGecko note tracking where AI API prices are compressing, which providers anchor the low cost floor, and which premium models still sit far above the market median.
155 priced models sit at or below $0.50 blended per one million tokens using a 3 input to 1 output workload mix. That makes the bottom of the market wide enough for real substitution pressure.
35 priced models sit above $5 blended. The monitor treats these as premium outliers until benchmark strength, latency, context, or specialist capability explains the gap.
The median output price is $1.2 versus $0.385 for input. The 90th to 10th percentile spread is 77.8x for output and 46.2x for input.
Cheapest blended token prices in the current dataset.
Providers with at least three priced models, ranked by median blended price.
High priced models that need capability context before price alone is interpreted as market power.
Prices are taken from BenchGecko model records and expressed per one million tokens where available.
The blended price uses a 3 input to 1 output workload mix. It is a comparison lens, not a universal workload model.
Compression does not mean quality convergence. Benchmark coverage, latency, context, tools, and reliability still matter.