What's above this tier?

Claude Opus ($30-$75 input range depending on tier), GPT-5 ($35+), o-series ($15+ plus heavy thinking token burn). The frontier is outside this page.

When is $20/M justified?

Long-horizon agents, research-grade analysis, legal and medical work where quality matters more than cost, and any task where an error is more expensive than a dozen extra cents.

Does prompt caching help at this tier?

A lot. A $15/M model with cached reads at $1.50/M effectively behaves like a cheap model for repeated system prompts. This can turn a Sonnet-class model into a DeepSeek-class bill.

Yes. Route easy queries to sub-$5, escalate hard ones to sub-$20, reserve absolute frontier for final verification. This routing pattern can cut bills by 3x to 5x with no quality loss on real workloads.

Budget · Under $20/M

LLMs under $20 per million tokens

Every model priced below $20 per 1M input tokens. Ranked by benchmark score within budget. The "I care about quality" tier.

Models40

Top quality85.0

Max price$20.00/M

All pricing Pricing home

What this page is

This page ranks every model with input priced below $20 per million tokens, highest quality first. The under-$20 tier includes near-frontier models like Claude Sonnet and Gemini 2.5 Pro along with strong reasoning models. Above this tier you enter Opus and GPT-5 territory. Use this page when you care about quality but still want to keep bills sane.

Ranked by quality within budget

Input under $20/M, highest benchmark score first.

#	Model	Provider	In $/1M	Out $/1M	Context	avg score	Type
1	GPT-5.5	OpenAI	$5.00	$30.00	400K	85.0	Closed
2	GPT-5 Chat	OpenAI	$1.25	$10.00	128K	81.9	Closed
3	Qwen3.5 397B A17B	Alibaba Qwen	$0.39	$2.34	262K	78.4	OSS
4	DeepSeek V3.2 Speciale	DeepSeek	$0.40	$1.20	164K	78.2	OSS
5	Gemini 2.5 Pro Preview 05-06	Google DeepMind	$1.25	$10.00	1.0M	76.9	Closed
6	Step 3.5 Flash	stepfun	$0.10	$0.30	262K	76.9	OSS
7	MiMo-V2-Flash	xiaomi	$0.09	$0.29	262K	73.3	OSS
8	GPT-5.1-Codex-Max	OpenAI	$1.25	$10.00	400K	72.0	Closed
9	o4 Mini High	OpenAI	$1.10	$4.40	200K	72.0	Closed
10	Qwen3.6 Plus	Alibaba Qwen	$0.33	$1.95	1.0M	70.9	OSS
11	GPT-5.2-Codex	OpenAI	$1.75	$14.00	400K	70.6	Closed
12	GLM 5.1	z-ai	$1.05	$3.50	203K	70.2	OSS
13	Palmyra X5	writer	$0.60	$6.00	1.0M	69.7	Closed
14	Grok 3 Beta	xAI	$3.00	$15.00	131K	69.5	Closed
15	MiniMax M2	minimax	$0.26	$1.00	197K	69.5	OSS
16	GLM 4.5	z-ai	$0.60	$2.20	131K	69.2	OSS
17	gpt-oss-120b (free)	OpenAI	$0.00	$0.00	131K	68.7	OSS
18	GPT-5.1-Codex	OpenAI	$1.25	$10.00	400K	68.6	Closed
19	gpt-oss-20b	OpenAI	$0.03	$0.14	131K	67.4	OSS
20	gpt-oss-20b (free)	OpenAI	$0.00	$0.00	131K	66.4	OSS
21	Qwen3 30B A3B Thinking 2507	Alibaba Qwen	$0.08	$0.40	131K	66.2	OSS
22	GPT-4 Turbo (older v1106)	OpenAI	$10.00	$30.00	128K	65.4	Closed
23	Grok 3 Mini Beta	xAI	$0.30	$0.50	131K	64.8	Closed
24	Gemini 2.0 Flash Lite	Google DeepMind	$0.07	$0.30	1.0M	64.2	Closed
25	MiniMax M2.7	minimax	$0.30	$1.20	197K	63.5	OSS
26	Gemma 4 31B	Google DeepMind	$0.13	$0.38	262K	61.6	OSS
27	Qwen3 Next 80B A3B Thinking	Alibaba Qwen	$0.10	$0.78	131K	61.6	OSS
28	Gemini 3.1 Pro Preview	Google DeepMind	$2.00	$12.00	1.0M	60.6	Closed
29	GPT-5.1-Codex-Mini	OpenAI	$0.25	$2.00	400K	60.4	Closed
30	o3 Mini High	OpenAI	$1.10	$4.40	200K	60.4	Closed
31	LongCat Flash Chat	meituan	$0.20	$0.80	131K	59.8	OSS
32	Gemini 2.5 Flash Lite	Google DeepMind	$0.10	$0.40	1.0M	59.1	Closed
33	DeepSeek V3	DeepSeek	$0.32	$0.89	164K	59.0	OSS
34	GPT-5.4	OpenAI	$2.50	$15.00	1.1M	59.0	Closed
35	Qwen3 Max	Alibaba Qwen	$0.78	$3.90	262K	58.3	OSS
36	Qwen3 32B	Alibaba Qwen	$0.08	$0.24	41K	58.2	OSS
37	MiMo-V2-Pro	xiaomi	$1.00	$3.00	1.0M	58.1	Closed
38	R1 0528	DeepSeek	$0.50	$2.15	164K	57.9	OSS
39	Mixtral 8x7B Instruct	Mistral AI	$0.54	$0.54	33K	57.8	OSS
40	GLM 5	z-ai	$0.60	$1.92	203K	57.6	OSS

Top 3 best-quality sub-$20 models

Best quality under $20

GPT-5.5 sits in the under-$20 tier with a top-shelf quality score. Appropriate when reasoning or reliability matters and you do not need the absolute frontier.

GPT-5 Chat sits in the under-$20 tier with a top-shelf quality score. Appropriate when reasoning or reliability matters and you do not need the absolute frontier.

Qwen3.5 397B A17B sits in the under-$20 tier with a top-shelf quality score. Appropriate when reasoning or reliability matters and you do not need the absolute frontier.

The price gap · cheapest vs most expensive

Cheapest

GPT-5.5

$5.00/M

$ per 1M input tokens

Why the gap

At the top of this tier, pricing approaches frontier. The delta to true frontier (Opus, GPT-5) is 2x to 5x for roughly 5 to 15 percent quality gain on most tasks.

Most expensive

GPT-4 Turbo (older v1106)

$10.00/M

$ per 1M input tokens

Frequently asked questions

Because most workloads don't need this tier. $20/M catches Claude Sonnet, Gemini 2.5 Pro Ultra tiers, some specialized reasoning models. For a 10K-token prompt, that's $0.20 just for input. At scale, it adds up fast.

LLMs under $20 per million tokens

Ranked by quality within budget

Top 3 best-quality sub-$20 models

The price gap · cheapest vs most expensive

Frequently asked questions

See also

Other budget tiers

Use cases

Compare